. 1
( 3)


Notes on Classical Groups
Peter J. Cameron
School of Mathematical Sciences
Queen Mary and West¬eld College
London E1 4NS

These notes are the content of an M.Sc. course I gave at Queen Mary and
West¬eld College, London, in January“March 2000.
I am grateful to the students on the course for their comments; to Keldon
Drudge, for standing in for me; and to Simeon Ball, for helpful discussions.

1. Fields and vector spaces

2. Linear and projective groups

3. Polarities and forms

4. Symplectic groups

5. Unitary groups

6. Orthogonal groups

7. Klein correspondence and triality

8. Further topics

A short bibliography on classical groups

1 Fields and vector spaces
In this section we revise some algebraic preliminaries and establish notation.

1.1 Division rings and ¬elds
A division ring, or skew ¬eld, is a structure F with two binary operations called
addition and multiplication, satisfying the following conditions:

(a) (F, +) is an abelian group, with identity 0, called the additive group of F;

(b) (F \ 0, ·) is a group, called the multiplicative group of F;

(c) left or right multiplication by any ¬xed element of F is an endomorphism of
the additive group of F.

Note that condition (c) expresses the two distributive laws. Note that we must
assume both, since one does not follow from the other.
The identity element of the multiplicative group is called 1.
A ¬eld is a division ring whose multiplication is commutative (that is, whose
multiplicative group is abelian).

Exercise 1.1 Prove that the commutativity of addition follows from the other ax-
ioms for a division ring (that is, we need only assume that (F, +) is a group in

Exercise 1.2 A real quaternion has the form a + bi + cj + dk, where a, b, c, d ∈
R. Addition and multiplication are given by “the usual rules”, together with the
following rules for multiplication of the elements 1, i, j, k:

· 1i j k
1 1i j k
i ’1 k ’j
j ’k ’1 i
k j ’i ’1

Prove that the set H of real quaternions is a division ring. (Hint: If q = a + bi +
cj + dk, let q— = a ’ bi ’ cj ’ dk; prove that qq— = a2 + b2 + c2 + d 2 .)

Multiplication by zero induces the zero endomorphism of (F, +). Multiplica-
tion by any non-zero element induces an automorphism (whose inverse is mul-
tiplication by the inverse element). In particular, we see that the automorphism
group of (F, +) acts transitively on its non-zero elements. So all non-zero ele-
ments have the same order, which is either in¬nite or a prime p. In the ¬rst case,
we say that the characteristic of F is zero; in the second case, it has characteristic
The structure of the multiplicative group is not so straightforward. However,
the possible ¬nite subgroups can be determined. If F is a ¬eld, then any ¬nite
subgroup of the multiplicative group is cyclic. To prove this we require Vander-
monde™s Theorem:

Theorem 1.1 A polynomial equation of degree n over a ¬eld has at most n roots.

Exercise 1.3 Prove Vandermonde™s Theorem. (Hint: If f (a) = 0, then f (x) =
(x ’ a)g(x).)

Theorem 1.2 A ¬nite subgroup of the multiplicative group of a ¬eld is cyclic.

Proof An element ω of a ¬eld F is an nth root of unity if ωn = 1; it is a primitive
nth root of unity if also ωm = 1 for 0 < m < n.
Let G be a subgroup of order n in the multiplicative group of the ¬eld F. By
Lagrange™s Theorem, every element of G is an nth root of unity. If G contains a
primitive nth root of unity, then it is cyclic, and the number of primitive nth roots
is φ(n), where φ is Euler™s function. If not, then of course the number of primitive
nth roots is zero. The same considerations apply of course to any divisor of n. So,
if ψ(m) denotes the number of primitive mth roots of unity in G, then
(a) for each divisor m of n, either ψ(m) = φ(m) or ψ(m) = 0.
Now every element of G has some ¬nite order dividing n; so

‘ ψ(m) = n.

Finally, a familiar property of Euler™s function yields:

‘ φ(m) = n.

From (a), (b) and (c) we conclude that ψ(m) = φ(m) for all divisors m of n. In
particular, ψ(n) = φ(n) = 0, and G is cyclic.

For division rings, the position is not so simple, since Vandermonde™s Theorem

Exercise 1.4 Find all solutions of the equation x2 + 1 = 0 in H.

However, the possibilities can be determined. Let G be a ¬nite subgroup of
the multiplicative group of the division ring F. We claim that there is an abelian
group A such that G is a group of automorphisms of A acting semiregularly on the
non-zero elements. Let B be the subgroup of (F, +) generated by G. Then B is a
¬nitely generated abelian group admitting G acting semiregularly. If F has non-
zero characteristic, then B is elementary abelian; take A = B. Otherwise, choose
a prime p such that, for all x, g ∈ G, the element (xg ’ x)p’1 is not in B, and set
A = B/pB.
The structure of semiregular automorphism groups of ¬nite groups (a.k.a.
Frobenius complements) was determined by Zassenhaus. See Passman, Permu-
tation Groups, Benjamin, New York, 1968, for a detailed account. In particular,
either G is metacyclic, or it has a normal subgroup isomorphic to SL(2, 3) or
SL(2, 5). (These are ¬nite groups G having a unique subgroup Z of order 2, such
that G/Z is isomorphic to the alternating group A4 or A5 respectively. There is a
unique such group in each case.)

Exercise 1.5 Identify the division ring H of real quaternions with the real vec-
tor space R4 with basis {1, i, j, k}. Let U denote the multiplicative group of unit
quaternions, those elements a+bi+cj+dk satisfying a2 +b2 +c2 +d 2 = 1. Show
that conjugation by a unit quaternion is an orthogonal transformation of R4 , ¬xing
the 1-dimensional space spanned by 1 and inducing an orthogonal transformation
on the 3-dimensional subspace spanned by i, j, k.
Prove that the map from U to the 3-dimensional orthogonal group has kernel
±1 and image the group of rotations of 3-space (orthogonal transformations with
determinant 1).
Hence show that the groups SL(2, 3) and SL(2, 5) are ¬nite subgroups of the
multiplicative group of H.
Remark: This construction explains why the groups SL(2, 3) and SL(2, 5) are
sometimes called the binary tetrahedral and binary icosahedral groups. Construct
also a binary octahedral group of order 48, and show that it is not isomorphic to
GL(2, 3) (the group of 2 — 2 invertible matrices over the integers mod 3), even
though both groups have normal subgroups of order 2 whose factor groups are
isomorphic to the symmetric group S4 .

1.2 Finite ¬elds
The basic facts about ¬nite ¬elds are summarised in the following two theorems,
due to Wedderburn and Galois respectively.

Theorem 1.3 Every ¬nite division ring is commutative.

Theorem 1.4 The number of elements in a ¬nite ¬eld is a prime power. Con-
versely, if q is a prime power, then there is a unique ¬eld with q elements, up to

The unique ¬nite ¬eld with a given prime power order q is called the Galois
¬eld of order q, and denoted by GF(q) (or sometimes Fq ). If q is prime, then
GF(q) is isomorphic to Z/qZ, the integers mod q.
We now summarise some results about GF(q).

Theorem 1.5 Let q = pa , where p is prime and a is a positive integer. Let F =

(a) F has characteristic p, and its additive group is an elementary abelian p-

(b) The multiplicative group of F is cyclic, generated by a primitive (pa ’ 1)th
root of unity (called a primitive element of F).

(c) The automorphism group of F is cyclic of order a, generated by the Frobenius
automorphism x ’ x p .

(d) For every divisor b of a, there is a unique sub¬eld of F of order pb , consisting
of all solutions of x p = x; and these are all the sub¬elds of F.

Proof Part (a) is obvious since the additive group contains an element of order p,
and part (b) follows from Theorem 1.2. Parts (c) and (d) are most easily proved
using Galois theory. Let E denote the sub¬eld Z/pZ of F. Then the degree of
F over E is a. The Frobenius map σ : x ’ x p is an E-automorphism of F, and
has order a; so F is a Galois extension of E, and σ generates the Galois group.
Now sub¬elds of F necessarily contain E; by the Fundamental Theorem of Galois
Theory, they are the ¬xed ¬elds of subgroups of the Galois group σ .

For explicit calculation in F = GF(pa ), it is most convenient to represent it
as E[x]/( f ), where E = Z/pZ, E[x] is the polynomial ring over E, and f is the
(irreducible) minimum polynomial of a primitive element of F. If ± denotes the
coset ( f ) + x, then ± is a root of f , and hence a primitive element.
Now every element of F can be written uniquely in the form

c0 + c1 ± + · · · + ca’1 ±a’1 ,

where c0 , c1 , . . . , ca’1 ∈ E; addition is straightforward in this representation. Also,
every non-zero element of F can be written uniquely in the form ±m , where
0 ¤ m < pa ’ 1, since ± is primitive; multiplication is straightforward in this
representation. Using the fact that f (±) = 0, it is possible to construct a table
matching up the two representations.

Example The polynomial x3 + x + 1 is irreducible over E = Z/2Z. So the ¬eld
F = E(±) has eight elements, where ± satis¬es ±3 + ± + 1 = 0 over E. We have
±7 = 1, and the table of logarithms is as follows:

±0 1
±1 ±
±2 ±2
±3 ±+1
±4 ±2 + ±
±5 ±2 + ± + 1
±6 ±2 +1

(±2 + ± + 1)(±2 + 1) = ±5 · ±6 = ±4 = ±2 + ±.

Exercise 1.6 Show that there are three irreducible polynomials of degree 4 over
the ¬eld Z/2Z, of which two are primitive. Hence construct GF(16) by the
method outlined above.

Exercise 1.7 Show that an irreducible polynomial of degree m over GF(q) has a
root in GF(qn ) if and only if m divides n.
Hence show that the number am of irreducible polynomials of degree m over
GF(q) satis¬es
‘ mam = qn.

Exercise 1.8 Show that, if q is even, then every element of GF(q) is a square;
while, if q is odd, then half of the non-zero elements of GF(q) are squares and
half are non-squares.
If q is odd, show that ’1 is a square in GF(q) if and only if q ≡ 1 (mod 4).

1.3 Vector spaces
A left vector space over a division ring F is a unital left F-module. That is, it is
an abelian group V , with a anti-homomorphism from F to End(V ) mapping 1 to
the identity endomorphism of V .
Writing scalars on the left, we have (cd)v = c(dv) for all c, d ∈ F and v ∈ V :
that is, scalar multiplication by cd is the same as multiplication by d followed by
multiplication by c, not vice versa. (The opposite convention would make V a
right (rather than left) vector space; scalars would more naturally be written on
the right.) The unital condition simply means that 1v = v for all v ∈ V .
Note that F is a vector space over itself, using ¬eld multiplication for the scalar
If F is a division ring, the opposite division ring F —¦ has the same underlying
set as F and the same addition, with multiplication given by

a —¦ b = ba.

Now a right vector space over F can be regarded as a left vector space over F —¦ .
A linear transformation T : V ’ W between two left F-vector spaces V and
W is a vector space homomorphism; that is, a homomorphism of abelian groups
which commutes with scalar multiplication. We write linear transformations on
the right, so that we have
(cv)T = c(vT )
for all c ∈ F, v ∈ V . We add linear transformations, or multiply them by scalars,
pointwise (as functions), and multiply then by function composition; the results
are again linear transformations.
If a linear transformation T is one-to-one and onto, then the inverse map is
also a linear transformation; we say that T is invertible if this occurs.
Now Hom(V,W ) denotes the set of all linear transformations from V to W .
The dual space of F is F — = Hom(V, F).

Exercise 1.9 Show that V — is a right vector space over F.

A vector space is ¬nite-dimensional if it is ¬nitely generated as F-module.
A basis is a minimal generating set. Any two bases have the same number of
elements; this number is usually called the dimension of the vector space, but in
order to avoid confusion with a slightly different geometric notion of dimension,
I will call it the rank of the vector space. The rank of V is denoted by rk(V ).
Every vector can be expressed uniquely as a linear combination of the vectors
in a basis. In particular, a linear combination of basis vectors is zero if and only if
all the coef¬cients are zero. Thus, a vector space of rank n over F is isomorphic
to F n (with coordinatewise addition and scalar multiplication).
I will assume familiarity with standard results of linear algebra about ranks
of sums and intersections of subspaces, about ranks of images and kernels of
linear transformations, and about the representation of linear transformations by
matrices with respect to given bases.
As well as linear transformations, we require the concept of a semilinear trans-
formation between F-vector spaces V and W . This can be de¬ned in two ways. It
is a map T from V to W satisfying

(a) (v1 + v2 )T = v1 T + v2 T for all v1 , v2 ∈ V ;

(b) (cv)T = cσ vT for all c ∈ F, v ∈ V , where σ is an automorphism of F called
the associated automorphism of T .

Note that, if T is not identically zero, the associated automorphism is uniquely
determined by T .
The second de¬nition is as follows. Given an automorphism σ of F, we extend
the action of σ to F n coordinatewise:

(c1 , . . . , cn )σ = (cσ , . . . , cσ ).

Hence we have an action of σ on any F-vector space with a given basis. Now a
σ-semilinear transformation from V to W is the composition of a linear transfor-
mation from V to W with the action of σ on W (with respect to some basis).
The fact that the two de¬nitions agree follows from the observations

• the action of σ on F n is semilinear in the ¬rst sense;

• the composition of semilinear transformations is semilinear (and the associ-
ated automorphism is the composition of the associated automorphisms of
the factors).

This immediately shows that a semilinear map in the second sense is semilinear
in the ¬rst. Conversely, if T is semilinear with associated automorphism σ, then
the composition of T with σ’1 is linear, so T is σ-semilinear.

Exercise 1.10 Prove the above assertions.

If a semilinear transformation T is one-to-one and onto, then the inverse map
is also a semilinear transformation; we say that T is invertible if this occurs.
Almost exclusively, I will consider only ¬nite-dimensional vector spaces. To
complete the picture, here is the situation in general. In ZFC (Zermelo“Fraenkel
set theory with the Axiom of Choice), every vector space has a basis (a set of
vectors with the property that every vector has a unique expression as a linear
combination of a ¬nite set of basis vectors with non-zero coef¬cients), and any
two bases have the same cardinal number of elements. However, without the
Axiom of Choice, there may exist a vector space which has no basis.
Note also that there exist division rings F with bimodules V such that V has
different ranks when regarded as a left or a right vector space.

1.4 Projective spaces
It is not easy to give a concise de¬nition of a projective space, since projective
geometry means several different things: a geometry with points, lines, planes,
and so on; a topological manifold with a strange kind of torsion; a lattice with
meet, join, and order; an abstract incidence structure; a tool for computer graphics.
Let V be a vector space of rank n + 1 over a ¬eld F. The “objects” of the
n-dimensional projective space are the subspaces of V , apart from V itself and the
zero subspace {0}. Each object is assigned a dimension which is one less than its
rank, and we use geometric terminology, so that points, lines and planes are the
objects of dimension 0, 1 and 2 (that is, rank 1, 2, 3 respectively). A hyperplane is
an object having codimension 1 (that is, dimension n ’ 1, or rank n). Two objects
are incident if one contains the other. So two objects of the same dimension are
incident if and only if they are equal.
The n-dimensional projective space is denoted by PG(n, F). If F is the Galois
¬eld GF(q), we abbreviate PG(n, GF(q)) to PG(n, q). A similar convention will
be used for other geometries and groups over ¬nite ¬elds.
A 0-dimensional projective space has no internal structure at all, like an ide-
alised point. A 1-dimensional projective space is just a set of points, one more
than the number of elements of F, with (at the moment) no further structure. (If

{e1 , e2 } is a basis for V , then the points are spanned by the vectors »e1 + e2 (for
» ∈ F) and e1 .)
For n > 1, PG(n, F) contains objects of different dimensions, and the relation
of incidence gives it a non-trivial structure.
Instead of our “incidence structure” model, we can represent a projective space
as a collection of subsets of a set. Let S be the set of points of PG(n, F). The point
shadow of an object U is the set of points incident with U. Now the point shadow
of a point P is simply {P}. Moreover, two objects are incident if and only if the
point shadow of one contains that of the other.
The diagram below shows PG(2, 2). It has seven points, labelled 1, 2, 3, 4, 5,
6, 7; the line shadows are 123, 145, 167, 246, 257, 347 356 (where, for example,
123 is an abbreviation for {1, 2, 3}).

'$ „
3u „u
˜ u4 5
4 ˜„
4 ˜
u &%
4 u ˜u

2 6 4
The correspondence between points and spanning vectors of the rank-1 sub-
spaces can be taken as follows:

1 2 3 4 5 6 7
(0, 0, 1) (0, 1, 0) (0, 1, 1) (1, 0, 0) (1, 0, 1) (1, 1, 0) (1, 1, 1)

The following geometric properties of projective spaces are easily veri¬ed
from the rank formulae of linear algebra:

(a) Any two distinct points are incident with a unique line.

(b) Two distinct lines contained in a plane are incident with a unique point.

(c) Any three distinct points, or any two distinct collinear lines, are incident with
a unique plane.

(d) A line not incident with a given hyperplane meets it in a unique point.

(e) If two distinct points are both incident with some object of the projective
space, then the unique line incident with them is also incident with that

Exercise 1.11 Prove the above assertions.

It is usual to be less formal with the language of incidence, and say “the point
P lies on the line L”, or “the line L passes through the point P” rather than “the
point P and the line L are incident”. Similar geometric language will be used
without further comment.
An isomorphism from a projective space Π1 to a projective space Π2 is a map
from the objects of Π1 to the objects of Π2 which preserves the dimensions of ob-
jects and also preserves the relation of incidence between objects. A collineation
of a projective space Π is an isomorphism from Π to Π.
The important theorem which connects this topic with that of the previous
section is the Fundamental Theorem of Projective Geometry:

Theorem 1.6 Any isomorphism of projective spaces of dimension at least two
is induced by an invertible semilinear transformation of the underlying vector
spaces. In particular, the collineations of PG(n, F) for n ≥ 2 are induced by
invertible semilinear transformations of the rank-(n + 1) vector space over F.

This theorem will not be proved here, but I make a few comments about the
proof. Consider ¬rst the case n = 2. One shows that the ¬eld F can be recov-
ered from the projective plane (that is, the addition and multiplication in F can
be de¬ned by geometric constructions involving points and lines). The construc-
tion is based on choosing four points of which no three are collinear. Hence any
collineation ¬xing these four points is induced by a ¬eld automorphism. Since
the group of invertible linear transformations acts transitively on quadruples of
points with this property, it follows that any collineation is induced by the com-
position of a linear transformation and a ¬eld automorphism, that is, a semilinear
For higher-dimensional spaces, we show that the coordinatisations of the planes
¬t together in a consistent way to coordinatise the whole space.
In the next chapter we study properties of the collineation group of projective
spaces. Since we are concerned primarily with groups of matrices, I will normally
speak of PG(n ’ 1, F) as the projective space based on a vector space of rank n,
rather than PG(n, F) based on a vector space of rank n + 1.
Next we give some numerical information about ¬nite projective spaces.

Theorem 1.7 (a) The number of points in the projective space PG(n ’ 1, q) is
(qn ’ 1)/(q ’ 1).

(b) More generally, the number of (m’1)-dimensional subspaces of PG(n’1, q)
(qn ’ 1)(qn ’ q) · · · (qn ’ qm’1 )
(qm ’ 1)(qm ’ q) · · · (qm ’ qm’1 )
(c) The number of (m ’ 1)-dimensional subspaces of PG(n ’ 1, q) containing a
given (l ’ 1)-dimensional subspace is equal to the number of (m ’ l ’ 1)-
dimensional subspaces of PG(n ’ l ’ 1, q).

Proof (a) The projective space is based on a vector space of rank n, which con-
tains qn vectors. One of these is the zero vector, and the remaining qn ’ 1 each
span a subspace of rank 1. Each rank 1 subspace contains q ’ 1 non-zero vectors,
each of which spans it.
(b) Count the number of linearly independent m-tuples of vectors. The jth
vector must lie outside the rank ( j ’ 1) subspace spanned by the preceding vec-
tors, so there are qn ’ q j’1 choices for it. So the number of such m-tuples is the
numerator of the fraction. By the same argument (replacing n by m), the num-
ber of linearly independent m-tuples which span a given rank m subspace is the
denominator of the fraction.
(c) If U is a rank l subspace of the rank m vector space V , then the Second
Isomorphism Theorem shows that there is a bijection between rank m subspaces
of V containing U, and rank (m ’ l) subspaces of the rank (n ’ l) vector space
V /U.

The number given by the fraction in part (b) of the theorem is called a Gaus-
sian coef¬cient, written . Gaussian coef¬cients have properties resembling
those of binomial coef¬cients, to which they tend as q ’ 1.

Exercise 1.12 (a) Prove that
n n
+ qn’k+1 = .
k k
q q q

(b) Prove that for n ≥ 1,
‘ qk(k’1)/2
∏ (1 + q x) = i
xk .
k q
i=0 k=0

(This result is known as the q-binomial theorem, since it reduces to the
binomial theorem as q ’ 1.)

If we regard a projective space PG(n ’ 1, F) purely as an incidence structure,
the dimensions of its objects are not uniquely determined. This is because there
is an additional symmetry known as duality. That is, if we regard the hyperplanes
as points, and de¬ne new dimensions by dim— (U) = n ’ 2 ’ dim(U), we again
obtain a projective space, with the same relation of incidence. The reason that it
is a projective space is as follows.
Let V — = Hom(V, F) be the dual space of V , where V is the underlying vector
space of PG(n’1, F). Recall that V — is a right vector space over F, or equivalently
a left vector space over the opposite ¬eld F —¦ . To each subspace U of V , there is a
corresponding subspace U † of V — , the annihilator of U, given by
U † = { f ∈ V — : u f = 0 for all u ∈ U}.
The correspondence U ’ U † is a bijection between the subspaces of V and the
subspaces of V — ; we denote the inverse map from subspaces of V — to subspaces
of V also by †. It satis¬es
(a) (U † )† = U;
† †
(b) U1 ¤ U2 if and only if U1 ≥ U2 ;
(c) rk(U † ) = n ’ rk(U).
Thus we have:
Theorem 1.8 The dual of PG(n ’ 1, F) is the projective space PG(n ’ 1, F —¦ ). In
particular, if n ≥ 3, then PG(n ’ 1, F) is isomorphic to its dual if and only if F is
isomorphic to its opposite F —¦ .
Proof The ¬rst assertion follows from our remarks. The second follows from the
¬rst by use of the Fundamental Theorem of Projective Geometry.
Thus, PG(n’1, F) is self-dual if F is commutative, and for some non-commutative
division rings such as H; but there are division rings F for which F ∼ F —¦ .
An isomorphism from F to its opposite is a bijection σ satisfying
(a + b)σ = aσ + bσ ,
(ab)σ = bσ aσ ,
for all a, b ∈ F. Such a map is called an anti-automorphism of F.
Exercise 1.13 Show that H ∼ H—¦ . (Hint: (a + bi + cj + dk)σ = a ’ bi ’ cj ’ dk.)

2 Linear and projective groups
In this section, we de¬ne and study the general and special linear groups and their
projective versions. We look at the actions of the projective groups on the points of
the projective space, and discuss transitivity properties, generation, and simplicity
of these groups.

2.1 The general linear groups
Let F be a division ring. As we saw, a vector space of rank n over F can be
identi¬ed with the standard space F n (with scalars on the left) by choosing a basis.
Any invertible linear transformation of V is then represented by an invertible n — n
matrix, acting on F n by right multiplication.
We let GL(n, F) denote the group of all invertible n — n matrices over F, with
the operation of matrix multiplication.
The group GL(n, F) acts on the projective space PG(n ’ 1, F), since an in-
vertible linear transformation maps a subspace to another subspace of the same

Proposition 2.1 The kernel of the action of GL(n, F) on the set of points of PG(n’
1, F) is the subgroup
{cI : c ∈ Z(F), c = 0}
of central scalar matrices in F, where Z(F) denotes the centre of F.

Proof Let A = (ai j ) be an invertible matrix which ¬xes every rank 1 subspace of
F n . Thus, A maps each non-zero vector (x1 , . . . , xn ) to a scalar multiple (cx1 , . . . , cxn )
of itself.
Let ei be the ith basis vector, with 1 in position i and 0 elsewhere. Then
ei A = ci ei , so the ith row of A is ci ei . This shows that A is a diagonal matrix.
Now for i = j, we have

ci ei + c j e j = (ei + e j )A = d(ei + e j )

for some d. So ci = c j . Thus, A is a diagonal matrix cI.
Finally, let a ∈ F, a = 0. Then

c(ae1 ) = (ae1 )A = a(e1 A) = ace1 ,

so ac = ca. Thus, c ∈ Z(F).

Let Z be the kernel of this action. We de¬ne the projective general linear
group PGL(n, F) to be the group induced on the points of the projective space
PG(n ’ 1, F) by GL(n, F). Thus,
PGL(n, F) ∼ GL(n, F)/Z.
In the case where F is the ¬nite ¬eld GF(q), we write GL(n, q) and PGL(n, q)
in place of GL(n, F) and PGL(n, F) (with similar conventions for the groups we
meet later). Now we can compute the orders of these groups:
(a) | GL(n, q)| = (qn ’ 1)(qn ’ q) · · · (qn ’ qn’1 );
Theorem 2.2
(b) | PGL(n, q)| = | GL(n, q)|/(q ’ 1).
Proof (a) The rows of an invertible matrix over a ¬eld are linearly independent,
that is, for i = 1, . . . , n, the ith row lies outside the subspace of rank i ’ 1 generated
by the preceding rows. Now the number of vectors in a subspace of rank i ’ 1 over
GF(q) is qi’1 , so the number of choices for the ith row is qn ’ qi’1 . Multiplying
these numbers for i = 1, . . . , n gives the result.
(b) PGL(n, q) is the image of GL(n, q) under a homomorphism whose kernel
consists of non-zero scalar matrices and so has order q ’ 1.
If the ¬eld F is commutative, then the determinant function is de¬ned on n — n
matrices over F and is a multiplicative map to F:
det(AB) = det(A) det(B).
Also, det(A) = 0 if and only if A is invertible. So det is a homomorphism from
GL(n, F) to F — , the multiplicative group of F (also known as GL(1, F)). This
homomorphism is onto, since the matrix with c in the top left corner, 1 in the
other diagonal positions, and 0 elsewhere has determinant c.
The kernel of this homomorphism is the special linear group SL(n, F), a nor-
mal subgroup of GL(n, F) with factor group isomorphic to F — .
We de¬ne the projective special linear group PSL(n, F) to be the image of
SL(n, F) under the homomorphism from GL(n, F) to PGL(n, F), that is, the group
induced on the projective space by SL(n, F). Thus,
PSL(n, F) = SL(n, F)/(SL(n, F) © Z).
The kernel of this homomorphism consists of the scalar matrices cI which have
determinant 1, that is, those cI for which cn = 1. This is a ¬nite cyclic group
whose order divides n.
Again, for ¬nite ¬elds, we can calculate the orders:

(a) | SL(n, q)| = | GL(n, q)|/(q ’ 1);
Theorem 2.3

(b) | PSL(n, q)| = | SL(n, q)|/(n, q ’ 1), where (n, q ’ 1) is the greatest common
divisor of n and q ’ 1.

Proof (a) SL(n, q) is the kernel of the determinant homomorphism on GL(n, q)
whose image F — has order q ’ 1.
(b) From the remark before the theorem, we see that PSL(n, q) is the image of
SL(n, q) under a homomorphism whose kernel is the group of nth roots of unity
in GF(q). Since the multiplicative group of this ¬eld is cyclic of order q ’ 1, the
nth roots form a subgroup of order (n, q ’ 1).

A group G acts sharply transitively on a set „¦ if its action is regular, that is, it
is transitive and the stabiliser of a point is the identity.

Theorem 2.4 Let F be a division ring. Then the group PGL(n, F) acts transitively
on the set of all (n + 1)-tuples of points of PG(n ’ 1, F) with the property that no
n points lie in a hyperplane; the stabiliser of such a tuple is isomorphic to the
group of inner automorphisms of the multiplicative group of F. In particular, if
F is commutative, then PGL(n, F) is sharply transitive on the set of such (n + 1)-

Proof Consider n points not lying in a hyperplane. The n vectors spanning these
points form a basis, and we may assume that this is the standard basis e1 , . . . , en of
F n , where ei has ith coordinate 1 and all others zero. The proof of Proposition 2.1
shows that G acts transitively on the set of such n-tuples, and the stabiliser of the
n points is the group of diagonal matrices. Now a vector v not lying in the hy-
perplane spanned by any n ’ 1 of the basis vectors must have all its coordinates
non-zero, and conversely. Moreover, the group of diagonal matrices acts transi-
tively on the set of such vectors. This proves that PG(n, F) is transitive on the set
of (n + 1)-tuples of the given form. Without loss of generality, we may assume
that v = e1 + · · · + en = (1, 1, . . . , 1). Then the stabiliser of the n + 1 points consists
of the group of scalar matrices, which is isomorphic to the multiplicative group
F — . We have seen that the kernel of the action on the projective space is Z(F — ), so
the group induced by the scalar matrices is F — /Z(F — ), which is isomorphic to the
group of inner automorphisms of F — .

Corollary 2.5 The group PGL(2, F) is 3-transitive on the points of the projective
line PG(1, F); the stabiliser of three points is isomorphic to the group of inner

automorphisms of the multiplicative group of F. In particular, if F is commutative,
then PGL(2, F) is sharply 3-transitive on the points of the projective line.
For n > 2, the group PGL(n, F) is 2-transitive on the points of the projective
space PG(n ’ 1, F).
This follows from the theorem because, in the projective plane, the hyper-
planes are the points, and so no two distinct points lie in a hyperplane; while, in
general, any two points are independent and can be extended to an (n + 1)-tuple
as in the theorem.
We can represent the set of points of the projective line as {∞} ∪ F, where
∞ = (1, 0) and a = (a, 1) for a ∈ F. Then the stabiliser of the three points
∞, 0, 1 acts in the natural way on F \ {0, 1} by conjugation.
For consider the effect of the diagonal matrix aI on the point (x, 1) . This is
mapped to (xa, a) , which is the same rank 1 subspace as (a’1 xa, 1) ; so in the
new representation, aI induces the map x ’ a’1 xa.
In this convenient representation, the action of PGL(2, F) can be represented
maps (x, 1) to (xa +
by linear fractional transformations. The matrix
c, xb + d), which spans the same point as ((xb + d)’1 (xa + c), 1) if xb + d = 0, or
(1, 0) otherwise. Thus the transformation induced by this matrix can be written as
x ’ (xb + d)’1 (xa + c),
provided we make standard conventions about ∞ (for example, 0’1 a = ∞ for a =
0 and (∞b + d)’1 (∞a + c) = b’1 a. If F is commutative, this transformation is
conveniently written as a fraction:
ax + c
x’ .
bx + d
Exercise 2.1 Work out carefully all the conventions required to use the linear
fractional representation of PGL(2, F).
Exercise 2.2 By Theorem 2.4, the order of PGL(n, q) is equal to the number of
(n + 1)-tuples of points of PG(n ’ 1, q) for which no n lie in a hyperplane. Use
this to give an alternative proof of Theorem 2.2.
Paul Cohn constructed an example of a division ring F such that all elements
of F \ {0, 1} are conjugate in the multiplicative group of F. For a division ring
F with this property, we see that PGL(2, F) is 4-transitive on the projective line.
This is the highest degree of transitivity that can be realised in this way.

Exercise 2.3 Show that, if F is a division ring with the above property, then F
has characteristic 2, and the multiplicative group of F is torsion-free and simple.
Exercise 2.4 Let F be a commutative ¬eld. Show that, for all n ≥ 2, the group
PSL(n, F) is 2-transitive on the points of the projective space PG(n ’ 1, F); it is
3-transitive if and only if n = 2 and every element of F is a square.

2.2 Generation
For the rest of this section, we assume that F is a commutative ¬eld. A transvec-
tion of the F-vector space V is a linear map : V ’ V which satis¬es rk(T ’ I) = 1
and (T ’ I)2 = 0. Thus, if we choose a basis such that e1 spans the image of T ’ I
and e1 , . . . .en’1 span the kernel, then T is represented by the matrix I +U, where
U has entry 1 in the top right position and 0 elsewhere. Note that a transvection
has determinant 1. The axis of the transvection is the hyperplane ker(T ’ I); this
subspace is ¬xed elementwise by T . Dually, the centre of T is the image of T ’ I;
every subspace containing this point is ¬xed by T (so that T acts trivially on the
quotient space).
Thus, a transvection is a map of the form
x ’ x + (x f )a,
where a ∈ V and f ∈ V — satisfy a f = 0 (that is, f ∈ a† ). Its centre and axis are a
and ker( f ) respectively.
The transformation of projective space induced by a transvection is called an
elation. The matrix form given earlier shows that all elations lie in PSL(n, F).
Theorem 2.6 For any n ≥ 2 and commutative ¬eld F, the group PSL(n, F) is
generated by the elations.
Proof We use induction on n.
Consider the case n = 2. The elations ¬xing a speci¬ed point, together with
the identity, form a group which acts regularly on the remaining points. (In the
linear fractional representation, this elation group is
{x ’ x + a : a ∈ F},
¬xing ∞.) Hence the group G generated by the elations is 2-transitive. So it is
enough to show that the stabiliser of the two points ∞ and 0 in G is the same as in
PSL(2, F), namely
{x ’ a2 x : a ∈ F, a = 0}.

Given a ∈ F, a = 0, we have

1 ’a’1
1 1 1 0 1 0 a 0
= ,
a ’ a2
a’1 1
0 1 0 1 1 0

and the last matrix induces the linear fractional map x ’ ax/a’1 = a2 x, as re-
(The proof shows that two elation groups, with centres ∞ and 0, suf¬ce to
generate PSL(2, F).)
Now for the general case, we assume that PSL(n ’ 1, F) is generated by ela-
tions. Let G be the subgroup of PSL(n, F) generated by elations. First, we observe
that G is transitive; for, given any two points p1 and p2 , there is an elation on the
line p1 , p2 carrying p1 to p2 , which is induced by an elation on the whole space
(acting trivially on a complement to the line). So it is enough to show that the
stabiliser of a point p is generated by elations. Take an element g ∈ PSL(n, F)
¬xing p.
By induction, G p induces at least the group PSL(n ’ 1, F) on the quotient
space V /p. So, multiplying g by a suitable product of elations, we may assume
that g induces an element on V /p which is diagonal, with all but one of its diagonal
elements equal to 1. In other words, we can assume that g has the form
« 
» 0 ... 0 0
¬ 0 1 ... 0·
¬. .·
. .. .
¬. . ·.
. .
¬. . . .·
 0 0 ... 0
x1 x2 . . . xn’1 »’1

By further multiplication by elations, we may assume that x1 = . . . = xn’1 = 0.
Now the result follows from the matrix calculation given in the case n = 2.

Exercise 2.5 A homology is an element of PGL(n, F) which ¬xes a hyperplane
pointwise and also ¬xes a point not in this hyperplane. Thus, a homology is
represented in a suitable basis by a diagonal matrix with all its diagonal entries
except one equal to 1.

(a) Find two homologies whose product is an elation.

(b) Prove that PGL(n, F) is generated by homologies.

2.3 Iwasawa™s Lemma
Let G be a permutation group on a set „¦: this means that G is a subgroup of the
symmetric group on „¦. Iwasawa™s Lemma gives a criterion for G to be simple.
We will use this to prove the simplicity of PSL(n, F) and various other classical
Recall that G is primitive on „¦ if it is transitive and there is no non-trivial
equivalence relation on „¦ which is G-invariant: equivalently, if the stabiliser G±
of a point ± ∈ „¦ is a maximal subgroup of G. Any 2-transitive group is primitive.
Iwasawa™s Lemma is the following.

Theorem 2.7 Let G be primitive on „¦. Suppose that there is an abelian normal
subgroup A of G± with the property that the conjugates of A generate G. Then any
non-trivial normal subgroup of G contains G . In particular, if G = G , then G is

Proof Suppose that N is a non-trivial normal subgroup of G. Then N ¤ G± for
some ±. Since G± is a maximal subgroup of G, we have NG± = G.
Let g be any element of G. Write g = nh, where n ∈ N and h ∈ G± . Then

gAg’1 = nhAh’1 n’1 = nAn’1 ,

since A is normal in G± . Since N is normal in G we have gAg’1 ¤ NA. Since the
conjugates of A generate G we see that G = NA.
G/N = NA/N ∼ A/(A © N)
is abelian, whence N ≥ G , and we are done.

2.4 Simplicity
We now apply Iwasawa™s Lemma to prove the simplicity of PSL(n, F). First, we
consider the two exceptional cases where the group is not simple.
Recall that PSL(2, q) is a subgroup of the symmetric group Sq+1 , having order
(q + 1)q(q ’ 1)/(q ’ 1, 2).
(a) If q = 2, then PSL(2, q) is a subgroup of S3 of order 6, so PSL(2, 2) ∼ S3 .
It is not simple, having a normal subgroup of order 3.
(b) If q = 3, then PSL(2, q) is a subgroup of S4 of order 12, so PSL(2, 3) ∼ A4 .
It is not simple, having a normal subgroup of order 4.

(c) For comparison, we note that, if q = 4, then PSL(2, q) is a subgroup of S5
of order 60, so PSL(2, 4) ∼ A5 . This group is simple.

Lemma 2.8 The group PSL(n, F) is equal to its derived group if n > 2 or if |F| >

Proof The group G = PSL(n, F) acts transitively on incident point-hyperplane
pairs. Each such pair de¬nes a unique elation group. So all the elation groups are
conjugate. These groups generate G. So the proof will be concluded if we can
show that some elation group is contained in G .
Suppose that |F| > 3. It is enough to consider n = 2, since we can extend all
matrices in the argument below to rank n by appending a block consisting of the
identity of rank n ’ 2. There is an element a ∈ F with a2 = 0, 1. We saw in the
proof of Theorem 2.6 that SL(2, F) contains the matrix . Now
0 a’1

a’1 (a2 ’ 1)x
1 ’x a 0 1x 0 1
= ;
01 0 01 0 a 0 1

this equation expresses any element of the corresponding transvection group as a
Finally suppose that |F| = 2 or 3. As above, it is enough to consider the case
n = 3. This is easier, since we have more room to manoeuvre in three dimensions:
we have
« « « « « 
1 ’x 0 10 0 1x0 100 10x
 0 1 0   0 1 ’1   0 1 0   0 1 1  =  0 1 0  .
001 00 1 001 001 001

Lemma 2.9 Let „¦ be the set of points of the projective space PG(n ’ 1, F). Then,
for ± ∈ „¦, the set of elations with centre ±, together with the identity, forms an
abelian normal subgroup of G± .

Proof This is more conveniently shown for the corresponding transvections in
SL(n, F). But the transvections with centre spanned by the vector a consist of all
maps x ’ x + (x f )a,, for f ∈ A† ; these clearly form an abelian group isomorphic
to the additive group of a† .

Theorem 2.10 The group PSL(n, F) is simple if n > 2 or if |F| > 3.

Proof let G = PSL(n, F). Then G is 2-transitive, and hence primitive, on the
set „¦ of points of the projective space. The group A of elations with centre ±
is an abelian normal subgroup of G± , and the conjugates of A generate G (by
Theorem 2.6, since every elation has a centre). Apart from the two excluded
cases, G = G . So G is simple, by Iwasawa™s Lemma.

2.5 Small ¬elds
We now have the family PSL(n, q), for (n, q) = (2, 2), (2, 3) of ¬nite simple groups.
(The ¬rst two members are not simple: we observed that PSL(2, 2) ∼ S3 and =
∼ A4 , neither of which is simple.) As is well-known, Galois showed
PSL(2, 3) =
that the alternating group An of degree n ≥ 5 is simple.

Exercise 2.6 Prove that the alternating group An is simple for n ≥ 5.

Some of these groups coincide:

(a) PSL(2, 4) ∼ PSL(2, 5) ∼ A5 .
= =
Theorem 2.11

(b) PSL(2, 7) ∼ PSL(3, 2).

(c) PSL(2, 9) ∼ A6 .

(d) PSL(4, 2) ∼ A8 .

Proofs of these isomorphisms are outlined below. Many of the details are left
as exercises. There are many other ways to proceed!

Theorem 2.12 Let G be a simple group of order (p + 1)p(p ’ 1)/2, where p is a
prime number greater than 3. Then G ∼ PSL(2, p).

Proof By Sylow™s Theorem, the number of Sylow p-subgroups is congruent to 1
mod p and divides (p + 1)(p ’ 1)/2; also this number is greater than 1, since G
is simple. So there are p + 1 Sylow p-subgroups; and if P is a Sylow p-subgroup
and N = NG (P), then |N| = p(p ’ 1)/2.
Consider G acting as a permutation group on the set „¦ of cosets of N. Let ∞
denote the coset N. Then P ¬xes ∞ and permutes the other p cosets regularly. So
we can identify „¦ with the set {∞} ∪ GF(p) such that a generator of P acts on „¦

as the permutation x ’ x + 1 (¬xing ∞). We see that N is permutation isomorphic
to the group
{x ’ a2 x + b : a, b ∈ GF(p), a = 0}.
More conveniently, elements of N can be represented as linear fractional transfor-
mations of „¦ with determinant 1, since
ax + a’1 b
a x+b = .
0x + a’1
Since G is 2-transitive on „¦, N is a maximal subgroup of G, and G is gener-
ated by N and an element t interchanging ∞ and 0, which can be chosen to be an
involution. If we can show that t is also represented by a linear fractional trans-
formation with determinant 1, then G will be a subgroup of the group PSL(2, p)
of all such transformations, and comparing orders will show that G = PSL(2, p).
We treat the case p ≡ ’1 (mod 4); the other case is a little bit trickier.
The element t must normalise the stabiliser of ∞ and 0, which is the cyclic
group C = {x ’ a2 x} of order (p ’ 1)/2 (having two orbits of size (p ’ 1)/2,
consisting of the non-zero squares and the non-squares in GF(p)). Also, t has
no ¬xed points. For the stabiliser of three points in G is trivial, so t cannot ¬x
more than 2 points; but the two-point stabiliser has odd order (p ’ 1)/2. Thus t
interchanges the two orbits of C.
There are various ways to show that t inverts C. One of them uses Burnside™s
Transfer Theorem. Let q be any prime divisor of (p ’ 1)/2, and let Q be a Sylow
q-subgroup of C (and hence of G). Clearly NG (Q) = C t , so t must centralise or
invert Q. If t centralises Q, then Q ¤ Z(NG (Q), and Burnside™s Transfer Theorem
implies that G has a normal q-complement, contradicting simplicity. So t inverts
every Sylow subgroup of C, and thus inverts C.
Now C t is a dihedral group, containing (p ’ 1)/2 involutions, one inter-
changing the point 1 with each point in the other C-orbit. We may choose t so
that it interchanges 1 with ’1. Then the fact that t inverts C shows that it inter-
changes a2 with ’a’2 for each non-zero a ∈ GF(p). So t is the linear fractional
map x ’ ’1/x, and we are done.

Theorem 2.11(b) follows, since PSL(3, 2) is a simple group of order

(23 ’ 1)(23 ’ 2)(23 ’ 22 ) = 168 = (7 + 1)7(7 ’ 1)/2.

Exercise 2.7 (a) Complete the proof of the above theorem in the case p = 5.
Hence prove Theorem 2.11(a).

(b) Show that a simple group of order 60 has ¬ve Sylow 2-subgroups, and hence
show that any such group is isomorphic to A5 . Give an alternative proof of
Theorem 2.11(a).
Proof of Theorem 2.11(d) The simple group PSL(3, 2) of order 168 is the group
of collineations of the projective plane over GF(2), shown below.

'$ „
u „u
˜ u4
4 ˜„
4 ˜
u &%
4 u ˜u

Since its index in S7 is 30, there are 30 different ways of assigning the structure
of a projective plane to a given set N = {1, 2, 3, 4, 5, 6, 7} of seven points; and since
PSL(3, 2), being simple, contains no odd permutations, it is contained in A7 , so
these 30 planes fall into two orbits of 15 under the action of A7 .
Let „¦ be one of the A7 -orbits. Each plane contains seven lines, so there 15 —
7 = 105 pairs (L, Π), where L is a 3-subset of N, Π ∈ „¦, and L is a line of Π.
Thus, each of the 7 = 35 triples is a line in exactly three of the planes in „¦.
We now de¬ne a new geometry G whose ˜points™ are the elements of „¦, and
whose ˜lines™ are the triples of elements containing a ¬xed line L. Clearly, any
two ˜points™ lie in at most one ˜line™, and a simple counting argument shows that
in fact two ˜points™ lie in a unique line.
Let Π be a plane from the other A7 -orbit. For each point n ∈ N, the three lines
of Π containing n belong to a unique plane of the set „¦. (Having chosen three
lines through a point, there are just two ways to complete the projective plane,
differing by an odd permutation.) In this way, each of the seven points of N gives
rise to a ˜point™ of „¦. Moreover, the three points of a line of Π correspond to
three ˜points™ of a ˜line™ in our new geometry G. Thus, G contains ˜planes™, each
isomorphic to the projective plane PG(2, 2).
It follows that G is isomorphic to PG(3, 2). The most direct way to see this is
to consider the set A = {0} ∪ „¦, and de¬ne a binary operation on A by the rules
0 + Π = Π + 0 = Π for all Π ∈ „¦;
Π+Π = 0 for all Π ∈ „¦;
Π+Π = Π if {Π, Π , Π } is a ˜line™.
Then A is an elementary abelian 2-group. (The associative law follows from the
fact that any three non-collinear ˜points™ lie in a ˜plane™.) In other words, A is the

additive group of a rank 4 vector space over GF(2), and clearly G is the projective
geometry based on this vector space.
Now A7 ¤ Aut(G) = PSL(4, 2). (The last inequality comes from the Funda-
mental Theorem of Projective Geometry and the fact that PSL(4, 2) = P“L(4, 2)
since GF(2) has no non-trivial scalars or automorphisms.) By calculating orders,
we see that A7 has index 8 in PSL(4, 2). Thus, PSL(4, 2) is a permutation group
on the cosets of A7 , that is, a subgroup of S8 , and a similar calculation shows that
it has index 2 in S8 . We conclude that PSL(4, 2) ∼ A8 .
The proof of Theorem 2.11(c) is an exercise. Two approaches are outlined
below. Fill in the details.
Exercise 2.8 The ¬eld GF(9) can be represented as {a + bi : a, b ∈ GF(3)}, where
i2 = ’1. Let
1 1+i 01
A= , B= .
’1 0
0 1
A3 = I, B2 = ’I, (AB)5 = ’I.
So the corresponding elements a, b ∈ G = PSL(2, 9) satisfy
a3 = b2 = (ab)5 = 1,
and so generate a subgroup H isomorphic to A5 . Then H has index 6 in G, and
the action of G on the cosets of H shows that G ¤ S6 . Then consideration of order
shows that G ∼ A6 .
Exercise 2.9 Let G = A6 , and let H be the normaliser of a Sylow 3-subgroup of
G. Let G act on the 10 cosets of H. Show that H ¬xes one point and acts is
isomorphic to the group
{x ’ a2 x + b : a, b ∈ GF(9), a = 0}
on the remaining points. Choose an element outside H and, following the proof of
Theorem 2.12, show that its action is linear fractional (if the ¬xed point is labelled
as ∞). Deduce that A6 ¤ PSL(2, 9), and by considering orders, show that equality
Exercise 2.10 A Hall subgroup of a ¬nite group G is a subgroup whose order and
index are coprime. Philip Hall proved that a ¬nite soluble group G has Hall sub-
groups of all admissible orders m dividing |G| for which (m, |G|/m) = 1, and that
any two Hall subgroups of the same order in a ¬nite soluble group are conjugate.

(a) Show that PSL(2, 5) fails to have a Hall subgroup of some admissible order.

(b) Show that PSL(2, 7) has non-conjugate Hall subgroups of the same order.

(c) Show that PSL(2, 11) has non-isomorphic Hall subgroups of the same order.

(d) Show that each of these groups is the smallest with the stated property.

Exercise 2.11 Show that PSL(4, 2) and PSL(3, 4) are non-isomorphic simple groups
of the same order.

3 Polarities and forms
3.1 Sesquilinear forms
We saw in Chapter 1 that the projective space PG(n ’ 1, F) is isomorphic to its
dual if and only if the ¬eld F is isomorphic to its opposite. More precisely, we
have the following. Let σ be an anti-automorphism of F, and V an F-vector space
of rank n. A sesquilinear form B on V is a function B : V —V ’ F which satis¬es
the following conditions:

(a) B(c1 x1 + c2 x2 , y) = c1 B(x1 , y) + c2 B(x2 , y), that is, B is a linear function of
its ¬rst argument;

(b) B(x, c1 y1 +c2 y2 ) = B(x, y1 )cσ +B(x, y2 )cσ , that is, B is a semilinear function
1 2
of its second argument, with ¬eld anti-automorphism σ.

(The word ˜sesquilinear™ means ˜one-and-a-half™.) If σ is the identity (so that F is
commutative), we say that B is a bilinear form.
The left radical of B is the subspace {x ∈ V : (∀y ∈ V )B(x, ) = 0}, and the right
radical is the subspace {y ∈ V : (∀x ∈ V )B(x, y) = 0}.

Exercise 3.1 (a) Prove that the left and right radicals are subspaces.
(b) Show that the left and right radicals have the same rank (if V has ¬nite
(c) Construct a bilinear form on a vector space of in¬nite rank such that the
left radical is zero and the right radical is no-zero.

The sesquilinear form B is called non-degenerate if its left and right radicals
are zero. (By the preceding exercise, it suf¬ces to assume that one of the radicals
is zero.)
A non-degenerate sesquilinear form induces a duality of PG(n ’ 1, F) (an iso-
morphism from PG(n ’ 1, F) to PG(n ’ 1, F —¦ )) as follows: for any y ∈ V , the map
x ’ B(x, y) is a linear map from V to F, that is, an element of the dual space V —
(which is a left vector space of rank n over F —¦ ); if we call this element βy , then the
map y ’ βy is a σ-semilinear bijection from V to V — , and so induces the required

Theorem 3.1 For n ≥ 3, any duality of PG(n ’ 1, F) is induced in this way by a
non-degenerate sesquilinear form on V = F n .

Proof By the Fundamental Theorem of Projective Geometry, a duality is induced
by a σ-semilinear bijection φ from V to V — , for some anti-automorphism σ. Set

B(x, y) = x(yφ).

We can short-circuit the passage to the dual space, and write the duality as

U ’ U ⊥ = {x ∈ V : B(x, y) = 0 for all y ∈ U}.

Obviously, a duality applied twice is a collineation. The most important types
of dualities are those whose square is the identity. A polarity of PG(n, F) is a
duality ⊥ which satis¬es U ⊥⊥ = U for all ¬‚ats U of PG(n, F).
It will turn out that polarities give rise to a class of geometries (the polar
spaces) with properties similar to those of projective spaces, and de¬ne groups
analogous to the projective groups. If a duality is not a polarity, then any collineation
which respects it must commute with its square, which is a collineation; so the
group we obtain will lie inside the centraliser of some element of the collineation
group. So the “largest” subgroups obtained will be those preserving polarities.
A sesquilinear form B is re¬‚exive if B(x, y) = 0 implies B(y, x) = 0.

Proposition 3.2 A duality is a polarity if and only if the sesquilinear form de¬ning
it is re¬‚exive.

Proof B is re¬‚exive if and only if x ∈ y ⊥ ’ y ∈ x ⊥ . Hence, if B is re¬‚exive,
then U ⊆ U ⊥⊥ for all subspaces U. But by non-degeneracy, dimU ⊥⊥ = dimV ’
dimU ⊥ = dimU; and so U = U ⊥⊥ for all U. Conversely, given a polarity ⊥, if
y ∈ x ⊥ , then x ∈ x ⊥⊥ ⊆ y ⊥ (since inclusions are reversed).

We now turn to the classi¬cation of re¬‚exive forms. For convenience, from
now on F will always be assumed to be commutative. (Note that, if the anti-
automorphism σ is an automorphism, and in particular if σ is the identity, then F
is automatically commutative.)
The form B is said to be σ-Hermitian if B(y, x) = B(x, y)σ for all x, y ∈ V . If B
is a non-zero σ-Hermitian form, then
(a) for any x, B(x, x) lies in the ¬xed ¬eld of σ;

(b) σ2 = 1. For every scalar c is a value of B, say B(x, y) = c; then

cσ = B(x, y)σ = B(y, x)σ = B(x, y) = c.
2 2

If σ is the identity, such a form (which is bilinear) is called symmetric.
A bilinear form b is called alternating if B(x, x) = 0 for all x ∈ V . This implies
that B(x, y) = ’B(y, x) for all x, y ∈ V . For
0 = B(x + y, x + y) = B(x, x) + B(x, y) + B(y, x) + B(y, y) = B(x, y) + B(y, x).
Hence, if the characteristic is 2, then any alternating form is symmetric (but not
conversely); but, in characteristic different from 2, only the zero form is both
symmetric and alternating.
Clearly, an alternating or Hermitian form is re¬‚exive. Conversely, we have the
Theorem 3.3 A non-degenerate re¬‚exive σ-sesquilinear form is either alternat-
ing, or a scalar multiple of a σ-Hermitian form. In the latter case, if σ is the
identity, then the scalar can be taken to be 1.
Proof I will give the proof just for a bilinear form. Thus, it must be proved that
a non-degenerate re¬‚exive bilinear form is either symmetric or alternating.
We have
B(u, v)B(u, w) ’ B(u, w)B(u, v) = 0
by commutativity; that is, using bilinearity,
B(u, B(u, v)w ’ B(u, w)v) = 0.
By re¬‚exivity,
B(B(u, v)w ’ B(u, w)v, u) = 0,
whence bilinearity again gives
B(u, v)B(w, u) = B(u, w)B(v, u). (1)
Call a vector u good if B(u, v) = B(v, u) = 0 for some v. By Equation (1), if
u is good, then B(u, w) = B(w, u) for all w. Also, if u is good and B(u, v) = 0,
then v is good. But, given any two non-zero vectors u1 , u2 , there exists v with
B(ui , v) = 0 for i = 1, 2. (For there exist v1 , v2 with B(ui , vi ) = 0 for i = 1, 2, by
non-degeneracy; and at least one of v1 , v2 , v1 + v2 has the required property.) So,
if some vector is good, then every non-zero vector is good, and B is symmetric.
But, putting u = w in Equation (1) gives
B(u, u) (B(u, v) ’ B(v, u)) = 0
for all u, v. So, if u is not good, then B(u, u) = 0; and, if no vector is good, then B
is alternating.

Exercise 3.2 (a) Show that the left and right radicals of a re¬‚exive form are

(b) Assuming Theorem 3.3, prove that the assumption of non-degeneracy in the
theorem can be removed.

Exercise 3.3 Let σ be a (non-identity) automorphism of F of order 2. Let E be
the sub¬eld Fix(σ).
(a) Prove that F is of degree 2 over E, i.e., a rank 2 E-vector space.
[See any textbook on Galois theory. Alternately, argue as follows: Take » ∈
F \ E. Then » is quadratic over E, so E(») has degree 2 over E. Now E(»)
contains an element ω such that ωσ = ’ω (if the characteristic is not 2) or ωσ =
ω + 1 (if the characteristic is 2). Now, given two such elements, their quotient or
difference respectively is ¬xed by σ, so lies in E.]
(b) Prove that

{» ∈ F : »»σ = 1} = {µ/µσ : µ ∈ F}.

[The left-hand set clearly contains the right. For the reverse inclusion, separate
into cases according as the characteristic is 2 or not.
If the characteristic is not 2, then we can take F = E(ω), where ω2 = ± ∈ E
and ωσ = ’ω. If » = 1, then take µ = 1; otherwise, if » = a + bω, take µ =
b± + (a ’ 1)ω.
If the characteristic is 2, show that we can take F = E(ω), where ω2 + ω + ± =
0, ± ∈ E, and ωσ = ω + 1. Again, if » = 1, set µ = 1; else, if » = a + bω, take
µ = (a + 1) + bω.]

Exercise 3.4 Use the result of the preceding exercise to complete the proof of
Theorem 3.3 in general.
[If B(u, u) = 0 for all u, the form B is alternating and bilinear. If not, suppose
that B(u, u) = 0 and let B(u, u)σ = »B(u, u). Choosing µ as in Exercise 3.3 and
re-normalising B, show that we may assume that » = 1, and (with this choice) that
B is Hermitian.]

3.2 Hermitian and quadratic forms
We now change ground slightly from the last section. On the one hand, we restrict
things by excluding some bilinear forms from the discussion; on the other, we

introduce quadratic forms. The loss and gain exactly balance if the characteristic
is not 2; but, in characteristic 2, we make a net gain.
Let σ be an automorphism of the commutative ¬eld F, of order dividing 2. Let
Fix(σ) = {» ∈ F : »σ = »} be the ¬xed ¬eld of σ, and Tr(σ) = {» + »σ : » ∈ F}
the trace of σ. Since σ2 is the identity, it is clear that Fix(σ) ⊇ Tr(σ). Moreover,
if σ is the identity, then Fix(σ) = F, and

0 if F has characteristic 2,
Tr(σ) =
F otherwise.
Let B be a σ-Hermitian form. We observed in the last section that B(x, x) ∈
Fix(σ) for all x ∈ V . We call the form B trace-valued if B(x, x) ∈ Tr(σ) for all
x ∈ V.

Exercise 3.5 Let σ be an automorphism of a commutative ¬eld F such that σ2 is
the identity.

(a) Prove that Fix(σ) is a sub¬eld of F.

(b) Prove that Tr(σ) is closed under addition, and under multiplication by ele-
ments of Fix(σ).

Proposition 3.4 Tr(σ) = Fix(σ) unless the characteristic of F is 2 and σ is the

Proof E = Fix(σ) is a ¬eld, and K = Tr(σ) is an E-vector space contained in E
(Exercise 3.5). So, if K = E, then K = 0, and σ is the map x ’ ’x. But, since
σ is a ¬eld automorphism, this implies that the characteristic is 2 and σ is the

Thus, in characteristic 2, symmetric bilinear forms which are not alternating
are not trace-valued; but this is the only obstruction. We introduce quadratic forms
to repair this damage. But, of course, quadratic forms can be de¬ned in any char-
acteristic. However, we note at this point that Theorem 3.3 depends in a crucial
way on the commutativity of F; this leaves open the possibility of additional types
of polar spaces de¬ned by so-called pseudoquadratic forms. We will not pursue
this here: see Tits™s classi¬cation of spherical buildings.
Let V be a vector space over F. A quadratic form on V is a function q : V ’ F

(a) q(»x) = »2 f (x) for all » ∈ F, x ∈ V ;

(b) q(x + y) = q(x) + q(y) + B(x, y), where B is bilinear.

Now, if the characteristic of F is not 2, then B is a symmetric bilinear form.
Each of q and B determines the other, by

B(x, y) = q(x + y) ’ q(x) ’ q(y),
q(x) = 1 B(x, x),

the latter equation coming from the substitution x = y in (b). So nothing new is
On the other hand, if the characteristic of F is 2, then B is an alternating bi-
linear form, and q cannot be recovered from B. Indeed, many different quadratic
forms correspond to the same bilinear form. (Note that the quadratic form does
give extra structure to the vector space; we™ll see that this structure is geometri-
cally similar to that provided by an alternating or Hermitian form.)
We say that the bilinear form B is obtained by polarisation of q.
Now let B be a symmetric bilinear form over a ¬eld of characteristic 2, which
is not alternating. Set f (x) = B(x, x). Then we have

f (»x) = »2 f (x),
f (x + y) = f (x) + f (y),

since B(x, y) + B(y, x) = 0. Thus f is “almost” a semilinear form; the map » ’ »2
is a homomorphism of the ¬eld F with kernel 0, but it may fail to be an automor-
phism. But in any case, the kernel of f is a subspace of V , and the restriction of
B to this subspace is an alternating bilinear form. So again, in the spirit of the
vague comment motivating the study of polarities in the last section, the structure
provided by the form B is not “primitive”. For this reason, we do not consider
symmetric bilinear forms in characteristic 2 at all. However, as indicated above,
we will consider quadratic forms in characteristic 2.
Now, in characteristic different from 2, we can take either quadratic forms or
symmetric bilinear forms, since the structural content is the same. For consistency,
we will take quadratic forms in this case too. This leaves us with three “types” of
forms to study: alternating bilinear forms; σ-Hermitian forms where σ is not the
identity; and quadratic forms.
We have to de¬ne the analogue of non-degeneracy for quadratic forms. Of
course, we could require that the bilinear form obtained by polarisation is non-

degenerate; but this is too restrictive. We say that a quadratic form q is non-
degenerate if
q(x) = 0 & (∀y ∈ V )B(x, y) = 0 ’ x = 0,
where B is the associated bilinear form; that is, if the form q is non-zero on every
non-zero vector of the radical.
If the characteristic is not 2, then non-degeneracy of the quadratic form and of
the bilinear form are equivalent conditions.
Now suppose that the characteristic is 2, and let W be the radical of B. Then
B is identically zero on W ; so the restriction of q to W satis¬es
q(x + y) = q(x) + q(y),
q(»x) = »2 q(x).
As above, f is very nearly semilinear.
The ¬eld F is called perfect if every element is a square. If F is perfect,
then the map x ’ x2 is onto, and hence an automorphism of F; so q is indeed
semilinear, and its kernel is a hyperplane of W . We conclude:
Theorem 3.5 Let q be a non-singular quadratic form, which polarises to B, over
a ¬eld F.
(a) If the characteristic of F is not 2, then B is non-degenerate.
(b) If F is a perfect ¬eld of characteristic 2, then the radical of B has rank at
most 1.
Exercise 3.6 Let B be an alternating bilinear form on a vector space V over a ¬eld
F of characteristic 2. Let (vi : i ∈ I) be a basis for V , and (ci : i ∈ I) any function
from I to F. Show that there is a unique quadratic form q with the properties that
q(vi ) = ci for every i ∈ I, and q polarises to B.
Exercise 3.7 (a) Construct an imperfect ¬eld of characteristic 2.
(b) Construct a non-singular quadratic form with the property that the radical
of the associated bilinear form has rank greater than 1.
Exercise 3.8 Show that ¬nite ¬elds of characteristic 2 are perfect.
Exercise 3.9 Let B be a σ-Hermitian form on a vector space V over F, where σ is
not the identity. Set f (x) = B(x, x). Let E = Fix(σ), and let V be V regarded as an
E-vector space by restricting scalars. Prove that f is a quadratic form on V , which
polarises to the bilinear form Tr(B) de¬ned by Tr(B)(x, y) = B(x, y) + B(x, y)σ .
Show further that Tr(B) is non-degenerate if and only if B is.

3.3 Classi¬cation of forms
As explained in the last section, we now consider a vector space V of ¬nite rank
equipped with a form of one of the following types: a non-degenerate alternating
bilinear form B; a non-degenerate trace-valued σ-Hermitian form B, where σ is
not the identity; or a non-singular quadratic form q. In the third case, we let B
be the bilinear form obtained by polarising q; then B is alternating or symmetric
according as the characteristic is or is not 2, but B may be degenerate. We also let
f denote the function q. In the other two cases, we de¬ne a function f : V ’ F by
f (x) = B(x, x) ” this is identically zero if b is alternating. See Exercise 3.10 for
the Hermitian case.
Such a pair (V, B) or (V, q) will be called a formed space.

Exercise 3.10 Let B be a σ-Hermitian form on a vector space V over F, where σ is
not the identity. Set f (x) = B(x, x). Let E = Fix(σ), and let V be V regarded as an
E-vector space by restricting scalars. Prove that f is a quadratic form on V , which
polarises to the bilinear form Tr(B) de¬ned by Tr(B)(x, y) = B(x, y) + B(x, y)σ .
Show further that Tr(b) is non-degenerate if and only if B is.

We say that V is anisotropic if f (x) = 0 for all x = 0. Also, V is a hyperbolic
plane if it is spanned by vectors v and w with f (v) = f (w) = 0 and B(v, w) = 1.
(The vectors v and w are linearly independent, so V has rank 2.)

Theorem 3.6 A non-degenerate formed space is the direct sum of a number r of
hyperbolic lines and an anisotropic space U. The number r and the isomorphism
type of U are invariants of V .

Proof If V is anisotropic, then there is nothing to prove, since V cannot contain
a hyperbolic plane. So suppose that V contains a vector v = 0 with f (v) = 0.
We claim that there is a vector w with B(v, w) = 0. In the alternating and
Hermitian cases, this follows immediately from the non-degeneracy of the form.
In the quadratic case, if no such vector exists, then v is in the radical of B; but v is
a singular vector, contradicting the non-degeneracy of f .
Multiplying w by a non-zero constant, we may assume that B(v, w) = 1.
Now, for any value of », we have B(v, w ’ »v) = 1. We wish to choose » so
that f (w ’ »v) = 0; then v and w will span a hyperbolic line. Now we distinguish

(a) If B is alternating, then any value of » works.

(b) If B is Hermitian, we have
f (w ’ »v) = f (w) ’ »B(v, w) ’ »σ B(w, v) + »»σ f (v)
= f (w) ’ (» + »σ );
and, since B is trace-valued, there exists » with Tr(») = f (w).
(c) Finally, if f = q is quadratic, we have
f (w ’ »v) = f (w) ’ »B(w, v) + »2 f (v)
= f (w) ’ »,

. 1
( 3)