Peter J. Cameron

School of Mathematical Sciences

Queen Mary and West¬eld College

London E1 4NS

U.K.

p.j.cameron@qmw.ac.uk

These notes are the content of an M.Sc. course I gave at Queen Mary and

West¬eld College, London, in January“March 2000.

I am grateful to the students on the course for their comments; to Keldon

Drudge, for standing in for me; and to Simeon Ball, for helpful discussions.

Contents:

1. Fields and vector spaces

2. Linear and projective groups

3. Polarities and forms

4. Symplectic groups

5. Unitary groups

6. Orthogonal groups

7. Klein correspondence and triality

8. Further topics

A short bibliography on classical groups

1

1 Fields and vector spaces

In this section we revise some algebraic preliminaries and establish notation.

1.1 Division rings and ¬elds

A division ring, or skew ¬eld, is a structure F with two binary operations called

addition and multiplication, satisfying the following conditions:

(a) (F, +) is an abelian group, with identity 0, called the additive group of F;

(b) (F \ 0, ·) is a group, called the multiplicative group of F;

(c) left or right multiplication by any ¬xed element of F is an endomorphism of

the additive group of F.

Note that condition (c) expresses the two distributive laws. Note that we must

assume both, since one does not follow from the other.

The identity element of the multiplicative group is called 1.

A ¬eld is a division ring whose multiplication is commutative (that is, whose

multiplicative group is abelian).

Exercise 1.1 Prove that the commutativity of addition follows from the other ax-

ioms for a division ring (that is, we need only assume that (F, +) is a group in

(a)).

Exercise 1.2 A real quaternion has the form a + bi + cj + dk, where a, b, c, d ∈

R. Addition and multiplication are given by “the usual rules”, together with the

following rules for multiplication of the elements 1, i, j, k:

· 1i j k

1 1i j k

i ’1 k ’j

i

j ’k ’1 i

j

k j ’i ’1

k

Prove that the set H of real quaternions is a division ring. (Hint: If q = a + bi +

cj + dk, let q— = a ’ bi ’ cj ’ dk; prove that qq— = a2 + b2 + c2 + d 2 .)

2

Multiplication by zero induces the zero endomorphism of (F, +). Multiplica-

tion by any non-zero element induces an automorphism (whose inverse is mul-

tiplication by the inverse element). In particular, we see that the automorphism

group of (F, +) acts transitively on its non-zero elements. So all non-zero ele-

ments have the same order, which is either in¬nite or a prime p. In the ¬rst case,

we say that the characteristic of F is zero; in the second case, it has characteristic

p.

The structure of the multiplicative group is not so straightforward. However,

the possible ¬nite subgroups can be determined. If F is a ¬eld, then any ¬nite

subgroup of the multiplicative group is cyclic. To prove this we require Vander-

monde™s Theorem:

Theorem 1.1 A polynomial equation of degree n over a ¬eld has at most n roots.

Exercise 1.3 Prove Vandermonde™s Theorem. (Hint: If f (a) = 0, then f (x) =

(x ’ a)g(x).)

Theorem 1.2 A ¬nite subgroup of the multiplicative group of a ¬eld is cyclic.

Proof An element ω of a ¬eld F is an nth root of unity if ωn = 1; it is a primitive

nth root of unity if also ωm = 1 for 0 < m < n.

Let G be a subgroup of order n in the multiplicative group of the ¬eld F. By

Lagrange™s Theorem, every element of G is an nth root of unity. If G contains a

primitive nth root of unity, then it is cyclic, and the number of primitive nth roots

is φ(n), where φ is Euler™s function. If not, then of course the number of primitive

nth roots is zero. The same considerations apply of course to any divisor of n. So,

if ψ(m) denotes the number of primitive mth roots of unity in G, then

(a) for each divisor m of n, either ψ(m) = φ(m) or ψ(m) = 0.

Now every element of G has some ¬nite order dividing n; so

‘ ψ(m) = n.

(b)

m|n

Finally, a familiar property of Euler™s function yields:

‘ φ(m) = n.

(c)

m|n

From (a), (b) and (c) we conclude that ψ(m) = φ(m) for all divisors m of n. In

particular, ψ(n) = φ(n) = 0, and G is cyclic.

3

For division rings, the position is not so simple, since Vandermonde™s Theorem

fails.

Exercise 1.4 Find all solutions of the equation x2 + 1 = 0 in H.

However, the possibilities can be determined. Let G be a ¬nite subgroup of

the multiplicative group of the division ring F. We claim that there is an abelian

group A such that G is a group of automorphisms of A acting semiregularly on the

non-zero elements. Let B be the subgroup of (F, +) generated by G. Then B is a

¬nitely generated abelian group admitting G acting semiregularly. If F has non-

zero characteristic, then B is elementary abelian; take A = B. Otherwise, choose

a prime p such that, for all x, g ∈ G, the element (xg ’ x)p’1 is not in B, and set

A = B/pB.

The structure of semiregular automorphism groups of ¬nite groups (a.k.a.

Frobenius complements) was determined by Zassenhaus. See Passman, Permu-

tation Groups, Benjamin, New York, 1968, for a detailed account. In particular,

either G is metacyclic, or it has a normal subgroup isomorphic to SL(2, 3) or

SL(2, 5). (These are ¬nite groups G having a unique subgroup Z of order 2, such

that G/Z is isomorphic to the alternating group A4 or A5 respectively. There is a

unique such group in each case.)

Exercise 1.5 Identify the division ring H of real quaternions with the real vec-

tor space R4 with basis {1, i, j, k}. Let U denote the multiplicative group of unit

quaternions, those elements a+bi+cj+dk satisfying a2 +b2 +c2 +d 2 = 1. Show

that conjugation by a unit quaternion is an orthogonal transformation of R4 , ¬xing

the 1-dimensional space spanned by 1 and inducing an orthogonal transformation

on the 3-dimensional subspace spanned by i, j, k.

Prove that the map from U to the 3-dimensional orthogonal group has kernel

±1 and image the group of rotations of 3-space (orthogonal transformations with

determinant 1).

Hence show that the groups SL(2, 3) and SL(2, 5) are ¬nite subgroups of the

multiplicative group of H.

Remark: This construction explains why the groups SL(2, 3) and SL(2, 5) are

sometimes called the binary tetrahedral and binary icosahedral groups. Construct

also a binary octahedral group of order 48, and show that it is not isomorphic to

GL(2, 3) (the group of 2 — 2 invertible matrices over the integers mod 3), even

though both groups have normal subgroups of order 2 whose factor groups are

isomorphic to the symmetric group S4 .

4

1.2 Finite ¬elds

The basic facts about ¬nite ¬elds are summarised in the following two theorems,

due to Wedderburn and Galois respectively.

Theorem 1.3 Every ¬nite division ring is commutative.

Theorem 1.4 The number of elements in a ¬nite ¬eld is a prime power. Con-

versely, if q is a prime power, then there is a unique ¬eld with q elements, up to

isomorphism.

The unique ¬nite ¬eld with a given prime power order q is called the Galois

¬eld of order q, and denoted by GF(q) (or sometimes Fq ). If q is prime, then

GF(q) is isomorphic to Z/qZ, the integers mod q.

We now summarise some results about GF(q).

Theorem 1.5 Let q = pa , where p is prime and a is a positive integer. Let F =

GF(q).

(a) F has characteristic p, and its additive group is an elementary abelian p-

group.

(b) The multiplicative group of F is cyclic, generated by a primitive (pa ’ 1)th

root of unity (called a primitive element of F).

(c) The automorphism group of F is cyclic of order a, generated by the Frobenius

automorphism x ’ x p .

(d) For every divisor b of a, there is a unique sub¬eld of F of order pb , consisting

b

of all solutions of x p = x; and these are all the sub¬elds of F.

Proof Part (a) is obvious since the additive group contains an element of order p,

and part (b) follows from Theorem 1.2. Parts (c) and (d) are most easily proved

using Galois theory. Let E denote the sub¬eld Z/pZ of F. Then the degree of

F over E is a. The Frobenius map σ : x ’ x p is an E-automorphism of F, and

has order a; so F is a Galois extension of E, and σ generates the Galois group.

Now sub¬elds of F necessarily contain E; by the Fundamental Theorem of Galois

Theory, they are the ¬xed ¬elds of subgroups of the Galois group σ .

5

For explicit calculation in F = GF(pa ), it is most convenient to represent it

as E[x]/( f ), where E = Z/pZ, E[x] is the polynomial ring over E, and f is the

(irreducible) minimum polynomial of a primitive element of F. If ± denotes the

coset ( f ) + x, then ± is a root of f , and hence a primitive element.

Now every element of F can be written uniquely in the form

c0 + c1 ± + · · · + ca’1 ±a’1 ,

where c0 , c1 , . . . , ca’1 ∈ E; addition is straightforward in this representation. Also,

every non-zero element of F can be written uniquely in the form ±m , where

0 ¤ m < pa ’ 1, since ± is primitive; multiplication is straightforward in this

representation. Using the fact that f (±) = 0, it is possible to construct a table

matching up the two representations.

Example The polynomial x3 + x + 1 is irreducible over E = Z/2Z. So the ¬eld

F = E(±) has eight elements, where ± satis¬es ±3 + ± + 1 = 0 over E. We have

±7 = 1, and the table of logarithms is as follows:

±0 1

±1 ±

±2 ±2

±3 ±+1

±4 ±2 + ±

±5 ±2 + ± + 1

±6 ±2 +1

Hence

(±2 + ± + 1)(±2 + 1) = ±5 · ±6 = ±4 = ±2 + ±.

Exercise 1.6 Show that there are three irreducible polynomials of degree 4 over

the ¬eld Z/2Z, of which two are primitive. Hence construct GF(16) by the

method outlined above.

Exercise 1.7 Show that an irreducible polynomial of degree m over GF(q) has a

root in GF(qn ) if and only if m divides n.

Hence show that the number am of irreducible polynomials of degree m over

GF(q) satis¬es

‘ mam = qn.

m|n

6

Exercise 1.8 Show that, if q is even, then every element of GF(q) is a square;

while, if q is odd, then half of the non-zero elements of GF(q) are squares and

half are non-squares.

If q is odd, show that ’1 is a square in GF(q) if and only if q ≡ 1 (mod 4).

1.3 Vector spaces

A left vector space over a division ring F is a unital left F-module. That is, it is

an abelian group V , with a anti-homomorphism from F to End(V ) mapping 1 to

the identity endomorphism of V .

Writing scalars on the left, we have (cd)v = c(dv) for all c, d ∈ F and v ∈ V :

that is, scalar multiplication by cd is the same as multiplication by d followed by

multiplication by c, not vice versa. (The opposite convention would make V a

right (rather than left) vector space; scalars would more naturally be written on

the right.) The unital condition simply means that 1v = v for all v ∈ V .

Note that F is a vector space over itself, using ¬eld multiplication for the scalar

multiplication.

If F is a division ring, the opposite division ring F —¦ has the same underlying

set as F and the same addition, with multiplication given by

a —¦ b = ba.

Now a right vector space over F can be regarded as a left vector space over F —¦ .

A linear transformation T : V ’ W between two left F-vector spaces V and

W is a vector space homomorphism; that is, a homomorphism of abelian groups

which commutes with scalar multiplication. We write linear transformations on

the right, so that we have

(cv)T = c(vT )

for all c ∈ F, v ∈ V . We add linear transformations, or multiply them by scalars,

pointwise (as functions), and multiply then by function composition; the results

are again linear transformations.

If a linear transformation T is one-to-one and onto, then the inverse map is

also a linear transformation; we say that T is invertible if this occurs.

Now Hom(V,W ) denotes the set of all linear transformations from V to W .

The dual space of F is F — = Hom(V, F).

Exercise 1.9 Show that V — is a right vector space over F.

7

A vector space is ¬nite-dimensional if it is ¬nitely generated as F-module.

A basis is a minimal generating set. Any two bases have the same number of

elements; this number is usually called the dimension of the vector space, but in

order to avoid confusion with a slightly different geometric notion of dimension,

I will call it the rank of the vector space. The rank of V is denoted by rk(V ).

Every vector can be expressed uniquely as a linear combination of the vectors

in a basis. In particular, a linear combination of basis vectors is zero if and only if

all the coef¬cients are zero. Thus, a vector space of rank n over F is isomorphic

to F n (with coordinatewise addition and scalar multiplication).

I will assume familiarity with standard results of linear algebra about ranks

of sums and intersections of subspaces, about ranks of images and kernels of

linear transformations, and about the representation of linear transformations by

matrices with respect to given bases.

As well as linear transformations, we require the concept of a semilinear trans-

formation between F-vector spaces V and W . This can be de¬ned in two ways. It

is a map T from V to W satisfying

(a) (v1 + v2 )T = v1 T + v2 T for all v1 , v2 ∈ V ;

(b) (cv)T = cσ vT for all c ∈ F, v ∈ V , where σ is an automorphism of F called

the associated automorphism of T .

Note that, if T is not identically zero, the associated automorphism is uniquely

determined by T .

The second de¬nition is as follows. Given an automorphism σ of F, we extend

the action of σ to F n coordinatewise:

(c1 , . . . , cn )σ = (cσ , . . . , cσ ).

n

1

Hence we have an action of σ on any F-vector space with a given basis. Now a

σ-semilinear transformation from V to W is the composition of a linear transfor-

mation from V to W with the action of σ on W (with respect to some basis).

The fact that the two de¬nitions agree follows from the observations

• the action of σ on F n is semilinear in the ¬rst sense;

• the composition of semilinear transformations is semilinear (and the associ-

ated automorphism is the composition of the associated automorphisms of

the factors).

8

This immediately shows that a semilinear map in the second sense is semilinear

in the ¬rst. Conversely, if T is semilinear with associated automorphism σ, then

the composition of T with σ’1 is linear, so T is σ-semilinear.

Exercise 1.10 Prove the above assertions.

If a semilinear transformation T is one-to-one and onto, then the inverse map

is also a semilinear transformation; we say that T is invertible if this occurs.

Almost exclusively, I will consider only ¬nite-dimensional vector spaces. To

complete the picture, here is the situation in general. In ZFC (Zermelo“Fraenkel

set theory with the Axiom of Choice), every vector space has a basis (a set of

vectors with the property that every vector has a unique expression as a linear

combination of a ¬nite set of basis vectors with non-zero coef¬cients), and any

two bases have the same cardinal number of elements. However, without the

Axiom of Choice, there may exist a vector space which has no basis.

Note also that there exist division rings F with bimodules V such that V has

different ranks when regarded as a left or a right vector space.

1.4 Projective spaces

It is not easy to give a concise de¬nition of a projective space, since projective

geometry means several different things: a geometry with points, lines, planes,

and so on; a topological manifold with a strange kind of torsion; a lattice with

meet, join, and order; an abstract incidence structure; a tool for computer graphics.

Let V be a vector space of rank n + 1 over a ¬eld F. The “objects” of the

n-dimensional projective space are the subspaces of V , apart from V itself and the

zero subspace {0}. Each object is assigned a dimension which is one less than its

rank, and we use geometric terminology, so that points, lines and planes are the

objects of dimension 0, 1 and 2 (that is, rank 1, 2, 3 respectively). A hyperplane is

an object having codimension 1 (that is, dimension n ’ 1, or rank n). Two objects

are incident if one contains the other. So two objects of the same dimension are

incident if and only if they are equal.

The n-dimensional projective space is denoted by PG(n, F). If F is the Galois

¬eld GF(q), we abbreviate PG(n, GF(q)) to PG(n, q). A similar convention will

be used for other geometries and groups over ¬nite ¬elds.

A 0-dimensional projective space has no internal structure at all, like an ide-

alised point. A 1-dimensional projective space is just a set of points, one more

than the number of elements of F, with (at the moment) no further structure. (If

9

{e1 , e2 } is a basis for V , then the points are spanned by the vectors »e1 + e2 (for

» ∈ F) and e1 .)

For n > 1, PG(n, F) contains objects of different dimensions, and the relation

of incidence gives it a non-trivial structure.

Instead of our “incidence structure” model, we can represent a projective space

as a collection of subsets of a set. Let S be the set of points of PG(n, F). The point

shadow of an object U is the set of points incident with U. Now the point shadow

of a point P is simply {P}. Moreover, two objects are incident if and only if the

point shadow of one contains that of the other.

The diagram below shows PG(2, 2). It has seven points, labelled 1, 2, 3, 4, 5,

6, 7; the line shadows are 123, 145, 167, 246, 257, 347 356 (where, for example,

123 is an abbreviation for {1, 2, 3}).

1u

„

'$ „

3u „u

˜ u4 5

˜4„

47

˜

4 ˜„

4 ˜

u &%

4 u ˜u

„

2 6 4

The correspondence between points and spanning vectors of the rank-1 sub-

spaces can be taken as follows:

1 2 3 4 5 6 7

(0, 0, 1) (0, 1, 0) (0, 1, 1) (1, 0, 0) (1, 0, 1) (1, 1, 0) (1, 1, 1)

The following geometric properties of projective spaces are easily veri¬ed

from the rank formulae of linear algebra:

(a) Any two distinct points are incident with a unique line.

(b) Two distinct lines contained in a plane are incident with a unique point.

(c) Any three distinct points, or any two distinct collinear lines, are incident with

a unique plane.

(d) A line not incident with a given hyperplane meets it in a unique point.

(e) If two distinct points are both incident with some object of the projective

space, then the unique line incident with them is also incident with that

object.

10

Exercise 1.11 Prove the above assertions.

It is usual to be less formal with the language of incidence, and say “the point

P lies on the line L”, or “the line L passes through the point P” rather than “the

point P and the line L are incident”. Similar geometric language will be used

without further comment.

An isomorphism from a projective space Π1 to a projective space Π2 is a map

from the objects of Π1 to the objects of Π2 which preserves the dimensions of ob-

jects and also preserves the relation of incidence between objects. A collineation

of a projective space Π is an isomorphism from Π to Π.

The important theorem which connects this topic with that of the previous

section is the Fundamental Theorem of Projective Geometry:

Theorem 1.6 Any isomorphism of projective spaces of dimension at least two

is induced by an invertible semilinear transformation of the underlying vector

spaces. In particular, the collineations of PG(n, F) for n ≥ 2 are induced by

invertible semilinear transformations of the rank-(n + 1) vector space over F.

This theorem will not be proved here, but I make a few comments about the

proof. Consider ¬rst the case n = 2. One shows that the ¬eld F can be recov-

ered from the projective plane (that is, the addition and multiplication in F can

be de¬ned by geometric constructions involving points and lines). The construc-

tion is based on choosing four points of which no three are collinear. Hence any

collineation ¬xing these four points is induced by a ¬eld automorphism. Since

the group of invertible linear transformations acts transitively on quadruples of

points with this property, it follows that any collineation is induced by the com-

position of a linear transformation and a ¬eld automorphism, that is, a semilinear

transformation.

For higher-dimensional spaces, we show that the coordinatisations of the planes

¬t together in a consistent way to coordinatise the whole space.

In the next chapter we study properties of the collineation group of projective

spaces. Since we are concerned primarily with groups of matrices, I will normally

speak of PG(n ’ 1, F) as the projective space based on a vector space of rank n,

rather than PG(n, F) based on a vector space of rank n + 1.

Next we give some numerical information about ¬nite projective spaces.

Theorem 1.7 (a) The number of points in the projective space PG(n ’ 1, q) is

(qn ’ 1)/(q ’ 1).

11

(b) More generally, the number of (m’1)-dimensional subspaces of PG(n’1, q)

is

(qn ’ 1)(qn ’ q) · · · (qn ’ qm’1 )

.

(qm ’ 1)(qm ’ q) · · · (qm ’ qm’1 )

(c) The number of (m ’ 1)-dimensional subspaces of PG(n ’ 1, q) containing a

given (l ’ 1)-dimensional subspace is equal to the number of (m ’ l ’ 1)-

dimensional subspaces of PG(n ’ l ’ 1, q).

Proof (a) The projective space is based on a vector space of rank n, which con-

tains qn vectors. One of these is the zero vector, and the remaining qn ’ 1 each

span a subspace of rank 1. Each rank 1 subspace contains q ’ 1 non-zero vectors,

each of which spans it.

(b) Count the number of linearly independent m-tuples of vectors. The jth

vector must lie outside the rank ( j ’ 1) subspace spanned by the preceding vec-

tors, so there are qn ’ q j’1 choices for it. So the number of such m-tuples is the

numerator of the fraction. By the same argument (replacing n by m), the num-

ber of linearly independent m-tuples which span a given rank m subspace is the

denominator of the fraction.

(c) If U is a rank l subspace of the rank m vector space V , then the Second

Isomorphism Theorem shows that there is a bijection between rank m subspaces

of V containing U, and rank (m ’ l) subspaces of the rank (n ’ l) vector space

V /U.

The number given by the fraction in part (b) of the theorem is called a Gaus-

n

sian coef¬cient, written . Gaussian coef¬cients have properties resembling

mq

those of binomial coef¬cients, to which they tend as q ’ 1.

Exercise 1.12 (a) Prove that

n+1

n n

+ qn’k+1 = .

k’1

k k

q q q

(b) Prove that for n ≥ 1,

n

n’1

n

‘ qk(k’1)/2

∏ (1 + q x) = i

xk .

k q

i=0 k=0

(This result is known as the q-binomial theorem, since it reduces to the

binomial theorem as q ’ 1.)

12

If we regard a projective space PG(n ’ 1, F) purely as an incidence structure,

the dimensions of its objects are not uniquely determined. This is because there

is an additional symmetry known as duality. That is, if we regard the hyperplanes

as points, and de¬ne new dimensions by dim— (U) = n ’ 2 ’ dim(U), we again

obtain a projective space, with the same relation of incidence. The reason that it

is a projective space is as follows.

Let V — = Hom(V, F) be the dual space of V , where V is the underlying vector

space of PG(n’1, F). Recall that V — is a right vector space over F, or equivalently

a left vector space over the opposite ¬eld F —¦ . To each subspace U of V , there is a

corresponding subspace U † of V — , the annihilator of U, given by

U † = { f ∈ V — : u f = 0 for all u ∈ U}.

The correspondence U ’ U † is a bijection between the subspaces of V and the

subspaces of V — ; we denote the inverse map from subspaces of V — to subspaces

of V also by †. It satis¬es

(a) (U † )† = U;

† †

(b) U1 ¤ U2 if and only if U1 ≥ U2 ;

(c) rk(U † ) = n ’ rk(U).

Thus we have:

Theorem 1.8 The dual of PG(n ’ 1, F) is the projective space PG(n ’ 1, F —¦ ). In

particular, if n ≥ 3, then PG(n ’ 1, F) is isomorphic to its dual if and only if F is

isomorphic to its opposite F —¦ .

Proof The ¬rst assertion follows from our remarks. The second follows from the

¬rst by use of the Fundamental Theorem of Projective Geometry.

Thus, PG(n’1, F) is self-dual if F is commutative, and for some non-commutative

division rings such as H; but there are division rings F for which F ∼ F —¦ .

=

An isomorphism from F to its opposite is a bijection σ satisfying

(a + b)σ = aσ + bσ ,

(ab)σ = bσ aσ ,

for all a, b ∈ F. Such a map is called an anti-automorphism of F.

Exercise 1.13 Show that H ∼ H—¦ . (Hint: (a + bi + cj + dk)σ = a ’ bi ’ cj ’ dk.)

=

13

2 Linear and projective groups

In this section, we de¬ne and study the general and special linear groups and their

projective versions. We look at the actions of the projective groups on the points of

the projective space, and discuss transitivity properties, generation, and simplicity

of these groups.

2.1 The general linear groups

Let F be a division ring. As we saw, a vector space of rank n over F can be

identi¬ed with the standard space F n (with scalars on the left) by choosing a basis.

Any invertible linear transformation of V is then represented by an invertible n — n

matrix, acting on F n by right multiplication.

We let GL(n, F) denote the group of all invertible n — n matrices over F, with

the operation of matrix multiplication.

The group GL(n, F) acts on the projective space PG(n ’ 1, F), since an in-

vertible linear transformation maps a subspace to another subspace of the same

dimension.

Proposition 2.1 The kernel of the action of GL(n, F) on the set of points of PG(n’

1, F) is the subgroup

{cI : c ∈ Z(F), c = 0}

of central scalar matrices in F, where Z(F) denotes the centre of F.

Proof Let A = (ai j ) be an invertible matrix which ¬xes every rank 1 subspace of

F n . Thus, A maps each non-zero vector (x1 , . . . , xn ) to a scalar multiple (cx1 , . . . , cxn )

of itself.

Let ei be the ith basis vector, with 1 in position i and 0 elsewhere. Then

ei A = ci ei , so the ith row of A is ci ei . This shows that A is a diagonal matrix.

Now for i = j, we have

ci ei + c j e j = (ei + e j )A = d(ei + e j )

for some d. So ci = c j . Thus, A is a diagonal matrix cI.

Finally, let a ∈ F, a = 0. Then

c(ae1 ) = (ae1 )A = a(e1 A) = ace1 ,

so ac = ca. Thus, c ∈ Z(F).

14

Let Z be the kernel of this action. We de¬ne the projective general linear

group PGL(n, F) to be the group induced on the points of the projective space

PG(n ’ 1, F) by GL(n, F). Thus,

PGL(n, F) ∼ GL(n, F)/Z.

=

In the case where F is the ¬nite ¬eld GF(q), we write GL(n, q) and PGL(n, q)

in place of GL(n, F) and PGL(n, F) (with similar conventions for the groups we

meet later). Now we can compute the orders of these groups:

(a) | GL(n, q)| = (qn ’ 1)(qn ’ q) · · · (qn ’ qn’1 );

Theorem 2.2

(b) | PGL(n, q)| = | GL(n, q)|/(q ’ 1).

Proof (a) The rows of an invertible matrix over a ¬eld are linearly independent,

that is, for i = 1, . . . , n, the ith row lies outside the subspace of rank i ’ 1 generated

by the preceding rows. Now the number of vectors in a subspace of rank i ’ 1 over

GF(q) is qi’1 , so the number of choices for the ith row is qn ’ qi’1 . Multiplying

these numbers for i = 1, . . . , n gives the result.

(b) PGL(n, q) is the image of GL(n, q) under a homomorphism whose kernel

consists of non-zero scalar matrices and so has order q ’ 1.

If the ¬eld F is commutative, then the determinant function is de¬ned on n — n

matrices over F and is a multiplicative map to F:

det(AB) = det(A) det(B).

Also, det(A) = 0 if and only if A is invertible. So det is a homomorphism from

GL(n, F) to F — , the multiplicative group of F (also known as GL(1, F)). This

homomorphism is onto, since the matrix with c in the top left corner, 1 in the

other diagonal positions, and 0 elsewhere has determinant c.

The kernel of this homomorphism is the special linear group SL(n, F), a nor-

mal subgroup of GL(n, F) with factor group isomorphic to F — .

We de¬ne the projective special linear group PSL(n, F) to be the image of

SL(n, F) under the homomorphism from GL(n, F) to PGL(n, F), that is, the group

induced on the projective space by SL(n, F). Thus,

PSL(n, F) = SL(n, F)/(SL(n, F) © Z).

The kernel of this homomorphism consists of the scalar matrices cI which have

determinant 1, that is, those cI for which cn = 1. This is a ¬nite cyclic group

whose order divides n.

Again, for ¬nite ¬elds, we can calculate the orders:

15

(a) | SL(n, q)| = | GL(n, q)|/(q ’ 1);

Theorem 2.3

(b) | PSL(n, q)| = | SL(n, q)|/(n, q ’ 1), where (n, q ’ 1) is the greatest common

divisor of n and q ’ 1.

Proof (a) SL(n, q) is the kernel of the determinant homomorphism on GL(n, q)

whose image F — has order q ’ 1.

(b) From the remark before the theorem, we see that PSL(n, q) is the image of

SL(n, q) under a homomorphism whose kernel is the group of nth roots of unity

in GF(q). Since the multiplicative group of this ¬eld is cyclic of order q ’ 1, the

nth roots form a subgroup of order (n, q ’ 1).

A group G acts sharply transitively on a set „¦ if its action is regular, that is, it

is transitive and the stabiliser of a point is the identity.

Theorem 2.4 Let F be a division ring. Then the group PGL(n, F) acts transitively

on the set of all (n + 1)-tuples of points of PG(n ’ 1, F) with the property that no

n points lie in a hyperplane; the stabiliser of such a tuple is isomorphic to the

group of inner automorphisms of the multiplicative group of F. In particular, if

F is commutative, then PGL(n, F) is sharply transitive on the set of such (n + 1)-

tuples.

Proof Consider n points not lying in a hyperplane. The n vectors spanning these

points form a basis, and we may assume that this is the standard basis e1 , . . . , en of

F n , where ei has ith coordinate 1 and all others zero. The proof of Proposition 2.1

shows that G acts transitively on the set of such n-tuples, and the stabiliser of the

n points is the group of diagonal matrices. Now a vector v not lying in the hy-

perplane spanned by any n ’ 1 of the basis vectors must have all its coordinates

non-zero, and conversely. Moreover, the group of diagonal matrices acts transi-

tively on the set of such vectors. This proves that PG(n, F) is transitive on the set

of (n + 1)-tuples of the given form. Without loss of generality, we may assume

that v = e1 + · · · + en = (1, 1, . . . , 1). Then the stabiliser of the n + 1 points consists

of the group of scalar matrices, which is isomorphic to the multiplicative group

F — . We have seen that the kernel of the action on the projective space is Z(F — ), so

the group induced by the scalar matrices is F — /Z(F — ), which is isomorphic to the

group of inner automorphisms of F — .

Corollary 2.5 The group PGL(2, F) is 3-transitive on the points of the projective

line PG(1, F); the stabiliser of three points is isomorphic to the group of inner

16

automorphisms of the multiplicative group of F. In particular, if F is commutative,

then PGL(2, F) is sharply 3-transitive on the points of the projective line.

For n > 2, the group PGL(n, F) is 2-transitive on the points of the projective

space PG(n ’ 1, F).

This follows from the theorem because, in the projective plane, the hyper-

planes are the points, and so no two distinct points lie in a hyperplane; while, in

general, any two points are independent and can be extended to an (n + 1)-tuple

as in the theorem.

We can represent the set of points of the projective line as {∞} ∪ F, where

∞ = (1, 0) and a = (a, 1) for a ∈ F. Then the stabiliser of the three points

∞, 0, 1 acts in the natural way on F \ {0, 1} by conjugation.

For consider the effect of the diagonal matrix aI on the point (x, 1) . This is

mapped to (xa, a) , which is the same rank 1 subspace as (a’1 xa, 1) ; so in the

new representation, aI induces the map x ’ a’1 xa.

In this convenient representation, the action of PGL(2, F) can be represented

ab

maps (x, 1) to (xa +

by linear fractional transformations. The matrix

cd

c, xb + d), which spans the same point as ((xb + d)’1 (xa + c), 1) if xb + d = 0, or

(1, 0) otherwise. Thus the transformation induced by this matrix can be written as

x ’ (xb + d)’1 (xa + c),

provided we make standard conventions about ∞ (for example, 0’1 a = ∞ for a =

0 and (∞b + d)’1 (∞a + c) = b’1 a. If F is commutative, this transformation is

conveniently written as a fraction:

ax + c

x’ .

bx + d

Exercise 2.1 Work out carefully all the conventions required to use the linear

fractional representation of PGL(2, F).

Exercise 2.2 By Theorem 2.4, the order of PGL(n, q) is equal to the number of

(n + 1)-tuples of points of PG(n ’ 1, q) for which no n lie in a hyperplane. Use

this to give an alternative proof of Theorem 2.2.

Paul Cohn constructed an example of a division ring F such that all elements

of F \ {0, 1} are conjugate in the multiplicative group of F. For a division ring

F with this property, we see that PGL(2, F) is 4-transitive on the projective line.

This is the highest degree of transitivity that can be realised in this way.

17

Exercise 2.3 Show that, if F is a division ring with the above property, then F

has characteristic 2, and the multiplicative group of F is torsion-free and simple.

Exercise 2.4 Let F be a commutative ¬eld. Show that, for all n ≥ 2, the group

PSL(n, F) is 2-transitive on the points of the projective space PG(n ’ 1, F); it is

3-transitive if and only if n = 2 and every element of F is a square.

2.2 Generation

For the rest of this section, we assume that F is a commutative ¬eld. A transvec-

tion of the F-vector space V is a linear map : V ’ V which satis¬es rk(T ’ I) = 1

and (T ’ I)2 = 0. Thus, if we choose a basis such that e1 spans the image of T ’ I

and e1 , . . . .en’1 span the kernel, then T is represented by the matrix I +U, where

U has entry 1 in the top right position and 0 elsewhere. Note that a transvection

has determinant 1. The axis of the transvection is the hyperplane ker(T ’ I); this

subspace is ¬xed elementwise by T . Dually, the centre of T is the image of T ’ I;

every subspace containing this point is ¬xed by T (so that T acts trivially on the

quotient space).

Thus, a transvection is a map of the form

x ’ x + (x f )a,

where a ∈ V and f ∈ V — satisfy a f = 0 (that is, f ∈ a† ). Its centre and axis are a

and ker( f ) respectively.

The transformation of projective space induced by a transvection is called an

elation. The matrix form given earlier shows that all elations lie in PSL(n, F).

Theorem 2.6 For any n ≥ 2 and commutative ¬eld F, the group PSL(n, F) is

generated by the elations.

Proof We use induction on n.

Consider the case n = 2. The elations ¬xing a speci¬ed point, together with

the identity, form a group which acts regularly on the remaining points. (In the

linear fractional representation, this elation group is

{x ’ x + a : a ∈ F},

¬xing ∞.) Hence the group G generated by the elations is 2-transitive. So it is

enough to show that the stabiliser of the two points ∞ and 0 in G is the same as in

PSL(2, F), namely

{x ’ a2 x : a ∈ F, a = 0}.

18

Given a ∈ F, a = 0, we have

1 ’a’1

1 1 1 0 1 0 a 0

= ,

a’1

a ’ a2

a’1 1

0 1 0 1 1 0

and the last matrix induces the linear fractional map x ’ ax/a’1 = a2 x, as re-

quired.

(The proof shows that two elation groups, with centres ∞ and 0, suf¬ce to

generate PSL(2, F).)

Now for the general case, we assume that PSL(n ’ 1, F) is generated by ela-

tions. Let G be the subgroup of PSL(n, F) generated by elations. First, we observe

that G is transitive; for, given any two points p1 and p2 , there is an elation on the

line p1 , p2 carrying p1 to p2 , which is induced by an elation on the whole space

(acting trivially on a complement to the line). So it is enough to show that the

stabiliser of a point p is generated by elations. Take an element g ∈ PSL(n, F)

¬xing p.

By induction, G p induces at least the group PSL(n ’ 1, F) on the quotient

space V /p. So, multiplying g by a suitable product of elations, we may assume

that g induces an element on V /p which is diagonal, with all but one of its diagonal

elements equal to 1. In other words, we can assume that g has the form

«

» 0 ... 0 0

¬ 0 1 ... 0·

0

¬. .·

. .. .

¬. . ·.

. .

.

¬. . . .·

0 0 ... 0

1

x1 x2 . . . xn’1 »’1

By further multiplication by elations, we may assume that x1 = . . . = xn’1 = 0.

Now the result follows from the matrix calculation given in the case n = 2.

Exercise 2.5 A homology is an element of PGL(n, F) which ¬xes a hyperplane

pointwise and also ¬xes a point not in this hyperplane. Thus, a homology is

represented in a suitable basis by a diagonal matrix with all its diagonal entries

except one equal to 1.

(a) Find two homologies whose product is an elation.

(b) Prove that PGL(n, F) is generated by homologies.

19

2.3 Iwasawa™s Lemma

Let G be a permutation group on a set „¦: this means that G is a subgroup of the

symmetric group on „¦. Iwasawa™s Lemma gives a criterion for G to be simple.

We will use this to prove the simplicity of PSL(n, F) and various other classical

groups.

Recall that G is primitive on „¦ if it is transitive and there is no non-trivial

equivalence relation on „¦ which is G-invariant: equivalently, if the stabiliser G±

of a point ± ∈ „¦ is a maximal subgroup of G. Any 2-transitive group is primitive.

Iwasawa™s Lemma is the following.

Theorem 2.7 Let G be primitive on „¦. Suppose that there is an abelian normal

subgroup A of G± with the property that the conjugates of A generate G. Then any

non-trivial normal subgroup of G contains G . In particular, if G = G , then G is

simple.

Proof Suppose that N is a non-trivial normal subgroup of G. Then N ¤ G± for

some ±. Since G± is a maximal subgroup of G, we have NG± = G.

Let g be any element of G. Write g = nh, where n ∈ N and h ∈ G± . Then

gAg’1 = nhAh’1 n’1 = nAn’1 ,

since A is normal in G± . Since N is normal in G we have gAg’1 ¤ NA. Since the

conjugates of A generate G we see that G = NA.

Hence

G/N = NA/N ∼ A/(A © N)

=

is abelian, whence N ≥ G , and we are done.

2.4 Simplicity

We now apply Iwasawa™s Lemma to prove the simplicity of PSL(n, F). First, we

consider the two exceptional cases where the group is not simple.

Recall that PSL(2, q) is a subgroup of the symmetric group Sq+1 , having order

(q + 1)q(q ’ 1)/(q ’ 1, 2).

(a) If q = 2, then PSL(2, q) is a subgroup of S3 of order 6, so PSL(2, 2) ∼ S3 .

=

It is not simple, having a normal subgroup of order 3.

(b) If q = 3, then PSL(2, q) is a subgroup of S4 of order 12, so PSL(2, 3) ∼ A4 .

=

It is not simple, having a normal subgroup of order 4.

20

(c) For comparison, we note that, if q = 4, then PSL(2, q) is a subgroup of S5

of order 60, so PSL(2, 4) ∼ A5 . This group is simple.

=

Lemma 2.8 The group PSL(n, F) is equal to its derived group if n > 2 or if |F| >

3.

Proof The group G = PSL(n, F) acts transitively on incident point-hyperplane

pairs. Each such pair de¬nes a unique elation group. So all the elation groups are

conjugate. These groups generate G. So the proof will be concluded if we can

show that some elation group is contained in G .

Suppose that |F| > 3. It is enough to consider n = 2, since we can extend all

matrices in the argument below to rank n by appending a block consisting of the

identity of rank n ’ 2. There is an element a ∈ F with a2 = 0, 1. We saw in the

a0

proof of Theorem 2.6 that SL(2, F) contains the matrix . Now

0 a’1

a’1 (a2 ’ 1)x

1 ’x a 0 1x 0 1

= ;

a’1

01 0 01 0 a 0 1

this equation expresses any element of the corresponding transvection group as a

commutator.

Finally suppose that |F| = 2 or 3. As above, it is enough to consider the case

n = 3. This is easier, since we have more room to manoeuvre in three dimensions:

we have

« « « « «

1 ’x 0 10 0 1x0 100 10x

0 1 0 0 1 ’1 0 1 0 0 1 1 = 0 1 0 .

001 00 1 001 001 001

Lemma 2.9 Let „¦ be the set of points of the projective space PG(n ’ 1, F). Then,

for ± ∈ „¦, the set of elations with centre ±, together with the identity, forms an

abelian normal subgroup of G± .

Proof This is more conveniently shown for the corresponding transvections in

SL(n, F). But the transvections with centre spanned by the vector a consist of all

maps x ’ x + (x f )a,, for f ∈ A† ; these clearly form an abelian group isomorphic

to the additive group of a† .

Theorem 2.10 The group PSL(n, F) is simple if n > 2 or if |F| > 3.

21

Proof let G = PSL(n, F). Then G is 2-transitive, and hence primitive, on the

set „¦ of points of the projective space. The group A of elations with centre ±

is an abelian normal subgroup of G± , and the conjugates of A generate G (by

Theorem 2.6, since every elation has a centre). Apart from the two excluded

cases, G = G . So G is simple, by Iwasawa™s Lemma.

2.5 Small ¬elds

We now have the family PSL(n, q), for (n, q) = (2, 2), (2, 3) of ¬nite simple groups.

(The ¬rst two members are not simple: we observed that PSL(2, 2) ∼ S3 and =

∼ A4 , neither of which is simple.) As is well-known, Galois showed

PSL(2, 3) =

that the alternating group An of degree n ≥ 5 is simple.

Exercise 2.6 Prove that the alternating group An is simple for n ≥ 5.

Some of these groups coincide:

(a) PSL(2, 4) ∼ PSL(2, 5) ∼ A5 .

= =

Theorem 2.11

(b) PSL(2, 7) ∼ PSL(3, 2).

=

(c) PSL(2, 9) ∼ A6 .

=

(d) PSL(4, 2) ∼ A8 .

=

Proofs of these isomorphisms are outlined below. Many of the details are left

as exercises. There are many other ways to proceed!

Theorem 2.12 Let G be a simple group of order (p + 1)p(p ’ 1)/2, where p is a

prime number greater than 3. Then G ∼ PSL(2, p).

=

Proof By Sylow™s Theorem, the number of Sylow p-subgroups is congruent to 1

mod p and divides (p + 1)(p ’ 1)/2; also this number is greater than 1, since G

is simple. So there are p + 1 Sylow p-subgroups; and if P is a Sylow p-subgroup

and N = NG (P), then |N| = p(p ’ 1)/2.

Consider G acting as a permutation group on the set „¦ of cosets of N. Let ∞

denote the coset N. Then P ¬xes ∞ and permutes the other p cosets regularly. So

we can identify „¦ with the set {∞} ∪ GF(p) such that a generator of P acts on „¦

22

as the permutation x ’ x + 1 (¬xing ∞). We see that N is permutation isomorphic

to the group

{x ’ a2 x + b : a, b ∈ GF(p), a = 0}.

More conveniently, elements of N can be represented as linear fractional transfor-

mations of „¦ with determinant 1, since

ax + a’1 b

2

a x+b = .

0x + a’1

Since G is 2-transitive on „¦, N is a maximal subgroup of G, and G is gener-

ated by N and an element t interchanging ∞ and 0, which can be chosen to be an

involution. If we can show that t is also represented by a linear fractional trans-

formation with determinant 1, then G will be a subgroup of the group PSL(2, p)

of all such transformations, and comparing orders will show that G = PSL(2, p).

We treat the case p ≡ ’1 (mod 4); the other case is a little bit trickier.

The element t must normalise the stabiliser of ∞ and 0, which is the cyclic

group C = {x ’ a2 x} of order (p ’ 1)/2 (having two orbits of size (p ’ 1)/2,

consisting of the non-zero squares and the non-squares in GF(p)). Also, t has

no ¬xed points. For the stabiliser of three points in G is trivial, so t cannot ¬x

more than 2 points; but the two-point stabiliser has odd order (p ’ 1)/2. Thus t

interchanges the two orbits of C.

There are various ways to show that t inverts C. One of them uses Burnside™s

Transfer Theorem. Let q be any prime divisor of (p ’ 1)/2, and let Q be a Sylow

q-subgroup of C (and hence of G). Clearly NG (Q) = C t , so t must centralise or

invert Q. If t centralises Q, then Q ¤ Z(NG (Q), and Burnside™s Transfer Theorem

implies that G has a normal q-complement, contradicting simplicity. So t inverts

every Sylow subgroup of C, and thus inverts C.

Now C t is a dihedral group, containing (p ’ 1)/2 involutions, one inter-

changing the point 1 with each point in the other C-orbit. We may choose t so

that it interchanges 1 with ’1. Then the fact that t inverts C shows that it inter-

changes a2 with ’a’2 for each non-zero a ∈ GF(p). So t is the linear fractional

map x ’ ’1/x, and we are done.

Theorem 2.11(b) follows, since PSL(3, 2) is a simple group of order

(23 ’ 1)(23 ’ 2)(23 ’ 22 ) = 168 = (7 + 1)7(7 ’ 1)/2.

Exercise 2.7 (a) Complete the proof of the above theorem in the case p = 5.

Hence prove Theorem 2.11(a).

23

(b) Show that a simple group of order 60 has ¬ve Sylow 2-subgroups, and hence

show that any such group is isomorphic to A5 . Give an alternative proof of

Theorem 2.11(a).

Proof of Theorem 2.11(d) The simple group PSL(3, 2) of order 168 is the group

of collineations of the projective plane over GF(2), shown below.

u

„

'$ „

u „u

˜ u4

˜4„

4

˜

4 ˜„

4 ˜

u &%

4 u ˜u

„

Since its index in S7 is 30, there are 30 different ways of assigning the structure

of a projective plane to a given set N = {1, 2, 3, 4, 5, 6, 7} of seven points; and since

PSL(3, 2), being simple, contains no odd permutations, it is contained in A7 , so

these 30 planes fall into two orbits of 15 under the action of A7 .

Let „¦ be one of the A7 -orbits. Each plane contains seven lines, so there 15 —

7 = 105 pairs (L, Π), where L is a 3-subset of N, Π ∈ „¦, and L is a line of Π.

Thus, each of the 7 = 35 triples is a line in exactly three of the planes in „¦.

3

We now de¬ne a new geometry G whose ˜points™ are the elements of „¦, and

whose ˜lines™ are the triples of elements containing a ¬xed line L. Clearly, any

two ˜points™ lie in at most one ˜line™, and a simple counting argument shows that

in fact two ˜points™ lie in a unique line.

Let Π be a plane from the other A7 -orbit. For each point n ∈ N, the three lines

of Π containing n belong to a unique plane of the set „¦. (Having chosen three

lines through a point, there are just two ways to complete the projective plane,

differing by an odd permutation.) In this way, each of the seven points of N gives

rise to a ˜point™ of „¦. Moreover, the three points of a line of Π correspond to

three ˜points™ of a ˜line™ in our new geometry G. Thus, G contains ˜planes™, each

isomorphic to the projective plane PG(2, 2).

It follows that G is isomorphic to PG(3, 2). The most direct way to see this is

to consider the set A = {0} ∪ „¦, and de¬ne a binary operation on A by the rules

0 + Π = Π + 0 = Π for all Π ∈ „¦;

Π+Π = 0 for all Π ∈ „¦;

Π+Π = Π if {Π, Π , Π } is a ˜line™.

Then A is an elementary abelian 2-group. (The associative law follows from the

fact that any three non-collinear ˜points™ lie in a ˜plane™.) In other words, A is the

24

additive group of a rank 4 vector space over GF(2), and clearly G is the projective

geometry based on this vector space.

Now A7 ¤ Aut(G) = PSL(4, 2). (The last inequality comes from the Funda-

mental Theorem of Projective Geometry and the fact that PSL(4, 2) = P“L(4, 2)

since GF(2) has no non-trivial scalars or automorphisms.) By calculating orders,

we see that A7 has index 8 in PSL(4, 2). Thus, PSL(4, 2) is a permutation group

on the cosets of A7 , that is, a subgroup of S8 , and a similar calculation shows that

it has index 2 in S8 . We conclude that PSL(4, 2) ∼ A8 .

=

The proof of Theorem 2.11(c) is an exercise. Two approaches are outlined

below. Fill in the details.

Exercise 2.8 The ¬eld GF(9) can be represented as {a + bi : a, b ∈ GF(3)}, where

i2 = ’1. Let

1 1+i 01

A= , B= .

’1 0

0 1

Then

A3 = I, B2 = ’I, (AB)5 = ’I.

So the corresponding elements a, b ∈ G = PSL(2, 9) satisfy

a3 = b2 = (ab)5 = 1,

and so generate a subgroup H isomorphic to A5 . Then H has index 6 in G, and

the action of G on the cosets of H shows that G ¤ S6 . Then consideration of order

shows that G ∼ A6 .

=

Exercise 2.9 Let G = A6 , and let H be the normaliser of a Sylow 3-subgroup of

G. Let G act on the 10 cosets of H. Show that H ¬xes one point and acts is

isomorphic to the group

{x ’ a2 x + b : a, b ∈ GF(9), a = 0}

on the remaining points. Choose an element outside H and, following the proof of

Theorem 2.12, show that its action is linear fractional (if the ¬xed point is labelled

as ∞). Deduce that A6 ¤ PSL(2, 9), and by considering orders, show that equality

holds.

Exercise 2.10 A Hall subgroup of a ¬nite group G is a subgroup whose order and

index are coprime. Philip Hall proved that a ¬nite soluble group G has Hall sub-

groups of all admissible orders m dividing |G| for which (m, |G|/m) = 1, and that

any two Hall subgroups of the same order in a ¬nite soluble group are conjugate.

25

(a) Show that PSL(2, 5) fails to have a Hall subgroup of some admissible order.

(b) Show that PSL(2, 7) has non-conjugate Hall subgroups of the same order.

(c) Show that PSL(2, 11) has non-isomorphic Hall subgroups of the same order.

(d) Show that each of these groups is the smallest with the stated property.

Exercise 2.11 Show that PSL(4, 2) and PSL(3, 4) are non-isomorphic simple groups

of the same order.

26

3 Polarities and forms

3.1 Sesquilinear forms

We saw in Chapter 1 that the projective space PG(n ’ 1, F) is isomorphic to its

dual if and only if the ¬eld F is isomorphic to its opposite. More precisely, we

have the following. Let σ be an anti-automorphism of F, and V an F-vector space

of rank n. A sesquilinear form B on V is a function B : V —V ’ F which satis¬es

the following conditions:

(a) B(c1 x1 + c2 x2 , y) = c1 B(x1 , y) + c2 B(x2 , y), that is, B is a linear function of

its ¬rst argument;

(b) B(x, c1 y1 +c2 y2 ) = B(x, y1 )cσ +B(x, y2 )cσ , that is, B is a semilinear function

1 2

of its second argument, with ¬eld anti-automorphism σ.

(The word ˜sesquilinear™ means ˜one-and-a-half™.) If σ is the identity (so that F is

commutative), we say that B is a bilinear form.

The left radical of B is the subspace {x ∈ V : (∀y ∈ V )B(x, ) = 0}, and the right

radical is the subspace {y ∈ V : (∀x ∈ V )B(x, y) = 0}.

Exercise 3.1 (a) Prove that the left and right radicals are subspaces.

(b) Show that the left and right radicals have the same rank (if V has ¬nite

rank).

(c) Construct a bilinear form on a vector space of in¬nite rank such that the

left radical is zero and the right radical is no-zero.

The sesquilinear form B is called non-degenerate if its left and right radicals

are zero. (By the preceding exercise, it suf¬ces to assume that one of the radicals

is zero.)

A non-degenerate sesquilinear form induces a duality of PG(n ’ 1, F) (an iso-

morphism from PG(n ’ 1, F) to PG(n ’ 1, F —¦ )) as follows: for any y ∈ V , the map

x ’ B(x, y) is a linear map from V to F, that is, an element of the dual space V —

(which is a left vector space of rank n over F —¦ ); if we call this element βy , then the

map y ’ βy is a σ-semilinear bijection from V to V — , and so induces the required

duality.

Theorem 3.1 For n ≥ 3, any duality of PG(n ’ 1, F) is induced in this way by a

non-degenerate sesquilinear form on V = F n .

27

Proof By the Fundamental Theorem of Projective Geometry, a duality is induced

by a σ-semilinear bijection φ from V to V — , for some anti-automorphism σ. Set

B(x, y) = x(yφ).

We can short-circuit the passage to the dual space, and write the duality as

U ’ U ⊥ = {x ∈ V : B(x, y) = 0 for all y ∈ U}.

Obviously, a duality applied twice is a collineation. The most important types

of dualities are those whose square is the identity. A polarity of PG(n, F) is a

duality ⊥ which satis¬es U ⊥⊥ = U for all ¬‚ats U of PG(n, F).

It will turn out that polarities give rise to a class of geometries (the polar

spaces) with properties similar to those of projective spaces, and de¬ne groups

analogous to the projective groups. If a duality is not a polarity, then any collineation

which respects it must commute with its square, which is a collineation; so the

group we obtain will lie inside the centraliser of some element of the collineation

group. So the “largest” subgroups obtained will be those preserving polarities.

A sesquilinear form B is re¬‚exive if B(x, y) = 0 implies B(y, x) = 0.

Proposition 3.2 A duality is a polarity if and only if the sesquilinear form de¬ning

it is re¬‚exive.

Proof B is re¬‚exive if and only if x ∈ y ⊥ ’ y ∈ x ⊥ . Hence, if B is re¬‚exive,

then U ⊆ U ⊥⊥ for all subspaces U. But by non-degeneracy, dimU ⊥⊥ = dimV ’

dimU ⊥ = dimU; and so U = U ⊥⊥ for all U. Conversely, given a polarity ⊥, if

y ∈ x ⊥ , then x ∈ x ⊥⊥ ⊆ y ⊥ (since inclusions are reversed).

We now turn to the classi¬cation of re¬‚exive forms. For convenience, from

now on F will always be assumed to be commutative. (Note that, if the anti-

automorphism σ is an automorphism, and in particular if σ is the identity, then F

is automatically commutative.)

The form B is said to be σ-Hermitian if B(y, x) = B(x, y)σ for all x, y ∈ V . If B

is a non-zero σ-Hermitian form, then

(a) for any x, B(x, x) lies in the ¬xed ¬eld of σ;

(b) σ2 = 1. For every scalar c is a value of B, say B(x, y) = c; then

cσ = B(x, y)σ = B(y, x)σ = B(x, y) = c.

2 2

28

If σ is the identity, such a form (which is bilinear) is called symmetric.

A bilinear form b is called alternating if B(x, x) = 0 for all x ∈ V . This implies

that B(x, y) = ’B(y, x) for all x, y ∈ V . For

0 = B(x + y, x + y) = B(x, x) + B(x, y) + B(y, x) + B(y, y) = B(x, y) + B(y, x).

Hence, if the characteristic is 2, then any alternating form is symmetric (but not

conversely); but, in characteristic different from 2, only the zero form is both

symmetric and alternating.

Clearly, an alternating or Hermitian form is re¬‚exive. Conversely, we have the

following:

Theorem 3.3 A non-degenerate re¬‚exive σ-sesquilinear form is either alternat-

ing, or a scalar multiple of a σ-Hermitian form. In the latter case, if σ is the

identity, then the scalar can be taken to be 1.

Proof I will give the proof just for a bilinear form. Thus, it must be proved that

a non-degenerate re¬‚exive bilinear form is either symmetric or alternating.

We have

B(u, v)B(u, w) ’ B(u, w)B(u, v) = 0

by commutativity; that is, using bilinearity,

B(u, B(u, v)w ’ B(u, w)v) = 0.

By re¬‚exivity,

B(B(u, v)w ’ B(u, w)v, u) = 0,

whence bilinearity again gives

B(u, v)B(w, u) = B(u, w)B(v, u). (1)

Call a vector u good if B(u, v) = B(v, u) = 0 for some v. By Equation (1), if

u is good, then B(u, w) = B(w, u) for all w. Also, if u is good and B(u, v) = 0,

then v is good. But, given any two non-zero vectors u1 , u2 , there exists v with

B(ui , v) = 0 for i = 1, 2. (For there exist v1 , v2 with B(ui , vi ) = 0 for i = 1, 2, by

non-degeneracy; and at least one of v1 , v2 , v1 + v2 has the required property.) So,

if some vector is good, then every non-zero vector is good, and B is symmetric.

But, putting u = w in Equation (1) gives

B(u, u) (B(u, v) ’ B(v, u)) = 0

for all u, v. So, if u is not good, then B(u, u) = 0; and, if no vector is good, then B

is alternating.

29

Exercise 3.2 (a) Show that the left and right radicals of a re¬‚exive form are

equal.

(b) Assuming Theorem 3.3, prove that the assumption of non-degeneracy in the

theorem can be removed.

Exercise 3.3 Let σ be a (non-identity) automorphism of F of order 2. Let E be

the sub¬eld Fix(σ).

(a) Prove that F is of degree 2 over E, i.e., a rank 2 E-vector space.

[See any textbook on Galois theory. Alternately, argue as follows: Take » ∈

F \ E. Then » is quadratic over E, so E(») has degree 2 over E. Now E(»)

contains an element ω such that ωσ = ’ω (if the characteristic is not 2) or ωσ =

ω + 1 (if the characteristic is 2). Now, given two such elements, their quotient or

difference respectively is ¬xed by σ, so lies in E.]

(b) Prove that

{» ∈ F : »»σ = 1} = {µ/µσ : µ ∈ F}.

[The left-hand set clearly contains the right. For the reverse inclusion, separate

into cases according as the characteristic is 2 or not.

If the characteristic is not 2, then we can take F = E(ω), where ω2 = ± ∈ E

and ωσ = ’ω. If » = 1, then take µ = 1; otherwise, if » = a + bω, take µ =

b± + (a ’ 1)ω.

If the characteristic is 2, show that we can take F = E(ω), where ω2 + ω + ± =

0, ± ∈ E, and ωσ = ω + 1. Again, if » = 1, set µ = 1; else, if » = a + bω, take

µ = (a + 1) + bω.]

Exercise 3.4 Use the result of the preceding exercise to complete the proof of

Theorem 3.3 in general.

[If B(u, u) = 0 for all u, the form B is alternating and bilinear. If not, suppose

that B(u, u) = 0 and let B(u, u)σ = »B(u, u). Choosing µ as in Exercise 3.3 and

re-normalising B, show that we may assume that » = 1, and (with this choice) that

B is Hermitian.]

3.2 Hermitian and quadratic forms

We now change ground slightly from the last section. On the one hand, we restrict

things by excluding some bilinear forms from the discussion; on the other, we

30

introduce quadratic forms. The loss and gain exactly balance if the characteristic

is not 2; but, in characteristic 2, we make a net gain.

Let σ be an automorphism of the commutative ¬eld F, of order dividing 2. Let

Fix(σ) = {» ∈ F : »σ = »} be the ¬xed ¬eld of σ, and Tr(σ) = {» + »σ : » ∈ F}

the trace of σ. Since σ2 is the identity, it is clear that Fix(σ) ⊇ Tr(σ). Moreover,

if σ is the identity, then Fix(σ) = F, and

0 if F has characteristic 2,

Tr(σ) =

F otherwise.

Let B be a σ-Hermitian form. We observed in the last section that B(x, x) ∈

Fix(σ) for all x ∈ V . We call the form B trace-valued if B(x, x) ∈ Tr(σ) for all

x ∈ V.

Exercise 3.5 Let σ be an automorphism of a commutative ¬eld F such that σ2 is

the identity.

(a) Prove that Fix(σ) is a sub¬eld of F.

(b) Prove that Tr(σ) is closed under addition, and under multiplication by ele-

ments of Fix(σ).

Proposition 3.4 Tr(σ) = Fix(σ) unless the characteristic of F is 2 and σ is the

identity.

Proof E = Fix(σ) is a ¬eld, and K = Tr(σ) is an E-vector space contained in E

(Exercise 3.5). So, if K = E, then K = 0, and σ is the map x ’ ’x. But, since

σ is a ¬eld automorphism, this implies that the characteristic is 2 and σ is the

identity.

Thus, in characteristic 2, symmetric bilinear forms which are not alternating

are not trace-valued; but this is the only obstruction. We introduce quadratic forms

to repair this damage. But, of course, quadratic forms can be de¬ned in any char-

acteristic. However, we note at this point that Theorem 3.3 depends in a crucial

way on the commutativity of F; this leaves open the possibility of additional types

of polar spaces de¬ned by so-called pseudoquadratic forms. We will not pursue

this here: see Tits™s classi¬cation of spherical buildings.

Let V be a vector space over F. A quadratic form on V is a function q : V ’ F

satisfying

31

(a) q(»x) = »2 f (x) for all » ∈ F, x ∈ V ;

(b) q(x + y) = q(x) + q(y) + B(x, y), where B is bilinear.

Now, if the characteristic of F is not 2, then B is a symmetric bilinear form.

Each of q and B determines the other, by

B(x, y) = q(x + y) ’ q(x) ’ q(y),

q(x) = 1 B(x, x),

2

the latter equation coming from the substitution x = y in (b). So nothing new is

obtained.

On the other hand, if the characteristic of F is 2, then B is an alternating bi-

linear form, and q cannot be recovered from B. Indeed, many different quadratic

forms correspond to the same bilinear form. (Note that the quadratic form does

give extra structure to the vector space; we™ll see that this structure is geometri-

cally similar to that provided by an alternating or Hermitian form.)

We say that the bilinear form B is obtained by polarisation of q.

Now let B be a symmetric bilinear form over a ¬eld of characteristic 2, which

is not alternating. Set f (x) = B(x, x). Then we have

f (»x) = »2 f (x),

f (x + y) = f (x) + f (y),

since B(x, y) + B(y, x) = 0. Thus f is “almost” a semilinear form; the map » ’ »2

is a homomorphism of the ¬eld F with kernel 0, but it may fail to be an automor-

phism. But in any case, the kernel of f is a subspace of V , and the restriction of

B to this subspace is an alternating bilinear form. So again, in the spirit of the

vague comment motivating the study of polarities in the last section, the structure

provided by the form B is not “primitive”. For this reason, we do not consider

symmetric bilinear forms in characteristic 2 at all. However, as indicated above,

we will consider quadratic forms in characteristic 2.

Now, in characteristic different from 2, we can take either quadratic forms or

symmetric bilinear forms, since the structural content is the same. For consistency,

we will take quadratic forms in this case too. This leaves us with three “types” of

forms to study: alternating bilinear forms; σ-Hermitian forms where σ is not the

identity; and quadratic forms.

We have to de¬ne the analogue of non-degeneracy for quadratic forms. Of

course, we could require that the bilinear form obtained by polarisation is non-

32

degenerate; but this is too restrictive. We say that a quadratic form q is non-

degenerate if

q(x) = 0 & (∀y ∈ V )B(x, y) = 0 ’ x = 0,

where B is the associated bilinear form; that is, if the form q is non-zero on every

non-zero vector of the radical.

If the characteristic is not 2, then non-degeneracy of the quadratic form and of

the bilinear form are equivalent conditions.

Now suppose that the characteristic is 2, and let W be the radical of B. Then

B is identically zero on W ; so the restriction of q to W satis¬es

q(x + y) = q(x) + q(y),

q(»x) = »2 q(x).

As above, f is very nearly semilinear.

The ¬eld F is called perfect if every element is a square. If F is perfect,

then the map x ’ x2 is onto, and hence an automorphism of F; so q is indeed

semilinear, and its kernel is a hyperplane of W . We conclude:

Theorem 3.5 Let q be a non-singular quadratic form, which polarises to B, over

a ¬eld F.

(a) If the characteristic of F is not 2, then B is non-degenerate.

(b) If F is a perfect ¬eld of characteristic 2, then the radical of B has rank at

most 1.

Exercise 3.6 Let B be an alternating bilinear form on a vector space V over a ¬eld

F of characteristic 2. Let (vi : i ∈ I) be a basis for V , and (ci : i ∈ I) any function

from I to F. Show that there is a unique quadratic form q with the properties that

q(vi ) = ci for every i ∈ I, and q polarises to B.

Exercise 3.7 (a) Construct an imperfect ¬eld of characteristic 2.

(b) Construct a non-singular quadratic form with the property that the radical

of the associated bilinear form has rank greater than 1.

Exercise 3.8 Show that ¬nite ¬elds of characteristic 2 are perfect.

Exercise 3.9 Let B be a σ-Hermitian form on a vector space V over F, where σ is

not the identity. Set f (x) = B(x, x). Let E = Fix(σ), and let V be V regarded as an

E-vector space by restricting scalars. Prove that f is a quadratic form on V , which

polarises to the bilinear form Tr(B) de¬ned by Tr(B)(x, y) = B(x, y) + B(x, y)σ .

Show further that Tr(B) is non-degenerate if and only if B is.

33

3.3 Classi¬cation of forms

As explained in the last section, we now consider a vector space V of ¬nite rank

equipped with a form of one of the following types: a non-degenerate alternating

bilinear form B; a non-degenerate trace-valued σ-Hermitian form B, where σ is

not the identity; or a non-singular quadratic form q. In the third case, we let B

be the bilinear form obtained by polarising q; then B is alternating or symmetric

according as the characteristic is or is not 2, but B may be degenerate. We also let

f denote the function q. In the other two cases, we de¬ne a function f : V ’ F by

f (x) = B(x, x) ” this is identically zero if b is alternating. See Exercise 3.10 for

the Hermitian case.

Such a pair (V, B) or (V, q) will be called a formed space.

Exercise 3.10 Let B be a σ-Hermitian form on a vector space V over F, where σ is

not the identity. Set f (x) = B(x, x). Let E = Fix(σ), and let V be V regarded as an

E-vector space by restricting scalars. Prove that f is a quadratic form on V , which

polarises to the bilinear form Tr(B) de¬ned by Tr(B)(x, y) = B(x, y) + B(x, y)σ .

Show further that Tr(b) is non-degenerate if and only if B is.

We say that V is anisotropic if f (x) = 0 for all x = 0. Also, V is a hyperbolic

plane if it is spanned by vectors v and w with f (v) = f (w) = 0 and B(v, w) = 1.

(The vectors v and w are linearly independent, so V has rank 2.)

Theorem 3.6 A non-degenerate formed space is the direct sum of a number r of

hyperbolic lines and an anisotropic space U. The number r and the isomorphism

type of U are invariants of V .

Proof If V is anisotropic, then there is nothing to prove, since V cannot contain

a hyperbolic plane. So suppose that V contains a vector v = 0 with f (v) = 0.

We claim that there is a vector w with B(v, w) = 0. In the alternating and

Hermitian cases, this follows immediately from the non-degeneracy of the form.

In the quadratic case, if no such vector exists, then v is in the radical of B; but v is

a singular vector, contradicting the non-degeneracy of f .

Multiplying w by a non-zero constant, we may assume that B(v, w) = 1.

Now, for any value of », we have B(v, w ’ »v) = 1. We wish to choose » so

that f (w ’ »v) = 0; then v and w will span a hyperbolic line. Now we distinguish

cases.

(a) If B is alternating, then any value of » works.

34

(b) If B is Hermitian, we have

f (w ’ »v) = f (w) ’ »B(v, w) ’ »σ B(w, v) + »»σ f (v)

= f (w) ’ (» + »σ );

and, since B is trace-valued, there exists » with Tr(») = f (w).

(c) Finally, if f = q is quadratic, we have

f (w ’ »v) = f (w) ’ »B(w, v) + »2 f (v)

= f (w) ’ »,