ńņš. 7 |

the group of orthogonal n Ć— n matrixes, is a topological group in the subspace topol-

ogy. Hence also Eucl(n), the group of Euclidean motions, and the group of motions

of S 2 (see 3.5) are topological groups.

Hyperbolic motions form a matrix group, the Lorentz group or group

Example 5

of Lorentz transformations (see 3.11 for the notation and compare Theorem 3.11 and

Exercise 8.5)

A J A = J , and A preserves the

t

+

O (1, 2) = A ā GL(3, R) .

halves of the cone q L (v) < 0

This is also a topological group. It and its higher dimensional colleagues O+ (1, n)

are important in special relativity and related areas of physics.

The topological groups in Examples 2ā“5 have an interesting ā˜continuousā™ geom-

etry. Here is a simple example (see Figure 8.0): recall that O(2) is the group of all

rotation matrixes cos Īø ā’ sinĪøĪø and reļ¬‚ection matrixes cos Īø ā’sin Īø Īø . Thus O(2) is a

Īø Īø

sin cos

sin cos

union of two connected components, each a copy of the circle S 1 parametrised by the

angle Īø. One aim of this chapter is to generalise this nice description to some other

orthogonal groups.

8.2 Dimension counting

Here I begin the study of some particular aspects of the geometry of transformation

groups. In this section I want to concentrate on a measure of their size. Recall that

O(2) can be described geometrically as the union of two circles. The circle S 1 is a one

dimensional geometric object in the sense that its points depend on one real parameter

Īø; standing at a point of the circle, there is one direction in which you can move.

Without going into rigorous details, by dimension of a transformation group G,

denoted dim G, I understand the number of continuous real parameters needed to

8.2 DIMENSION COUNTING 145

characterise an element g ā G. The previous paragraph then shows that dim O(2) = 1.

Do not get confused by the fact that O(2) has two components; to characterise elements

of O(2), I need one continuous real parameter (the angle Īø) and a discrete parameter

(the choice of one of the components, equivalently the sign of the determinant, or its

value Ā±1).

I proceed to compute the dimension of transformation groups in some nontrivial

cases. The computations will be performed by describing elements of the groups in a

way which makes it possible to count the parameters involved directly.

An element g ā Eucl(n) depends on n+1

real parameters, so

Proposition 2

dim Eucl(n) = 2 . Further,

n+1

n

dim O(n) = , dim GL(n, R) = n 2 , dim PGL(n + 1, R) = n(n + 2).

2

The language of Euclidean frames from 1.12 gives a way of specifying

Proof

elements of the Euclidean group. Choose a reference frame {P0 , P1 , . . . , Pn }; then by

Theorem 1.12, elements of the Euclidean group Eucl(n) correspond one-to-one with

the set of Euclidean frames {Q 0 , Q 1 , . . . , Q n }. Now calculate:

r Q 0 ā En is any point, so depends on n parameters;

r Q 1 ā En is any point with d(Q 0 , Q 1 ) = 1, that is, it is any point of the unit sphere

S nā’1 with centre Q 0 , hence depends on n ā’ 1 real parameters;

ā’ā’

ā’

r writing e1 = P0 P1 and eā„ = Enā’1 ā‚ En for the orthogonal complement, Q 2 is given

1

by a point of the unit sphere S nā’2 ā‚ Enā’1 , so depends on n ā’ 2 real parameters;

r similarly, Q i is given by a point of S nā’i , and hence depends on n ā’ i real parameters;

r in particular, Q n is one of two points, so has no continuous parameter.

Thus a Euclidean frame depends on

n+1

dim Eucl(n) = n + (n ā’ 1) + Ā· Ā· Ā· + 1 + 0 =

2

parameters.

An element of O(n) ļ¬xes the origin, which I can take to be P0 = Q 0 in the above

argument. Hence the dimension count is

n

dim O(n) = (n ā’ 1) + Ā· Ā· Ā· + 1 + 0 = ,

2

agreeing with dim O(2) = 1. Said slightly differently, O(n) and Eucl(n) differ by the

translation part (compare Proposition 6.5.3), which accounts for n parameters:

n+1 n

dim O(n) = dim Eucl(n) ā’ n = ā’n = .

2 2

The dimension of the general linear group can be calculated in exactly the same

way. Elements of GL(n, R) correspond to invertible maps of the vector space Rn . Such

146 GEOMETRY OF TRANSFORMATION GROUPS

a map is determined by the images of the n usual basis vectors in Rn , parametrised

by a total of n 2 numbers (the entries of the matrix representing the map). Not all

parametrisations give invertible maps, but most do: I only have to exclude matrixes

with zero determinant. Hence there are n 2 real parameters involved, so

dim GL(n, R) = n 2 .

Finally by Theorem 5.5 there are as many projective transformations as projective

frames of reference. Hence I have to pick n + 2 general points in Pn , leading to

dim PGL(n + 1, R) = (n + 2)n

parameters. Incidentally, the dimension of the projective group can also be calculated

from its deļ¬nition PGL(n + 1, R) = GL(n + 1, R)/Rā— , which gives

dim PGL(n + 1, R) = dim GL(n + 1, R) ā’ 1

= (n + 1)2 ā’ 1 = (n + 2)n. QED

You can design your own parameter counts for some other groups not mentioned

in the proposition; for example, do and generalise Exercise 8.3.

8.3 Compact and noncompact groups

The orthogonal group O(n) is a compact topological space.

Proposition

This is a simple application of Proposition 7.4.2. The orthogonal group

Proof

2

is a matrix group: it is a subspace of the space Rn of real matrixes. Hence it is

enough to show that it is closed and bounded. The equation tA A = 1n deļ¬nes a closed

2

subset of Rn , so the main issue is boundedness. However, if A = (ai j ) is orthogonal,

then its columns form an orthonormal basis and in particular for every 1 ā¤ k ā¤ n,

n

i=1 aik = 1. Hence

2

n

aki = n

2

i,k=1

ā

which just says that every orthogonal matrix A is contained in a ball of radius n in

2

Rn . QED

A compact space is often much more pleasant to work with than a noncompact one.

However, many transformation groups are visibly noncompact, such as the additive

group R. On the other hand, the topology and geometry of R are very simple (for

example, R is simply connected, and can be parametrised by a real parameter without

overlap). Most transformation groups are of course more complicated; however, in

a suitable sense they can be topologically decomposed as a compact group times a

group homeomorphic to Rn .

8.3 COMPACT AND NONCOMPACT GROUPS 147

The simplest example is the multiplicative group Rā— of nonzero real

Example 1

numbers. There is a homeomorphism (in this case, an isomorphism of groups)

R+ Ć— {Ā±1} ā’ Rā— ;

in plain English, every nonzero number is the product of a positive number and a sign.

The space R+ is homeomorphic to R; the group {Ā±1} is ļ¬nite so clearly compact.

Although the next example looks similarly innocent, it appears in

Example 2

many different guises throughout geometry, Fourier analysis, Lie groups, representa-

tion theory, complex analysis and number theory. Consider the multiplicative group

Cā— of nonzero complex numbers. This is a topological group; for example, I can

view C as the plane R2 and take the subspace topology. The space Cā— is obviously

noncompact. However, there is a homeomorphism (even a group isomorphism)

Cā—

S 1 Ć— R+ ā’

(Īø, r ) ā’ r exp(iĪø).

Here S 1 is compact (and deļ¬nitely not homeomorphic to a product of copies of R,

which is the essential content of 7.15.4, Corollary 1) and R+ is homeomorphic to R.

The ļ¬nal example is more substantial, and deals with the difference

Example 3

between the groups GL(n, R) and O(n). Write T+ (n) ā‚ GL(n, R) for the set of upper

triangular matrixes with positive diagonal entries:

T+ (n) = M = (m i j ) ā GL(n, R) m i j = 0 for all i > j, and m ii > 0

ļ£±ļ£« ļ£¶ļ£¼

+ ā— Ā·Ā·Ā·

ļ£“ ļ£“

ļ£“ ļ£“

ļ£² 0 + ā— Ā· Ā· Ā·ļ£·ļ£“

ļ£“ļ£¬ ļ£½

ļ£¬ ļ£·

=ļ£¬ ļ£·.

ļ£“ļ£ 0 Ā· Ā· Ā· . . . ā— ļ£øļ£“

ļ£“ ļ£“

ļ£“ ļ£“

ļ£³ ļ£¾

0 Ā·Ā·Ā· 0 +

It is easy to see that T+ (n) ā‚ GL(n, R) is a subgroup.

Every element A ā GL(n, R) can be written in a unique way in the form

Theorem

A = BC, where B ā O(n) is an orthogonal matrix and C ā T+ (n) is an upper trian-

gular matrix with positive diagonal entries. Moreover, B and C depend continuously

on A. The map

GL(n, R) ā’ O(n) Ć— T+ (n) given by A ā’ (B, C)

is a homeomorphism (see 7.3, but not a group homomorphism!).

The space O(n) is compact by the above Proposition. The space T+ (n)

Discussion

is homeomorphic to R N , where N = n+1 . Many geometric questions on GL(n, R)

2

148 GEOMETRY OF TRANSFORMATION GROUPS

reduce to similar questions on O(n); for a simple example, compare Remark 8.4. Note

also the dimension count:

n+1

n

dim O(n) + dim T+ (n) = + = n 2 = dim GL(n, R).

2 2

I view the n Ć— n matrix A as a row made up of n column vectors fi . Thus

Proof

{f1 , . . . , fn } is a basis of Rn because A ā GL(n, R). If it is an orthonormal basis then

there is no problem: A ā O(n), and we must take B = A and C = 1. If A is not

orthogonal to start with, then the Gramā“Schmidt process described in the proof of

Theorem B.3 (1) produces an orthonormal basis. Set B to be the matrix formed from

the new basis vectors as columns, and C to be the matrix describing the change of

basis. Clearly B ā O(n); I leave you to check (see Exercise 8.6) that C ā T+ (n) and

that B, C depend continuously on A. Then the map A ā’ (B, C) is continuous, and

its inverse is matrix multiplication (B, C) ā’ BC. QED

8.4 Components

Recall from 7.4.1 that every topological space can be decomposed into a number of

components, which are themselves connected. I repeatedly discussed the geometry

of O(2): a union of two circles. A circle S 1 is connected, so O(2) has two connected

components. This is typical:

The group O(n) has two connected components, distinguished by

Proposition

det A = Ā±1.

One can use Theorem 8.3 to show that GL(n, R) also has two connected

Remark

components, that are distinguished by det A > 0 and det A < 0; see Exercise 8.4. The

group O(1, 2) of all Lorentz matrixes has 4 components, as discussed in Exercise 8.5.

An orthogonal matrix has determinant Ā±1. (Compare 1.10; recall that I

Proof

called A direct if det A = 1 and opposite if det A = ā’1.) The function

det : O(n) ā’ {Ā±1}

is continuous, so the two possibilities det A = Ā±1 determine two disjoint open and

closed sets of O(n). It remains to show that each of these sets is path connected.

Fix a matrix A ā O(n). By the normal form theorem 1.11, A can be written with

respect to a suitable orthonormal basis in the diagonal block form with 2 Ć— 2 diagonal

blocks

cos Īøi ā’ sin Īøi

Bi = ,

sin Īøi cos Īøi

and one optional block Ā±1. For t varying from 0 to 1, let A(t) be the matrix with the

same block form as A, but with blocks

ā’ sin tĪøi

cos tĪøi

Bi (t) = .

sin tĪøi cos tĪøi

8.5 THE GEOMETRY OF SO(n) 149

The rule t ā’ A(t) gives a continuous path [0, 1] ā’ O(n) joining A either to the

identity or to the element diag(1, . . . , 1, ā’1). Therefore, the two subsets of O(n)

deļ¬ned by det A = Ā±1 are both path connected. A path connected space is connected

by Lemma 7.4.1 (2). QED

The special orthogonal group is the group

SO(n) = A ā O(n) det A = 1 .

By the Proposition, this is a connected component of O(n). Since it is the kernel of a

group homomorphism det : O(n) ā’ {Ā±1}, it is also a normal subgroup of index 2 in

O(n).

In the special case n = 3, the elements of SO(3) can be described explicitly. By the

normal form theorem 1.11, any orthogonal 3 Ć— 3 matrix of determinant 1 has the form

ļ£« ļ£¶

1

ļ£ cos Īø ā’ sin Īø ļ£ø

sin Īø cos Īø

in a suitable basis. If l is the line through the origin with direction vector given by

the ļ¬rst basis element, then the motion of E3 described by this matrix is the rotation

Rot(l, Īø) around the line l. Hence SO(3) is the group of rotations of E3 about axes

passing through O.

8.5 Quaternions, rotations and the geometry of SO(n)

As I discussed before, for n = 2 the group SO(2) is homeomorphic to the circle S 1 .

The purpose here is to ļ¬nd a similar description of the special orthogonal groups

SO(3) and SO(4) in terms of the 3-sphere. I start with a small detour to introduce

the quaternions, the main protagonists in the game. Note that SO(n) is the group

of direct motions of En with a ļ¬xed point, or in other words the group of rotations

of En ; hence the aim is to ļ¬nd a connection between quaternions and rotations (for

n = 3, 4).

The algebra of quaternions is the real vector space

8.5.1

Quaternions

H = a + bi + cj + dk with a, b, c, d ā R,

with the multiplication law

i 2 = j 2 = k 2 = ā’1, i j = k, jk = i, ki = j, ji = ā’k, k j = ā’i, ik = ā’ j.

The cyclic symmetry makes this easy to remember.

Some terminology, similar to the traditional language of complex numbers: if

q = a + bi + cj + dk, write q ā— = a ā’ bi ā’ cj ā’ dk for the conjugate quaternion.

We say that q is real if b = c = d = 0 and pure imaginary if a = 0.

150 GEOMETRY OF TRANSFORMATION GROUPS

Proposition

H is an associative noncommutative R-algebra of dimension 4 over R.

(1)

The conjugation q ā’ q ā— is an antiinvolution, meaning

(2)

( pq)ā— = q ā— p ā— for all p, q ā H.

|q|2 = qq ā— = q ā— q = a 2 + b2 + c2 + d 2 is a positive deļ¬nite quadratic form on H;

(3)

therefore for any nonzero q ā H, the element

q ā’1 = q ā— /|q|2

is a 2-sided inverse of q. Hence H is a division algebra or skew ļ¬eld.

If q ā H and q ā R, then q = A + B I with I pure imaginary, I 2 = ā’1 and A, B ā R.

/

(4)

Hence the subalgebra R[q] of H generated by q is of the form R[q] ā¼ C ā‚ H.

=

If I is pure imaginary with I = ā’1, there exists J, K ā H such that I, J, K have the

2

(5)

same multiplication table as i, j, k, that is I 2 = J 2 = K 2 = ā’1 and I J = K , etc.

(1) Noncommutativity is clear from the multiplication table: i j = k =

Proof

ā’k = ji.

Because everything is R-linear, it is enough to check the associative law a(bc) =

(ab)c for the basis elements a, b, c ā {1, i, j, k}. If any of a, b, c is 1 then it is OK.

By the cyclic symmetry, I can assume that the ļ¬rst term a = i; if only i appears, then

I am working in a copy of C. This leaves only 8 cases to check by brute force:

i(i j) = ik = ā’ j = (i 2 ) j; i(ik) = i(ā’ j) = ā’k = (i 2 )k;

i( ji) = i(ā’k) = j = ki = (i j)i; i( j 2 ) = ā’i = k j = (i j) j;

i( jk) = i 2 = ā’1 = k 2 = (i j)k; i(ki) = i j = k = ā’ ji = (ik)i;

i(k j) = ā’i 2 = 1 = ā’ j 2 = (ik) j; i(k 2 ) = ā’i = ā’ jk = (ik)k.

This is of course pure gobbledygook. A much more convincing argument is to say that

i, j, k are maps of something, such that multiplication coincides with composition of

maps, so is associative for a fundamental reason; see Exercise 8.8.

(2) Again because everything is R-linear, it is enough to check that ( pq)ā— = q ā— p ā—

for basis elements a, b ā {1, i, j, k}. The brute force method is an easy exercise:

(1i)ā— = ā’i = (i ā— )(1ā— ), (i j)ā— = ā’k = (ā’ j)(ā’i), etc.; see Exercise 8.9.

(3) On multiplying out the product (a + bi + cj + dk)(a ā’ bi ā’ cj ā’ dk), the

terms a 2 + b2 + c2 + d 2 appear in the obvious way from the squared terms. The

cross terms all cancel out, either as (a Ć— ā’bi) + (bi Ć— a) = 0 or (bi Ć— ā’cj) +

(cj Ć— ā’bi) = ā’bc(i Ć— j + j Ć— i) = 0.

(4) Note that q + q ā— = 2a and qq ā— = |q|2 ā R, so that q and q ā— are the two roots

of a quadratic polynomial x 2 ā’ 2ax + |q|2 with real coefļ¬cients. Also, q ā’ q ā— =

2(bi + cj + dk) is pure imaginary, and an easy calculation similar to that in (3) shows

that (q ā’ q ā— )2 = ā’4(b2 + c2 + d 2 ) < 0 (because q ā R), so that this has no real roots.

/

Thus q = A + B I where A = a, B = (b2 + c2 + d 2 ) and I is pure imaginary with

I 2 = ā’1.

(5) is worked out as an exercise in Exercise 8.12. QED

8.5 THE GEOMETRY OF SO(n) 151

(3) says that the Euclidean distance on R4 = H is determined by the

Remark

algebra structure of H together with the antiinvolution q ā’ q ā— . This has various nice

corollaries. For example, the direct sum decomposition

H = {real quaternions} ā• {imaginary quaternions} = R ā• R3

is orthogonal. Also, two imaginary vectors p, q anticommute pq = ā’ pq if and only

if the corresponding vectors of R3 are orthogonal. This point is the main reason that

quaternions can be applied to rotations of E3 and E4 .

Set

8.5.2

Quaternions

U = {unit quaternions} = {q ā H | qq ā— = 1} = S 3 ā‚ R4

and

rotations

for the unit quaternions. Note that U has two structures: it is a group under mul-

tiplication, and also has its own geometry as the sphere S 3 . The two structures are

compatible as in 8.1. The group U generalises the multiplicative group of complex

numbers of modulus 1, which is the unit circle S 1 ā‚ C.

For the next theorem, identify H and its quadratic form |q| with E4 and its Eu-

clidean distance. The purely imaginary quaternions form a linear subspace which gets

identiļ¬ed with E3 .

Theorem

For any p ā U , left multiplication a p : x ā’ px deļ¬nes a map H ā’ H which is a

(1)

direct motion of H = E4 ļ¬xing the origin; the same holds for right multiplication

bq : x ā’ xq ā— .

The group homomorphism Ļ• : U Ć— U ā’ SO(4) deļ¬ned by

(2)

Ļ•( p, q) = a p ā—¦ bq : x ā’ pxq ā—

is surjective, and Ļ•( p, q) = idH if and only if ( p, q) = (1, 1) or ( p, q) = (ā’1, ā’1).

For any q ā U , the map rq : x ā’ q xq ā— is a direct motion of H = E4 , which is the

(3)

identity on real elements of H and takes pure imaginary quaternions of H to pure

imaginary quaternions. Thus it deļ¬nes a rotation of the subspace E3 ā‚ H of pure

imaginary quaternions.

Any q ā U with q ā R has a unique expression in the form q = cos Īø + I sin Īø, where

/

(4)

I ā U is a pure imaginary quaternion and Īø ā (0, Ļ). Then rq = Rot(I, 2Īø) is the

rotation of R3 about the directed axis deļ¬ned by I through the angle 2Īø.

The group homomorphism Ļ : U = S 3 ā’ SO(3) deļ¬ned by

(5)

Ļ(q) = rq

is surjective, and Ļ(q1 ) = Ļ(q2 ) if and only if q1 = Ā±q2 .

152 GEOMETRY OF TRANSFORMATION GROUPS

(1) It is clear that a p is a motion, since it ļ¬xes 0 and | px|2 = |x|2 . Moreover,

Proof

it must be a direct motion, for example, because det(aq ) is a continuous map from the

connected set U = S 3 to Ā±1. (Several other proofs are possible, see Exercise 8.15.)

I relegate (2) to Exercise 8.22.

(3) is obvious, since a ā R commutes with quaternion multiplication, so rq (a) =

qaq ā— = aqq ā— = a. Also, if p ā— = ā’ p, then rq ( p) = q pq ā— has (rq ( p))ā— = (q pq ā— )ā— =

q p ā— q ā— = ā’q pq ā— , so q pq ā— is pure imaginary.

(4) follows from Proposition 8.5.1 (4): R[q] ā¼ C. The equation x 2 = ā’1 has

=

exactly two roots Ā±I in C, and choosing the appropriate sign gives q = cos Īø + I sin Īø

with Īø ā (0, Ļ). Then rq (I ) = I follows because R[q] ā¼ C, so that q ā— = q ā’1 and

=

ā’1

qIq = I.

Now let J, K be as in Proposition 8.5.1 (5). Then

q J q ā— = (cos Īø + I sin Īø)J (cos Īø ā’ I sin Īø)

= (cos2 Īø ā’ sin2 Īø)J + (2 sin Īø cos Īø)K ,

and similarly q K q ā— = ā’(2 sin Īø cos Īø)J + (cos2 Īø ā’ sin2 Īø)K . Thus rq ļ¬xes the di-

rected axis deļ¬ned by I , and performs a rotation by 2Īø in the plane spanned by J, K .

Finally (5) follows by (4); every rotation is hit exactly twice because of the

2Īø. QED

After all this algebra, come the relations between groups of rotations and the sphere

8.5.3

S3.

Spheres and

special

Corollary

orthogonal

groups (1) There is a homeomorphism

S 3 /ā¼,

SO(3)

where ā¼ is the equivalence relation on S 3 that identiļ¬es antipodal points x and ā’x.

(2) There is a homeomorphism

(S 3 Ć— S 3 )/ā,

SO(4)

where ā is the equivalence relation on S 3 Ć— S 3 that identiļ¬es (x, y) with (ā’x, ā’y).

Both statements are direct corollaries of the previous theorem together

Proof

with Theorem 7.14 and the deļ¬nition of the quotient topology and its UMP discussed

in 7.5.

In more detail, by Theorem 8.5.2 (5) there is a continuous surjective map

Ļ : S 3 ā’ SO(3), with Ļ(x) = Ļ(y) if and only if x = y or x = ā’y. By the uni-

versal mapping property 7.5 of the quotient topology, there is consequently a con-

tinuous map Ļ : (S 3 /ā¼) ā’ SO(3) that is clearly a bijection. Now S 3 is compact,

and therefore so is S 3 /ā¼ by Proposition 7.4.3. Also the subspace topology of

SO(3) ā‚ R9 = {3 Ć— 3 matrixes} is metric and therefore Hausdorff. Therefore all the

8.6 THE GROUP SU(2) 153

assumptions of Theorem 7.14 are satisļ¬ed, Ļ is a homeomorphism, and (1) follows.

(2) is proved in exactly the same way using the map Ļ• : U Ć— U ā’ SO(4) of Theo-

rem 8.5.2 (2). QED

The statements of the corollary generalise for all n; namely, there ex-

Remark

ists a compact topological group Spin(n) called the spinor group with a surjective

homomorphism Ļ : Spin(n) ā’ SO(n) with kernel Ī¹ of order 2, so that Ļ induces

an isomorphism of groups Spin(n)/ Ī¹ ā’ SO(n) that is also a homeomorphism [15].

The pleasant thing about low dimensions is the fact that the spinor groups are spheres

or products of spheres: Spin(2) S 1 , Spin(3) S 3 , Spin(4) S 3 Ć— S 3 .

8.6 The group SU(2)

In this brief section, I identify the group U of unit quaternions of 8.5 as a matrix

group. This involves more linear algebra over the complex numbers, a subject that

already made a brief but important appearance in 1.11.

Let V be a 2-dimensional C-vector space together with a positive deļ¬nite Hermitian

form, represented in some basis by |z 1 |2 + |z 2 |2 , or the matrix 1 0 (see B.6 for more

01

details on Hermitian forms). A complex linear transformation of V that preserves this

form is unitary: thus a matrix A ā GL(2, C) is unitary if it satisļ¬es hA A = In , where

A is the Hermitian conjugate deļ¬ned by (hA)i j = A ji . The group of all such matrixes

h

is the unitary group U(2). I am interested in its subgroup, the special unitary group

SU(2) = A ā U(2) det A = 1 .

As matrix groups, both U(2) and SU(2) are topological groups in an obvious way.

A unitary matrix A has | det A| = 1; see Exercise B.4. Thus the set of

Remark

possible values for the determinant is the unit circle S 1 , which is connected. Thus

SU(2) is a normal subgroup, but not a connected component of U(2) in the same way

as SO(2) is in O(2).

I write out explicitly the condition for a matrix A ā GL(2, C) to be special unitary

(compare 1.11.1). If A = a d , the equations are

b

c

aa + cc = 1,

det A = ad ā’ bc = 1.

ab + cd = 0, and (1)

bb + dd = 1,

One solves these equations more-or-less as in 1.11.1 to get d = a and c = ā’b, where

aa + bb = 1; see Exercise 8.20. Thus

a b

SU(2) = a, b ā C, |a|2 + |b|2 = 1 .

ā’b a

This description has an important corollary.

154 GEOMETRY OF TRANSFORMATION GROUPS

The map ā’b a ā’ a + bj deļ¬nes an isomorphism from SU(2) to the

ab

Corollary

group U of unit quaternions of 8.5.2.

Write a = a1 + a2 i and b = b1 + b2 i. Then a + bj = a1 + a2 i + b1 j +

Proof

b2 k using quaternion multiplication. The condition |a|2 + |b|2 = 1 becomes |a1 |2 +

|a2 |2 + |b1 |2 + |b2 |2 = 1 hence a + bj has quaternion norm 1. The map SU(2) ā’ U

is clearly a bijection. It remains to check that the map respects multiplication, so that

it becomes a group isomorphism; this is a special case of Exercise 8.14. QED

Theorem 8.5.2 (5) on the description of SO(3) can thus be reformulated as say-

ing that there exists a two-to-one surjective group homomorphism SU(2) ā’ SO(3)

(compare also Exercise 8.3). The two groups are now matrix groups (over different

ļ¬elds), but the existence of the two-to-one map is by no means obvious from the

matrix description: the most convincing way of going from complexes to reals is via

quaternions.

8.7 The electron spin in quantum mechanics

This section relates the geometry of SO(3) to a fundamental attribute of elementary

particles: their spin. All the mathematics needed is at hand already; however, there is no

space in the present book to introduce all the necessary background from quantum me-

chanics. For more information and insight, see Feynmanā™s classic [7], Chapters 1ā“3.

The story begins in 1925. Two Dutch doctoral students George Uhlenbeck and Samuel

8.7.1

Goudsmit, halfway through their Ph.D. program, noted that the electron inside the

The story of

atom appeared to have, besides the three known ā˜quantum numbersā™ associated with

the electron

the position of the electron, its angular momentum around the nucleus and its magnetic

spin

ļ¬eld, an extra degree of freedom. They postulated the existence of an extra ā˜quantum

numberā™, which they called the electron spin. This new quantum number seemed

to behave in many ways like angular momentum, so they gave the interpretation

that it corresponds to some kind of intrinsic rotational motion. However, the quantum

number appeared to have just two possible values (+) and (ā’), and the rotation seemed

not to have a deļ¬nite axis; strange facts for a ā˜spinningā™ particle. Their advisor Paul

Ehrenfest is said to have commented: ā˜You are both young enough to be able to afford

a stupidity!ā™ (he realised soon afterwards though that his students had in fact made

an important discovery).

Unknown to Uhlenbeck and Goudsmit, the experimental veriļ¬cation of their dis-

covery had been around for three years in the form of the Sternā“Gerlach experiment.

In 1922 the German scientists Otto Stern and Walther Gerlach built the device illus-

trated schematically in Figure 8.7a. The source emits a beam of silver atoms. The

beam is directed between the poles of a magnet, which produces a magnetic ļ¬eld

orthogonal to the direction of the path. As the atoms are electrically neutral, they

are not expected to experience force; they should thus pass through the device with-

out any change in their direction. However, a screen on the other side of the device

8.7 THE ELECTRON SPIN IN QUANTUM MECHANICS 155

magnet screen

N

silver atoms

with (+) spin

beam of silver atoms

silver atoms

with (ā’) spin

S

Figure 8.7a The Sternā“Gerlach experiment.

reveals that the atoms are in fact deļ¬‚ected by the magnetic ļ¬eld, and moreover that

they follow one of two possible paths.

The experiment can only be understood in terms of the notion of spin. A silver

atom has an electron on an outer shell, whose intrinsic spin interacts with the magnetic

ļ¬eld. Atoms whose outer electron is in the (+) spin state follow a different path from

those in the (ā’) spin state.

The mid-1920s was of course the time when quantum mechanics was invented.

Soon after Uhlenbeck and Goudsmitā™s proposal, Pauli and Dirac incorporated elec-

tron spin into the quantum mechanical theory of the electron, also known as the

SchrĀØ dinger equation. Since this is not a course about the electron, I do not need to

o

worry unduly with the details.

In the following, I assume a modiļ¬ed form of the Sternā“Gerlach (SG) device, illus-

8.7.2

trated in Figure 8.7b. This is only a thought experiment1 , explained in detail in [7],

Measuring

pp. 5-1 and 5-2. An electron beam arrives from the left, and separates inside the device

spin: the

S into two beams according to its spin under the action of the left-hand ā˜magnetā™. A

Sternā“

combination of other ā˜magnetsā™ forces the electrons back into their horizontal path;

Gerlach

the outcoming beam still consists of a mixture of electrons in the two spin states.

device

Assume now that I block the path of one of the beams inside the device, as in the

case of device S of Figure 8.7c. Then the electrons leaving the device S are all in a

deļ¬nite spin state (+). In this sense, I have now ā˜measuredā™ the spin of this beam of

1 The experiment cannot be carried out as described here: the electronā™s wave function is too fuzzy because

of quantum mechanical effects, and the separation into two rays is not apparent. The point about the silver

atom featuring in the original Sternā“Gerlach experiment is that it is electrically neutral, but has a relatively

free electron on an outer shell; its motion between magnets is thus governed by the spin of the outer

electron. In the text I stick to the thought experiment involving free electrons.

156 GEOMETRY OF TRANSFORMATION GROUPS

N S N

electron beam

S N S

Figure 8.7b The modified Sternā“Gerlach device.

S Sā²

Figure 8.7c Two identical SG devices.

electrons: I know precisely what state they are in. (Unfortunately, I have lost about

half my electrons along the way, but that seems to be unavoidable in this kind of game.

Compare with a large accountancy ļ¬rm hired to count your money.) In particular, if I

attach another SG device S in the same position after the ļ¬rst as in Figure 8.7c, then

I know the path of all the electrons inside the device; blocking the other path then

makes no difference.

However, let us now put another SG device T in a different spatial position in the

path of my uniform spin electron ray; see Figure 8.7d. The ray now separates again;

the electrons choose two different paths in a speciļ¬c ratio (which can be measured

again by blocking one or other of the paths) depending on the position of the new

SG device. Hence knowing that the electron is in spin state (+) in one direction does

not mean that it is in spin state (+) in all directions. It registers as spin (+) or (ā’) in

some different direction following, it seems, a ļ¬xed dress code.

As both experiment and speculation conļ¬rm, the electron spin takes two possible

8.7.3

values +1 and ā’1, where I ignore unnecessary constants. In the framework of quantum

The spin

mechanics, such a two-state system is modelled on a 2-dimensional complex vector

operator

space V with a deļ¬nite Hermitian form on it, which I denote by bracket ( , ). Every

electron in this simple model is described by its wave function Ļ ā V, which we

normalise to unit length (Ļ, Ļ) = 1.

8.7 THE ELECTRON SPIN IN QUANTUM MECHANICS 157

S

T

Figure 8.7d Two different SG devices.

An SG device S in a ļ¬xed spatial position corresponds to a linear operator

O S : V ā’ V . The possible spin states with respect to this spatial direction corre-

spond to the different eigenvalues of this map. In the present case, the eigenvalues

+ ā’

must therefore be Ā±1. There are corresponding normalised eigenvectors Ļ S and Ļ S :

+ + ā’ ā’

O S (Ļ S ) = Ļ S , O S (Ļ S ) = ā’Ļ S .

Quantum mechanics postulates that the operator O S is Hermitian (Exercise 8.24).

+ ā’ + ā’

It follows that the eigenvectors are orthogonal (Ļ S , Ļ S ) = 0. Thus {Ļ S , Ļ S } is a

Hermitian basis in the 2-dimensional vector space V.

+

The electron with wave function Ļ S is in the (+) spin state and that with wave

ā’

function Ļ S is in the (ā’) spin state. These electrons are in eigenstates of the spin

operator O S . An arbitrary electron has a wave function Ļ ā V which is a linear

combination of the basis vectors:

+ ā’

Ļ = Ī±Ļ S + Ī²Ļ S .

Such a state is referred to as a mixed state.

+ ā’

An electron in a mixed state Ļ = Ī±Ļ S + Ī²Ļ S arriving at our SG device S passes

along the (+) or (ā’) path in the device with probability |Ī±|2 or |Ī²|2 respectively.

Ā±

These numbers are called probability amplitudes. Because both basis vectors Ļ S and

the vector Ļ are normalised to unit length, |Ī±|2 + |Ī²|2 = 1; thus these probabilities

add to one.

Once we block the (ā’) path, the outcoming electrons are all in the (+) eigenstate:

+

their wave function is the eigenvector Ļ S ā V . This explains their behaviour in a

next SG device S in the same spatial position as S, pictured in Figure 8.7c. The

operator corresponding to the device S is O S = O S , and the electrons are all in the

(+) eigenstate of this operator. So they choose the two paths with probability |Ī±|2 = 1,

respectively |Ī²|2 = 0; in other words, their path through S is determined.

To perform our next thought experiment, imagine a beam of electrons leaving a device

8.7.4

in one of the deļ¬nite eigenstates, and arriving at another device in a different spatial

Rotate the

position as in Figure 8.7d. The new SG device T corresponds to an operator OT and

device

+ ā’

hence to a new Hermitian basis {ĻT , ĻT } of V consisting of eigenvectors of OT .

158 GEOMETRY OF TRANSFORMATION GROUPS

Ā±

I wish to study an electron ray in one of the spin eigenstates Ļ S , when it passes

through T . The experiment says that electrons will follow one of two possible paths

in T , and I want the probability of its taking one or other of the paths. According to

+ ā’

the rule spelled out in the last section, I should write the vector Ļ S (and also Ļ S ) in

+ ā’

terms of the new basis {ĻT , ĻT } to ļ¬nd the probability amplitudes. This is simply a

change of basis, given by a 2 Ć— 2 matrix A Sā’T , an element of GL(2, C) (in fact U(2)

as both bases are Hermitian). The task is to ļ¬nd A Sā’T from S, T .

To proceed, I need to make precise the geometry of an SG device in 3-space.

Note that an SG device in physical space E3 determines two distinguished orthogonal

directed lines; namely, there is the distinguished direction of the electron beam, and

the distinguished direction of the magnetic ļ¬eld orthogonal to it; see Figure 8.7a. I can

think of these directed lines as two coordinate axes in a coordinate system, and there

is a unique way of adding a third directed coordinate axis orthogonal to the ļ¬rst two to

make a right-handed coordinate system in 3-space. The new system T determines in

the same way a new right-handed coordinate system in E3 . The transformation which

gets me from S to T is a direct motion of E3 , and thus a rotation g ā SO(3). (Note

that only directions matter in this discussion; the origin of the coordinate system is

not important, and I ignore translations.)

According to the earlier discussion, I need a recipe associating an element of

GL(2, C) with a transformation S ā’ T , presumably in a continuous manner. In other

words, I need a map

A : SO(3) ā’ GL(2, C).

It can also be argued from basic principles of quantum mechanics that the map A

should respect composition; after all, S ā’ T followed by T ā’ R should be the

same as S ā’ R. Hence the map A should be a group homomorphism. This however

presents a puzzle: there is no obvious way to map SO(3) to the group of linear maps

on a 2-dimensional C-vector space (apart from the map which takes every rotation

to the identity matrix, which would contradict the experimentally observed fact that

spin does depend on direction). In fact there is absolutely no such map at all.

Ā± Ā±

Although the expressions for ĻT in terms of Ļ S and the rotation taking S to T

8.7.5

can be derived from ļ¬rst principles, I cannot improve on Feynmanā™s beautiful and

The solution

self-contained account (in pp. 6-1 to 6-14 of [7]), and I just state the result: namely,

although there is no map A : SO(3) ā’ GL(2, C), there is an obvious map

A : SU(2) ā’ GL(2, C)

from the group SU(2) to GL(2, C); a 2 Ć— 2 unitary matrix is certainly invertible, so

the inclusion map will do. On the other hand, SU(2) is not too different from SO(3);

by Corollary 8.5.3, they are related by a two-to-one map. Thus A can be thought of

as a two-valued function on SO(3).

Up to a knowledge of the explicit form of the map SU(2) ā’ SO(3) that can easily

be derived from the expressions in 8.5.2, this answers the original question of how

8.8 PREVIEW OF LIE GROUPS 159

to compute the ratio of electrons following the two paths of Figure 8.7d: S ā’ T is

given by an element of SO(3), and there are two possible changes of basis

+ + ā’

Ļ T = Ī±+ Ļ S + Ī²+ Ļ S

ā’ + ā’

Ļ T = Ī±ā’ Ļ S + Ī²ā’ Ļ S

for matrixes

Ī±+ Ī±ā’

ā SU(2)

Ī²+ Ī²ā’

which differ from each other only in a change of sign; the eigenvectors are in any

case determined only up to sign, and the physical meaning is only carried by the

amplitudes |Ī±Ā± | and |Ī²Ā± | which are independent of the choice of signs made.

One way to think of the process is to start with an SG device S and then start to

turn it around a ļ¬xed axis. This determines a path in the group SO(3) starting from the

identity. Starting from the identity matrix in SU(2), I can follow this path in SU(2),

and see what happens to the transformation matrix. It turns out that after a full turn

by 2Ļ of my device, that is, after a loop in SO(3) returning to the identity, my path

in SU(2) takes me to the negative of the identity matrix. Following the loop in SO(3)

once again, I can continue my path in SU(2), and lo and behold! a turn of 4Ļ returns

me to the identity matrix in SU(2).

This thought experiment with paths reļ¬‚ects the topological fact that the funda-

mental group of SO(3) is Z/2, and its universal cover is the map S 3 ā’ SO(3) of

8.5ā“8.6 (see a ļ¬rst course in topology for the language). It is also responsible for the

mysterious statement turning up frequently in physics texts, that ā˜rotation by 2Ļ does

not leave the wave function of the electron invariant, but multiplies it by (ā’1)ā™. As I

am told, this can be directly demonstrated by experiment.

As a ļ¬nal comment, note that in this chapter I dealt with spin for a ā˜spin 1 ā™ particle

2

such as the electron, whose spin can take two values (+) or (ā’). There are also

ā˜spin 1ā™ particles such as the heavy particles Z , W Ā± which are responsible for nuclear

forces. Their spin can take the values (+), 0 or (ā’). Much of the discussion of this

chapter applies to such three-state systems; compare [7], Chapter 5. Their spin can

be measured by a three-way SG device. The vector space W representing spin states

is now 3-dimensional over C, and the transformation S ā’ T between SG devices

corresponds to a map B : SO(3) ā’ GL(3, C). In this case, there is no great mystery:

this map is, up to conjugation, the obvious inclusion map, where I think of a 3 Ć— 3 real

orthogonal matrix as a 3 Ć— 3 complex invertible matrix (the ā˜vector representationā™).

For this reason spin 1 particles are often called ā˜vector particlesā™.

8.8 Preview of Lie groups

The topological groups GL(n, R) and O(n) are examples of Lie groups, groups whose

elements depend on a ļ¬nite number of continuous parameters. Examples of Lie groups

include the Euclidean group Eucl(n), the Lorentz group O+ (1, 2), the special linear

group SL(n) (the group of invertible n Ć— n matrixes with determinant 1), the spinor

160 GEOMETRY OF TRANSFORMATION GROUPS

groups Spin(n), and groups deļ¬ned using the complex numbers such as the group

GL(n, C) of invertible matrixes over C. Here is a list of features of general Lie group

theory that made an appearance in this chapter:

The geometry of the group around any point can be described by d

Dimension

parameters, where the number d is independent of the point chosen, and is called the

dimension of the group. Examples from Proposition 8.2 are

n+1

n

dim O(n) = dim Eucl(n) = .

and

2 2

A Lie group G has a number of connected components (ļ¬nite or

Components

inļ¬nite), all of them geometrically the same (homeomorphic). The component con-

taining the identity is a normal subgroup, and the other components are its cosets.

See 8.4 for O(n) and Exercise 8.5 for the group O+ (1, 2).

A connected Lie group G is homeomorphic to a

Maximal compact subgroup

product H Ć— R of a compact Lie group H and a space R N in which all loops are

N

contractible (compare 7.15). The examples of 8.3 are typical: compactness is achieved

by imposing a positive deļ¬nite orthogonal or Hermitian form.

A connected Lie group G has a cover G ā’ G by a simply

The universal cover

connected Lie group G (possibly G itself). The typical examples are the exponential

map C ā’ Cā— and the two-to-one spinor covers S 3 ā’ SO(3) and S 3 Ć— S 3 ā’ SO(4)

discussed in 8.5.3.

The group GL(n, C) is the complexiļ¬cation of

Complexification and real forms

the group GL(n, R): the latter is a matrix group, and I can simply take complex instead

of real entries. Conversely, we say that GL(n, R) is a real form of GL(n, C). Along the

same lines, the group O(n, C) of n Ć— n complex matrixes, which leave the standard

quadratic (!) form i xi2 invariant, is a complexiļ¬cation of the group O(n). However,

O(n) is not the only real form: over the complex numbers, there is no difference

between the forms i xi2 and ā’x1 + i>1 xi2 . Thus the Lorentz group O(1, n ā’ 1)

2

is also a real form of O(n, C).

Just as ļ¬nite groups, Lie groups are often studied via their

Linear representations

linear (matrix) representations. In plain language, we associate to every group element

g ā G an n Ć— n (complex) matrix A g so that Ah A g = Ahg . In fancier language, this is

nothing but a group homomorphism G ā’ GL(n, C); one familiar example is the map

A : SU(2) ā’ GL(2, C) from 8.7.5. I recommend Fulton and Harris [9] for further

Ė

study.

Lie groups commonly appear as symmetry groups

Symmetry groups in physics

of interesting physical systems. The mathematics of the group and the physics of the

system are often related in beautiful and nontrivial ways. The interaction occurs on

EXERCISES 161

two levels: ā˜classicalā™ (meaning Newtonian dynamics and Maxwell electromagnetic

theory) and ā˜modernā™ (meaning relativity theory or quantum mechanics, possibly

both). The story of the electron in 8.7.5 is the starting point of the ā˜quantumā™ level of

this interaction; for more discussion, turn to 9.3 and Sternberg [23].

Exercises

8.1 How much bigger is the afļ¬ne group Aff(n) than the Euclidean group Eucl(n)? [Hint:

compare GL(n) and O(n) in 8.3.]

(a) Show that rotations, translations, reļ¬‚ections and glides of E2 (Theorem 1.14)

8.2

depend respectively on 3, 2, 2 and 3 parameters.

(b) Count parameters for each of the types of motion of Theorem 1.15. (Answers:

(1) translation 3; (2) rotation 5; (3) twist 6; (4) reļ¬‚ection 3; (5) glide 5; (6) rotary

reļ¬‚ection 6. For example, a rotation is speciļ¬ed by a line of 3-space, which

depends on 4 parameters, plus an angle.)

8.3 Count the number of real parameters for the groups SO(3) and SU(2); verify that they

depend on the same number of parameters, as you would expect from the two-to-one

cover discussed in 8.6. [Hint: use Proposition 8.2, respectively the results of 8.6.]

Determine the connected components of GL(n, R) using Theorem 8.3 and

8.4

Proposition 8.4.

8.5 Let

O(1, 2) = A ā GL(3, R) AJ A = J

t

be the group of all Lorentz matrixes, which contains the Lorentz group O+ (1, 2)

introduced in 8.1, Example 5. Show that this group has four connected components,

distinguished by whether a matrix preserves the cone q L (v) < 0 or maps it to q L (v) >

0 (that is, whether it is in O+ (1, 2)), and det A = Ā±1. [Hint: imitate the proof of

Proposition 8.4, using the Lorentz normal form statement of Exercise B.3. Distinguish

carefully between four types of possible diagonal matrixes arising as end products.]

Let A ā GL(n, R) be a matrix with columns fi . Following the proof of Theorem B.3

8.6

(1) carefully, show that it is possible to construct an orthonormal basis {ei } of Rn , so

that in each step

ei = ci1 f1 + Ā· Ā· Ā· + cii fi

with cii > 0. Let C = (ci j ) and B the matrix with columns ei ; check that A = BC

and that B ā O(n), C ā T+ (n) (compare 8.3). Check also that the entries of B and C

depend continuously on those of A.

Write the following matrixes in the form BC of Theorem 8.3 with B ā O(n) and

8.7

C ā T+ (n):

ļ£« ļ£¶

ā 103

1+ ā

1 3 13 ļ£2 ā’1 4ļ£ø .

ā , ,

ā’1 + 3 14

3

212

162 GEOMETRY OF TRANSFORMATION GROUPS

Exercises on quaternions.

8.8 Show that 4 complex matrixes

10 i0 01 0i

1= , I= , J= , K=

0 ā’i ā’1 0

01 i0

multiply together by the same rules as the 4 basic quaternions 1, i, j, k. Since matrix

multiplication is associative, use this to give a better proof of Proposition 8.5.1 (1).

Complete the proof by brute force of ( pq)ā— = q ā— p ā— for quaternion conjugation

8.9

(Proposition 8.5.1 (2)). Give a better proof along the lines of the previous exercise.

Study the group G 8 = {Ā±1, Ā±i, Ā± j, Ā±k} of unit quaternions. Write out the group

8.10

multiplication table, and ļ¬nd a convincing reason (or failing that, any reason) why

G 8 is not isomorphic to the dihedral group D8 appearing in Exercise 6.5.

If p = ai + bj + ck and q = di + ej + f k are two pure imaginary quaternions, cal-

8.11

culate pq + q p directly using the deļ¬nition of quaternion multiplication.

Prove that a pure imaginary quaternion p satisļ¬es p 2 = ā’| p|2 . Also if p, q are

8.12

pure imaginary then pq + q p = 0 if and only if they are orthogonal with respect to

the quadratic form a 2 + b2 + c2 + d 2 . [Hint: orthogonal with respect to a quadratic

form Q is expressed in terms of the associated bilinear form Ļ•( p, q) = Q( p + q) ā’

Q( p) ā’ Q(q); apply this with Q(q) = qq ā— = ā’q 2 .]

Deduce that 3 vectors I, J, K ā H have the same multiplication table as the quater-

nion basis i, j, k if and only if they are an oriented orthonormal frame of R3 . Prove

Proposition 8.5.1 (5).

Show how to express C in terms of 2 Ć— 2 matrixes over R of the form ā’b a . ab

8.13

Show that the algebra of 2 Ć— 2 matrixes over C of the form ā’b a is an algebra

ab

8.14

isomorphic to the quaternions H. [Hint: consider the basis given in Exercise 8.8 and

compare also 8.6.]

Consider left multiplication by M = ā’c+id aā’ib acting on C2 . Write out the action

a+ib c+id

8.15

of M on C2 = R4 in terms of the R-basis (1, 0), (i, 0), (0, 1), (0, i) of C2 . Prove that

the determinant of the map on R4 is (a 2 + b2 + c2 + d 2 )2 . Use this to give another

proof that aq is direct in Theorem 8.5.2 (1).

Prove that 2 Ć— 2 matrixes over R of the form a a form an algebra B, and study its

b

8.16 b

properties. Why is it not very interesting? [Hint: show that B is closed under addition

and multiplication of matrixes. Find a basis over R, and write out the multiplication

table.]

By analogy with the previous question, investigate the algebra of 2 Ć— 2 matrixes over

8.17

C of the form ā’b a .

ab

8.18 Use the argument of Theorem 8.5.2 to ļ¬nd a unit quaternion q so that the rotation

rq : x ā’ q xq ā— is (x, y, z) ā’ (y, ā’x, z).

Find a unit quaternion q so that the rotation rq : x ā’ q xq ā— is x ā’ y ā’ z ā’ x.

8.19

[Hint: the effort intensive method is to use brute force. The thinking personā™s method

is to represent x ā’ y ā’ z as a rotation through angle Īø about directed axis L, then

use Theorem 8.5.2.]

EXERCISES 163

By analogy with 1.11.1, solve the relations (1) of 8.6 to get d = a, c = ā’b. [Hint:

8.20

for example, do second line Ć— d ā’ third line Ć— c, then substitute ad ā’ bc = 1 on the

right-hand side.]

8.21 (Harder) Using the results of the two preceding exercises, show how to ļ¬nd a subgroup

BO48 of the unit quaternions which has a surjective two-to-one map to the group of

rotations of the cube in SO(3).

8.22 (Harder) Complete the proof of Theorem 8.5.2 (2).

(a) Prove that Ļ•( p, q) = idH if and only if ( p, q) = (1, 1) or ( p, q) = (ā’1, ā’1).

[Hint: p1q ā— = 1 if and only if p = q, and pi p ā— = i if and only if pi = i p if and

only if p = a + bi, etc.] Deduce that Ļ• induces an injective map (S 3 Ć— S 3 )/Ā±1 ā’

SO(4).

(b) Prove that Ļ• is surjective. [Hint: ļ¬nd a suitable Ļ•( p, q) to send 1 to a given

unit vector r ā H. Now compose with r ā— to assume that 1 ā’ 1, and apply

Theorem 8.5.2 (4).]

(Harder) Consider the algebra O of 2 Ć— 2 matrixes over the quaternions H of the form

8.23

ā—

ab

ā’bā— a ā— where a is the quaternion conjugate of a as in 8.5.1.

(a) Show that O is an 8-dimensional division algebra (algebra with two-sided multi-

plicative inverses for nonzero elements) over R. Find an explicit basis for O and

write out some of the multiplication table.

(b) Show that multiplication in O is not associative, but it satisļ¬es the identity

x(x y) = (x x)y for x, y ā O.

(c) Contemplate on the possibility of doing projective geometry over the division

algebra O (compare the end of 5.12).

O is the algebra of Cayley numbers or octonions. For much more on this, see Conway

and Smith [4].

[Hint: you get a division algebra by introducing an octonion conjugate a such that

aa = |a|2 is positive deļ¬nite, as in 8.5.1. It is easy to ļ¬nd examples of nonassociative

octonion multiplication; to prove the weaker identity, one possibility is to use your

basis for O over R in a brute-force proof similar to that of Proposition 8.5.1 (1) given

in the text. To do projective geometry, you have to start by thinking about the relation

x ā¼ Ī»x used to deļ¬ne projective space. Do not be surprised if you run into difļ¬culty.]

Hermitian matrixes.

An n Ć— n complex matrix A is called Hermitian, if hA = A. (See 8.6 for the Hermitian

8.24

conjugate hA.) Show that

(a) every eigenvalue of a Hermitian matrix is real;

(b) eigenvectors for different eigenvalues are orthogonal with respect to the Hermitian

form on Cn (compare Step 3 in the proof of Theorem 1.11!).

9 Concluding remarks

This ļ¬nal chapter is quite different from the earlier ones in style and intention: I

let my hair down with a number of informal fairy stories on different topics, tying

together loose strands in the historical and mathematical argument of the book, and

opening up some new directions. In particular, I give a ā˜popular scienceā™ discussion

of some of the surprising and amazingly fertile links between the geometry, topology

and Lie group theory discussed in this book and different aspects of twentieth century

physics.

There are many other topics closely related to the main text, both frivolous and

serious, that I would have liked to write about. But life is short, and I conļ¬ne myself

to a brief list of a few directions and developments. Several of these topics can form

the basis for undergraduate essays or projects.

r The classiļ¬cation of locally Euclidean geometries in the style of Nikulin and Sha-

farevich [18].

r Spherical trig and geometry in the history of navigation. Modern developments: GPS

(global positioning system) devices.

r Spherical geometry and cartography (map making): Mercatorā™s and other projections,

as discussed for example in [6].

r Plane and spherical geometry and plate tectonics, following for example [8], Chapter

2. Why South America and West Africa ļ¬t together like pieces of a spherical jigsaw

puzzle; Eulerā™s theorem and the classiļ¬cation of fault types.

r SO(3) and Euler angles, mechanics in moving frames, Coriolis forces.

r Symmetry groups in geometry. This is a vast subject, relating regular polyhedra and

polytopes, crystallography [5, 18], the geometric patterns of the Alhambra and other

Islamic art, Escherā™s art and Penrose tilings.

164

9.1 ON THE HISTORY OF GEOMETRY 165

r Subgroups of the symmetric group in puzzles and toys. Examples include the perfect

shufļ¬‚e groups and moves of the Rubik cube, as in [17] Chapter 19.

r Axiomatic projective geometry, leading to von Neumannā™s foundations of quantum

theory, Cā— algebras and ā˜noncommutative geometryā™.

r Geometry and dynamics: Newtonā™s equations, planetary motion and conics.

r Differential geometry of curves and surfaces. The FrĀ“ net frame, intrinsic curvature

e

and the Gaussā“Bonnet formula.

I leave you to explore details of these fascinating topics, as well as those

sketched below, in or out of the conļ¬nes of a degree course and its attendant exami-

nations.

9.1 On the history of geometry

9.1.1 Geometry has a very special place in the history and culture of western mathematics.

Coming at the dawn of western civilisation (350 Ā± 200 BC), Greek philosophy and

Greek

geometry geometry, passed on to us by the more advanced culture of the Islamic world at the

and rigour time of the Renaissance, has played a central role in the development of western

culture, not merely for its content, but for its idea of rigour. The Greeks were not the

ļ¬rst to attempt to describe the world around them by ā˜geometryā™: that credit goes to

the ancient Mesopotamians (from 2500 BC), followed by the Egyptians (from 2000

BC). However, before the Greeks, geometry largely consisted of a bag of tricks for

calculation that worked in practice most of the time. In contrast, Greek mathematicians

elaborated the notion of logical argument. By this I do not mean the elementary

and often hairsplitting logic of a ā˜Foundationsā™ or ā˜Set theoryā™ or ā˜Abstract algebraā™

course, but the idea that understanding steps at different stages in an argument from

the ground up is at least as important as somehow getting an approximately correct

answer. This is one of the fundamental items of intellectual equipment that set western

mathematics and science apart from (and in the course of time well above) that of India

and China.

Building on sources largely unknown to us, the geometer Euclid, probably working

in Alexandria in the fourth century BC, summarised the mathematical knowledge of

the time in his 13 volume Elements. Book I deals with the basic deļ¬nitions of geometry.

Euclid introduces notions such as point, line, plane, distance, angle and meets, whose

meaning is supposed to be self-evident, and enunciates certain postulates (in modern

language, axioms) concerning these notions. Lengths and angles are to be thought

of as geometric quantities in their own right, not related to any algebraic or numeric

representation. For example, one of the postulates states that two line segments are

equal if they are congruent, which makes perfect sense without having to consider

the length of a line as a number.

166 CONCLUDING REMARKS

Figure 9.1a The parallel postulate. To meet or not to meet?

9.1.2 Most of Euclidā™s postulates were for a long time beyond doubt, but the last one stood

The parallel out from the beginning as far less obvious:

postulate

If a line falls on two lines, with interior angles on one side adding to less than two right angles,

the two lines, if extended indeļ¬nitely, meet on the side on which the angles add to less than

two right angles.

This is nonobvious. Behold Figure 9.1a! Euclidā™s ā˜extended indeļ¬nitelyā™ makes it

clear that the statement involves arguing on objects that are arbitrarily distant, so

that it is in principle not veriļ¬able. Through the ages, many alternative axioms were

formulated, which can be proved to be equivalent to Euclidā™s on the basis of the other

axioms, such as:

given a line L in the plane, and a point P not on L, there exists one and only one line through

P not meeting L

(compare Figure 9.1b and Figure 3.13). Or

the sum of the angles of a triangle is equal to two right angles

(see Figure 1.16b and Theorem 3.14).

After arguably the longest dispute in intellectual history, it was discovered between

about 1810 and 1830 by Bolyai, Gauss, Lobachevsky and Schweikart (independently,

alphabetical order) that the parallel postulate cannot be a consequence of Euclidā™s

other axioms: axiomatic geometries exist which are in many ways similar to Euclidean

plane geometry, sharing its aesthetic appeal and simplicity, but which do not satisfy

the parallel postulate. As JĀ“ nos Bolyai wrote to his father,

a

ollyan felsĀ“ ges dolgokat hoztam ki, hogy magam elbĀ“ multam, s orĀØ kĀØ s kĀ“ r volna elveszni; ha

e a ĀØoo a

Ā“

meglĀ“ tja Edes ApĀ“ m megesmĀ“ ri; most tĀØ bbet nem szĀ“ lhatok, tsak annyit: hogy semmibĖ l egy

a a e o o o

ujj mĀ“ s vilĀ“ got teremtettem; mind az, valamint eddig kĀØ ldĀØ ttem, tsak kĀ“ rtyahĀ“ z a toronyhoz

a a uo a a

kĀ“ pest. . .

e

9.1 ON THE HISTORY OF GEOMETRY 167

P

the unique line

not meeting L

these lines all

meet L

L

Figure 9.1b The parallel postulate in the Euclidean plane.

Or, translated from the nineteenth century Hungarian:

I deduced things so marvellous that I was enchanted myself, and it would be an eternal loss

to let them pass; Dear Father, once you see them, you will recognise their greatness yourself;

now I cannot tell you more, only this: out of the void I created a new, a different world; all that

I sent you before is like a house of cards to a tower. . .

The discovery of non-Euclidean hyperbolic geometry was indeed a landmark in

modern scientiļ¬c thinking, as revolutionary and as far reaching in its implications

as the Copernican model of the solar system or Darwinā™s theory of evolution. For

an account of the very interesting history, see Greenberg [11] and Bonola [3]. The

early models of hyperbolic geometry were abstract; simple coordinate models, such

as that used in Chapter 3 of this course, were developed later in the second half

of the nineteenth and the early twentieth centuries. As I said, the coordinate model

of hyperbolic geometry constructed in Chapter 3 satisļ¬es all of Euclidā™s postulates

except for the parallel postulate; the parallel postulate is therefore certainly not a

logical consequence of the others. Hyperbolic geometry soon found many applications

in different areas of mathematics and science; in particular, the notion of curvature

in differential geometry and of curved space plays a foundational role in Einsteinā™s

general relativity (1916).

Spherical geometry seems to have been excluded from consideration in descriptive

or axiomatic geometry from the time of Euclid for two reasons.

(a) More obviously, any two lines meet in two points (a pair of antipodal points); this is

not a very serious defect, because you can pass to the geometry of S 2 /{Ā±1} = P2 , in

R

which every pair of lines meets in just one point.

(b) Its lines do not satisfy the order condition implicit in Euclid: given three points P,

Q, R on a spherical line (great circle), it is impossible to say which of the three is

ā˜betweenā™ the other two. Equivalently, a point P of a spherical line (great circle) does

not divide it into disconnected sets. That is, given a line L and a point P not on it,

every line M through P meets L both over there to the left and over there to the right

168 CONCLUDING REMARKS

every line

through P

meets L

P

L

Figure 9.1c The ā˜parallel postulateā™ in spherical geometry.

(see Figure 9.1c). In spherical geometry these are antipodal points; in the geometry

of S 2 /{Ā±1} = P2 , the same point.

R

Euclidā™s postulates did not discuss the separation properties of points on a line:

it was supposed to be understood what it meant for A to be between P and Q on

the line segment P Q. (Compare the discussion in 7.3.3; separation is a topological

statement about the geometry.) Thus it is not surprising that spherical geometry was

overlooked; however, this is a fair indication that Euclidā™s claim to rigour in a modern

sense was never really watertight.

Nevertheless, spherical geometry has been around in an ā˜appliedā™ form for cen-

turies. Spherical trigonometry was studied in amazing detail by the great medieval

Islamic geometers in the context of qibla (the sacred direction to Mecca, see for

example [16], and again from the time of Newton, to aid British ships engaged in

piracy or the slave trade to navigate around the oceans of the world and return to

the other origin at Greenwich. Because of winds and currents though, the lines of

spherical geometry, great circles, are not always the fastest way to travel. These days,

great circles are the routes taken for preference by airlines, except when no-ļ¬‚y zones

intervene.

Descartesā™ invention of coordinate geometry is another key ingredient in modern

9.1.3

science. It is scarcely an accident that calculus was discovered by Leibnitz and Newton

Coordinates

(independently, alphabetical order) in the ļ¬fty years following the dissemination of

versus

Descartesā™ ideas. Interactions between the axiomatic and the coordinate-based points

axioms

of view go in both ways: coordinate geometry gives models of axiomatic geometries,

and conversely, axiomatic geometries allow the introduction of number systems and

coordinates. There are several excellent books giving systematic treatments of these

very interesting issues; I warmly recommend Hilbertā™s classic [13].

As in art or music or politics, attitudes and fashions in mathematics vary quite

sharply from one generation to the next. In the second half of the nineteenth century,

up to the time of Hilbert and PoincarĀ“ , geometry was without doubt at the centre of

e

mathematics and of large areas of theoretical physics. This position was overturned

with the rise of abstract algebra, topology and set theoretic foundations of mathematics

9.2 GROUP THEORY 169

around the 1920s. The blame for this lies in part with the geometers themselves, who

developed a sloppy attitude to correct statements and proofs of theorems. One example

is the type of argument that involved a ā˜sufļ¬ciently general positionā™, which might

in favourable cases have a precise meaning within an epsilon neighbourhood of the

author. In England, there was a brilliant school of geometers between the wars in

Cambridge, which seems to have been broken up when the participants were drafted

into code breaking or aeronautics during the second world war. When the senior

author was an undergraduate at Cambridge (late 1960s), geometry in the sense of this

course was universally considered a terribly dull fuddy-duddy subject. The position

has been entirely turned around in the last 30 years, and at present geometry in

its various manifestations again claims centre stage in mathematics and theoretical

physics.

9.2 Group theory

According to the abstract deļ¬nition (which is comparatively recent), an abstract group

9.2.1

is a set with a composition law satisfying a couple of well known axioms. However,

Abstract

from the beginnings of the subject in the nineteenth century, the groups studied were

groups

always thought of as symmetry groups, that is, as transformation groups preserving

versus

some structure or other. For example, Rufļ¬ni, Abel and Galois considered permuta-

transforma-

tions of the roots of a polynomial equations, and the subgroup of permutations that

tion

preserve the rules of arithmetic. From the mid-nineteenth century, many other groups

groups

arose as geometric symmetries: ļ¬nite groups such as the symmetries of the regular

polyhedra, inļ¬nite but discrete groups in the study of crystallography, that contain

translations by a lattice as a subgroup, and Lie groups such as the Euclidean group.

The idea that a group can be treated as an abstract composition law without reference

to the nature of the operators that make it up was ļ¬rst introduced by Cayley in 1854,

but its signiļ¬cance was not recognised until much later.

Let G be a group and a set; I say that is a G-set or that G acts on , if a

group homomorphism

Ļ• : G ā’ Trans

is given from G to the group of transformations of (see 6.1). That is, each g ā G

corresponds to a transformation (bijective map) Ļ•g : ā’ , in such a way that the

abstract composition law in G corresponds to composition of transformations of .

In other words, G is trying to fulļ¬l its destiny as a transformation group of , as

discussed in Chapter 6. One usually writes simply Ļ•g (x) = gx or g(x) for the action

of g ā G on x ā .

The requirement that the map Ļ• is a homomorphism is written (gh)x = g(hx).

This looks like an associative law, but it just means that the abstract product in G cor-

responds to composition of maps ā’ ; compare the discussion in 2.4. Evaluating

g ā G on x ā provides a map : G Ć— ā’ given by (g, x) = Ļ•g (x); I leave

it to you to express the condition (gh)x = g(hx) in these terms.

170 CONCLUDING REMARKS

9.2.2 Let be a G-set. I say that G acts transitively on if the action takes

Definition

Homo- any point of to any other. In this case is a homogeneous space under G.

geneous and

principal

This idea has already appeared many times: the geometries in the earlier chapters of

ńņš. 7 |