. 7
( 8)



the group of orthogonal n — n matrixes, is a topological group in the subspace topol-
ogy. Hence also Eucl(n), the group of Euclidean motions, and the group of motions
of S 2 (see 3.5) are topological groups.

Hyperbolic motions form a matrix group, the Lorentz group or group
Example 5
of Lorentz transformations (see 3.11 for the notation and compare Theorem 3.11 and
Exercise 8.5)

A J A = J , and A preserves the
O (1, 2) = A ∈ GL(3, R) .
halves of the cone q L (v) < 0

This is also a topological group. It and its higher dimensional colleagues O+ (1, n)
are important in special relativity and related areas of physics.

The topological groups in Examples 2“5 have an interesting ˜continuous™ geom-
etry. Here is a simple example (see Figure 8.0): recall that O(2) is the group of all
rotation matrixes cos θ ’ sinθθ and re¬‚ection matrixes cos θ ’sin θ θ . Thus O(2) is a
θ θ
sin cos
sin cos
union of two connected components, each a copy of the circle S 1 parametrised by the
angle θ. One aim of this chapter is to generalise this nice description to some other
orthogonal groups.

8.2 Dimension counting
Here I begin the study of some particular aspects of the geometry of transformation
groups. In this section I want to concentrate on a measure of their size. Recall that
O(2) can be described geometrically as the union of two circles. The circle S 1 is a one
dimensional geometric object in the sense that its points depend on one real parameter
θ; standing at a point of the circle, there is one direction in which you can move.
Without going into rigorous details, by dimension of a transformation group G,
denoted dim G, I understand the number of continuous real parameters needed to

characterise an element g ∈ G. The previous paragraph then shows that dim O(2) = 1.
Do not get confused by the fact that O(2) has two components; to characterise elements
of O(2), I need one continuous real parameter (the angle θ) and a discrete parameter
(the choice of one of the components, equivalently the sign of the determinant, or its
value ±1).
I proceed to compute the dimension of transformation groups in some nontrivial
cases. The computations will be performed by describing elements of the groups in a
way which makes it possible to count the parameters involved directly.

An element g ∈ Eucl(n) depends on n+1
real parameters, so
Proposition 2
dim Eucl(n) = 2 . Further,

dim O(n) = , dim GL(n, R) = n 2 , dim PGL(n + 1, R) = n(n + 2).

The language of Euclidean frames from 1.12 gives a way of specifying
elements of the Euclidean group. Choose a reference frame {P0 , P1 , . . . , Pn }; then by
Theorem 1.12, elements of the Euclidean group Eucl(n) correspond one-to-one with
the set of Euclidean frames {Q 0 , Q 1 , . . . , Q n }. Now calculate:
r Q 0 ∈ En is any point, so depends on n parameters;
r Q 1 ∈ En is any point with d(Q 0 , Q 1 ) = 1, that is, it is any point of the unit sphere
S n’1 with centre Q 0 , hence depends on n ’ 1 real parameters;

r writing e1 = P0 P1 and e⊥ = En’1 ‚ En for the orthogonal complement, Q 2 is given
by a point of the unit sphere S n’2 ‚ En’1 , so depends on n ’ 2 real parameters;
r similarly, Q i is given by a point of S n’i , and hence depends on n ’ i real parameters;
r in particular, Q n is one of two points, so has no continuous parameter.
Thus a Euclidean frame depends on

dim Eucl(n) = n + (n ’ 1) + · · · + 1 + 0 =
An element of O(n) ¬xes the origin, which I can take to be P0 = Q 0 in the above
argument. Hence the dimension count is

dim O(n) = (n ’ 1) + · · · + 1 + 0 = ,
agreeing with dim O(2) = 1. Said slightly differently, O(n) and Eucl(n) differ by the
translation part (compare Proposition 6.5.3), which accounts for n parameters:

n+1 n
dim O(n) = dim Eucl(n) ’ n = ’n = .
2 2
The dimension of the general linear group can be calculated in exactly the same
way. Elements of GL(n, R) correspond to invertible maps of the vector space Rn . Such

a map is determined by the images of the n usual basis vectors in Rn , parametrised
by a total of n 2 numbers (the entries of the matrix representing the map). Not all
parametrisations give invertible maps, but most do: I only have to exclude matrixes
with zero determinant. Hence there are n 2 real parameters involved, so

dim GL(n, R) = n 2 .

Finally by Theorem 5.5 there are as many projective transformations as projective
frames of reference. Hence I have to pick n + 2 general points in Pn , leading to

dim PGL(n + 1, R) = (n + 2)n

parameters. Incidentally, the dimension of the projective group can also be calculated
from its de¬nition PGL(n + 1, R) = GL(n + 1, R)/R— , which gives

dim PGL(n + 1, R) = dim GL(n + 1, R) ’ 1
= (n + 1)2 ’ 1 = (n + 2)n. QED

You can design your own parameter counts for some other groups not mentioned
in the proposition; for example, do and generalise Exercise 8.3.

8.3 Compact and noncompact groups

The orthogonal group O(n) is a compact topological space.

This is a simple application of Proposition 7.4.2. The orthogonal group
is a matrix group: it is a subspace of the space Rn of real matrixes. Hence it is
enough to show that it is closed and bounded. The equation tA A = 1n de¬nes a closed
subset of Rn , so the main issue is boundedness. However, if A = (ai j ) is orthogonal,
then its columns form an orthonormal basis and in particular for every 1 ¤ k ¤ n,
i=1 aik = 1. Hence

aki = n


which just says that every orthogonal matrix A is contained in a ball of radius n in
Rn . QED
A compact space is often much more pleasant to work with than a noncompact one.
However, many transformation groups are visibly noncompact, such as the additive
group R. On the other hand, the topology and geometry of R are very simple (for
example, R is simply connected, and can be parametrised by a real parameter without
overlap). Most transformation groups are of course more complicated; however, in
a suitable sense they can be topologically decomposed as a compact group times a
group homeomorphic to Rn .

The simplest example is the multiplicative group R— of nonzero real
Example 1
numbers. There is a homeomorphism (in this case, an isomorphism of groups)

R+ — {±1} ’ R— ;

in plain English, every nonzero number is the product of a positive number and a sign.
The space R+ is homeomorphic to R; the group {±1} is ¬nite so clearly compact.

Although the next example looks similarly innocent, it appears in
Example 2
many different guises throughout geometry, Fourier analysis, Lie groups, representa-
tion theory, complex analysis and number theory. Consider the multiplicative group
C— of nonzero complex numbers. This is a topological group; for example, I can
view C as the plane R2 and take the subspace topology. The space C— is obviously
noncompact. However, there is a homeomorphism (even a group isomorphism)

S 1 — R+ ’
(θ, r ) ’ r exp(iθ).

Here S 1 is compact (and de¬nitely not homeomorphic to a product of copies of R,
which is the essential content of 7.15.4, Corollary 1) and R+ is homeomorphic to R.

The ¬nal example is more substantial, and deals with the difference
Example 3
between the groups GL(n, R) and O(n). Write T+ (n) ‚ GL(n, R) for the set of upper
triangular matrixes with positive diagonal entries:

T+ (n) = M = (m i j ) ∈ GL(n, R) m i j = 0 for all i > j, and m ii > 0
±« 
+ — ···
 
 
 0 + — · · ··
¬ 
¬ ·
=¬ ·.
 0 · · · . . . — 
 
 
 
0 ··· 0 +

It is easy to see that T+ (n) ‚ GL(n, R) is a subgroup.

Every element A ∈ GL(n, R) can be written in a unique way in the form
A = BC, where B ∈ O(n) is an orthogonal matrix and C ∈ T+ (n) is an upper trian-
gular matrix with positive diagonal entries. Moreover, B and C depend continuously
on A. The map

GL(n, R) ’ O(n) — T+ (n) given by A ’ (B, C)

is a homeomorphism (see 7.3, but not a group homomorphism!).

The space O(n) is compact by the above Proposition. The space T+ (n)
is homeomorphic to R N , where N = n+1 . Many geometric questions on GL(n, R)

reduce to similar questions on O(n); for a simple example, compare Remark 8.4. Note
also the dimension count:

dim O(n) + dim T+ (n) = + = n 2 = dim GL(n, R).
2 2
I view the n — n matrix A as a row made up of n column vectors fi . Thus
{f1 , . . . , fn } is a basis of Rn because A ∈ GL(n, R). If it is an orthonormal basis then
there is no problem: A ∈ O(n), and we must take B = A and C = 1. If A is not
orthogonal to start with, then the Gram“Schmidt process described in the proof of
Theorem B.3 (1) produces an orthonormal basis. Set B to be the matrix formed from
the new basis vectors as columns, and C to be the matrix describing the change of
basis. Clearly B ∈ O(n); I leave you to check (see Exercise 8.6) that C ∈ T+ (n) and
that B, C depend continuously on A. Then the map A ’ (B, C) is continuous, and
its inverse is matrix multiplication (B, C) ’ BC. QED

8.4 Components
Recall from 7.4.1 that every topological space can be decomposed into a number of
components, which are themselves connected. I repeatedly discussed the geometry
of O(2): a union of two circles. A circle S 1 is connected, so O(2) has two connected
components. This is typical:

The group O(n) has two connected components, distinguished by
det A = ±1.

One can use Theorem 8.3 to show that GL(n, R) also has two connected
components, that are distinguished by det A > 0 and det A < 0; see Exercise 8.4. The
group O(1, 2) of all Lorentz matrixes has 4 components, as discussed in Exercise 8.5.

An orthogonal matrix has determinant ±1. (Compare 1.10; recall that I
called A direct if det A = 1 and opposite if det A = ’1.) The function
det : O(n) ’ {±1}
is continuous, so the two possibilities det A = ±1 determine two disjoint open and
closed sets of O(n). It remains to show that each of these sets is path connected.
Fix a matrix A ∈ O(n). By the normal form theorem 1.11, A can be written with
respect to a suitable orthonormal basis in the diagonal block form with 2 — 2 diagonal
cos θi ’ sin θi
Bi = ,
sin θi cos θi
and one optional block ±1. For t varying from 0 to 1, let A(t) be the matrix with the
same block form as A, but with blocks

’ sin tθi
cos tθi
Bi (t) = .
sin tθi cos tθi

The rule t ’ A(t) gives a continuous path [0, 1] ’ O(n) joining A either to the
identity or to the element diag(1, . . . , 1, ’1). Therefore, the two subsets of O(n)
de¬ned by det A = ±1 are both path connected. A path connected space is connected
by Lemma 7.4.1 (2). QED

The special orthogonal group is the group

SO(n) = A ∈ O(n) det A = 1 .

By the Proposition, this is a connected component of O(n). Since it is the kernel of a
group homomorphism det : O(n) ’ {±1}, it is also a normal subgroup of index 2 in
In the special case n = 3, the elements of SO(3) can be described explicitly. By the
normal form theorem 1.11, any orthogonal 3 — 3 matrix of determinant 1 has the form
« 
 cos θ ’ sin θ 
sin θ cos θ

in a suitable basis. If l is the line through the origin with direction vector given by
the ¬rst basis element, then the motion of E3 described by this matrix is the rotation
Rot(l, θ) around the line l. Hence SO(3) is the group of rotations of E3 about axes
passing through O.

8.5 Quaternions, rotations and the geometry of SO(n)
As I discussed before, for n = 2 the group SO(2) is homeomorphic to the circle S 1 .
The purpose here is to ¬nd a similar description of the special orthogonal groups
SO(3) and SO(4) in terms of the 3-sphere. I start with a small detour to introduce
the quaternions, the main protagonists in the game. Note that SO(n) is the group
of direct motions of En with a ¬xed point, or in other words the group of rotations
of En ; hence the aim is to ¬nd a connection between quaternions and rotations (for
n = 3, 4).

The algebra of quaternions is the real vector space
H = a + bi + cj + dk with a, b, c, d ∈ R,

with the multiplication law

i 2 = j 2 = k 2 = ’1, i j = k, jk = i, ki = j, ji = ’k, k j = ’i, ik = ’ j.

The cyclic symmetry makes this easy to remember.
Some terminology, similar to the traditional language of complex numbers: if
q = a + bi + cj + dk, write q — = a ’ bi ’ cj ’ dk for the conjugate quaternion.
We say that q is real if b = c = d = 0 and pure imaginary if a = 0.


H is an associative noncommutative R-algebra of dimension 4 over R.
The conjugation q ’ q — is an antiinvolution, meaning

( pq)— = q — p — for all p, q ∈ H.

|q|2 = qq — = q — q = a 2 + b2 + c2 + d 2 is a positive de¬nite quadratic form on H;
therefore for any nonzero q ∈ H, the element
q ’1 = q — /|q|2

is a 2-sided inverse of q. Hence H is a division algebra or skew ¬eld.
If q ∈ H and q ∈ R, then q = A + B I with I pure imaginary, I 2 = ’1 and A, B ∈ R.
Hence the subalgebra R[q] of H generated by q is of the form R[q] ∼ C ‚ H.
If I is pure imaginary with I = ’1, there exists J, K ∈ H such that I, J, K have the
same multiplication table as i, j, k, that is I 2 = J 2 = K 2 = ’1 and I J = K , etc.

(1) Noncommutativity is clear from the multiplication table: i j = k =
’k = ji.
Because everything is R-linear, it is enough to check the associative law a(bc) =
(ab)c for the basis elements a, b, c ∈ {1, i, j, k}. If any of a, b, c is 1 then it is OK.
By the cyclic symmetry, I can assume that the ¬rst term a = i; if only i appears, then
I am working in a copy of C. This leaves only 8 cases to check by brute force:

i(i j) = ik = ’ j = (i 2 ) j; i(ik) = i(’ j) = ’k = (i 2 )k;
i( ji) = i(’k) = j = ki = (i j)i; i( j 2 ) = ’i = k j = (i j) j;
i( jk) = i 2 = ’1 = k 2 = (i j)k; i(ki) = i j = k = ’ ji = (ik)i;
i(k j) = ’i 2 = 1 = ’ j 2 = (ik) j; i(k 2 ) = ’i = ’ jk = (ik)k.
This is of course pure gobbledygook. A much more convincing argument is to say that
i, j, k are maps of something, such that multiplication coincides with composition of
maps, so is associative for a fundamental reason; see Exercise 8.8.
(2) Again because everything is R-linear, it is enough to check that ( pq)— = q — p —
for basis elements a, b ∈ {1, i, j, k}. The brute force method is an easy exercise:
(1i)— = ’i = (i — )(1— ), (i j)— = ’k = (’ j)(’i), etc.; see Exercise 8.9.
(3) On multiplying out the product (a + bi + cj + dk)(a ’ bi ’ cj ’ dk), the
terms a 2 + b2 + c2 + d 2 appear in the obvious way from the squared terms. The
cross terms all cancel out, either as (a — ’bi) + (bi — a) = 0 or (bi — ’cj) +
(cj — ’bi) = ’bc(i — j + j — i) = 0.
(4) Note that q + q — = 2a and qq — = |q|2 ∈ R, so that q and q — are the two roots
of a quadratic polynomial x 2 ’ 2ax + |q|2 with real coef¬cients. Also, q ’ q — =
2(bi + cj + dk) is pure imaginary, and an easy calculation similar to that in (3) shows
that (q ’ q — )2 = ’4(b2 + c2 + d 2 ) < 0 (because q ∈ R), so that this has no real roots.
Thus q = A + B I where A = a, B = (b2 + c2 + d 2 ) and I is pure imaginary with
I 2 = ’1.
(5) is worked out as an exercise in Exercise 8.12. QED

(3) says that the Euclidean distance on R4 = H is determined by the
algebra structure of H together with the antiinvolution q ’ q — . This has various nice
corollaries. For example, the direct sum decomposition

H = {real quaternions} • {imaginary quaternions} = R • R3

is orthogonal. Also, two imaginary vectors p, q anticommute pq = ’ pq if and only
if the corresponding vectors of R3 are orthogonal. This point is the main reason that
quaternions can be applied to rotations of E3 and E4 .

U = {unit quaternions} = {q ∈ H | qq — = 1} = S 3 ‚ R4
for the unit quaternions. Note that U has two structures: it is a group under mul-
tiplication, and also has its own geometry as the sphere S 3 . The two structures are
compatible as in 8.1. The group U generalises the multiplicative group of complex
numbers of modulus 1, which is the unit circle S 1 ‚ C.
For the next theorem, identify H and its quadratic form |q| with E4 and its Eu-
clidean distance. The purely imaginary quaternions form a linear subspace which gets
identi¬ed with E3 .


For any p ∈ U , left multiplication a p : x ’ px de¬nes a map H ’ H which is a
direct motion of H = E4 ¬xing the origin; the same holds for right multiplication
bq : x ’ xq — .
The group homomorphism • : U — U ’ SO(4) de¬ned by

•( p, q) = a p —¦ bq : x ’ pxq —

is surjective, and •( p, q) = idH if and only if ( p, q) = (1, 1) or ( p, q) = (’1, ’1).
For any q ∈ U , the map rq : x ’ q xq — is a direct motion of H = E4 , which is the
identity on real elements of H and takes pure imaginary quaternions of H to pure
imaginary quaternions. Thus it de¬nes a rotation of the subspace E3 ‚ H of pure
imaginary quaternions.
Any q ∈ U with q ∈ R has a unique expression in the form q = cos θ + I sin θ, where
I ∈ U is a pure imaginary quaternion and θ ∈ (0, π). Then rq = Rot(I, 2θ) is the
rotation of R3 about the directed axis de¬ned by I through the angle 2θ.
The group homomorphism ψ : U = S 3 ’ SO(3) de¬ned by

ψ(q) = rq

is surjective, and ψ(q1 ) = ψ(q2 ) if and only if q1 = ±q2 .

(1) It is clear that a p is a motion, since it ¬xes 0 and | px|2 = |x|2 . Moreover,
it must be a direct motion, for example, because det(aq ) is a continuous map from the
connected set U = S 3 to ±1. (Several other proofs are possible, see Exercise 8.15.)
I relegate (2) to Exercise 8.22.
(3) is obvious, since a ∈ R commutes with quaternion multiplication, so rq (a) =
qaq — = aqq — = a. Also, if p — = ’ p, then rq ( p) = q pq — has (rq ( p))— = (q pq — )— =
q p — q — = ’q pq — , so q pq — is pure imaginary.
(4) follows from Proposition 8.5.1 (4): R[q] ∼ C. The equation x 2 = ’1 has
exactly two roots ±I in C, and choosing the appropriate sign gives q = cos θ + I sin θ
with θ ∈ (0, π). Then rq (I ) = I follows because R[q] ∼ C, so that q — = q ’1 and
qIq = I.
Now let J, K be as in Proposition 8.5.1 (5). Then

q J q — = (cos θ + I sin θ)J (cos θ ’ I sin θ)
= (cos2 θ ’ sin2 θ)J + (2 sin θ cos θ)K ,

and similarly q K q — = ’(2 sin θ cos θ)J + (cos2 θ ’ sin2 θ)K . Thus rq ¬xes the di-
rected axis de¬ned by I , and performs a rotation by 2θ in the plane spanned by J, K .
Finally (5) follows by (4); every rotation is hit exactly twice because of the
2θ. QED

After all this algebra, come the relations between groups of rotations and the sphere
Spheres and
groups (1) There is a homeomorphism
S 3 /∼,
where ∼ is the equivalence relation on S 3 that identi¬es antipodal points x and ’x.
(2) There is a homeomorphism
(S 3 — S 3 )/≈,
where ≈ is the equivalence relation on S 3 — S 3 that identi¬es (x, y) with (’x, ’y).

Both statements are direct corollaries of the previous theorem together
with Theorem 7.14 and the de¬nition of the quotient topology and its UMP discussed
in 7.5.
In more detail, by Theorem 8.5.2 (5) there is a continuous surjective map
ψ : S 3 ’ SO(3), with ψ(x) = ψ(y) if and only if x = y or x = ’y. By the uni-
versal mapping property 7.5 of the quotient topology, there is consequently a con-
tinuous map ψ : (S 3 /∼) ’ SO(3) that is clearly a bijection. Now S 3 is compact,
and therefore so is S 3 /∼ by Proposition 7.4.3. Also the subspace topology of
SO(3) ‚ R9 = {3 — 3 matrixes} is metric and therefore Hausdorff. Therefore all the
8.6 THE GROUP SU(2) 153

assumptions of Theorem 7.14 are satis¬ed, ψ is a homeomorphism, and (1) follows.
(2) is proved in exactly the same way using the map • : U — U ’ SO(4) of Theo-
rem 8.5.2 (2). QED

The statements of the corollary generalise for all n; namely, there ex-
ists a compact topological group Spin(n) called the spinor group with a surjective
homomorphism π : Spin(n) ’ SO(n) with kernel ι of order 2, so that π induces
an isomorphism of groups Spin(n)/ ι ’ SO(n) that is also a homeomorphism [15].
The pleasant thing about low dimensions is the fact that the spinor groups are spheres
or products of spheres: Spin(2) S 1 , Spin(3) S 3 , Spin(4) S 3 — S 3 .

8.6 The group SU(2)
In this brief section, I identify the group U of unit quaternions of 8.5 as a matrix
group. This involves more linear algebra over the complex numbers, a subject that
already made a brief but important appearance in 1.11.
Let V be a 2-dimensional C-vector space together with a positive de¬nite Hermitian
form, represented in some basis by |z 1 |2 + |z 2 |2 , or the matrix 1 0 (see B.6 for more
details on Hermitian forms). A complex linear transformation of V that preserves this
form is unitary: thus a matrix A ∈ GL(2, C) is unitary if it satis¬es hA A = In , where
A is the Hermitian conjugate de¬ned by (hA)i j = A ji . The group of all such matrixes

is the unitary group U(2). I am interested in its subgroup, the special unitary group

SU(2) = A ∈ U(2) det A = 1 .
As matrix groups, both U(2) and SU(2) are topological groups in an obvious way.

A unitary matrix A has | det A| = 1; see Exercise B.4. Thus the set of
possible values for the determinant is the unit circle S 1 , which is connected. Thus
SU(2) is a normal subgroup, but not a connected component of U(2) in the same way
as SO(2) is in O(2).

I write out explicitly the condition for a matrix A ∈ GL(2, C) to be special unitary
(compare 1.11.1). If A = a d , the equations are

aa + cc = 1,
det A = ad ’ bc = 1.
ab + cd = 0, and (1)
bb + dd = 1,

One solves these equations more-or-less as in 1.11.1 to get d = a and c = ’b, where
aa + bb = 1; see Exercise 8.20. Thus

a b
SU(2) = a, b ∈ C, |a|2 + |b|2 = 1 .
’b a

This description has an important corollary.

The map ’b a ’ a + bj de¬nes an isomorphism from SU(2) to the
group U of unit quaternions of 8.5.2.

Write a = a1 + a2 i and b = b1 + b2 i. Then a + bj = a1 + a2 i + b1 j +
b2 k using quaternion multiplication. The condition |a|2 + |b|2 = 1 becomes |a1 |2 +
|a2 |2 + |b1 |2 + |b2 |2 = 1 hence a + bj has quaternion norm 1. The map SU(2) ’ U
is clearly a bijection. It remains to check that the map respects multiplication, so that
it becomes a group isomorphism; this is a special case of Exercise 8.14. QED

Theorem 8.5.2 (5) on the description of SO(3) can thus be reformulated as say-
ing that there exists a two-to-one surjective group homomorphism SU(2) ’ SO(3)
(compare also Exercise 8.3). The two groups are now matrix groups (over different
¬elds), but the existence of the two-to-one map is by no means obvious from the
matrix description: the most convincing way of going from complexes to reals is via

8.7 The electron spin in quantum mechanics
This section relates the geometry of SO(3) to a fundamental attribute of elementary
particles: their spin. All the mathematics needed is at hand already; however, there is no
space in the present book to introduce all the necessary background from quantum me-
chanics. For more information and insight, see Feynman™s classic [7], Chapters 1“3.

The story begins in 1925. Two Dutch doctoral students George Uhlenbeck and Samuel
Goudsmit, halfway through their Ph.D. program, noted that the electron inside the
The story of
atom appeared to have, besides the three known ˜quantum numbers™ associated with
the electron
the position of the electron, its angular momentum around the nucleus and its magnetic
¬eld, an extra degree of freedom. They postulated the existence of an extra ˜quantum
number™, which they called the electron spin. This new quantum number seemed
to behave in many ways like angular momentum, so they gave the interpretation
that it corresponds to some kind of intrinsic rotational motion. However, the quantum
number appeared to have just two possible values (+) and (’), and the rotation seemed
not to have a de¬nite axis; strange facts for a ˜spinning™ particle. Their advisor Paul
Ehrenfest is said to have commented: ˜You are both young enough to be able to afford
a stupidity!™ (he realised soon afterwards though that his students had in fact made
an important discovery).
Unknown to Uhlenbeck and Goudsmit, the experimental veri¬cation of their dis-
covery had been around for three years in the form of the Stern“Gerlach experiment.
In 1922 the German scientists Otto Stern and Walther Gerlach built the device illus-
trated schematically in Figure 8.7a. The source emits a beam of silver atoms. The
beam is directed between the poles of a magnet, which produces a magnetic ¬eld
orthogonal to the direction of the path. As the atoms are electrically neutral, they
are not expected to experience force; they should thus pass through the device with-
out any change in their direction. However, a screen on the other side of the device

magnet screen

silver atoms
with (+) spin

beam of silver atoms

silver atoms
with (’) spin

Figure 8.7a The Stern“Gerlach experiment.

reveals that the atoms are in fact de¬‚ected by the magnetic ¬eld, and moreover that
they follow one of two possible paths.
The experiment can only be understood in terms of the notion of spin. A silver
atom has an electron on an outer shell, whose intrinsic spin interacts with the magnetic
¬eld. Atoms whose outer electron is in the (+) spin state follow a different path from
those in the (’) spin state.
The mid-1920s was of course the time when quantum mechanics was invented.
Soon after Uhlenbeck and Goudsmit™s proposal, Pauli and Dirac incorporated elec-
tron spin into the quantum mechanical theory of the electron, also known as the
Schr¨ dinger equation. Since this is not a course about the electron, I do not need to
worry unduly with the details.

In the following, I assume a modi¬ed form of the Stern“Gerlach (SG) device, illus-
trated in Figure 8.7b. This is only a thought experiment1 , explained in detail in [7],
pp. 5-1 and 5-2. An electron beam arrives from the left, and separates inside the device
spin: the
S into two beams according to its spin under the action of the left-hand ˜magnet™. A
combination of other ˜magnets™ forces the electrons back into their horizontal path;
the outcoming beam still consists of a mixture of electrons in the two spin states.
Assume now that I block the path of one of the beams inside the device, as in the
case of device S of Figure 8.7c. Then the electrons leaving the device S are all in a
de¬nite spin state (+). In this sense, I have now ˜measured™ the spin of this beam of
1 The experiment cannot be carried out as described here: the electron™s wave function is too fuzzy because
of quantum mechanical effects, and the separation into two rays is not apparent. The point about the silver
atom featuring in the original Stern“Gerlach experiment is that it is electrically neutral, but has a relatively
free electron on an outer shell; its motion between magnets is thus governed by the spin of the outer
electron. In the text I stick to the thought experiment involving free electrons.


electron beam


Figure 8.7b The modified Stern“Gerlach device.

S S′
Figure 8.7c Two identical SG devices.

electrons: I know precisely what state they are in. (Unfortunately, I have lost about
half my electrons along the way, but that seems to be unavoidable in this kind of game.
Compare with a large accountancy ¬rm hired to count your money.) In particular, if I
attach another SG device S in the same position after the ¬rst as in Figure 8.7c, then
I know the path of all the electrons inside the device; blocking the other path then
makes no difference.
However, let us now put another SG device T in a different spatial position in the
path of my uniform spin electron ray; see Figure 8.7d. The ray now separates again;
the electrons choose two different paths in a speci¬c ratio (which can be measured
again by blocking one or other of the paths) depending on the position of the new
SG device. Hence knowing that the electron is in spin state (+) in one direction does
not mean that it is in spin state (+) in all directions. It registers as spin (+) or (’) in
some different direction following, it seems, a ¬xed dress code.

As both experiment and speculation con¬rm, the electron spin takes two possible
values +1 and ’1, where I ignore unnecessary constants. In the framework of quantum
The spin
mechanics, such a two-state system is modelled on a 2-dimensional complex vector
space V with a de¬nite Hermitian form on it, which I denote by bracket ( , ). Every
electron in this simple model is described by its wave function ψ ∈ V, which we
normalise to unit length (ψ, ψ) = 1.

Figure 8.7d Two different SG devices.

An SG device S in a ¬xed spatial position corresponds to a linear operator
O S : V ’ V . The possible spin states with respect to this spatial direction corre-
spond to the different eigenvalues of this map. In the present case, the eigenvalues
+ ’
must therefore be ±1. There are corresponding normalised eigenvectors ψ S and ψ S :
+ + ’ ’
O S (ψ S ) = ψ S , O S (ψ S ) = ’ψ S .

Quantum mechanics postulates that the operator O S is Hermitian (Exercise 8.24).
+ ’ + ’
It follows that the eigenvectors are orthogonal (ψ S , ψ S ) = 0. Thus {ψ S , ψ S } is a
Hermitian basis in the 2-dimensional vector space V.
The electron with wave function ψ S is in the (+) spin state and that with wave

function ψ S is in the (’) spin state. These electrons are in eigenstates of the spin
operator O S . An arbitrary electron has a wave function ψ ∈ V which is a linear
combination of the basis vectors:
+ ’
ψ = ±ψ S + βψ S .

Such a state is referred to as a mixed state.
+ ’
An electron in a mixed state ψ = ±ψ S + βψ S arriving at our SG device S passes
along the (+) or (’) path in the device with probability |±|2 or |β|2 respectively.
These numbers are called probability amplitudes. Because both basis vectors ψ S and
the vector ψ are normalised to unit length, |±|2 + |β|2 = 1; thus these probabilities
add to one.
Once we block the (’) path, the outcoming electrons are all in the (+) eigenstate:
their wave function is the eigenvector ψ S ∈ V . This explains their behaviour in a
next SG device S in the same spatial position as S, pictured in Figure 8.7c. The
operator corresponding to the device S is O S = O S , and the electrons are all in the
(+) eigenstate of this operator. So they choose the two paths with probability |±|2 = 1,
respectively |β|2 = 0; in other words, their path through S is determined.

To perform our next thought experiment, imagine a beam of electrons leaving a device
in one of the de¬nite eigenstates, and arriving at another device in a different spatial
Rotate the
position as in Figure 8.7d. The new SG device T corresponds to an operator OT and
+ ’
hence to a new Hermitian basis {ψT , ψT } of V consisting of eigenvectors of OT .

I wish to study an electron ray in one of the spin eigenstates ψ S , when it passes
through T . The experiment says that electrons will follow one of two possible paths
in T , and I want the probability of its taking one or other of the paths. According to
+ ’
the rule spelled out in the last section, I should write the vector ψ S (and also ψ S ) in
+ ’
terms of the new basis {ψT , ψT } to ¬nd the probability amplitudes. This is simply a
change of basis, given by a 2 — 2 matrix A S’T , an element of GL(2, C) (in fact U(2)
as both bases are Hermitian). The task is to ¬nd A S’T from S, T .
To proceed, I need to make precise the geometry of an SG device in 3-space.
Note that an SG device in physical space E3 determines two distinguished orthogonal
directed lines; namely, there is the distinguished direction of the electron beam, and
the distinguished direction of the magnetic ¬eld orthogonal to it; see Figure 8.7a. I can
think of these directed lines as two coordinate axes in a coordinate system, and there
is a unique way of adding a third directed coordinate axis orthogonal to the ¬rst two to
make a right-handed coordinate system in 3-space. The new system T determines in
the same way a new right-handed coordinate system in E3 . The transformation which
gets me from S to T is a direct motion of E3 , and thus a rotation g ∈ SO(3). (Note
that only directions matter in this discussion; the origin of the coordinate system is
not important, and I ignore translations.)
According to the earlier discussion, I need a recipe associating an element of
GL(2, C) with a transformation S ’ T , presumably in a continuous manner. In other
words, I need a map

A : SO(3) ’ GL(2, C).

It can also be argued from basic principles of quantum mechanics that the map A
should respect composition; after all, S ’ T followed by T ’ R should be the
same as S ’ R. Hence the map A should be a group homomorphism. This however
presents a puzzle: there is no obvious way to map SO(3) to the group of linear maps
on a 2-dimensional C-vector space (apart from the map which takes every rotation
to the identity matrix, which would contradict the experimentally observed fact that
spin does depend on direction). In fact there is absolutely no such map at all.

± ±
Although the expressions for ψT in terms of ψ S and the rotation taking S to T
can be derived from ¬rst principles, I cannot improve on Feynman™s beautiful and
The solution
self-contained account (in pp. 6-1 to 6-14 of [7]), and I just state the result: namely,
although there is no map A : SO(3) ’ GL(2, C), there is an obvious map

A : SU(2) ’ GL(2, C)

from the group SU(2) to GL(2, C); a 2 — 2 unitary matrix is certainly invertible, so
the inclusion map will do. On the other hand, SU(2) is not too different from SO(3);
by Corollary 8.5.3, they are related by a two-to-one map. Thus A can be thought of
as a two-valued function on SO(3).
Up to a knowledge of the explicit form of the map SU(2) ’ SO(3) that can easily
be derived from the expressions in 8.5.2, this answers the original question of how

to compute the ratio of electrons following the two paths of Figure 8.7d: S ’ T is
given by an element of SO(3), and there are two possible changes of basis
+ + ’
ψ T = ±+ ψ S + β+ ψ S
’ + ’
ψ T = ±’ ψ S + β’ ψ S

for matrixes

±+ ±’
∈ SU(2)
β+ β’

which differ from each other only in a change of sign; the eigenvectors are in any
case determined only up to sign, and the physical meaning is only carried by the
amplitudes |±± | and |β± | which are independent of the choice of signs made.
One way to think of the process is to start with an SG device S and then start to
turn it around a ¬xed axis. This determines a path in the group SO(3) starting from the
identity. Starting from the identity matrix in SU(2), I can follow this path in SU(2),
and see what happens to the transformation matrix. It turns out that after a full turn
by 2π of my device, that is, after a loop in SO(3) returning to the identity, my path
in SU(2) takes me to the negative of the identity matrix. Following the loop in SO(3)
once again, I can continue my path in SU(2), and lo and behold! a turn of 4π returns
me to the identity matrix in SU(2).
This thought experiment with paths re¬‚ects the topological fact that the funda-
mental group of SO(3) is Z/2, and its universal cover is the map S 3 ’ SO(3) of
8.5“8.6 (see a ¬rst course in topology for the language). It is also responsible for the
mysterious statement turning up frequently in physics texts, that ˜rotation by 2π does
not leave the wave function of the electron invariant, but multiplies it by (’1)™. As I
am told, this can be directly demonstrated by experiment.
As a ¬nal comment, note that in this chapter I dealt with spin for a ˜spin 1 ™ particle
such as the electron, whose spin can take two values (+) or (’). There are also
˜spin 1™ particles such as the heavy particles Z , W ± which are responsible for nuclear
forces. Their spin can take the values (+), 0 or (’). Much of the discussion of this
chapter applies to such three-state systems; compare [7], Chapter 5. Their spin can
be measured by a three-way SG device. The vector space W representing spin states
is now 3-dimensional over C, and the transformation S ’ T between SG devices
corresponds to a map B : SO(3) ’ GL(3, C). In this case, there is no great mystery:
this map is, up to conjugation, the obvious inclusion map, where I think of a 3 — 3 real
orthogonal matrix as a 3 — 3 complex invertible matrix (the ˜vector representation™).
For this reason spin 1 particles are often called ˜vector particles™.

8.8 Preview of Lie groups
The topological groups GL(n, R) and O(n) are examples of Lie groups, groups whose
elements depend on a ¬nite number of continuous parameters. Examples of Lie groups
include the Euclidean group Eucl(n), the Lorentz group O+ (1, 2), the special linear
group SL(n) (the group of invertible n — n matrixes with determinant 1), the spinor

groups Spin(n), and groups de¬ned using the complex numbers such as the group
GL(n, C) of invertible matrixes over C. Here is a list of features of general Lie group
theory that made an appearance in this chapter:

The geometry of the group around any point can be described by d
parameters, where the number d is independent of the point chosen, and is called the
dimension of the group. Examples from Proposition 8.2 are

dim O(n) = dim Eucl(n) = .
2 2
A Lie group G has a number of connected components (¬nite or
in¬nite), all of them geometrically the same (homeomorphic). The component con-
taining the identity is a normal subgroup, and the other components are its cosets.
See 8.4 for O(n) and Exercise 8.5 for the group O+ (1, 2).

A connected Lie group G is homeomorphic to a
Maximal compact subgroup
product H — R of a compact Lie group H and a space R N in which all loops are

contractible (compare 7.15). The examples of 8.3 are typical: compactness is achieved
by imposing a positive de¬nite orthogonal or Hermitian form.

A connected Lie group G has a cover G ’ G by a simply
The universal cover
connected Lie group G (possibly G itself). The typical examples are the exponential
map C ’ C— and the two-to-one spinor covers S 3 ’ SO(3) and S 3 — S 3 ’ SO(4)
discussed in 8.5.3.

The group GL(n, C) is the complexi¬cation of
Complexification and real forms
the group GL(n, R): the latter is a matrix group, and I can simply take complex instead
of real entries. Conversely, we say that GL(n, R) is a real form of GL(n, C). Along the
same lines, the group O(n, C) of n — n complex matrixes, which leave the standard
quadratic (!) form i xi2 invariant, is a complexi¬cation of the group O(n). However,
O(n) is not the only real form: over the complex numbers, there is no difference
between the forms i xi2 and ’x1 + i>1 xi2 . Thus the Lorentz group O(1, n ’ 1)

is also a real form of O(n, C).

Just as ¬nite groups, Lie groups are often studied via their
Linear representations
linear (matrix) representations. In plain language, we associate to every group element
g ∈ G an n — n (complex) matrix A g so that Ah A g = Ahg . In fancier language, this is
nothing but a group homomorphism G ’ GL(n, C); one familiar example is the map
A : SU(2) ’ GL(2, C) from 8.7.5. I recommend Fulton and Harris [9] for further

Lie groups commonly appear as symmetry groups
Symmetry groups in physics
of interesting physical systems. The mathematics of the group and the physics of the
system are often related in beautiful and nontrivial ways. The interaction occurs on

two levels: ˜classical™ (meaning Newtonian dynamics and Maxwell electromagnetic
theory) and ˜modern™ (meaning relativity theory or quantum mechanics, possibly
both). The story of the electron in 8.7.5 is the starting point of the ˜quantum™ level of
this interaction; for more discussion, turn to 9.3 and Sternberg [23].

8.1 How much bigger is the af¬ne group Aff(n) than the Euclidean group Eucl(n)? [Hint:
compare GL(n) and O(n) in 8.3.]
(a) Show that rotations, translations, re¬‚ections and glides of E2 (Theorem 1.14)
depend respectively on 3, 2, 2 and 3 parameters.
(b) Count parameters for each of the types of motion of Theorem 1.15. (Answers:
(1) translation 3; (2) rotation 5; (3) twist 6; (4) re¬‚ection 3; (5) glide 5; (6) rotary
re¬‚ection 6. For example, a rotation is speci¬ed by a line of 3-space, which
depends on 4 parameters, plus an angle.)
8.3 Count the number of real parameters for the groups SO(3) and SU(2); verify that they
depend on the same number of parameters, as you would expect from the two-to-one
cover discussed in 8.6. [Hint: use Proposition 8.2, respectively the results of 8.6.]
Determine the connected components of GL(n, R) using Theorem 8.3 and
Proposition 8.4.
8.5 Let

O(1, 2) = A ∈ GL(3, R) AJ A = J

be the group of all Lorentz matrixes, which contains the Lorentz group O+ (1, 2)
introduced in 8.1, Example 5. Show that this group has four connected components,
distinguished by whether a matrix preserves the cone q L (v) < 0 or maps it to q L (v) >
0 (that is, whether it is in O+ (1, 2)), and det A = ±1. [Hint: imitate the proof of
Proposition 8.4, using the Lorentz normal form statement of Exercise B.3. Distinguish
carefully between four types of possible diagonal matrixes arising as end products.]
Let A ∈ GL(n, R) be a matrix with columns fi . Following the proof of Theorem B.3
(1) carefully, show that it is possible to construct an orthonormal basis {ei } of Rn , so
that in each step

ei = ci1 f1 + · · · + cii fi

with cii > 0. Let C = (ci j ) and B the matrix with columns ei ; check that A = BC
and that B ∈ O(n), C ∈ T+ (n) (compare 8.3). Check also that the entries of B and C
depend continuously on those of A.
Write the following matrixes in the form BC of Theorem 8.3 with B ∈ O(n) and
C ∈ T+ (n):
« 
√ 103
1+ √
1 3 13 2 ’1 4 .
√ , ,
’1 + 3 14

Exercises on quaternions.

8.8 Show that 4 complex matrixes

10 i0 01 0i
1= , I= , J= , K=
0 ’i ’1 0
01 i0

multiply together by the same rules as the 4 basic quaternions 1, i, j, k. Since matrix
multiplication is associative, use this to give a better proof of Proposition 8.5.1 (1).
Complete the proof by brute force of ( pq)— = q — p — for quaternion conjugation
(Proposition 8.5.1 (2)). Give a better proof along the lines of the previous exercise.
Study the group G 8 = {±1, ±i, ± j, ±k} of unit quaternions. Write out the group
multiplication table, and ¬nd a convincing reason (or failing that, any reason) why
G 8 is not isomorphic to the dihedral group D8 appearing in Exercise 6.5.
If p = ai + bj + ck and q = di + ej + f k are two pure imaginary quaternions, cal-
culate pq + q p directly using the de¬nition of quaternion multiplication.
Prove that a pure imaginary quaternion p satis¬es p 2 = ’| p|2 . Also if p, q are
pure imaginary then pq + q p = 0 if and only if they are orthogonal with respect to
the quadratic form a 2 + b2 + c2 + d 2 . [Hint: orthogonal with respect to a quadratic
form Q is expressed in terms of the associated bilinear form •( p, q) = Q( p + q) ’
Q( p) ’ Q(q); apply this with Q(q) = qq — = ’q 2 .]
Deduce that 3 vectors I, J, K ∈ H have the same multiplication table as the quater-
nion basis i, j, k if and only if they are an oriented orthonormal frame of R3 . Prove
Proposition 8.5.1 (5).
Show how to express C in terms of 2 — 2 matrixes over R of the form ’b a . ab
Show that the algebra of 2 — 2 matrixes over C of the form ’b a is an algebra
isomorphic to the quaternions H. [Hint: consider the basis given in Exercise 8.8 and
compare also 8.6.]
Consider left multiplication by M = ’c+id a’ib acting on C2 . Write out the action
a+ib c+id
of M on C2 = R4 in terms of the R-basis (1, 0), (i, 0), (0, 1), (0, i) of C2 . Prove that
the determinant of the map on R4 is (a 2 + b2 + c2 + d 2 )2 . Use this to give another
proof that aq is direct in Theorem 8.5.2 (1).
Prove that 2 — 2 matrixes over R of the form a a form an algebra B, and study its
8.16 b
properties. Why is it not very interesting? [Hint: show that B is closed under addition
and multiplication of matrixes. Find a basis over R, and write out the multiplication
By analogy with the previous question, investigate the algebra of 2 — 2 matrixes over
C of the form ’b a .

8.18 Use the argument of Theorem 8.5.2 to ¬nd a unit quaternion q so that the rotation
rq : x ’ q xq — is (x, y, z) ’ (y, ’x, z).
Find a unit quaternion q so that the rotation rq : x ’ q xq — is x ’ y ’ z ’ x.
[Hint: the effort intensive method is to use brute force. The thinking person™s method
is to represent x ’ y ’ z as a rotation through angle θ about directed axis L, then
use Theorem 8.5.2.]

By analogy with 1.11.1, solve the relations (1) of 8.6 to get d = a, c = ’b. [Hint:
for example, do second line — d ’ third line — c, then substitute ad ’ bc = 1 on the
right-hand side.]
8.21 (Harder) Using the results of the two preceding exercises, show how to ¬nd a subgroup
BO48 of the unit quaternions which has a surjective two-to-one map to the group of
rotations of the cube in SO(3).
8.22 (Harder) Complete the proof of Theorem 8.5.2 (2).
(a) Prove that •( p, q) = idH if and only if ( p, q) = (1, 1) or ( p, q) = (’1, ’1).
[Hint: p1q — = 1 if and only if p = q, and pi p — = i if and only if pi = i p if and
only if p = a + bi, etc.] Deduce that • induces an injective map (S 3 — S 3 )/±1 ’
(b) Prove that • is surjective. [Hint: ¬nd a suitable •( p, q) to send 1 to a given
unit vector r ∈ H. Now compose with r — to assume that 1 ’ 1, and apply
Theorem 8.5.2 (4).]
(Harder) Consider the algebra O of 2 — 2 matrixes over the quaternions H of the form

’b— a — where a is the quaternion conjugate of a as in 8.5.1.
(a) Show that O is an 8-dimensional division algebra (algebra with two-sided multi-
plicative inverses for nonzero elements) over R. Find an explicit basis for O and
write out some of the multiplication table.
(b) Show that multiplication in O is not associative, but it satis¬es the identity

x(x y) = (x x)y for x, y ∈ O.

(c) Contemplate on the possibility of doing projective geometry over the division
algebra O (compare the end of 5.12).
O is the algebra of Cayley numbers or octonions. For much more on this, see Conway
and Smith [4].
[Hint: you get a division algebra by introducing an octonion conjugate a such that
aa = |a|2 is positive de¬nite, as in 8.5.1. It is easy to ¬nd examples of nonassociative
octonion multiplication; to prove the weaker identity, one possibility is to use your
basis for O over R in a brute-force proof similar to that of Proposition 8.5.1 (1) given
in the text. To do projective geometry, you have to start by thinking about the relation
x ∼ »x used to de¬ne projective space. Do not be surprised if you run into dif¬culty.]
Hermitian matrixes.
An n — n complex matrix A is called Hermitian, if hA = A. (See 8.6 for the Hermitian
conjugate hA.) Show that
(a) every eigenvalue of a Hermitian matrix is real;
(b) eigenvectors for different eigenvalues are orthogonal with respect to the Hermitian
form on Cn (compare Step 3 in the proof of Theorem 1.11!).
9 Concluding remarks

This ¬nal chapter is quite different from the earlier ones in style and intention: I
let my hair down with a number of informal fairy stories on different topics, tying
together loose strands in the historical and mathematical argument of the book, and
opening up some new directions. In particular, I give a ˜popular science™ discussion
of some of the surprising and amazingly fertile links between the geometry, topology
and Lie group theory discussed in this book and different aspects of twentieth century
There are many other topics closely related to the main text, both frivolous and
serious, that I would have liked to write about. But life is short, and I con¬ne myself
to a brief list of a few directions and developments. Several of these topics can form
the basis for undergraduate essays or projects.

r The classi¬cation of locally Euclidean geometries in the style of Nikulin and Sha-
farevich [18].
r Spherical trig and geometry in the history of navigation. Modern developments: GPS
(global positioning system) devices.
r Spherical geometry and cartography (map making): Mercator™s and other projections,
as discussed for example in [6].
r Plane and spherical geometry and plate tectonics, following for example [8], Chapter
2. Why South America and West Africa ¬t together like pieces of a spherical jigsaw
puzzle; Euler™s theorem and the classi¬cation of fault types.
r SO(3) and Euler angles, mechanics in moving frames, Coriolis forces.
r Symmetry groups in geometry. This is a vast subject, relating regular polyhedra and
polytopes, crystallography [5, 18], the geometric patterns of the Alhambra and other
Islamic art, Escher™s art and Penrose tilings.


r Subgroups of the symmetric group in puzzles and toys. Examples include the perfect
shuf¬‚e groups and moves of the Rubik cube, as in [17] Chapter 19.
r Axiomatic projective geometry, leading to von Neumann™s foundations of quantum
theory, C— algebras and ˜noncommutative geometry™.
r Geometry and dynamics: Newton™s equations, planetary motion and conics.
r Differential geometry of curves and surfaces. The Fr´ net frame, intrinsic curvature
and the Gauss“Bonnet formula.

I leave you to explore details of these fascinating topics, as well as those
sketched below, in or out of the con¬nes of a degree course and its attendant exami-

9.1 On the history of geometry
9.1.1 Geometry has a very special place in the history and culture of western mathematics.
Coming at the dawn of western civilisation (350 ± 200 BC), Greek philosophy and
geometry geometry, passed on to us by the more advanced culture of the Islamic world at the
and rigour time of the Renaissance, has played a central role in the development of western
culture, not merely for its content, but for its idea of rigour. The Greeks were not the
¬rst to attempt to describe the world around them by ˜geometry™: that credit goes to
the ancient Mesopotamians (from 2500 BC), followed by the Egyptians (from 2000
BC). However, before the Greeks, geometry largely consisted of a bag of tricks for
calculation that worked in practice most of the time. In contrast, Greek mathematicians
elaborated the notion of logical argument. By this I do not mean the elementary
and often hairsplitting logic of a ˜Foundations™ or ˜Set theory™ or ˜Abstract algebra™
course, but the idea that understanding steps at different stages in an argument from
the ground up is at least as important as somehow getting an approximately correct
answer. This is one of the fundamental items of intellectual equipment that set western
mathematics and science apart from (and in the course of time well above) that of India
and China.
Building on sources largely unknown to us, the geometer Euclid, probably working
in Alexandria in the fourth century BC, summarised the mathematical knowledge of
the time in his 13 volume Elements. Book I deals with the basic de¬nitions of geometry.
Euclid introduces notions such as point, line, plane, distance, angle and meets, whose
meaning is supposed to be self-evident, and enunciates certain postulates (in modern
language, axioms) concerning these notions. Lengths and angles are to be thought
of as geometric quantities in their own right, not related to any algebraic or numeric
representation. For example, one of the postulates states that two line segments are
equal if they are congruent, which makes perfect sense without having to consider
the length of a line as a number.

Figure 9.1a The parallel postulate. To meet or not to meet?

9.1.2 Most of Euclid™s postulates were for a long time beyond doubt, but the last one stood
The parallel out from the beginning as far less obvious:
If a line falls on two lines, with interior angles on one side adding to less than two right angles,
the two lines, if extended inde¬nitely, meet on the side on which the angles add to less than
two right angles.

This is nonobvious. Behold Figure 9.1a! Euclid™s ˜extended inde¬nitely™ makes it
clear that the statement involves arguing on objects that are arbitrarily distant, so
that it is in principle not veri¬able. Through the ages, many alternative axioms were
formulated, which can be proved to be equivalent to Euclid™s on the basis of the other
axioms, such as:

given a line L in the plane, and a point P not on L, there exists one and only one line through
P not meeting L

(compare Figure 9.1b and Figure 3.13). Or

the sum of the angles of a triangle is equal to two right angles

(see Figure 1.16b and Theorem 3.14).
After arguably the longest dispute in intellectual history, it was discovered between
about 1810 and 1830 by Bolyai, Gauss, Lobachevsky and Schweikart (independently,
alphabetical order) that the parallel postulate cannot be a consequence of Euclid™s
other axioms: axiomatic geometries exist which are in many ways similar to Euclidean
plane geometry, sharing its aesthetic appeal and simplicity, but which do not satisfy
the parallel postulate. As J´ nos Bolyai wrote to his father,

ollyan fels´ ges dolgokat hoztam ki, hogy magam elb´ multam, s or¨ k¨ s k´ r volna elveszni; ha
e a ¨oo a
megl´ tja Edes Ap´ m megesm´ ri; most t¨ bbet nem sz´ lhatok, tsak annyit: hogy semmib˝ l egy
a a e o o o
ujj m´ s vil´ got teremtettem; mind az, valamint eddig k¨ ld¨ ttem, tsak k´ rtyah´ z a toronyhoz
a a uo a a
k´ pest. . .

the unique line
not meeting L

these lines all
meet L

Figure 9.1b The parallel postulate in the Euclidean plane.

Or, translated from the nineteenth century Hungarian:

I deduced things so marvellous that I was enchanted myself, and it would be an eternal loss
to let them pass; Dear Father, once you see them, you will recognise their greatness yourself;
now I cannot tell you more, only this: out of the void I created a new, a different world; all that
I sent you before is like a house of cards to a tower. . .

The discovery of non-Euclidean hyperbolic geometry was indeed a landmark in
modern scienti¬c thinking, as revolutionary and as far reaching in its implications
as the Copernican model of the solar system or Darwin™s theory of evolution. For
an account of the very interesting history, see Greenberg [11] and Bonola [3]. The
early models of hyperbolic geometry were abstract; simple coordinate models, such
as that used in Chapter 3 of this course, were developed later in the second half
of the nineteenth and the early twentieth centuries. As I said, the coordinate model
of hyperbolic geometry constructed in Chapter 3 satis¬es all of Euclid™s postulates
except for the parallel postulate; the parallel postulate is therefore certainly not a
logical consequence of the others. Hyperbolic geometry soon found many applications
in different areas of mathematics and science; in particular, the notion of curvature
in differential geometry and of curved space plays a foundational role in Einstein™s
general relativity (1916).
Spherical geometry seems to have been excluded from consideration in descriptive
or axiomatic geometry from the time of Euclid for two reasons.

(a) More obviously, any two lines meet in two points (a pair of antipodal points); this is
not a very serious defect, because you can pass to the geometry of S 2 /{±1} = P2 , in
which every pair of lines meets in just one point.
(b) Its lines do not satisfy the order condition implicit in Euclid: given three points P,
Q, R on a spherical line (great circle), it is impossible to say which of the three is
˜between™ the other two. Equivalently, a point P of a spherical line (great circle) does
not divide it into disconnected sets. That is, given a line L and a point P not on it,
every line M through P meets L both over there to the left and over there to the right

every line
through P
meets L

Figure 9.1c The ˜parallel postulate™ in spherical geometry.

(see Figure 9.1c). In spherical geometry these are antipodal points; in the geometry
of S 2 /{±1} = P2 , the same point.

Euclid™s postulates did not discuss the separation properties of points on a line:
it was supposed to be understood what it meant for A to be between P and Q on
the line segment P Q. (Compare the discussion in 7.3.3; separation is a topological
statement about the geometry.) Thus it is not surprising that spherical geometry was
overlooked; however, this is a fair indication that Euclid™s claim to rigour in a modern
sense was never really watertight.
Nevertheless, spherical geometry has been around in an ˜applied™ form for cen-
turies. Spherical trigonometry was studied in amazing detail by the great medieval
Islamic geometers in the context of qibla (the sacred direction to Mecca, see for
example [16], and again from the time of Newton, to aid British ships engaged in
piracy or the slave trade to navigate around the oceans of the world and return to
the other origin at Greenwich. Because of winds and currents though, the lines of
spherical geometry, great circles, are not always the fastest way to travel. These days,
great circles are the routes taken for preference by airlines, except when no-¬‚y zones

Descartes™ invention of coordinate geometry is another key ingredient in modern
science. It is scarcely an accident that calculus was discovered by Leibnitz and Newton
(independently, alphabetical order) in the ¬fty years following the dissemination of
Descartes™ ideas. Interactions between the axiomatic and the coordinate-based points
of view go in both ways: coordinate geometry gives models of axiomatic geometries,
and conversely, axiomatic geometries allow the introduction of number systems and
coordinates. There are several excellent books giving systematic treatments of these
very interesting issues; I warmly recommend Hilbert™s classic [13].
As in art or music or politics, attitudes and fashions in mathematics vary quite
sharply from one generation to the next. In the second half of the nineteenth century,
up to the time of Hilbert and Poincar´ , geometry was without doubt at the centre of
mathematics and of large areas of theoretical physics. This position was overturned
with the rise of abstract algebra, topology and set theoretic foundations of mathematics

around the 1920s. The blame for this lies in part with the geometers themselves, who
developed a sloppy attitude to correct statements and proofs of theorems. One example
is the type of argument that involved a ˜suf¬ciently general position™, which might
in favourable cases have a precise meaning within an epsilon neighbourhood of the
author. In England, there was a brilliant school of geometers between the wars in
Cambridge, which seems to have been broken up when the participants were drafted
into code breaking or aeronautics during the second world war. When the senior
author was an undergraduate at Cambridge (late 1960s), geometry in the sense of this
course was universally considered a terribly dull fuddy-duddy subject. The position
has been entirely turned around in the last 30 years, and at present geometry in
its various manifestations again claims centre stage in mathematics and theoretical

9.2 Group theory
According to the abstract de¬nition (which is comparatively recent), an abstract group
is a set with a composition law satisfying a couple of well known axioms. However,
from the beginnings of the subject in the nineteenth century, the groups studied were
always thought of as symmetry groups, that is, as transformation groups preserving
some structure or other. For example, Ruf¬ni, Abel and Galois considered permuta-
tions of the roots of a polynomial equations, and the subgroup of permutations that
preserve the rules of arithmetic. From the mid-nineteenth century, many other groups
arose as geometric symmetries: ¬nite groups such as the symmetries of the regular
polyhedra, in¬nite but discrete groups in the study of crystallography, that contain
translations by a lattice as a subgroup, and Lie groups such as the Euclidean group.
The idea that a group can be treated as an abstract composition law without reference
to the nature of the operators that make it up was ¬rst introduced by Cayley in 1854,
but its signi¬cance was not recognised until much later.
Let G be a group and a set; I say that is a G-set or that G acts on , if a
group homomorphism

• : G ’ Trans

is given from G to the group of transformations of (see 6.1). That is, each g ∈ G
corresponds to a transformation (bijective map) •g : ’ , in such a way that the
abstract composition law in G corresponds to composition of transformations of .
In other words, G is trying to ful¬l its destiny as a transformation group of , as
discussed in Chapter 6. One usually writes simply •g (x) = gx or g(x) for the action
of g ∈ G on x ∈ .
The requirement that the map • is a homomorphism is written (gh)x = g(hx).
This looks like an associative law, but it just means that the abstract product in G cor-
responds to composition of maps ’ ; compare the discussion in 2.4. Evaluating
g ∈ G on x ∈ provides a map : G — ’ given by (g, x) = •g (x); I leave
it to you to express the condition (gh)x = g(hx) in these terms.

9.2.2 Let be a G-set. I say that G acts transitively on if the action takes
Homo- any point of to any other. In this case is a homogeneous space under G.
geneous and
This idea has already appeared many times: the geometries in the earlier chapters of


. 7
( 8)