. 6
( 24)


a photon of 4-momentum p collides with an electron of 4-momentum q. It is
easiest to consider the collision in the inertial frame S in which the electron is at
rest and the photon is travelling along the positive x1 -axis (see Figure 5.4). Thus
the components of p and q in S are

= h /c h /c 0 0
= me c 0 0 0

where is the frequency of the photon as measured by a stationary observer in
S, and me is the rest mass of the electron. Let us assume that, after the collision,
¯ ¯
the electron and photon have 4-momenta p and q such that they move off in the
plane x3 = 0, making angles and respectively with the x1 -axis. Thus

¯ = h¯ /c h¯ /c cos h¯ /c sin 0
¯ = ’
u me u cos u me sin 0
q u me c

where u is the electron™s speed and ¯ is the photon frequency as measured by
a stationary observer in S after the collision. Conservation of total 4-momentum
means that

p +q = p +q
¯ ¯

which gives

h /c + me c = h¯ /c + (5.24)
u me c

h /c = h¯ /c cos + u me u cos (5.25)
0 = h¯ /c sin ’ u me u sin (5.26)
5.13 Accelerating observers

Eliminating u and from these equations leads to the formula for Compton
scattering, which gives the frequency of the photon in S after the collision:

¯= 1+ 1 ’ cos
me c2

¯ ¯
The components of the 4-momentum p (or q) in any other inertial frame S
can be found easily by using p =
¯ ¯
p , where are the elements of the
Lorentz transformation matrix connecting the frames S and S .

5.13 Accelerating observers
So far we have only considered inertial observers, who move at uniform speeds
with respect to one another. Let us now consider a general observer , who
may be accelerating with respect to some inertial frame S. If the observer has a
4-velocity u , where is the proper time measured along the worldline, then
his 4-acceleration is given by
It is worth noting that, at any given event P, the 4-acceleration a is always
orthogonal to the corresponding 4-velocity u, since
d d
a·u = 2u·u = =0
d d
An accelerating observer has no inertial frame in which he or she is always at
rest. Nevertheless, at any event P along the worldline we can define an instan-
taneous rest frame S , in which the observer is momentarily at rest. Since
the observer is at rest in S , the timelike basis vector e0 of this frame must be
parallel to the 4-velocity u of the observer. The remaining spacelike basis vectors
ei (i = 1 2 3) of S are all orthogonal to e0 and to one another and will depend on
the relative velocity of S and S and the relative orientation of their spatial axes.
Observations made by at the event P thus correspond to measurements made
in the instantaneous rest frame (IRF) S at P. This is illustrated in Figure 5.5.
Thus, the notion of a localised laboratory can be idealised as follows. An
observer (whether accelerating or not) carries along four orthogonal unit vectors
(or tetrad), which vary along his worldline but always satisfy

·e = (5.28)
126 Special relativity revisited





Figure 5.5 The basis vectors e0 e1 at the event P in the instantaneous rest frame
S of an observer who is accelerating with respect to the inertial frame S.

In particular, the timelike unit vector is given by

=u (5.29)

where u is the normalised 4-velocity of the observer and is simply u /c. At
any event P along the observer™s worldline, the tetrad comprises the basis vectors
of the Cartesian IRF at the event P and defines a time direction and three space
directions to which the observer will refer all measurements. Thus, the results of
any measurement made by the observer at the event P are given by projections
of physical quantities (i.e. vectors and tensors) onto these tetrad vectors.
An important example occurs when the worldline of the observer intersects the
worldline of some particle at the event P (at which we take the observer™s proper
time to be ). If p is the 4-momentum of the particle at this event then the energy
E of the particle as measured by the observer is given by
= p · e0 ’ E = p·u
Similarly, the covariant components pi of the spatial momentum of the particle
as measured by the observer are given by

pi = p · ei

Another example is provided by the 4-acceleration a. Since at any event P on the
worldline we have e0 = u, the orthogonality condition (5.27) and the fact that in
the IRF u = c 0 imply that the components of the 4-acceleration in the IRF
5.13 Accelerating observers

are a = 0 a . Thus the magnitude of the 3-acceleration in the IRF can be
computed as the simple invariant a · a.
It is interesting to consider how the tetrad of basis vectors changes along
the worldline of an observer whose acceleration varies arbitrarily with time.
As it is transported along the observer™s worldline, the tetrad must satisfy the
two requirements (5.28) and (5.29). Clearly, given u the condition (5.29)
determines the timelike basis vector e0 uniquely. Unfortunately, condition
(5.28) is obviously insufficient to determine uniquely the evolution of the spacelike
i = 1 2 3 , which reflect the different ways in which the
basis vectors ei
observer™s local laboratory might be spinning and tumbling. An important special
case, however, is when the tetrad is ˜non-rotating™.
This last requirement requires some clarification. Clearly, the basis vectors of
the tetrad at any proper time are related to the basis vectors e of some given
inertial frame by the Lorentz transformation

e e

Thus the tetrad basis vectors at two successive instants must also be related to each
other by a Lorentz transformation, which can be thought of as a ˜rotation™ in space-
time. A ˜non-rotating™ tetrad is one where the basis vectors e change from instant
to instant by precisely the amount implied by the rate of change of u but with no
additional rotation. In other words, we accept the inevitable rotation in the timelike
plane defined by u and a but rule out any ordinary rotation of the 3-space vectors.
Since we wish to treat the time and space directions on an equal footing, we must
seek a general expression for the rate of change de /d of a basis vector along
the worldline such that: (i) it generates the appropriate Lorentz transformation if
e lies in the timelike plane defined by u and a, and (ii) it excludes any rotation
if e lies in any other plane, in particular any spacelike plane. A little reflection
shows that the unique answer to these requirements is

de 1
= u·e a’ a·e u (5.30)

Any vector that undergoes the above transformation is said to be Fermi“Walker
transported along the worldline. From (5.30), we find that if e is orthogonal to
both u and a then de /d = 0 as required. Moreover, we see that de0 /d = a/c,
again as required.
A physical example of a 3-space vector that does not rotate along the worldline
is the spin (i.e. the angular momentum vector) of a gyroscope that the observer
accelerates with himself by means of forces applied to its centre of mass (so that
128 Special relativity revisited

there are no torques). Indeed, a careful observer could set up a non-rotating tetrad
by aligning his three spatial axes using such gyroscopes.

5.14 Minkowski spacetime in arbitrary coordinates
There is no need to label events in Minkowski spacetime with the Cartesian
inertial coordinates we have used thus far. The advantage of Cartesian coordinates
X , which put the line element into the form5

ds2 = (5.31)
dX dX

(even just at a particular event P), is that they have a clear physical meaning, i.e.
they correspond to time and distances measured by an observer at P who is at rest
in some inertial frame S labelled using three-dimensional Cartesian coordinates
(we will prove this below). Nevertheless, we are free to label events in spacetime
using any arbitrary system of coordinates x although, in general, the coordinates
in such an arbitrary system may not have simple physical meanings.
Since the path of a free massive particle is a geodesic in Minkowski spacetime,
its worldline x in some arbitrary coordinate system is given by the geodesic
d2 x dx dx
+ =0 (5.32)
d2 dd
An inertial frame S is defined as one in which a free particle moves in a straight
line with fixed speed. Thus from (5.31) it is clear that coordinates X , such that
(5.31) holds, define an inertial frame. In this case, the connection vanishes,
and so the worldline of a particle is given by
d2 X
=0 (5.33)
Setting X = cT X Y Z for the moment, the = 0 equation (5.33) shows
that dT/d = constant. Thus the = 1 2 3 equations read
d2 X d2 Y d2 Z
= = =0
dT 2 dT 2 dT 2
from which we see immediately that a free particle moves in a straight line with
constant speed in S.
We could label the inertial frame S using three-dimensional spatial coordi-
nates that are not Cartesian, however. For example, we could use spherical polar

In the interest of clarity, in this section we will denote Cartesian inertial coordinates by X and an arbitrary
coordinate system by x .
5.14 Minkowski spacetime in arbitrary coordinates

coordinates. This would correspond to making a change of variables in Minkowski
spacetime to the new system x = ct r , where

T =t X = r sin cos Y = r sin sin Z = r cos

In this case, the line element becomes

ds2 = c dt2 ’ dr 2 ’ r 2 d ’ r 2 sin2
2 2

= diag 1 ’1 ’r 2 ’r 2 sin2 . From the metric we can
so the metric is g
show that the non-vanishing components of the connection in this coordinate
system are (with c = 1)

= ’r = r sin2
1 1
22 33

= 1/r = ’ sin cos
2 2
12 33

= 1/r = cot
3 1
13 22

Thus, from (5.32), the geodesic equations for the worldline x of a free particle
are very complicated in these coordinates (exercise), in spite of the fact that, to
an observer with fixed r coordinates (i.e. at rest in S), a free particle still
moves in a straight line with fixed speed.
Alternatively, we could use three-dimensional Cartesian coordinates to label
points in a non-inertial frame S that is accelerating with respect to S. As an
example, consider transforming from X = cT X Y Z to a new system of
coordinates x = ct x y z , where t x y z are defined by the equations6

T =t X = x cos t ’ y sin t Y = x sin t + y cos t Z=z

Thus points with constant x y z values (i.e. the values are fixed in S ) rotate
with angular speed about the Z-axis of S (see Figure 5.6). Substituting these
definitions into (5.31), the line element becomes

ds2 = c2 ’ x2 + y2 dt2 + 2 ydtdx ’ 2 xdtdy ’ dx2 ’ dy2 ’ dz2

and the geodesic equations (5.32) are (exercise)
™ ™™
¨ xt 2 ’ 2 y t = 0

™ ™™
¨ y t 2 + 2 xt = 0


For a full discussion, see for example J. Foster & J. D. Nightingale, A Short Course in General Relativity,
Springer-Verlag, 1995.
130 Special relativity revisited





Figure 5.6 The coordinate system x y z rotating relative to the inertial coor-
dinate system X Y Z .

where the dots denote differentiation with respect to proper time . These equa-
tions give the worldline x of a free particle in this coordinate system. Once
again, the first equation implies that dt/d = constant, so that we can replace
the dots in the remaining three equations with derivatives with respect to t.
Multiplying through by the rest mass m of the particle and rearranging, these
equations become

d2 x dy
m 2 =m x + 2m
dt dt
d2 y dx
m 2 =m y ’ 2m
dt dt
d2 z
m 2 =0
or, in 3-vector notation,

d2 x dx
m 2 = ’m — — x ’ 2m — (5.34)
dt dt
where x = x y z and = 0 0 . Thus we recover the equation of motion
for a free particle in a rotating frame of reference. We note, however, that the
coordinate t is the time measured by clocks at rest in the non-rotating system S,
since we have set t = T . It is possible to rewrite the equation of motion in terms of
the proper time measured by an observer at some some fixed position in S , but to
do so would involve replacing (5.34) by a more complicated equation that tends to
conceal the Coriolis and centrifugal forces. Note that t is exactly the proper time

for an observer situated at the common origin O of the two systems, so observers
close to O who are at rest in S would accept (5.34) as (approximately) valid.
From these examples, we see that in general the geodesic equations can be
rather complicated both for non-inertial frames and for inertial frames labelled
by non-Cartesian spatial coordinates. Thus, when describing physical effects in
an inertial frame, it is conventional to use Cartesian spatial coordinates to label
points in the frame and so to work in a coordinate system X for which (5.31) is
valid. It is then much easier to disentangle the physical effects from artefacts of
the coordinate system.

5.1 Show that the transformation matrix for a Lorentz transformation from S to S in
standard configuration is given by (5.4).
5.2 Show that, under a Lorentz transformation, the covariant components of a vector
transform as v = v . Hence show explicitly in component form that, for two
4-vectors v and w, the scalar product v · w is invariant under a Lorentz transformation.
5.3 Prove that, for any timelike vector v in Minkowski space, there exists an inertial
frame in which the spatial components are zero.
5.4 Prove (a) that the sum of any two spacelike vectors is spacelike; and (b) that a
timelike vector and a null vector cannot be orthogonal.
5.5 For the spaceship discussed in Section 1.14, which maintains a uniform acceleration
a in the x-direction of some inertial frame S, the worldline is given by

c a a
= = ’1 =0 =0
sinh cosh
t x y z
a c a c
where is the proper time of an astronaut on the spaceship. Show that the 4-velocity
of the rocket in the coordinate system ct x y z is given by
a a
= c cosh c sinh 00
c c
Hence show explicitly that u u = c2 and that the spaceship™s 3-velocity is
u = c tanh 00
5.6 Show that the 4-acceleration of the spaceship in Exercise 5.5 is given by
a a
= a sinh a cosh 00
c c
Hence show that a a = a2 and that the magnitude of the spaceship™s 3-acceleration
in its own instantaneous rest frame is also a.
5.7 A spaceship has constant acceleration g in the x-direction in its locally comoving
frame, i.e. the IRF. Show that, in an inertial frame, the spaceship™s 4-velocity u =
132 Special relativity revisited

= a0 a1 0 0 satisfy a1 = gu0 /c and a0 =
u0 u1 0 0 and 4-acceleration a
gu1 /c. Show also that
d2 u g2 u
d2 c
where is the proper time as measured by an occupant of the spaceship. A spaceship
accelerates at a constant rate g = 9 5 m s’2 in its own locally comoving frame.
It starts out towards the centre of the Galaxy 10 kpc distant. After going 5 kpc
it decelerates at the same rate to come to rest again at the Galactic centre. The
outward journey is then repeated in reverse to come back home. Show that, in the
spaceship™s frame, the elapsed travel time is 41.5 years. What is the elapsed time
for the waiting observer (or descendants) on Earth?
5.8 Show that in its own instantaneous rest frame (IRF), a particle™s 4-acceleration is
given by a = 0 a , where a is the 3-acceleration of the particle in the IRF.
5.9 Show that, in an inertial frame in which a particle™s 3-acceleration a is orthogonal
to its 3-velocity u, the particle™s 4-acceleration is given by a = u 0 a .

5.10 Show that when an electron and a positron annihilate, more than one photon must
be produced.
5.11 Show that if a photon is reflected from a mirror moving parallel to its plane, then
the angle of incidence of the photon is equal to the angle of reflection.
5.12 An inertial frame S moves with constant velocity u along the x-axis with respect
to frame S. A photon in frame S is fired at an angle to the forward direction of
motion. Show that the angle measured in frame S is

tan 1 ’ 2 1/2
tan =
1 + sec
where = u/c.
5.13 A photon with energy E collides with a stationary electron whose rest mass is m0 .
As a result of the collision the direction of the photon™s motion is deflected through
an angle and its energy is reduced to E . Show that

1 1
’ = 1 ’ cos
m0 c 2
Deduce that the wavelength of the photon is increased by

= sin2
m0 c
where h is Planck™s constant. At what angle to the initial photon direction does
the electron move? Show that, if the photon is deflected through a right angle, and
m0 c2 , then after the interaction the angle of the
the photon energy satisfies E
electron™s motion to the direction of the photon™s initial motion is = ’ /4.
5.14 Inverse Compton scattering occurs whenever a photon scatters off a particle moving
with a speed very nearly equal to that of light. Suppose that a particle of rest mass

m0 and total energy E collides head on with a photon of energy E . Show that the
scattered photon has energy
m2 c 4
E 1+ 0
Ultra-high-energy cosmic rays have energies up to 1020 eV. How much energy can
a cosmic ray proton transfer to a microwave background photon?
5.15 For a pure 4-force f acting on a particle of rest mass m0 , show that the corresponding
3-force f satisfies
f ·u
f = u m0 a + 2 u
Hence show that a is only parallel to f when f is either parallel or orthogonal
to u. Show further that, in these two cases, one has f = u m0 a and f = u m0 a

5.16 For a pure 4-force f acting on a particle of rest mass m0 , show that
= uf
5.17 In Minkowski spacetime, consider an emitter moving at speed v along the positive
x1 -axis of the frame S in which a receiver is at rest. Prove the Doppler shift
= v 1 ’ cos
where is the angle made by the photon trajectory with the x1 -axis of S. Show that
this expression can be written in the manifestly covariant way
where k is the photon 4-wavevector and u and u are the 4-velocities of and
5.18 An astronaut on the space rocket in Exercise 5.5 refers all his measurements to an
orthonormal tetrad e that comprises the basis vectors of a Cartesian instan-
taneous rest frame S at proper time . Suppose that at = 0 the tetrad coincides
with the fixed basis vectors e of the ct x y z coordinate system in the inertial
frame S and that the rocket is not rotating in any way. Show that, in the ct x y z
coordinate system, the components of the astronaut™s orthonormal tetrad at some
later proper time are
a a
= cosh sinh 00
c c
a a
= sinh cosh 00
c c
= 0010
= 0001
134 Special relativity revisited

The astronaut observes photons that were emitted with frequency 0 from a star that
is stationary at the origin of S. Show that the frequency of the photons as measured
by the astronaut at proper time is given by

= ’a /c
0 exp

5.19 At some event P in Minkowski spacetime, the worldline of a particle (either massive
or massless) and an observer cross. If, at this event, the particle has 4-momentum
p and the observer has 4-velocity u then show that the observer measures the
magnitude of the spatial momentum of the particle to be
p·u 2
p= ’p·p
5.20 Repeat Exercise 1.10 using 4-vectors.
5.21 In Minkowski spacetime, the coordinates cT X Y Z correspond to a Cartesian
inertial frame. The coordinates ct r are related to them by the equations

X = r sin cos Y = r sin sin Z = r cos

Obtain the special-relativistic equations of motion of a free particle in the ct r
coordinate system, and interpret these equations physically.
5.22 Repeat Exercise 5.21 for the coordinates ct z , that are related to the Cartesian
inertial coordinates cT X Y Z by

T =t
X = cos cos t ’ sin sin t
Y = cos sin t + sin cos t

where is a constant.

At the time special relativity was devised only two forces were known, electro-
magnetism and gravity. As mentioned in Chapter 1, it was electromagnetism that
actually led to the development of special relativity. Therefore, we now discuss
electromagnetism in some detail; in particular its relativistic formulation. This
will introduce a number of ideas that we will use later in developing and applying
a relativistic formulation of gravity, namely general relativity. Our guiding prin-
ciple here is to derive tensorial equations in Minkowski spacetime. This makes
it possible to express the theory in a form that is independent of the coordinate
system used. We will see that a consistent theory of electromagnetism follows
from saying that there exists a pure 4-force that depends linearly on 4-velocity
and also on a certain property of a particle, namely its charge q. Even if one has
no prior knowledge of electromagnetism, one can derive the complete theory in
a few lines using this basic assumption and occasional appeals to simplicity.

6.1 The electromagnetic force on a moving charge
In some inertial frame S, the 3-force on a particle of charge q moving in an
electromagnetic field is
f = q E+u—B

where u is the particle™s 3-velocity in S. The 3-vector fields E and B are the
electric and magnetic fields as measured in S. This equation suggests that for the
proper relativistic formulation we should write down a tensor equation in four-
dimensional spacetime in which the electromagnetic 4-force f depends linearly
on the particle™s 4-velocity u. Thus we are led to an equation of the form

f = qF · u (6.1)

136 Electromagnetism

where F must be a rank-2 tensor in order to make a 4-force from a 4-velocity.
We call F the electromagnetic field tensor. The scalar q is some property of the
particle that determines the strength of the electromagnetic force upon it (i.e. its
We could develop the theory entirely in terms of coordinate-independent
4-vectors and 4-tensors. Nevertheless, if we label points in spacetime with some
arbitrary coordinate system x , we may express (6.1) in component form as

f = qF u

where the F are the covariant components of F in our chosen coordinate
system. In order that the rest mass of a particle is not altered by the action of
the electromagnetic force we require the latter to be a pure force, so that for any
4-velocity u we have u · f = 0. In component form this reads
f u = qF u u = 0
which implies that the electromagnetic field tensor must be antisymmetric, i.e.

= ’F

The contravariant components of F are given by
F gF
where the g are the contravariant components of the metric tensor in our
coordinate system. Since g is symmetric, it is clear that F = ’F also.

6.2 The 4-current density
So far we have found only the relativistic form of the electromagnetic force on
an idealised point particle with charge q and 4-velocity u, in terms of some as
yet undetermined rank-2 antisymmetric tensor F. In order to develop the theory
further, we must now construct the field equations of the theory, which determine
the electromagnetic field tensor F x at any point in spacetime in terms of charges
and currents. To construct these field equations, we must first find a properly
relativistic (or covariant) way of expressing the source term. In other words, we
need to identify the 4-tensor, defined at each event in spacetime, that acts as the
source of the electromagnetic field.
Let us consider some general time-dependent charge distribution. At each
event P in spacetime we can characterise the distribution completely by giving
the charge density and 3-velocity u as measured in some inertial frame. For
simplicity, let us consider the fluid in the frame S in which u = 0 at P. In this
6.2 The 4-current density


l' = l/ γ

Lorentz contracted in
direction of motion

Figure 6.1 The Lorentz contraction of a fluid element in the direction of motion.

frame, the (proper) charge density is given by 0 = qn0 , where q is the charge
on each particle and n0 is the number of particles in a unit volume. In some
other frame S , moving with speed v relative to S, the volume containing a fixed
number of particles will be Lorentz contracted along the direction of motion (see
Figure 6.1). Hence in S the number density of particles is n = v n0 , from which
we obtain

= v0

Thus we see that the charge density is not a 4-scalar but does transform as the
0-component of a 4-vector. This suggests that the source term in the electromag-
netic field equations should be a 4-vector. At each point in spacetime, the obvious
choice is

jx = xux

where 0 x is the proper charge density of the fluid (i.e. that measured by an
observer comoving with the local flow) and u x is its 4-velocity. The squared
length of this 4-current density j at any event is

j·j = 22

In an inertial frame S the components of the 4-current density j are

= cu=c
j j

where is the charge density as measured in S and j is the relativistic 3-current
density in S. Thus, we see that c2 2 ’ j 2 is a Lorentz invariant, where j 2 = j · j.
138 Electromagnetism

6.3 The electromagnetic field equations
We are now in a position to write down the electromagnetic field equations. The
simplest way in which to relate the rank-2 electromagnetic field tensor F to the
4-vector j is to contract F with some other 4-vector. Since there are no more
physical 4-vectors associated with the theory, the only other 4-vector that the field
equations can contain is the 4-gradient . Thus the field equations must be of the
· F = kj (6.2)

where k is an unimportant constant related to our choice of units. In order to make
our final results more familiar, let us work in Cartesian inertial coordinates x
corresponding to some inertial frame S. In such a system, the covariant derivative
reduces simply to the partial derivative, and so we can write (6.2) in component
form as
= kj (6.3)

We can use this field equation to obtain the law for the conservation of charge.
If we take the partial derivative of (6.3), we obtain

=k j (6.4)

However, since F is antisymmetric, we can write the scalar on the left-hand
side as
=’ =’ =’

= 0. Thus the right-hand side of (6.4) must
from which we deduce that F
also be zero, so that

j =0

Using 3-vector notation in the frame S, we may write this in a more familiar way:

+ ·j = 0
which expresses the conservation of charge. This equation has the same form as
the non-relativistic equation of charge continuity, but the relativistic expressions
for and j must be used in it.
It is clear, however, that we do not yet have a viable theory. The field equations
of the theory are given by (6.3), but there are six independent components in
F and only four field equations. Evidently our theory is under-determined as it
6.4 Electromagnetism in the Lorenz gauge

stands. This suggests that F could be constructed from a 4-vector ˜potential™ A.
Again working in Cartesian inertial coordinates x , let us write

= A’ (6.5)

Thus F is antisymmetric by construction and contains only four independent
fields A . Using the field equation (6.3), we can write

kj = k j= =

where we have used the fact that the metric coefficients in Cartesian inertial
coordinates x are constants.1 Hence, by substituting into the expression (6.5),
we obtain the electromagnetic field equations in terms of the 4-vector potential A
A’ = kj (6.6)

Alternatively, we can express electromagnetism entirely in terms of the electro-
magnetic field tensor F . In this case, we require the two field equations

= kj
+ + =0

where the second of these is straightforwardly derived from (6.5). Using the
antisymmetrisation operation described in Section 4.3, the second equation can
F = 0. The constant k may be found by
also be written very succinctly as
demanding consistency with the standard Maxwell equations (see Section 6.5). In
SI units we have k = 0 , where 0 0 = 1/c2 .

6.4 Electromagnetism in the Lorenz gauge
Suppose that we add an arbitrary 4-vector Q to the 4-potential A. Thus, in
component form (in Cartesian inertial coordinates, x , for example) we have

A new = A + Q (6.8)

Note that this is not a coordinate transformation. We are still working in the same
set of coordinates x but have defined a new vector A new , whose components

In fact, such an operation is valid in any coordinate system. As we showed in Chapter 4, the covariant
derivative of the metric tensor is identically zero, which means that we can interchange the order of index
raising or lowering and covariant differentiation without affecting the result.
140 Electromagnetism

in this basis are given by (6.8). The new electromagnetic field tensor is then
given by

= A new ’ A new = A’ A+ Q’

Clearly, we will recover the original electromagnetic field tensor provided that

Q= Q

This equation can be satisfied if Q is the gradient of some scalar field (say), so
that Q = . Thus we have uncovered a gauge freedom in the theory: we are
free to add the gradient of any scalar field to the 4-vector potential A, giving

A new = A + (6.9)

and still recover the same electromagnetic field tensor and hence the same elec-
tromagnetic field equations. The transformation (6.9) is an example of a gauge
transformation and, as stated above, is distinct from a coordinate transformation.
In the field equations

A’ =
A 0j

the second term on the left-hand side can be written as A . Thus, we can
make this term zero by choosing a scalar field such that

A =0 (6.10)

This condition is called the Lorenz gauge. It is worth noting that the condition
(6.10) is preserved by any further gauge transformation A ’ A + if and
= 0.
only if
Adopting the Lorenz gauge allows the electromagnetic field equations to be
written very simply as

A= A= 0j

using the notation 2 =
It is usual to write the four-dimensional Laplacian
= , where 2 is the d™Alembertian operator.2 In Cartesian inertial
coordinates ct x y z ,
12 2 2 2
= 2 2’ 2’ 2’ 2
ct x y z

This operator should properly be written 2 , which is the inner product · of the 4-gradient with itself.

However, the notation we have adopted is quite common, since it makes clearer the distinction between the
four-dimensional Laplacian and the three-dimensional Laplacian 2 = · .
6.5 Electric and magnetic fields in inertial frames

Then the electromagnetic field equations in the Lorenz gauge take the especially
simple form


together with the attendant gauge condition (6.10). Moreover, in the absence of
charges and currents, the right-hand side becomes zero and so A has wave
solutions travelling at the speed of light, as do the components of F since in
this case we also have 2 F = 0.

6.5 Electric and magnetic fields in inertial frames
We have not yet identified the components of F (or A) with the familiar electric
and magnetic 3-vector fields E and B as observed in some Cartesian inertial frame
S. This is simply a matter of convention; we just have to name the components of
A (say) in a way which results in 3-vector equations in S that describe the physics
correctly in terms of the traditionally defined 3-vectors E and B. Thus, in some
Cartesian inertial frame S, the components of A are taken to be as follows:


where is the electrostatic potential and A is the traditional three-dimensional
vector potential. In terms of and A, the Lorenz gauge condition becomes
·A+ =0
and, in this gauge, the field equations take the form

A= =
2 2

In terms of and A, the electric and magnetic fields in S are given by

B= —A E=’ ’
and (6.11)
It is straightforward to show that these equations lead to the Maxwell equations
in their familiar form,

—E = ’
·E =
·B = 0 —B = 0j + 00
142 Electromagnetism

From the expressions (6.11) and (6.5) we have

Ei = ’ ’ c 0 Ai = ’c j A ’ 0 Aj = ’c
ij ij ij

where we have used the fact that A0 = A = A0 . Also, we have

B1 = 2A ’ 3A = 3 A2 ’ 2 A3 = F32
3 2

where we have used the fact that Ai = i A = ’Ai . Similar results hold for B2
and B3 . Thus we find that the covariant components of F in S are given by
⎛ ⎞
E 1 /c E 2 /c E 3 /c
⎜’E 1 /c B2 ⎟
⎜ ⎟
=⎜ 2 ⎟
⎝’E /c ’B1 ⎠
B3 0
’E 3 /c ’B2 B1 0

The corresponding electric and magnetic fields E and B in some other Cartesian
inertial frame S are most easily obtained by calculating the components of the
electromagnetic field tensor F or the 4-potential A in this frame. For example, if S
is moving at speed v relative to S in standard configuration then the components
in S are given by

A= =

where the matrix is given in Chapter 5.

6.6 Electromagnetism in arbitrary coordinates
So far we have developed electromagnetic theory in Cartesian inertial coordinates.
In general, however, we are free to label points in the Minkowski spacetime using
any arbitrary coordinate system x . We could have developed the entire theory
in such an arbitrary system, or even in a coordinate-independent way by using
the 4-tensors themselves rather than their components in some coordinate system.
Nevertheless, having expressed the theory in Cartesian inertial coordinates, it is
now trivial to re-express it in a form valid in arbitrary coordinates.
As shown in (6.7), the electromagnetic field equations in Cartesian inertial
coordinates, when expressed in terms of F, are given by

F 0j

+ + =0
6.6 Electromagnetism in arbitrary coordinates

In such a coordinate system, the partial derivative is identical to the covariant
derivative , so we can rewrite these equations as

F 0j
+ + =0

These new equations are now fully covariant tensor equations, however, so that if
they are valid in one system of coordinates then they are valid in all coordinate
systems. Thus, (6.12) gives the electromagnetic field equations in an arbitrary
coordinate system! Once again, using the antisymmetrisation operation discussed
F = 0.
in Section 4.3, one can write the second equation simply as
A similar procedure can be performed for the electromagnetic field equations
when expressed in terms of the 4-vector potential A. From (6.6), in Cartesian
inertial coordinates we have

A’ =
A 0j

Once again, we can replace by , but in this case we must also replace
by g , to obtain

A’ =
g A 0j

Again we have a fully covariant tensor equation, which must therefore be valid
in any arbitrary coordinate system, the metric coefficients of which are g .
In arbitrary coordinates, the electromagnetic field equations still permit the
gauge transformation

A new = A + =A +

where the last equality holds because the covariant derivative of the scalar field
is simply its partial derivative. We can again choose a scalar field , so that

A =0

which is the Lorenz gauge condition in arbitrary coordinates. In this case the
electromagnetic field equations can again be written in the form

144 Electromagnetism

but now the d™Alembertian operator is given by 2 = g = . In vacuo,
we may again write 2 A = 0 and 2 F = 0. Also, charge conservation is given
in arbitrary coordinates by
j =0

Finally, we note that the components of F and A in two different arbitrary
coordinate systems x and x are related by
x x x
A= =
x x x

6.7 Equation of motion for a charged particle
From our original considerations in Section 6.1, we see that the coordinate-
invariant manner of writing the equation of motion of a charged particle in an
electromagnetic field is

dp du
= m0 = qF · u
d d

where m0 is the rest mass of the particle, p is its 4-momentum, u is its 4-velocity
and is the proper time measured along its worldline. Note that the first equality
holds because the electromagnetic force is a pure force.
In Cartesian inertial coordinates, this becomes
= qF u
In a general coordinate system, however, the left-hand side is no longer valid
since the ordinary derivative of the components of the 4-velocity along the parti-
cle™s worldline must be replaced by the intrinsic derivative along the worldline.
Using the expression for the intrinsic derivative given in Chapter 3, we find
that in an arbitary coordinate system the equation of motion of a particle in an
electromagnetic field is
Du du
= m0 + = qF
m0 uu u
D d
where we have written dx /d as u since the 4-velocity is the tangent to the
particle™s worldline x .
The equation for the particle™s worldline in arbitrary coordinates is thus given by

d2 x dx dx q dx
+ = (6.13)
d2 dd m0 d

In the absence of an electromagnetic field (or for an uncharged particle), the
right-hand side is zero and we can recognise the result as the equation of a
In summary, the general procedure for converting an equation valid in Cartesian
inertial coordinates into one that is valid in an arbitrary coordinate system is as

• replace partial derivatives with covariant derivatives;
• replace ordinary derivatives along curves with intrinsic derivatives;
• replace by g .

6.1 Show that the second Maxwell equation in (6.7) can be written as F = 0.
6.2 Show that the Maxwell equation (6.6) is unchanged under the gauge transformation
6.3 In some Cartesian inertial frame S, the contravariant components of the electric and
magnetic fields are E i and Bi respectively. Show that the corresponding electromag-
netic field-strength tensor has the contravariant components
⎛ ⎞
0 ’E 1 /c ’E 2 /c ’E 3 /c
⎜E 1 /c 0 B2 ⎟
⎜ ⎟
=⎜ 2 ⎟
⎝E /c B 1⎠
E 3 /c ’B2 B1 0

6.4 In a Cartesian inertial coordinate system in Minkowski spacetime the field equations
of electromagnetism can be written

F 0j

F+F + =0

Show that these equations are equivalent to the standard form of Maxwell™s equations
in vacuo.
6.5 Two Cartesian inertial frames S and S are in standard configuration. Show that the
components of electric and magnetic fields in the two frames are related as follows:

B 1 = B1
E =E
1 1
B2 = B2 + E
E= E ’ vB
2 2 3
E3= E 3 + vB2 B3 = B3 ’ 2 E 2
Show further that c2 B2 ’ E 2 is Lorentz invariant.
146 Electromagnetism

6.6 Show that the transformation equations derived in Exercise 6.5 can be written as

E⊥ = E⊥ + v — B⊥
E =E
B⊥ = B⊥ ’ v — E⊥
B =B

where v = v 0 0 , and E and E⊥ denote the projections of E parallel and orthog-
onal to v respectively (and similarly for B). Explain why these equations must hold
for a Lorentz boost v in an arbitrary direction with respect to the axes of S.
6.7 Show that one may eliminate the explicit reference to the projections of E and B
in Exercise 6.6 and write the transformations as
E = E+v—B + 2 v·E v
B= B’ 2v—E + 2 v·B v
c v

Show that E · B is a Lorentz invariant.
6.9 In an arbitrary coordinate system, the second Maxwell equation reads

F+ + =0

Show that this can be written as

F+F + =0

F = 0.
and hence show that
6.10 In Cartesian inertial coordinates, the equation of motion for a charged particle in
an electromagnetic field is
= qF u
Show that
dp d
= q E+u—B = qE · u
dt dt
where p and are the 3-momentum and the energy respectively of the particle in
S. Interpret these results physically.
6.11 In some inertial frame S, show that the 3-acceleration of a charged particle in an
electromagnetic field is
du q
a= = E+u—B’ 2 u·E u
dt m0 c
The equivalence principle and spacetime curvature

We are now in a position to use the experience gained in deriving a relativistic
formulation of electromagnetism (together with some flashes of inspiration from
Einstein!) to begin our formulation of a relativistic theory of gravity, namely
general relativity.

7.1 Newtonian gravity
In our development of electromagnetism, we began by considering the electro-
magnetic 3-force on a charged particle. Let us therefore start our discussion of
gravity by considering the description of the gravitational force in the classical,
non-relativistic, theory of Newton. In the Newtonian theory, the gravitational
force f on a (test) particle of gravitational mass mG at some position is

f = mG g = ’mG

where g is the gravitational field derived from the gravitational potential at that
position. In turn, the gravitational potential is determined by Poisson™s equation:

=4 G

where is the gravitational matter density and G is Newton™s gravitational
constant. This is the field equation of Newtonian gravity.
It is clear from (7.1) that Newtonian gravity is not consistent with special
relativity. There is no explicit time dependence, which means that the potential
(and hence the gravitational force on a particle) responds instantaneously to a
disturbance in the matter density ; this violates the special-relativistic requirement
that signals cannot propagate faster than c. We might try to remedy this by noting

148 The equivalence principle and spacetime curvature

that the Laplacian operator 2 in (7.1) is equivalent to minus the d™Alembertian
operator 2 in the limit c ’ , and thus postulate the modified field equation

= ’4 G

However, this equation does not yield a consistent relativistic theory. It is still
not Lorentz covariant, since the matter density does not transform as a Lorentz
scalar. We shall discuss the transformation properties of the matter density later.
In addition to the incompatibility of Newtonian gravity with special relativity,
there is a second fundamental difference between the electromagnetic and grav-
itational forces. The equation of motion of a particle of inertial mass mI in a
gravitational field is given by

d2 x m
=’ G (7.2)
dt2 mI

It is a well-established experimental fact, however, that the ratio mG /mI appearing
in the equation of motion is the same for all particles. By an appropriate choice of
units one may thus arrange for this ratio to equal unity. In contrast, the ratio q/mI
occurring in the equation of motion of a charged particle in an electromagnetic
field is not the same for all particles. From (7.2), we thus see that the trajectory
through space of a particle in a gravitational field is independent of the nature of
the particle.
This equivalence of the gravitational and inertial masses (which allows us to
refer simply to ˜the mass™), is a truly remarkable coincidence in the Newtonian
theory. In this theory, there is no a-priori reason why the quantity that determines
the magnitude of the gravitational force on the particle should equal the quantity
that determines the particle™s ˜resistance™ to an applied force in general. It appears
as an isolated experimental result, which has since been verified to an accuracy
of at least one part in 1011 (by Dicke and co-workers).

7.2 The equivalence principle
The equality of the gravitational and inertial masses of a particle led Einstein
to his classic ˜elevator™ thought experiment. Consider an observer in a freely
falling elevator (i.e. after the lift cable has been cut). Objects released from
rest relative to the elevator cabin remain floating ˜weightless™ in the cabin.
A projectile shot from one side of the elevator to the other appears to move
in a straight line at constant velocity, rather than in the usual curved trajectory.
All this follows from the fact that the acceleration of any particle relative to
7.3 Gravity as spacetime curvature

the elevator is zero: the particle and the elevator cabin have the same accelera-
tion relative to the Earth as a result of the equivalence of gravitational and inertial
All these observations would hold exactly if the gravitational field of the Earth
were truly uniform. Of course, the gravitational field of the Earth is not uniform
but acts radially inwards towards its centre of mass, with a strength proportional
to 1/r 2 . Thus, if the elevator were left to free-fall for a long time or if it were very
large (i.e. a significant fraction of the Earth™s radius), two particles released from
rest near the walls of the elevator would gradually drift inwards, since they would
both be falling along radial lines towards the centre of the Earth (see Figure 7.1).
Furthermore, as a result of the varying strength of the gravitational field, particles
released from rest near the floor of the elevator would gradually drift downwards
whereas those near the ceiling would drift upwards. What the observer in the
elevator would be experiencing would be the tidal forces resulting from the
residual inhomogeneity in the strength and direction of the gravitational field once
the main acceleration has been subtracted. It should always be remembered that
these tidal forces can never be completely abolished in an elevator (laboratory)
of finite, i.e. non-zero, size.
Nevertheless, provided that we consider the elevator cabin over a short time
period and that it is spatially small, then a freely falling elevator (which may
have x y z coordinates marked on its walls and an elevator clock measuring
time t) resembles a Cartesian inertial frame of reference, and therefore the laws
of special relativity hold inside the elevator.1 These observations lead to

The equivalence principle: In a freely falling (non-rotating) laboratory occupying
a small region of spacetime, the laws of physics are those of special relativity.2

7.3 Gravity as spacetime curvature
These observations led Einstein to make a profound proposal that simultaneously
provides for a relativistic description of gravity and incorporates in a natural way
the equivalence principle (and consequently the equivalence of gravitational and
inertial mass). Einstein™s proposal was that gravity should no longer be regarded
as a force in the conventional sense but rather as a manifestation of the curvature
of the spacetime, this curvature being induced by the presence of matter. This is
the central idea underpinning the theory of general relativity.

The elevator cabin must not only occupy a small region of spacetime but also be non-rotating with respect to
distant matter in the universe. This statement is related to Mach™s principle.
This is in fact a statement of the strong equivalence principle, since it refers to all the Laws of physics. The
more modest weak equivalence principle refers only to the trajectories of freely falling particles.
150 The equivalence principle and spacetime curvature

Figure 7.1 An elevator in free-fall towards the Earth.

If gravity is regarded a manifestation of the curvature of spacetime itself, and
not as the action of some 4-force f defined on the manifold then the equation of
motion of a particle moving only under the influence of gravity must be that of a
˜free™ particle in the curved spacetime, i.e.


where p is the particle™s 4-momentum and is the proper time measured along the
particle™s worldline. Thus, the worldline of a particle freely falling under gravity
is a geodesic in the curved spacetime.
The equivalence principle restricts the possible geometry of the curved space-
time to pseudo-Riemannian, as follows. The mathematical meaning of the equiv-
alence principle is that it requires that at any event P in the spacetime manifold
we must be able to define a coordinate system X such that, in the local neigh-
bourhood of P, the line element of spacetime takes the form
ds2 ≈ dX dX
where exact equality holds at the event P. From the geodesic equation (as shown
in Chapter 5), in such a coordinate system the path of a ˜free™ particle, i.e. one
moving only under the influence of gravity, in the vicinity of the event P is
given by
d2 X i
dT 2
where i = 1 2 3 and we have denoted X 0 by cT (once again the equality in the
above equations holds exactly at P). Thus, in the vicinity of P the coordinates X
define a local Cartesian inertial frame (like our small elevator considered over a
short time interval), in which the laws of special relativity hold locally. In order
7.4 Local inertial coordinates

that we can construct such a system, spacetime must be a pseudo-Riemannian
manifold (which is curved and four-dimensional). For such a manifold, in some
arbitrary coordinate system x the line element takes the general form

ds2 = g dx dx

7.4 Local inertial coordinates
The curvature of spacetime means that it is not possible to find coordinates in
which the metric g = at all points in the manifold. Thus, it is not possible
to define global Cartesian inertial frames as we could in the pseudo-Euclidean
Minkowski spacetime. Instead, we are forced to use arbitrary coordinate systems
x to label events in spacetime, and these coordinates often do not have simple
physical meanings. It is often the case that x0 is a timelike coordinate and the
xi i = 1 2 3 are spacelike (i.e. the tangent vector to the x0 coordinate curve is
timelike at all points, and similarly the tangent vectors to the xi coordinate curves
are always spacelike). This allocation of coordinates is not necessary, however,
and it is sometimes useful to define null coordinates. In any case, the arbitrary
coordinates x need not have any direct physical interpretation.
Nevertheless, as demanded by the equivalence principle, problems of physical
meaning can always be overcome by transforming, at any event P in the curved
spacetime, to a local inertial coordinate system X , which, in a limited region of
spacetime about P, corresponds to a freely falling, non-rotating, Cartesian frame
over a short time interval. Mathematically, this corresponds to constructing about
the event P a coordinate system X such that

P= =0
and (7.3)
g g P

P = 0 and that the coordinate basis vectors at the
This also means that
event P form an orthonormal set, i.e.

e P ·e P = (7.4)

There are in fact an infinite number of local inertial coordinate systems at P, all
of which are related to one another by Lorentz transformations. In other words,
if a coordinate system X satisfies the conditions (7.3), and hence the condition
(7.4), then so too will the coordinate system


. 6
( 24)