easiest to consider the collision in the inertial frame S in which the electron is at

rest and the photon is travelling along the positive x1 -axis (see Figure 5.4). Thus

the components of p and q in S are

= h /c h /c 0 0

p

= me c 0 0 0

q

where is the frequency of the photon as measured by a stationary observer in

S, and me is the rest mass of the electron. Let us assume that, after the collision,

¯ ¯

the electron and photon have 4-momenta p and q such that they move off in the

plane x3 = 0, making angles and respectively with the x1 -axis. Thus

¯ = h¯ /c h¯ /c cos h¯ /c sin 0

p

¯ = ’

u me u cos u me sin 0

q u me c

where u is the electron™s speed and ¯ is the photon frequency as measured by

a stationary observer in S after the collision. Conservation of total 4-momentum

means that

p +q = p +q

¯ ¯

which gives

h /c + me c = h¯ /c + (5.24)

u me c

h /c = h¯ /c cos + u me u cos (5.25)

0 = h¯ /c sin ’ u me u sin (5.26)

125

5.13 Accelerating observers

Eliminating u and from these equations leads to the formula for Compton

scattering, which gives the frequency of the photon in S after the collision:

’1

h

¯= 1+ 1 ’ cos

me c2

¯ ¯

The components of the 4-momentum p (or q) in any other inertial frame S

can be found easily by using p =

¯ ¯

p , where are the elements of the

Lorentz transformation matrix connecting the frames S and S .

5.13 Accelerating observers

So far we have only considered inertial observers, who move at uniform speeds

with respect to one another. Let us now consider a general observer , who

may be accelerating with respect to some inertial frame S. If the observer has a

4-velocity u , where is the proper time measured along the worldline, then

his 4-acceleration is given by

du

=

a

d

It is worth noting that, at any given event P, the 4-acceleration a is always

orthogonal to the corresponding 4-velocity u, since

d d

a·u = 2u·u = =0

12

1

(5.27)

2c

d d

An accelerating observer has no inertial frame in which he or she is always at

rest. Nevertheless, at any event P along the worldline we can define an instan-

taneous rest frame S , in which the observer is momentarily at rest. Since

the observer is at rest in S , the timelike basis vector e0 of this frame must be

parallel to the 4-velocity u of the observer. The remaining spacelike basis vectors

ei (i = 1 2 3) of S are all orthogonal to e0 and to one another and will depend on

the relative velocity of S and S and the relative orientation of their spatial axes.

Observations made by at the event P thus correspond to measurements made

in the instantaneous rest frame (IRF) S at P. This is illustrated in Figure 5.5.

Thus, the notion of a localised laboratory can be idealised as follows. An

observer (whether accelerating or not) carries along four orthogonal unit vectors

(or tetrad), which vary along his worldline but always satisfy

e

·e = (5.28)

e

126 Special relativity revisited

x0

e'0

e'1

P

x1

Figure 5.5 The basis vectors e0 e1 at the event P in the instantaneous rest frame

S of an observer who is accelerating with respect to the inertial frame S.

In particular, the timelike unit vector is given by

ˆ

=u (5.29)

e0

ˆ

where u is the normalised 4-velocity of the observer and is simply u /c. At

any event P along the observer™s worldline, the tetrad comprises the basis vectors

of the Cartesian IRF at the event P and defines a time direction and three space

directions to which the observer will refer all measurements. Thus, the results of

any measurement made by the observer at the event P are given by projections

of physical quantities (i.e. vectors and tensors) onto these tetrad vectors.

An important example occurs when the worldline of the observer intersects the

worldline of some particle at the event P (at which we take the observer™s proper

time to be ). If p is the 4-momentum of the particle at this event then the energy

E of the particle as measured by the observer is given by

E

= p · e0 ’ E = p·u

c

Similarly, the covariant components pi of the spatial momentum of the particle

as measured by the observer are given by

pi = p · ei

Another example is provided by the 4-acceleration a. Since at any event P on the

ˆ

worldline we have e0 = u, the orthogonality condition (5.27) and the fact that in

the IRF u = c 0 imply that the components of the 4-acceleration in the IRF

127

5.13 Accelerating observers

are a = 0 a . Thus the magnitude of the 3-acceleration in the IRF can be

computed as the simple invariant a · a.

It is interesting to consider how the tetrad of basis vectors changes along

the worldline of an observer whose acceleration varies arbitrarily with time.

As it is transported along the observer™s worldline, the tetrad must satisfy the

two requirements (5.28) and (5.29). Clearly, given u the condition (5.29)

determines the timelike basis vector e0 uniquely. Unfortunately, condition

(5.28) is obviously insufficient to determine uniquely the evolution of the spacelike

i = 1 2 3 , which reflect the different ways in which the

basis vectors ei

observer™s local laboratory might be spinning and tumbling. An important special

case, however, is when the tetrad is ˜non-rotating™.

This last requirement requires some clarification. Clearly, the basis vectors of

the tetrad at any proper time are related to the basis vectors e of some given

inertial frame by the Lorentz transformation

=

e e

Thus the tetrad basis vectors at two successive instants must also be related to each

other by a Lorentz transformation, which can be thought of as a ˜rotation™ in space-

time. A ˜non-rotating™ tetrad is one where the basis vectors e change from instant

to instant by precisely the amount implied by the rate of change of u but with no

additional rotation. In other words, we accept the inevitable rotation in the timelike

plane defined by u and a but rule out any ordinary rotation of the 3-space vectors.

Since we wish to treat the time and space directions on an equal footing, we must

seek a general expression for the rate of change de /d of a basis vector along

the worldline such that: (i) it generates the appropriate Lorentz transformation if

e lies in the timelike plane defined by u and a, and (ii) it excludes any rotation

if e lies in any other plane, in particular any spacelike plane. A little reflection

shows that the unique answer to these requirements is

de 1

= u·e a’ a·e u (5.30)

c2

d

Any vector that undergoes the above transformation is said to be Fermi“Walker

transported along the worldline. From (5.30), we find that if e is orthogonal to

both u and a then de /d = 0 as required. Moreover, we see that de0 /d = a/c,

again as required.

A physical example of a 3-space vector that does not rotate along the worldline

is the spin (i.e. the angular momentum vector) of a gyroscope that the observer

accelerates with himself by means of forces applied to its centre of mass (so that

128 Special relativity revisited

there are no torques). Indeed, a careful observer could set up a non-rotating tetrad

by aligning his three spatial axes using such gyroscopes.

5.14 Minkowski spacetime in arbitrary coordinates

There is no need to label events in Minkowski spacetime with the Cartesian

inertial coordinates we have used thus far. The advantage of Cartesian coordinates

X , which put the line element into the form5

ds2 = (5.31)

dX dX

(even just at a particular event P), is that they have a clear physical meaning, i.e.

they correspond to time and distances measured by an observer at P who is at rest

in some inertial frame S labelled using three-dimensional Cartesian coordinates

(we will prove this below). Nevertheless, we are free to label events in spacetime

using any arbitrary system of coordinates x although, in general, the coordinates

in such an arbitrary system may not have simple physical meanings.

Since the path of a free massive particle is a geodesic in Minkowski spacetime,

its worldline x in some arbitrary coordinate system is given by the geodesic

equations

d2 x dx dx

+ =0 (5.32)

d2 dd

An inertial frame S is defined as one in which a free particle moves in a straight

line with fixed speed. Thus from (5.31) it is clear that coordinates X , such that

(5.31) holds, define an inertial frame. In this case, the connection vanishes,

and so the worldline of a particle is given by

d2 X

=0 (5.33)

d2

Setting X = cT X Y Z for the moment, the = 0 equation (5.33) shows

that dT/d = constant. Thus the = 1 2 3 equations read

d2 X d2 Y d2 Z

= = =0

dT 2 dT 2 dT 2

from which we see immediately that a free particle moves in a straight line with

constant speed in S.

We could label the inertial frame S using three-dimensional spatial coordi-

nates that are not Cartesian, however. For example, we could use spherical polar

5

In the interest of clarity, in this section we will denote Cartesian inertial coordinates by X and an arbitrary

coordinate system by x .

129

5.14 Minkowski spacetime in arbitrary coordinates

coordinates. This would correspond to making a change of variables in Minkowski

spacetime to the new system x = ct r , where

T =t X = r sin cos Y = r sin sin Z = r cos

In this case, the line element becomes

ds2 = c dt2 ’ dr 2 ’ r 2 d ’ r 2 sin2

2 2

d

= diag 1 ’1 ’r 2 ’r 2 sin2 . From the metric we can

so the metric is g

show that the non-vanishing components of the connection in this coordinate

system are (with c = 1)

= ’r = r sin2

1 1

22 33

= 1/r = ’ sin cos

2 2

12 33

= 1/r = cot

3 1

13 22

Thus, from (5.32), the geodesic equations for the worldline x of a free particle

are very complicated in these coordinates (exercise), in spite of the fact that, to

an observer with fixed r coordinates (i.e. at rest in S), a free particle still

moves in a straight line with fixed speed.

Alternatively, we could use three-dimensional Cartesian coordinates to label

points in a non-inertial frame S that is accelerating with respect to S. As an

example, consider transforming from X = cT X Y Z to a new system of

coordinates x = ct x y z , where t x y z are defined by the equations6

T =t X = x cos t ’ y sin t Y = x sin t + y cos t Z=z

Thus points with constant x y z values (i.e. the values are fixed in S ) rotate

with angular speed about the Z-axis of S (see Figure 5.6). Substituting these

definitions into (5.31), the line element becomes

ds2 = c2 ’ x2 + y2 dt2 + 2 ydtdx ’ 2 xdtdy ’ dx2 ’ dy2 ’ dz2

2

and the geodesic equations (5.32) are (exercise)

¨

t=0

™ ™™

x’

¨ xt 2 ’ 2 y t = 0

2

™ ™™

y’

¨ y t 2 + 2 xt = 0

2

z=0

¨

6

For a full discussion, see for example J. Foster & J. D. Nightingale, A Short Course in General Relativity,

Springer-Verlag, 1995.

130 Special relativity revisited

z

Z

ω

y

Y

O

ωt

X

x

Figure 5.6 The coordinate system x y z rotating relative to the inertial coor-

dinate system X Y Z .

where the dots denote differentiation with respect to proper time . These equa-

tions give the worldline x of a free particle in this coordinate system. Once

again, the first equation implies that dt/d = constant, so that we can replace

the dots in the remaining three equations with derivatives with respect to t.

Multiplying through by the rest mass m of the particle and rearranging, these

equations become

d2 x dy

m 2 =m x + 2m

2

dt dt

d2 y dx

m 2 =m y ’ 2m

2

dt dt

d2 z

m 2 =0

dt

or, in 3-vector notation,

d2 x dx

m 2 = ’m — — x ’ 2m — (5.34)

dt dt

where x = x y z and = 0 0 . Thus we recover the equation of motion

for a free particle in a rotating frame of reference. We note, however, that the

coordinate t is the time measured by clocks at rest in the non-rotating system S,

since we have set t = T . It is possible to rewrite the equation of motion in terms of

the proper time measured by an observer at some some fixed position in S , but to

do so would involve replacing (5.34) by a more complicated equation that tends to

conceal the Coriolis and centrifugal forces. Note that t is exactly the proper time

131

Exercises

for an observer situated at the common origin O of the two systems, so observers

close to O who are at rest in S would accept (5.34) as (approximately) valid.

From these examples, we see that in general the geodesic equations can be

rather complicated both for non-inertial frames and for inertial frames labelled

by non-Cartesian spatial coordinates. Thus, when describing physical effects in

an inertial frame, it is conventional to use Cartesian spatial coordinates to label

points in the frame and so to work in a coordinate system X for which (5.31) is

valid. It is then much easier to disentangle the physical effects from artefacts of

the coordinate system.

Exercises

5.1 Show that the transformation matrix for a Lorentz transformation from S to S in

standard configuration is given by (5.4).

5.2 Show that, under a Lorentz transformation, the covariant components of a vector

transform as v = v . Hence show explicitly in component form that, for two

4-vectors v and w, the scalar product v · w is invariant under a Lorentz transformation.

5.3 Prove that, for any timelike vector v in Minkowski space, there exists an inertial

frame in which the spatial components are zero.

5.4 Prove (a) that the sum of any two spacelike vectors is spacelike; and (b) that a

timelike vector and a null vector cannot be orthogonal.

5.5 For the spaceship discussed in Section 1.14, which maintains a uniform acceleration

a in the x-direction of some inertial frame S, the worldline is given by

c2

c a a

= = ’1 =0 =0

sinh cosh

t x y z

a c a c

where is the proper time of an astronaut on the spaceship. Show that the 4-velocity

of the rocket in the coordinate system ct x y z is given by

a a

= c cosh c sinh 00

u

c c

Hence show explicitly that u u = c2 and that the spaceship™s 3-velocity is

a

u = c tanh 00

c

5.6 Show that the 4-acceleration of the spaceship in Exercise 5.5 is given by

a a

= a sinh a cosh 00

a

c c

Hence show that a a = a2 and that the magnitude of the spaceship™s 3-acceleration

in its own instantaneous rest frame is also a.

5.7 A spaceship has constant acceleration g in the x-direction in its locally comoving

frame, i.e. the IRF. Show that, in an inertial frame, the spaceship™s 4-velocity u =

132 Special relativity revisited

= a0 a1 0 0 satisfy a1 = gu0 /c and a0 =

u0 u1 0 0 and 4-acceleration a

gu1 /c. Show also that

d2 u g2 u

=2

d2 c

where is the proper time as measured by an occupant of the spaceship. A spaceship

accelerates at a constant rate g = 9 5 m s’2 in its own locally comoving frame.

It starts out towards the centre of the Galaxy 10 kpc distant. After going 5 kpc

it decelerates at the same rate to come to rest again at the Galactic centre. The

outward journey is then repeated in reverse to come back home. Show that, in the

spaceship™s frame, the elapsed travel time is 41.5 years. What is the elapsed time

for the waiting observer (or descendants) on Earth?

5.8 Show that in its own instantaneous rest frame (IRF), a particle™s 4-acceleration is

given by a = 0 a , where a is the 3-acceleration of the particle in the IRF.

5.9 Show that, in an inertial frame in which a particle™s 3-acceleration a is orthogonal

to its 3-velocity u, the particle™s 4-acceleration is given by a = u 0 a .

2

5.10 Show that when an electron and a positron annihilate, more than one photon must

be produced.

5.11 Show that if a photon is reflected from a mirror moving parallel to its plane, then

the angle of incidence of the photon is equal to the angle of reflection.

5.12 An inertial frame S moves with constant velocity u along the x-axis with respect

to frame S. A photon in frame S is fired at an angle to the forward direction of

motion. Show that the angle measured in frame S is

tan 1 ’ 2 1/2

tan =

1 + sec

where = u/c.

5.13 A photon with energy E collides with a stationary electron whose rest mass is m0 .

As a result of the collision the direction of the photon™s motion is deflected through

an angle and its energy is reduced to E . Show that

1 1

’ = 1 ’ cos

m0 c 2

E E

Deduce that the wavelength of the photon is increased by

2h

= sin2

2

m0 c

where h is Planck™s constant. At what angle to the initial photon direction does

the electron move? Show that, if the photon is deflected through a right angle, and

m0 c2 , then after the interaction the angle of the

the photon energy satisfies E

electron™s motion to the direction of the photon™s initial motion is = ’ /4.

5.14 Inverse Compton scattering occurs whenever a photon scatters off a particle moving

with a speed very nearly equal to that of light. Suppose that a particle of rest mass

133

Exercises

m0 and total energy E collides head on with a photon of energy E . Show that the

scattered photon has energy

’1

m2 c 4

E 1+ 0

4EE

Ultra-high-energy cosmic rays have energies up to 1020 eV. How much energy can

a cosmic ray proton transfer to a microwave background photon?

5.15 For a pure 4-force f acting on a particle of rest mass m0 , show that the corresponding

3-force f satisfies

f ·u

f = u m0 a + 2 u

c

Hence show that a is only parallel to f when f is either parallel or orthogonal

to u. Show further that, in these two cases, one has f = u m0 a and f = u m0 a

3

respectively.

5.16 For a pure 4-force f acting on a particle of rest mass m0 , show that

du

= uf

m0

d

5.17 In Minkowski spacetime, consider an emitter moving at speed v along the positive

x1 -axis of the frame S in which a receiver is at rest. Prove the Doppler shift

formula

v

= v 1 ’ cos

c

where is the angle made by the photon trajectory with the x1 -axis of S. Show that

this expression can be written in the manifestly covariant way

uk

=

uk

where k is the photon 4-wavevector and u and u are the 4-velocities of and

respectively.

5.18 An astronaut on the space rocket in Exercise 5.5 refers all his measurements to an

orthonormal tetrad e that comprises the basis vectors of a Cartesian instan-

taneous rest frame S at proper time . Suppose that at = 0 the tetrad coincides

with the fixed basis vectors e of the ct x y z coordinate system in the inertial

frame S and that the rocket is not rotating in any way. Show that, in the ct x y z

coordinate system, the components of the astronaut™s orthonormal tetrad at some

later proper time are

a a

= cosh sinh 00

e0

c c

a a

= sinh cosh 00

e1

c c

= 0010

e2

= 0001

e3

134 Special relativity revisited

The astronaut observes photons that were emitted with frequency 0 from a star that

is stationary at the origin of S. Show that the frequency of the photons as measured

by the astronaut at proper time is given by

= ’a /c

0 exp

5.19 At some event P in Minkowski spacetime, the worldline of a particle (either massive

or massless) and an observer cross. If, at this event, the particle has 4-momentum

p and the observer has 4-velocity u then show that the observer measures the

magnitude of the spatial momentum of the particle to be

1/2

p·u 2

p= ’p·p

c2

5.20 Repeat Exercise 1.10 using 4-vectors.

5.21 In Minkowski spacetime, the coordinates cT X Y Z correspond to a Cartesian

inertial frame. The coordinates ct r are related to them by the equations

X = r sin cos Y = r sin sin Z = r cos

Obtain the special-relativistic equations of motion of a free particle in the ct r

coordinate system, and interpret these equations physically.

5.22 Repeat Exercise 5.21 for the coordinates ct z , that are related to the Cartesian

inertial coordinates cT X Y Z by

T =t

X = cos cos t ’ sin sin t

Y = cos sin t + sin cos t

Z=z

where is a constant.

6

Electromagnetism

At the time special relativity was devised only two forces were known, electro-

magnetism and gravity. As mentioned in Chapter 1, it was electromagnetism that

actually led to the development of special relativity. Therefore, we now discuss

electromagnetism in some detail; in particular its relativistic formulation. This

will introduce a number of ideas that we will use later in developing and applying

a relativistic formulation of gravity, namely general relativity. Our guiding prin-

ciple here is to derive tensorial equations in Minkowski spacetime. This makes

it possible to express the theory in a form that is independent of the coordinate

system used. We will see that a consistent theory of electromagnetism follows

from saying that there exists a pure 4-force that depends linearly on 4-velocity

and also on a certain property of a particle, namely its charge q. Even if one has

no prior knowledge of electromagnetism, one can derive the complete theory in

a few lines using this basic assumption and occasional appeals to simplicity.

6.1 The electromagnetic force on a moving charge

In some inertial frame S, the 3-force on a particle of charge q moving in an

electromagnetic field is

f = q E+u—B

where u is the particle™s 3-velocity in S. The 3-vector fields E and B are the

electric and magnetic fields as measured in S. This equation suggests that for the

proper relativistic formulation we should write down a tensor equation in four-

dimensional spacetime in which the electromagnetic 4-force f depends linearly

on the particle™s 4-velocity u. Thus we are led to an equation of the form

f = qF · u (6.1)

135

136 Electromagnetism

where F must be a rank-2 tensor in order to make a 4-force from a 4-velocity.

We call F the electromagnetic field tensor. The scalar q is some property of the

particle that determines the strength of the electromagnetic force upon it (i.e. its

charge).

We could develop the theory entirely in terms of coordinate-independent

4-vectors and 4-tensors. Nevertheless, if we label points in spacetime with some

arbitrary coordinate system x , we may express (6.1) in component form as

f = qF u

where the F are the covariant components of F in our chosen coordinate

system. In order that the rest mass of a particle is not altered by the action of

the electromagnetic force we require the latter to be a pure force, so that for any

4-velocity u we have u · f = 0. In component form this reads

f u = qF u u = 0

which implies that the electromagnetic field tensor must be antisymmetric, i.e.

= ’F

F

The contravariant components of F are given by

=g

F gF

where the g are the contravariant components of the metric tensor in our

coordinate system. Since g is symmetric, it is clear that F = ’F also.

6.2 The 4-current density

So far we have found only the relativistic form of the electromagnetic force on

an idealised point particle with charge q and 4-velocity u, in terms of some as

yet undetermined rank-2 antisymmetric tensor F. In order to develop the theory

further, we must now construct the field equations of the theory, which determine

the electromagnetic field tensor F x at any point in spacetime in terms of charges

and currents. To construct these field equations, we must first find a properly

relativistic (or covariant) way of expressing the source term. In other words, we

need to identify the 4-tensor, defined at each event in spacetime, that acts as the

source of the electromagnetic field.

Let us consider some general time-dependent charge distribution. At each

event P in spacetime we can characterise the distribution completely by giving

the charge density and 3-velocity u as measured in some inertial frame. For

simplicity, let us consider the fluid in the frame S in which u = 0 at P. In this

137

6.2 The 4-current density

l

l

l

l' = l/ γ

Lorentz contracted in

direction of motion

Figure 6.1 The Lorentz contraction of a fluid element in the direction of motion.

frame, the (proper) charge density is given by 0 = qn0 , where q is the charge

on each particle and n0 is the number of particles in a unit volume. In some

other frame S , moving with speed v relative to S, the volume containing a fixed

number of particles will be Lorentz contracted along the direction of motion (see

Figure 6.1). Hence in S the number density of particles is n = v n0 , from which

we obtain

= v0

Thus we see that the charge density is not a 4-scalar but does transform as the

0-component of a 4-vector. This suggests that the source term in the electromag-

netic field equations should be a 4-vector. At each point in spacetime, the obvious

choice is

jx = xux

0

where 0 x is the proper charge density of the fluid (i.e. that measured by an

observer comoving with the local flow) and u x is its 4-velocity. The squared

length of this 4-current density j at any event is

j·j = 22

0c

In an inertial frame S the components of the 4-current density j are

= cu=c

j j

0u

where is the charge density as measured in S and j is the relativistic 3-current

density in S. Thus, we see that c2 2 ’ j 2 is a Lorentz invariant, where j 2 = j · j.

138 Electromagnetism

6.3 The electromagnetic field equations

We are now in a position to write down the electromagnetic field equations. The

simplest way in which to relate the rank-2 electromagnetic field tensor F to the

4-vector j is to contract F with some other 4-vector. Since there are no more

physical 4-vectors associated with the theory, the only other 4-vector that the field

equations can contain is the 4-gradient . Thus the field equations must be of the

form

· F = kj (6.2)

where k is an unimportant constant related to our choice of units. In order to make

our final results more familiar, let us work in Cartesian inertial coordinates x

corresponding to some inertial frame S. In such a system, the covariant derivative

reduces simply to the partial derivative, and so we can write (6.2) in component

form as

= kj (6.3)

F

We can use this field equation to obtain the law for the conservation of charge.

If we take the partial derivative of (6.3), we obtain

=k j (6.4)

F

However, since F is antisymmetric, we can write the scalar on the left-hand

side as

=’ =’ =’

F F F F

= 0. Thus the right-hand side of (6.4) must

from which we deduce that F

also be zero, so that

j =0

Using 3-vector notation in the frame S, we may write this in a more familiar way:

+ ·j = 0

t

which expresses the conservation of charge. This equation has the same form as

the non-relativistic equation of charge continuity, but the relativistic expressions

for and j must be used in it.

It is clear, however, that we do not yet have a viable theory. The field equations

of the theory are given by (6.3), but there are six independent components in

F and only four field equations. Evidently our theory is under-determined as it

139

6.4 Electromagnetism in the Lorenz gauge

stands. This suggests that F could be constructed from a 4-vector ˜potential™ A.

Again working in Cartesian inertial coordinates x , let us write

= A’ (6.5)

F A

Thus F is antisymmetric by construction and contains only four independent

fields A . Using the field equation (6.3), we can write

kj = k j= =

F F

where we have used the fact that the metric coefficients in Cartesian inertial

coordinates x are constants.1 Hence, by substituting into the expression (6.5),

we obtain the electromagnetic field equations in terms of the 4-vector potential A

as

A’ = kj (6.6)

A

Alternatively, we can express electromagnetism entirely in terms of the electro-

magnetic field tensor F . In this case, we require the two field equations

= kj

F

(6.7)

+ + =0

F F F

where the second of these is straightforwardly derived from (6.5). Using the

antisymmetrisation operation described in Section 4.3, the second equation can

F = 0. The constant k may be found by

also be written very succinctly as

demanding consistency with the standard Maxwell equations (see Section 6.5). In

SI units we have k = 0 , where 0 0 = 1/c2 .

6.4 Electromagnetism in the Lorenz gauge

Suppose that we add an arbitrary 4-vector Q to the 4-potential A. Thus, in

component form (in Cartesian inertial coordinates, x , for example) we have

A new = A + Q (6.8)

Note that this is not a coordinate transformation. We are still working in the same

set of coordinates x but have defined a new vector A new , whose components

1

In fact, such an operation is valid in any coordinate system. As we showed in Chapter 4, the covariant

derivative of the metric tensor is identically zero, which means that we can interchange the order of index

raising or lowering and covariant differentiation without affecting the result.

140 Electromagnetism

in this basis are given by (6.8). The new electromagnetic field tensor is then

given by

= A new ’ A new = A’ A+ Q’

new

F Q

Clearly, we will recover the original electromagnetic field tensor provided that

Q= Q

This equation can be satisfied if Q is the gradient of some scalar field (say), so

that Q = . Thus we have uncovered a gauge freedom in the theory: we are

free to add the gradient of any scalar field to the 4-vector potential A, giving

A new = A + (6.9)

and still recover the same electromagnetic field tensor and hence the same elec-

tromagnetic field equations. The transformation (6.9) is an example of a gauge

transformation and, as stated above, is distinct from a coordinate transformation.

In the field equations

A’ =

A 0j

the second term on the left-hand side can be written as A . Thus, we can

make this term zero by choosing a scalar field such that

A =0 (6.10)

This condition is called the Lorenz gauge. It is worth noting that the condition

(6.10) is preserved by any further gauge transformation A ’ A + if and

= 0.

only if

Adopting the Lorenz gauge allows the electromagnetic field equations to be

written very simply as

A= A= 0j

using the notation 2 =

It is usual to write the four-dimensional Laplacian

= , where 2 is the d™Alembertian operator.2 In Cartesian inertial

coordinates ct x y z ,

12 2 2 2

= 2 2’ 2’ 2’ 2

2

ct x y z

This operator should properly be written 2 , which is the inner product · of the 4-gradient with itself.

2

However, the notation we have adopted is quite common, since it makes clearer the distinction between the

four-dimensional Laplacian and the three-dimensional Laplacian 2 = · .

141

6.5 Electric and magnetic fields in inertial frames

Then the electromagnetic field equations in the Lorenz gauge take the especially

simple form

A=

2

0j

together with the attendant gauge condition (6.10). Moreover, in the absence of

charges and currents, the right-hand side becomes zero and so A has wave

solutions travelling at the speed of light, as do the components of F since in

this case we also have 2 F = 0.

6.5 Electric and magnetic fields in inertial frames

We have not yet identified the components of F (or A) with the familiar electric

and magnetic 3-vector fields E and B as observed in some Cartesian inertial frame

S. This is simply a matter of convention; we just have to name the components of

A (say) in a way which results in 3-vector equations in S that describe the physics

correctly in terms of the traditionally defined 3-vectors E and B. Thus, in some

Cartesian inertial frame S, the components of A are taken to be as follows:

=

A A

c

where is the electrostatic potential and A is the traditional three-dimensional

vector potential. In terms of and A, the Lorenz gauge condition becomes

1

·A+ =0

ct

and, in this gauge, the field equations take the form

A= =

2 2

and

0j

0

In terms of and A, the electric and magnetic fields in S are given by

A

B= —A E=’ ’

and (6.11)

t

It is straightforward to show that these equations lead to the Maxwell equations

in their familiar form,

B

—E = ’

·E =

t

0

E

·B = 0 —B = 0j + 00

t

142 Electromagnetism

From the expressions (6.11) and (6.5) we have

Ei = ’ ’ c 0 Ai = ’c j A ’ 0 Aj = ’c

0

ij ij ij

Fj0

j

where we have used the fact that A0 = A = A0 . Also, we have

0

B1 = 2A ’ 3A = 3 A2 ’ 2 A3 = F32

3 2

where we have used the fact that Ai = i A = ’Ai . Similar results hold for B2

and B3 . Thus we find that the covariant components of F in S are given by

⎛ ⎞

E 1 /c E 2 /c E 3 /c

0

⎜’E 1 /c B2 ⎟

’B3

⎜ ⎟

0

=⎜ 2 ⎟

F

⎝’E /c ’B1 ⎠

B3 0

’E 3 /c ’B2 B1 0

The corresponding electric and magnetic fields E and B in some other Cartesian

inertial frame S are most easily obtained by calculating the components of the

electromagnetic field tensor F or the 4-potential A in this frame. For example, if S

is moving at speed v relative to S in standard configuration then the components

in S are given by

A= =

and

A F F

where the matrix is given in Chapter 5.

6.6 Electromagnetism in arbitrary coordinates

So far we have developed electromagnetic theory in Cartesian inertial coordinates.

In general, however, we are free to label points in the Minkowski spacetime using

any arbitrary coordinate system x . We could have developed the entire theory

in such an arbitrary system, or even in a coordinate-independent way by using

the 4-tensors themselves rather than their components in some coordinate system.

Nevertheless, having expressed the theory in Cartesian inertial coordinates, it is

now trivial to re-express it in a form valid in arbitrary coordinates.

As shown in (6.7), the electromagnetic field equations in Cartesian inertial

coordinates, when expressed in terms of F, are given by

=

F 0j

+ + =0

F F F

143

6.6 Electromagnetism in arbitrary coordinates

In such a coordinate system, the partial derivative is identical to the covariant

derivative , so we can rewrite these equations as

=

F 0j

(6.12)

+ + =0

F F F

These new equations are now fully covariant tensor equations, however, so that if

they are valid in one system of coordinates then they are valid in all coordinate

systems. Thus, (6.12) gives the electromagnetic field equations in an arbitrary

coordinate system! Once again, using the antisymmetrisation operation discussed

F = 0.

in Section 4.3, one can write the second equation simply as

A similar procedure can be performed for the electromagnetic field equations

when expressed in terms of the 4-vector potential A. From (6.6), in Cartesian

inertial coordinates we have

A’ =

A 0j

Once again, we can replace by , but in this case we must also replace

by g , to obtain

A’ =

g A 0j

Again we have a fully covariant tensor equation, which must therefore be valid

in any arbitrary coordinate system, the metric coefficients of which are g .

In arbitrary coordinates, the electromagnetic field equations still permit the

gauge transformation

A new = A + =A +

where the last equality holds because the covariant derivative of the scalar field

is simply its partial derivative. We can again choose a scalar field , so that

A =0

which is the Lorenz gauge condition in arbitrary coordinates. In this case the

electromagnetic field equations can again be written in the form

A=

2

0j

144 Electromagnetism

but now the d™Alembertian operator is given by 2 = g = . In vacuo,

we may again write 2 A = 0 and 2 F = 0. Also, charge conservation is given

in arbitrary coordinates by

j =0

Finally, we note that the components of F and A in two different arbitrary

coordinate systems x and x are related by

x x x

A= =

and

A F F

x x x

6.7 Equation of motion for a charged particle

From our original considerations in Section 6.1, we see that the coordinate-

invariant manner of writing the equation of motion of a charged particle in an

electromagnetic field is

dp du

= m0 = qF · u

d d

where m0 is the rest mass of the particle, p is its 4-momentum, u is its 4-velocity

and is the proper time measured along its worldline. Note that the first equality

holds because the electromagnetic force is a pure force.

In Cartesian inertial coordinates, this becomes

du

= qF u

m0

d

In a general coordinate system, however, the left-hand side is no longer valid

since the ordinary derivative of the components of the 4-velocity along the parti-

cle™s worldline must be replaced by the intrinsic derivative along the worldline.

Using the expression for the intrinsic derivative given in Chapter 3, we find

that in an arbitary coordinate system the equation of motion of a particle in an

electromagnetic field is

Du du

= m0 + = qF

m0 uu u

D d

where we have written dx /d as u since the 4-velocity is the tangent to the

particle™s worldline x .

The equation for the particle™s worldline in arbitrary coordinates is thus given by

d2 x dx dx q dx

+ = (6.13)

F

d2 dd m0 d

145

Exercises

In the absence of an electromagnetic field (or for an uncharged particle), the

right-hand side is zero and we can recognise the result as the equation of a

geodesic.

In summary, the general procedure for converting an equation valid in Cartesian

inertial coordinates into one that is valid in an arbitrary coordinate system is as

follows:

• replace partial derivatives with covariant derivatives;

• replace ordinary derivatives along curves with intrinsic derivatives;

• replace by g .

Exercises

6.1 Show that the second Maxwell equation in (6.7) can be written as F = 0.

6.2 Show that the Maxwell equation (6.6) is unchanged under the gauge transformation

(6.9).

6.3 In some Cartesian inertial frame S, the contravariant components of the electric and

magnetic fields are E i and Bi respectively. Show that the corresponding electromag-

netic field-strength tensor has the contravariant components

⎛ ⎞

0 ’E 1 /c ’E 2 /c ’E 3 /c

⎜E 1 /c 0 B2 ⎟

’B3

⎜ ⎟

=⎜ 2 ⎟

F

⎝E /c B 1⎠

’B

3

0

E 3 /c ’B2 B1 0

6.4 In a Cartesian inertial coordinate system in Minkowski spacetime the field equations

of electromagnetism can be written

=

F 0j

F+F + =0

F

Show that these equations are equivalent to the standard form of Maxwell™s equations

in vacuo.

6.5 Two Cartesian inertial frames S and S are in standard configuration. Show that the

components of electric and magnetic fields in the two frames are related as follows:

B 1 = B1

E =E

1 1

v3

B2 = B2 + E

E= E ’ vB

2 2 3

c2

v

E3= E 3 + vB2 B3 = B3 ’ 2 E 2

c

Show further that c2 B2 ’ E 2 is Lorentz invariant.

146 Electromagnetism

6.6 Show that the transformation equations derived in Exercise 6.5 can be written as

E⊥ = E⊥ + v — B⊥

E =E

1

B⊥ = B⊥ ’ v — E⊥

B =B

c2

where v = v 0 0 , and E and E⊥ denote the projections of E parallel and orthog-

onal to v respectively (and similarly for B). Explain why these equations must hold

for a Lorentz boost v in an arbitrary direction with respect to the axes of S.

6.7 Show that one may eliminate the explicit reference to the projections of E and B

in Exercise 6.6 and write the transformations as

1’

E = E+v—B + 2 v·E v

v

1’

1

B= B’ 2v—E + 2 v·B v

c v

Show that E · B is a Lorentz invariant.

6.8

6.9 In an arbitrary coordinate system, the second Maxwell equation reads

F+ + =0

F F

Show that this can be written as

F+F + =0

F

F = 0.

and hence show that

6.10 In Cartesian inertial coordinates, the equation of motion for a charged particle in

an electromagnetic field is

du

= qF u

m0

d

Show that

dp d

= q E+u—B = qE · u

and

dt dt

where p and are the 3-momentum and the energy respectively of the particle in

S. Interpret these results physically.

6.11 In some inertial frame S, show that the 3-acceleration of a charged particle in an

electromagnetic field is

1

du q

a= = E+u—B’ 2 u·E u

dt m0 c

7

The equivalence principle and spacetime curvature

We are now in a position to use the experience gained in deriving a relativistic

formulation of electromagnetism (together with some flashes of inspiration from

Einstein!) to begin our formulation of a relativistic theory of gravity, namely

general relativity.

7.1 Newtonian gravity

In our development of electromagnetism, we began by considering the electro-

magnetic 3-force on a charged particle. Let us therefore start our discussion of

gravity by considering the description of the gravitational force in the classical,

non-relativistic, theory of Newton. In the Newtonian theory, the gravitational

force f on a (test) particle of gravitational mass mG at some position is

f = mG g = ’mG

where g is the gravitational field derived from the gravitational potential at that

position. In turn, the gravitational potential is determined by Poisson™s equation:

=4 G

2

(7.1)

where is the gravitational matter density and G is Newton™s gravitational

constant. This is the field equation of Newtonian gravity.

It is clear from (7.1) that Newtonian gravity is not consistent with special

relativity. There is no explicit time dependence, which means that the potential

(and hence the gravitational force on a particle) responds instantaneously to a

disturbance in the matter density ; this violates the special-relativistic requirement

that signals cannot propagate faster than c. We might try to remedy this by noting

147

148 The equivalence principle and spacetime curvature

that the Laplacian operator 2 in (7.1) is equivalent to minus the d™Alembertian

operator 2 in the limit c ’ , and thus postulate the modified field equation

= ’4 G

2

However, this equation does not yield a consistent relativistic theory. It is still

not Lorentz covariant, since the matter density does not transform as a Lorentz

scalar. We shall discuss the transformation properties of the matter density later.

In addition to the incompatibility of Newtonian gravity with special relativity,

there is a second fundamental difference between the electromagnetic and grav-

itational forces. The equation of motion of a particle of inertial mass mI in a

gravitational field is given by

d2 x m

=’ G (7.2)

dt2 mI

It is a well-established experimental fact, however, that the ratio mG /mI appearing

in the equation of motion is the same for all particles. By an appropriate choice of

units one may thus arrange for this ratio to equal unity. In contrast, the ratio q/mI

occurring in the equation of motion of a charged particle in an electromagnetic

field is not the same for all particles. From (7.2), we thus see that the trajectory

through space of a particle in a gravitational field is independent of the nature of

the particle.

This equivalence of the gravitational and inertial masses (which allows us to

refer simply to ˜the mass™), is a truly remarkable coincidence in the Newtonian

theory. In this theory, there is no a-priori reason why the quantity that determines

the magnitude of the gravitational force on the particle should equal the quantity

that determines the particle™s ˜resistance™ to an applied force in general. It appears

as an isolated experimental result, which has since been verified to an accuracy

of at least one part in 1011 (by Dicke and co-workers).

7.2 The equivalence principle

The equality of the gravitational and inertial masses of a particle led Einstein

to his classic ˜elevator™ thought experiment. Consider an observer in a freely

falling elevator (i.e. after the lift cable has been cut). Objects released from

rest relative to the elevator cabin remain floating ˜weightless™ in the cabin.

A projectile shot from one side of the elevator to the other appears to move

in a straight line at constant velocity, rather than in the usual curved trajectory.

All this follows from the fact that the acceleration of any particle relative to

149

7.3 Gravity as spacetime curvature

the elevator is zero: the particle and the elevator cabin have the same accelera-

tion relative to the Earth as a result of the equivalence of gravitational and inertial

mass.

All these observations would hold exactly if the gravitational field of the Earth

were truly uniform. Of course, the gravitational field of the Earth is not uniform

but acts radially inwards towards its centre of mass, with a strength proportional

to 1/r 2 . Thus, if the elevator were left to free-fall for a long time or if it were very

large (i.e. a significant fraction of the Earth™s radius), two particles released from

rest near the walls of the elevator would gradually drift inwards, since they would

both be falling along radial lines towards the centre of the Earth (see Figure 7.1).

Furthermore, as a result of the varying strength of the gravitational field, particles

released from rest near the floor of the elevator would gradually drift downwards

whereas those near the ceiling would drift upwards. What the observer in the

elevator would be experiencing would be the tidal forces resulting from the

residual inhomogeneity in the strength and direction of the gravitational field once

the main acceleration has been subtracted. It should always be remembered that

these tidal forces can never be completely abolished in an elevator (laboratory)

of finite, i.e. non-zero, size.

Nevertheless, provided that we consider the elevator cabin over a short time

period and that it is spatially small, then a freely falling elevator (which may

have x y z coordinates marked on its walls and an elevator clock measuring

time t) resembles a Cartesian inertial frame of reference, and therefore the laws

of special relativity hold inside the elevator.1 These observations lead to

The equivalence principle: In a freely falling (non-rotating) laboratory occupying

a small region of spacetime, the laws of physics are those of special relativity.2

7.3 Gravity as spacetime curvature

These observations led Einstein to make a profound proposal that simultaneously

provides for a relativistic description of gravity and incorporates in a natural way

the equivalence principle (and consequently the equivalence of gravitational and

inertial mass). Einstein™s proposal was that gravity should no longer be regarded

as a force in the conventional sense but rather as a manifestation of the curvature

of the spacetime, this curvature being induced by the presence of matter. This is

the central idea underpinning the theory of general relativity.

1

The elevator cabin must not only occupy a small region of spacetime but also be non-rotating with respect to

distant matter in the universe. This statement is related to Mach™s principle.

2

This is in fact a statement of the strong equivalence principle, since it refers to all the Laws of physics. The

more modest weak equivalence principle refers only to the trajectories of freely falling particles.

150 The equivalence principle and spacetime curvature

Figure 7.1 An elevator in free-fall towards the Earth.

If gravity is regarded a manifestation of the curvature of spacetime itself, and

not as the action of some 4-force f defined on the manifold then the equation of

motion of a particle moving only under the influence of gravity must be that of a

˜free™ particle in the curved spacetime, i.e.

dp

=0

d

where p is the particle™s 4-momentum and is the proper time measured along the

particle™s worldline. Thus, the worldline of a particle freely falling under gravity

is a geodesic in the curved spacetime.

The equivalence principle restricts the possible geometry of the curved space-

time to pseudo-Riemannian, as follows. The mathematical meaning of the equiv-

alence principle is that it requires that at any event P in the spacetime manifold

we must be able to define a coordinate system X such that, in the local neigh-

bourhood of P, the line element of spacetime takes the form

ds2 ≈ dX dX

where exact equality holds at the event P. From the geodesic equation (as shown

in Chapter 5), in such a coordinate system the path of a ˜free™ particle, i.e. one

moving only under the influence of gravity, in the vicinity of the event P is

given by

d2 X i

≈0

dT 2

where i = 1 2 3 and we have denoted X 0 by cT (once again the equality in the

above equations holds exactly at P). Thus, in the vicinity of P the coordinates X

define a local Cartesian inertial frame (like our small elevator considered over a

short time interval), in which the laws of special relativity hold locally. In order

151

7.4 Local inertial coordinates

that we can construct such a system, spacetime must be a pseudo-Riemannian

manifold (which is curved and four-dimensional). For such a manifold, in some

arbitrary coordinate system x the line element takes the general form

ds2 = g dx dx

7.4 Local inertial coordinates

The curvature of spacetime means that it is not possible to find coordinates in

which the metric g = at all points in the manifold. Thus, it is not possible

to define global Cartesian inertial frames as we could in the pseudo-Euclidean

Minkowski spacetime. Instead, we are forced to use arbitrary coordinate systems

x to label events in spacetime, and these coordinates often do not have simple

physical meanings. It is often the case that x0 is a timelike coordinate and the

xi i = 1 2 3 are spacelike (i.e. the tangent vector to the x0 coordinate curve is

timelike at all points, and similarly the tangent vectors to the xi coordinate curves

are always spacelike). This allocation of coordinates is not necessary, however,

and it is sometimes useful to define null coordinates. In any case, the arbitrary

coordinates x need not have any direct physical interpretation.

Nevertheless, as demanded by the equivalence principle, problems of physical

meaning can always be overcome by transforming, at any event P in the curved

spacetime, to a local inertial coordinate system X , which, in a limited region of

spacetime about P, corresponds to a freely falling, non-rotating, Cartesian frame

over a short time interval. Mathematically, this corresponds to constructing about

the event P a coordinate system X such that

P= =0

and (7.3)

g g P

P = 0 and that the coordinate basis vectors at the

This also means that

event P form an orthonormal set, i.e.

e P ·e P = (7.4)

There are in fact an infinite number of local inertial coordinate systems at P, all

of which are related to one another by Lorentz transformations. In other words,

if a coordinate system X satisfies the conditions (7.3), and hence the condition

(7.4), then so too will the coordinate system