The symbols are used to exclude unwanted indices from the (anti-)

symmetrisation implied by ( ) and [ ].

4.4 The metric tensor

The most important tensor that one can define on a manifold is the metric tensor

g. This defines a linear map of two vectors into the number that is their inner

product, i.e.

g u v ≡ u·v (4.2)

From this definition, it is clear that g is a symmetric second-rank tensor. Its

covariant and contravariant components are given by

gab = g ea eb = ea · eb g ab = g ea eb = ea · eb

and

which, from (4.2), clearly match our earlier definitions. As we showed in Chap-

ter 3, the matrix g ab containing the contravariant components of the metric

tensor is the inverse of the matrix gab that contains its covariant components.

The mixed components of g are given by

g eb ea = g ea eb = b

a

where the last equality is a result of the reciprocity relation between basis vectors

and their duals.

97

4.6 Mapping tensors into tensors

4.5 Raising and lowering tensor indices

The contravariant and covariant components of the metric tensor can be used

for raising and lowering general tensor indices, just as they are used for vector

indices. As we have seen, when a tensor acts on different combinations of basis

and dual basis vectors it yields different components. Consider, for example, a

third-rank tensor t. Its covariant components are given by

t ea eb ec = tabc (4.3)

whereas one possible set of mixed components of the tensor is given by

t ea eb ec = tab c

As we stated earlier, in general these two sets of components will differ, since

the basis and dual basis vectors are related by the metric: ec = gcd ed . Thus, for

example,

tabc = gcd tab d

In a similar way we can raise or lower more than one index at a time. For example,

ta bc = g ad gce tdb e

4.6 Mapping tensors into tensors

Tensors can be thought of not just as maps between vectors and real numbers

but also as maps between tensors and other tensors. Consider, for example, a

third-rank tensor t, but let us not ˜fill™ all of its argument ˜slots™ with vectors. If,

for instance, we fill just its last slot with some fixed vector u, we have the object

t··u (4.4)

What sort of object is this? Well, it is clear that, if we supply two further vectors to

this object, we will obtain a real number. Thus the object (4.4) is itself a second-

rank tensor, which we could denote by s (say). Thus the third-rank tensor t has

˜mapped™ the vector u into the second-rank tensor s. The covariant components

(say) of s are given by

sab ≡ s ea eb = t ea eb u = tabc uc

where, in the last slot, we have expressed u as uc ec . By expressing this vector as

uc ec instead, we obtain the equivalent expression sab = tab c uc .

As a further example of mapping between tensors, let us fill both the first and

last slots of t with fixed vectors v and u respectively to obtain the object

tv·u

98 Tensor calculus on manifolds

Clearly, this object is a first-rank tensor (or vector), which we denote by w. Thus

the third-rank tensor t has mapped the two vectors v and u into the vector w. The

covariant components (say) of w are

wb = w eb = t v eb u

which can be expressed in several equivalent ways, i.e.

wb = tabc va uc = ta bc va uc = tab c va uc = ta bc va uc

The number of free indices in such expressions is the rank of the resulting tensor.

4.7 Elementary operations with tensors

Tensor calculus is concerned with tensorial operations, that is, operations on

tensors which result in quantities that are still tensors. We now consider some

elementary tensorial operations.

Addition (and subtraction)

It is clear from the definition of a tensor that the sum and difference of two tensors

of rank N are both themselves tensors of rank N . For example, the covariant

components (say) of the sum s and difference d of two rank-2 tensors are given

straightforwardly by

sab = s ea eb = t ea eb + r ea eb = tab + rab

(4.5)

dab = d ea eb = t ea eb ’ r ea eb = tab ’ rab

Multiplication by a scalar

If t is a rank-N tensor then so too is t, where is some arbitrary real constant.

Clearly, its components are all multiplied by .

Outer product

The outer or tensor product of two tensors produces a tensor of higher rank. The

simplest example of an outer product is that of two vectors. This is defined as the

rank-2 tensor, denoted by u — v, such that

u—v p q ≡ u p v q

where p and q are arbitrary vector arguments (this notation is not to be confused

with the vector product u — v of two vectors, which is itself a vector). Note that

the outer product is not, in general, commutative, so that u — v and v — u are

99

4.7 Elementary operations with tensors

different rank-2 tensors. The covariant components (say) of u — v in some basis

are given by

u — v ea eb ≡ u ea v eb = ua vb

The outer product of higher-rank tensors is a simple generalisation of the outer

product of two vectors. For example, the outer product of a rank-2 tensor t with

a rank-1 tensor s is defined by

t—s p q r ≡ t p q s r

This is a rank-3 tensor, which we could call h. The mixed components, for

instance, of this tensor are given by

ha bc = t ea eb s ec = ta b sc (4.6)

In general, the outer product of an N th-rank tensor with an Mth-rank tensor will

produce an N + M th-rank tensor.

Contraction (and inner product)

The contraction of a tensor is performed by summing over the basis and dual

basis vectors in two of its vector arguments, and it results in a tensor of lower

rank. Let us take as an example a rank-3 tensor h and consider the quantity

q · = h ea · ea

This is clearly a rank-1 tensor with covariant components (say) given by

qb = h ea eb ea = ha ba (4.7)

Thus in terms of tensor components, contraction amounts to setting a subscript

equal to a superscript and summing, as the summation convention requires. In

general, performing a single contraction on an N th-rank tensor will produce a

tensor of rank N ’ 2.

Contraction may be combined with tensor multiplication to obtain the inner

product of two tensors. For example, if ha bc were in fact given by (4.6), then

(4.7) could be written as

qb = t ea eb s ec = ta b sa

which is the inner product of the tensors t and s. Alternatively, one could view

the qb as a contraction of the rank-3 tensor having components ta b sc , which is

the outer product t — s.

If two tensors t and s are rank 2 or lower then we can denote their inner product

unambiguously by t · s. Note, however, that in general such an inner product is

100 Tensor calculus on manifolds

not commutative. For example, if t is a rank-2 tensor and s is rank 1 then the

contravariant components (say) of the vectors t · s and s · t are respectively

tab sb tab sa

and

Clearly, the ˜dot™ notation for the inner product becomes ambiguous if either

tensor is rank 3 or higher, since there is then a choice concerning which indices

to contract.

4.8 Tensors as geometrical objects

We have seen that a rank-1 tensor t · can be identified as a vector. The covariant

and contravariant components of this vector in some basis are given by

t ea = ta t ea = ta

and

We are used to thinking of a vector t as a geometrical object which can be made

up from a linear combination of the basis vectors,

t = ta ea = ta ea (4.8)

Tensors of higher rank are generalisations of the concept of a vector and can

also be regarded as geometrical entities. In a particular basis, a general tensor

can expressed as a linear combination of a tensor basis made up from the basis

vectors and their duals.

Let us consider the outer product ea —eb of two basis vectors of some coordinate

system. The contravariant components of this rank-2 tensor in this basis are very

simple,

ea — eb ec ed = ea ec eb ed = cd

ab

Now suppose that we have some general rank-2 tensor t, whose contravariant

components in our basis are tab . Let us consider the quantity tab ea — eb . This is

a sum of rank-2 tensors, which must therefore also be a rank-2 tensor (see above).

If we consider its action on two basis vectors, we find

tab ea — eb ec ed = tab = tcd

cd

ab

the tcd are simply the contravariant components of t. Thus, in an analogous way

to the vector in (4.8), we may express the rank-2 tensor t as a linear combination

basis tensors,

t = tab ea — eb

101

4.9 Tensors and coordinate transformations

By considering different tensor bases, constructed from other combinations of the

basis and dual basis vectors, we can also write t in several different ways:

t = tab ea — eb = ta b ea — eb = ta b ea — eb

This idea is extended straightforwardly to higher-rank tensors.

4.9 Tensors and coordinate transformations

The description of tensors as a geometrical objects lends itself naturally to a

discussion of the behaviour of tensor components under a coordinate transfor-

mation xa ’ x a on the manifold. As shown in Chapter 3, there is a simple

relationship between the coordinate basis vectors ea associated with the coordi-

nate system xa and the coordinate basis vectors ea associated with a new system

of coordinates x a . We found that at any point P the two sets of coordinate basis

vectors are related by

xb

ea = (4.9)

e

xa b

where the partial derivative is evaluated at the point P. A similar relationship

holds between the two sets of dual basis vectors:

xa b

e = be

a

(4.10)

x

Using (4.9) and (4.10), we can now calculate how the components of any general

tensor must transform under the coordinate transformation.

As shown in Chapter 3, the contravariant components of a vector t in the new

coordinate basis are given by

xa b xa b

t =t e = bt e = bt

a a

x x

Similarly, the covariant components of t are given by

xb xb

ta = t e a = te = t

xa b xa b

It is important to remember that the unprimed and primed components describe

the same vector t in terms of different basis vectors, i.e. t = ta ea = t a ea . The

vector t is a geometric entity that does not depend on the choice of coordinate

system.

102 Tensor calculus on manifolds

The transformation properties of the components of higher-rank tensors may

be found in a similar way. For example, if t is a second-rank tensor then

xc xd

tab = t

x a x b cd

x a x b cd

= c dt

ab (4.11)

t

xx

xc x b d

=

tab t

x a xd c

Once again, these components describe the same tensor (which is a geometric

entity) in terms of different bases. For example,

t = tab ea — eb = t ab ea — eb

In general, when transforming the components of a tensor of arbitrary rank,

each superscript inherits a transformation ˜matrix™ x a / xc and each subscript a

transformation matrix xc / x a . Thus, for example,

xd xe x c f

c

= (4.12)

tab t

x a x b xf de

Indeed, the basic requirement for a set of quantities to be the components of a

tensor is that they transform in such a way under a change of coordinates. We

shall return to this point later.

4.10 Tensor equations

Given a coordinate system (and hence a coordinate basis and its dual), it is

convenient to work in terms of the components of a tensor t in this system rather

than with the geometrical entity t itself. Therefore, from here onwards we shall

adopt a much-used convention, which is to confuse a tensor with its components.

This allows us to refer simply to the tensor tab c , rather than the tensor with

components tab c .

We now come to the reason why tensors are important in mathematical physics.

Let us illustrate this by way of an example. Suppose we find that in one particular

coordinate system two tensors are equal, for example,

tab = sab (4.13)

103

4.11 The quotient theorem

Let us multiply both sides by xa / x c and xb / x d and take the implied summa-

tions to obtain

xa xb xa xb

t= s

x c x d ab x c x d ab

Since tab and sab are both covariant components of tensors of rank 2, this equation

can be restated as tab = sab . In other words, the equation (4.13) holds in any

other coordinate system. In short, a tensor equation which holds in one coordi-

nate system necessarily holds in all coordinate systems. Put another way, tensor

equations are coordinate independent, which is in fact obvious from the geomet-

ric approach we have adopted since the outset. One particularly useful fact that

emerges clearly from this discussion, and the transformation law (4.12), is that if

all the components of a tensor are zero in one coordinate system then they vanish

in all coordinate systems. This is useful in proving many tensor relations.

4.11 The quotient theorem

Not all objects with indices are the components of a tensor. An important example

is provided by the connection coefficients a bc , which vanish in a locally Cartesian

coordinate system but not in other coordinate systems. Moreover, in Chapter 3

we derived the transformation properies of a bc and found that these were not of

the form (4.12).

As mentioned above, the fundamental requirement that a set of quantities form

the components of a tensor is that they obey a transformation law of the kind

(4.12) under a change of coordinates. The quotient theorem provides a means of

establishing this requirement in a particular case without having to demonstrate

explicitly that the transformation law holds. It states that if a set of quantities

when contracted with a tensor produces another tensor then the original set of

quantities is also a tensor. Rather than give a general statement of the theorem

and its proof, which tend to become obscured by a mass of indices, we shall give

an example that illustrates the gist of the theorem.

In an N-dimensional manifold, suppose that with each system of coordinates

about a point P there are associated N 3 numbers ta bc and it is known that, for arbi-

trary contravariant vector components va , the N 2 numbers ta bc vc transform as the

components of a rank-2 tensor at P under a change of coordinates. This means that

x a xe d f

t bc v = d

a c

(4.14)

tv

x x b ef

where the t a bc are the corresponding N 3 numbers associated with the primed

coordinate system. Then we may deduce that the ta bc are the components of a

104 Tensor calculus on manifolds

rank-3 tensor, as follows. Since vf = xf / x c v c , equation (4.14) yields

x a xe d xf c

t bc v = d

a c

t v

x x b ef x c

which, on rearrangement gives

x a xe xf d

t bc ’ d t ef v c = 0

a

x xb xc

This holds for arbitrary vector components v c , so the expression in parentheses

must vanish identically. Thus

x a xe xf d

t bc = d

a

t

x x b x c ef

and therefore the ta bc must be the components of a third-rank tensor.

Thus the gist of the quotient theorem is that if a set of numbers displays tensor

characteristics when some of their indices are ˜killed off™ by summation with the

components of an arbitrary tensor then the original numbers are the components

of a tensor.

4.12 Covariant derivative of a tensor

It is straightforward to show that in an arbitrary coordinate system (unlike in local

Cartesian coordinates) the differentiation of the components of a general tensor,

other than a scalar, with respect to the coordinates does not in general result in

the components of another tensor. For example, consider the derivative of the

contravariant components va of a vector. Under a change of coordinates we have

va xc v a

=

xb x b xc

xc xa d

= v

x b xc xd

xc 2 x a d

x c x a vd

= + b c dv (4.15)

x b xd uc x xx

The presence of the second term on the right-hand side of (4.15) shows that the

derivatives va / xb do not form the components of a second-order tensor. This

term arises because the ˜transformation matrix™ x a / xb changes with position

in the manifold (this is not true in local Cartesian coordinates, for which the

second term vanishes).

To avoid this difficulty, in Chapter 3 we introduced the covariant derivative of

a vector,

= +

a a a c

bv bv cb v

105

4.12 Covariant derivative of a tensor

in terms of which we may write b v = b va ea . Using the transformation prop-

erties of the connection, derived in Chapter 3, it is straightforward to show that

the b va are the (mixed) components of a rank-2 tensor, which is in fact clear

from their definition. We denote this rank-2 tensor by v, which is formally the

outer product of the vector differential operator with the vector v, although it

is usual to omit the symbol — in outer products containing . In a given basis we

have = ea a , so we may write, for example,

v = ea a —v eb = ea — vb eb = ea — eb

b b

av

a

Similarly, the b va form the covariant components of this tensor, i.e. v =

a vb e — e . Indeed, it is easy to check that b v and b va satisfy the required

a b a

transformation laws for being the components of a tensor.

We can extend the idea of the covariant derivative to higher-rank tensors. For

example, let us consider an arbitrary rank-2 tensor t and derive the form of the

covariant derivative c tab of its contravariant components. Expressing t in terms

of its contravariant components, we have

= tab ea — eb = ea — eb + tab — eb + tab ea —

ab

ct c ea c eb

ct

c

We can rewrite the derivatives of the basis vectors in terms of connection coeffi-

cients to obtain

= ea — eb + tab ac ed — eb + t ea —

ab d ab d

ct bc ed

ct

Interchanging the dummy indices a and d in the second term on the right-hand

side and b and d in the third term, this becomes

= + + ea — eb

ab a db b ad

ct ct dc t dc t

where the expression in parentheses is the required covariant derivative,

= + +

ab ab a db b ad

(4.16)

ct ct dc t dc t

Using (4.16), the derivative of the tensor t with respect to xc can be written in

terms of its contravariant components as

= ea — eb

ab

ct ct

106 Tensor calculus on manifolds

Similar results may be obtained for the the covariant derivatives of the mixed

and covariant components of the second-order tensor t. Collecting these results

together, we have

= + +

ab ab a db b ad

ct ct dc t dc t

= ct b + dc t b ’

a a a d d a

(4.17)

ct b bc t d

= c tab ’ ac tdb ’ bc tad

d d

c tab

The positions of the indices in these expressions are once again very systematic.

The last index on each connection coefficient matches that on the covariant

derivative, and the remaining indices can only be logically arranged in one way.

For each contravariant index (superscript) on the left-hand side we add a term on

the right-hand side containing a Christoffel symbol with a plus sign, and for every

covariant index (subscript) we add a corresponding term with a minus sign. This

is extended straightforwardly to tensors with an arbitrary number of contravariant

and covariant indices. We note that the quantities c tab c ta b and c tab are the

components of the same third-order tensor t with respect to different tensor

bases, i.e.

t= ec — ea — eb = ec — ea — eb = ec — ea — eb

ab a

ct ct b c tab

One particularly important result is that the covariant derivative of the metric

tensor g is identically zero at all points in a manifold, i.e.

g=0

Alternatively, we can write this in terms of the components in any basis as

=0 =0

ab

and (4.18)

c gab cg

This result follows immediately from comparing, for example, the third result in

(4.17) with our expression (3.20), derived in Chapter 3, for the partial derivative

of the metric in terms of the affine connection. We note, in particular, that the

expression (3.20) holds even in a manifold with non-zero torsion, and therefore

so too must the result (4.18).1

The result (4.18) has an important consequence, which considerably simplifies

tensor manipulations. This is that we can interchange the order of raising or

1

In fact, for a general manifold with non-zero torsion, it is not necessary that (4.18) holds since one can,

in principle, define the affine connection and the metric independently. In arriving at our earlier expression

(3.20), we had in fact already assumed implicitly that the affine connection was metric-compatible, in which

case (4.18) holds automatically. This topic is, however, beyond the scope of our discussion.

107

4.13 Intrinsic derivative of a tensor along a curve

lowering an index and performing covariant differentiation without affecting the

result. For example, consider the contravariant components tab of some rank-2

tensor. Using (4.18), we can write, for example,

= g bd ta d = ta d + g bd = g bd

ab bd a a

ct cg ct d ct d

c

We also note that the covariant derivative obeys the standard rule for the differ-

entiation of a product.

4.13 Intrinsic derivative of a tensor along a curve

In Chapter 3 we encountered vector fields that are defined only on some subspace

of the manifold, an extreme example being when the vector field v u is defined

only along some curve xa u in the manifold (as for the spin s of a single

particle along its worldline in spacetime). In a similar way a tensor field t u

could be defined only along some curve . We now consider how to calculate

the derivative of such a tensor with respect to the parameter u along the curve.

Let us begin by expressing the tensor at any point along the curve in terms of

its contravariant components (say),

t u = tab u ea u — eb u

where the ea u are the coordinate basis vectors at the point on the curve corre-

sponding to the parameter value u. Thus, the derivative of t along the curve is

given by

dtab

dt de de

ea — eb + tab a — eb + tab ea — b

=

du du du du

Using the chain rule to rewrite the derivatives of the basis vectors, we obtain

dtab ce dxc eb

dt ab dx a

= e — eb + t — eb + t ea —

ab

du a du xc du xc

du

Finally, by writing the partial derivatives of the basis vectors in terms of the

connection and relabelling indices, we find that

dtab c c

dt a db dx ad dx

= + + ea — eb

b

(4.19)

dc t dc t

du du du du

The term in brackets is called the intrinsic (or absolute) derivative of the compo-

nents tab along the curve and is denoted

Dtab dtab c c

db dx b ad dx

= + +

a

dc t dc t

Du du du du

108 Tensor calculus on manifolds

Similar results may be obtained for the covariant and mixed components of the

tensor t. For example, the derivative of t along the curve may be written

a

Dtab Dtb

dt Dtab a

= ea — eb = e —e = ea — eb

b

du Du Du Du

Clearly, the method can be extended easily to higher-rank tensors.

In a similar way to vectors, a tensor t is said to be parallel-transported along a

curve if dt/du = 0 or, equivalently, in terms of its components, if for example

Dtab /Du = 0.

Following our discussion of the intrinsic derivative of a vector in Chapter 3, a

convenient way to remember the form of the intrinsic derivative is to pretend that

the tensor t is in fact defined throughout (some region of) the manifold, i.e. not

only along the curve . If this were the case then we could differentiate t with

respect to the coordinates xa . Thus we could write

dtab tab dxc

=c

du x du

Substituting this into (4.19), we could then factor out dxc /du and recognise the

other factor as the covariant derivative c tab . Thus we could write

Dtab c

ab dx

= ct (4.20)

Du du

with similar expressions for the intrinsic derivatives of its other components. It

must be remembered, however, that if t is only defined along the curve then

formally (4.20) is not defined and acts merely as an aide-memoire.

Exercises

4.1 If t is a rank-2 tensor, show that

t u + v w + z = tab ua + va wb + zb

4.2 If sab = sba and tab = ’tba are the component of a symmetric and an antisymmetric

tensor respectively, show that sab tab = 0.

4.3 If tab are the components of an antisymmetric tensor and va the components of a

vector, show that

v a tbc = va tbc + vc tab + vb tca

1

3

4.4 If tab are the components of a symmetric tensor and va the components of a vector,

show that if

va tbc + vc tab + vb tca = 0

then either tab = 0 or va = 0.

109

Exercises

If the tensor tabcd satisfies tabcd va wb vc wd = 0 for arbitrary vectors va and wa , show

4.5

that

tabcd + tcdab + tcbad + tadcb = 0

4.6 Consider the infinitesimal coordinate transformation

x a = x a + va x

where va x is a vector field and is a small scalar quantity. Show that, to first

order in ,

gab x = gab x ’ gac b vc + gcb a vc

By investigating their transformation properties, show that b va are the mixed

4.7

components of a rank-2 tensor.

4.8 If va are the covariant components of a vector and Aab are the components of an

antisymmetric rank-2 tensor, show that

’ = ’ b va

a vb b va a vb

+ + = + c Aab + b Aca

a Abc c Aab b Aca a Abc

Determine the symmetry properties of the rank-3 tensor

Babc = + c Aab + b Aca

a Abc

4.9 Show that covariant differentiation obeys the usual product rule, e.g.

Abc Bcd = Bcd + Abc cd

a Abc aB

a

Hint: Use local Cartesian coordinates.

4.10 For a general rank-2 tensor T ab , show that the covariant divergence is given by

1

= g T ab +

ab b ac

aT ca T

a

g

Show further that if Aab = ’Aba are the components of an antisymmetric rank-2

tensor then

1

aA =

ab

g Aab

a

g

Hence show that if the antisymmetric tensor field Aab vanishes on a hypersurface

S that bounds a region V of an N -dimensional manifold then

√

’g dN x = 0

ab

aA

V

4.11 Any coordinate transformation xa ’ x a under which the metric is form invariant,

i.e. such that

gab x = gab x

110 Tensor calculus on manifolds

is called an isometry (note that the argument is the same on both sides of the above

equation). Show that the infinitesimal coordinate transformation in Exercise 4.6 is

an isometry, to first order in , provided that va satisfies

gac b vc + gcb a vc + vc c gab = 0

Show further that this expression can be written as

+ =0

a vb b va

This is Killing™s equation and any vector satisfying it is known as a Killing vector

of the metric gab . Show that if va and wa are both Killing vectors then so too is any

linear combination va + wa , where and are constants.

5

Special relativity revisited

Now that we have the machinery of tensor calculus in place, let us return to special

relativity and consider how to express this theory in a more formal manner.

5.1 Minkowski spacetime in Cartesian coordinates

In the language of Chapter 2, the Minkowski spacetime of special relativity

is a fixed four-dimensional pseudo-Euclidean manifold. As such, there exists a

privileged class of Cartesian coordinate systems t x y z covering the whole

manifold, so that at every point (or event) the squared line element takes the form

ds2 = c2 d = c2 dt2 ’ dx2 ’ dy2 ’ dz2

2

where we have taken the opportunity to define the proper time interval d 2 =

= 0 1 2 3 ,1

ds2 /c2 . It is convenient to introduce the indexed coordinates x

so that

x0 ≡ ct x1 ≡ x x2 ≡ y x3 ≡ z

and to write the line element as

ds2 = dx dx

1

It is conventional to use Greek indices when discussing four-dimensional spacetimes rather than the Latin

indices a b c etc. from the start of the alphabet, which are used for abstract N -dimensional manifolds.

Moreover, in relativity theory, it is more common for a Greek index to run from 0 to 3 than from 1 to 4

(although the latter usage is found in some textbooks). Also, it is conventional to use Latin letters from the

middle of the alphabet, such as i j k etc., for indices that run from 1 to 3.

111

112 Special relativity revisited

where the are the covariant components of the metric tensor and are

given by

⎛ ⎞

1000

⎜ 0 ’1 0 0 ⎟

⎜ ⎟

=⎜ ⎟ (5.1)

⎝ 0 0 ’1 0 ⎠

0 0 0 ’1

= diag 1 ’1 ’1 ’1 .

From now on we will often use the shorthand notation

It is clear that the contravariant components of the metric are identical, i.e.

= diag 1 ’1 ’1 ’1 . With this definition of the metric, Minkowski

spacetime has a signature of ’2.2 We also note that, since the metric coefficients

are constant, the connection vanishes everywhere in this coordinate system.

5.2 Lorentz transformations

Cartesian coordinates, which we are using in the context of special relativity, have

a direct physical interpretation and correspond to distances and times measured

by an observer at rest in some inertial frame S that is labelled using three-

dimensional Cartesian coordinates3 (remember that, in Chapter 1, we defined

an inertial frame as one in which a free particle moves in a straight line with

fixed speed). Transforming to a different Cartesian inertial frame corresponds to

performing a coordinate transformation on the Minkowski spacetime to a new

system x . Since we require that the new coordinate system x also corresponds

to a Cartesian inertial frame, the (squared) line element ds2 must take the same

form in these primed coordinates as it did in the unprimed coordinates, i.e.

ds2 = dx dx = dx dx

In other words the metric in the new coordinates must also be given by (5.1).

From the transformation properties of a second-rank tensor, this means that the

transformation x ’ x must satisfy

x x

= (5.2)

x x

which is the necessary and sufficient condition that a transformation x ’ x

is a Lorentz transformation between two Cartesian inertial coordinate systems.

From (5.2), we see that the elements of the transformation matrix must be

= diag ’1 1 1 1 in which the

2

Note that some relativists use an alternative, but equivalent, definition

signature is +2.

3

We shall prove this shortly.

113

5.3 Cartesian basis vectors

constants. Thus the transformation between two inertial coordinate systems must

be linear, i.e.

x= x +a (5.3)

where the and a are constants. This has the form of a general inhomogeneous

Lorentz transformation (or Poincar© transformation). We will generally take the

(unimportant) constants a to be zero, in which case (5.3) reduces to a normal,

homogeneous, Lorentz transformation. As discussed in Chapter 1, the constants

in the transformation matrix depend upon the relative speed and orientation

of the two inertial frames. If the unprimed and primed coordinates correspond

to inertial frames S and S in standard configuration, with S moving at a speed

v relative to S, then the transformation matrix can be written in two equivalent

forms:

⎛ ⎞⎛ ⎞

’ cosh ’ sinh 0 0

00

⎜’ 0 0⎟ ⎜ ’ sinh cosh 0 0⎟

⎜ ⎟⎜ ⎟

x

= =⎜ ⎟=⎜ ⎟ (5.4)

⎝0 0 1 0⎠ ⎝ 1 0⎠

0 0

x

0 0 01 0 0 01

where = v/c, = 1 ’ 2 ’1/2 and the rapidity is defined by = tanh’1 .

Clearly, if the axes of S and S are rotated with respect to one another then the

transformation is more complicated.

The transformation inverse to (5.4) is clearly obtained by putting v ’ ’v (or

equivalently ’ ’ ). In general, the inverse transformation matrix is denoted by

x

=

x

and may be calculated from the forward transform using the index-raising and

index-lowering properties of the metric, i.e.

=

That this is indeed the required inverse may be shown using the condition (5.2),

which gives

= = =

5.3 Cartesian basis vectors

Figure 5.1 shows the coordinate curves for two systems of coordinates xa and

x a , corresponding to Cartesian inertial frames S and S in standard configuration

(with the 2- and 3- directions suppressed). In any coordinate system the coordinate

114 Special relativity revisited

e'0

e'1

e0

e'0

e0

e1

e'1

e1

e'0

e0

e'1

e1

Figure 5.1 The coordinate curves (dotted lines) for two systems of coordinates

xa and x a , corresponding to Cartesian inertial frames S and S in standard

configuration. The coordinate basis vectors for each system are also shown. The

2- and 3-directions are suppressed, and null vectors would lie at 45 degrees to

the vertical axis.

basis vectors are tangents to the coordinate curves; these are shown for S and S

in Figure 5.1 (in this diagram, null vectors would lie at 45 degrees to the vertical

axis). In general, the two sets of basis vectors are related by

e= e=

and

e e

which tells us how to draw one set of basis vectors in terms of the other set.

The two sets of basis vectors satisfy

e ·e = e ·e =

and so both sets form an orthonormal basis at each point in the pseudo-Euclidean

Minkowski spacetime. As drawn in Figure 5.1 the vectors e appear mutually

perpendicular, but the e do not. This is an artefact of representing a pseudo-

Euclidean space on a Euclidean piece of paper. As we shall see, the notion of an

orthonormal set of basis vectors at any point in the spacetime is of fundamental

importance for our description of observers.

We can also define dual basis vectors for each system as

e= e=

and

e e

These vectors also form orthonormal sets, since

e ·e = e ·e =

and the components are identical to the components .

115

5.4 Four-vectors and the lightcone

5.4 Four-vectors and the lightcone

As in any manifold, we can define vectors at any point P in Minkowski spacetime

(and thus vector fields).4 In relativity, vectors defined on a four-dimensional

spacetime manifold are called 4-vectors. These 4-vectors are geometrical entities

in spacetime, which can be defined without any reference to a basis (or coordinate

system). Nevertheless, in a particular coordinate system, we can write a general

4-vector v at P in terms of the coordinate basis vectors at P:

v=v e

Let us assume for the moment that we are using Cartesian coordinates x

corresponding to some inertial frame S. At each point P in spacetime we have a

constant set of orthonormal basis vectors e . The square of the length of a vector

v at a point P (which is a coordinate-independent quantity) is then given by

v·v = v v = vv

We have that

for v v > 0 the vector is timelike (5.5)

v v = 0 the vector is null

for (5.6)

for v v < 0 the vector is spacelike (5.7)

e0

future-pointing timelike vector

future-pointing null vector

spacelike vector

e2

past-pointing timelike vector

e1

Figure 5.2 The lightcone at some point P in Minkowski spacetime (with one

spatial dimension suppressed).

4

In fact, since Minkowski spacetime is pseudo-Euclidean, the tangent space TP at any point P coincides with

the manifold itelf. Thus, in this special case, we are not restricted to local vectors and can reinstate the notions

of position vector and of the displacement vector between arbitrary points in the manifold.

116 Special relativity revisited

Thus, as we would expect, the coordinate basis vector e0 , which has components

(1, 0, 0, 0), is timelike. Similarly, the basis vectors ei i = 1 2 3 are spacelike.

Moreover, for any timelike or null vector v, if v · e0 > 0 then v is called future-

pointing whereas if v · e0 < 0 then v is past-pointing.

At any point P in the Minkowski spacetime, the set of all null vectors at P

forms the lightcone or null-cone. The structure of the lightcone is illustrated in

Figure 5.2, with one spatial dimension suppressed.

5.5 Four-vectors and Lorentz transformations

Suppose that the Cartesian coordinates x and x correspond to inertial frames

S and S . Thus, at each point P in the Minkowski spacetime we have two sets of

(constant) basis vectors e and e , and a general 4-vector v defined at P can be

expressed in terms of either set:

v=v e =v e

Thus, the components in the two bases are related by

v = v·e = v

(5.8)

v = v·e = v

where is the Lorentz transformation linking the coordinates x and x . Let

us now consider some examples of physical 4-vectors and investigate the physical

consequences of these transformations.

5.6 Four-velocity

A particularly important 4-vector is the 4-velocity of a (massive) particle (or

observer). As discussed in Chapter 1, the trajectory of a particle describes a curve

or worldline in spacetime. We could parameterise this curve in any way we

wish, but for massive particles it is usual to parameterise it using the proper time

measured by the particle. The 4-velocity u of the particle at any event is then

the tangent vector to the worldline at that event. For a massive particle, u is a

future-pointing timelike vector. The length of this tangent vector (which is defined

independently of any coordinate system) is constant along the worldline, since

(as shown in Chapter 3)

2

ds

u·u = = c2 (5.9)

d

117

5.6 Four-velocity

x0

x1

Figure 5.3 The 4-velocity at events along the worldlines of a particle travelling

at uniform speed in S (solid line) and a particle accelerating with respect to S

(broken line).

Since is proportional to the interval s along the worldline, it is an affine parameter

(see Chapter 3).

Suppose that we label spacetime with some Cartesian coordinate system corre-

sponding to an inertial frame S. We can then write the worldline of a particle

in this coordinate system as x = x . Figure 5.3 shows the 4-velocity at two

events on the worldline of a particle moving at uniform velocity in the frame S.

In this case the direction of the 4-velocity is also constant along the worldline.

The figure also shows the 4-velocity at two events on the worldline of a particle

that is accelerating (back and forth) with respect to the frame S. Clearly, in this

case, the direction of the 4-velocity changes along the worldline.

The (contravariant) components of the 4-velocity in the frame S are given by

dx

u = u·e = (5.10)

d

Setting x0 = ct for the moment, and noting that d = dt/ u, we can write these

components as

dx1 dx2 dx3

= = (5.11)

u c cu

u u

dt dt dt

where in the last line (with a slight abuse of notation) we have introduced the rela-

tive 3-vector u = u1 u2 u3 , which is the familiar (three-dimensional) velocity

vector of the particle as measured by an observer at rest in S.

118 Special relativity revisited

In some other inertial frame S , the components of the 4-velocity of the parti-

cle are

u = u·e = u

Writing this out in full for the case where S and S are in standard configuration

with relative speed v, we obtain

⎛ ⎞⎛ ⎞⎛ ⎞

’v0 0

c c

u v u

⎜ u 1 ⎟ ⎜’ 0⎟ ⎜ 1⎟

⎜u ⎟ ⎜ ⎟⎜ uu ⎟

0

⎟=⎜ v v

⎜ ⎟⎜ 2⎟

⎝ u u 2⎠ ⎝ 0 0⎠ ⎝ uu ⎠

0 1

3 3

0 0 0 1

uu uu

This is equivalent to four equations. From the first, we find that

1 1

u

=

v 1 ’ u v/c

1 2

u

and from the others we obtain the 3-velocity addition law in special relativity,

u1 ’ v

u=1

1 ’ u1 v/c2

u2

u=2

1 ’ u1 v/c2

v

u3

u=3

1 ’ u1 v/c2

v

Note that this approach has allowed us to derive the 3-velocity addition law in an

almost trivial way.

5.7 Four-momentum of a massive particle

The 4-momentum of a massive particle of rest mass m0 is defined in terms of its

four-velocity u by

p = m0 u

At any point P along the particle™s worldline the square of the length of this

vector is

p · p = m0 u · m0 u = m2 c2 (5.12)

0

In Cartesian coordinates x corresponding to some inertial frame S, the compo-

nents of the 4-momentum are simply p = p·e . According to convention we write

= E/c p1 p2 p3 = E/c p (5.13)

p

119

5.8 Four-momentum of a photon

where E is the energy of the particle as measured in the frame S and p is its

3-momentum measured in S. Comparing (5.13) with (5.11) we see that, in special

relativity,

E= 2

(5.14)

u m0 c

p= (5.15)

u m0 u

In the frame S, the squared length of the 4-momentum is given by p p . Thus,

from (5.13) and (5.12), we find that

E 2 ’ p2 c2 = m2 c4

0

where p2 = p · p. This is the well-known energy“momentum invariant.

5.8 Four-momentum of a photon

The above discussion concerned particles of non-zero rest mass, which move

at speeds less than c. We now consider particles such as photons and perhaps

neutrinos, which move at the speed of light. The worldline of a massless particle is

a null curve, along which d = 0. Thus, we cannot parameterise such a worldline

using the proper time . Nevertheless, there are many other parameters that we

can use. For example, in an inertial frame, a photon travelling in the positive

x-direction will describe the path x = ct. This could be written parametrically as

x =u (5.16)

= 1 1 0 0 . Using (3.43), the tangent vector

where is the parameter and u

to the worldline is then

dx

u= e =u e

d

Since the worldline is a null curve, we have

u·u = 0 (5.17)

in contrast with (5.9). Moreover, with this choice of parameter we see that

du

=0 (5.18)

d

which is the equation of motion for a photon. We note that although this has

been derived using the fact that the Cartesian basis vectors e do not change

with position, it is a vector equation and therefore will hold in any basis (i.e. any

coordinate system).

120 Special relativity revisited

Our choice of parameterisation in (5.16) may appear somewhat arbitrary.

Indeed, it is true that there exists an unlimited number of parameterisations that

could be used. For example, suppose that we replaced by 2 (say). As the new

parameter varies between ’ and , the same worldline x = ct would be

described in the spacetime. Since this is a null curve, the condition (5.17) would

continue to be true (as may be verified explicitly). In the new parameterisation,

however, the equation of motion (5.18) would not still hold. The special class of

parameters for which the equation of motion has the simple form (5.18) is the

class of affine parameters (as discussed in Section 3.16). Since one is always free

to choose such a parameter, we will assume from here on that equation (5.18) is

satisfied.

So far, we have not mentioned the frequency (or energy) of the photon, which

characterises it in much the same way as the rest mass m0 characterises a massive

particle. Clearly, the tangent vector u can be multipled by any scalar constant and

will still satisfy the equations (5.17) and (5.18). The 4-momentum of a photon is

therefore defined as

p= u

for a constant chosen such that, in an arbitrary inertial frame S, the components

of p are

= E/c p

p

where E is the energy of the photon as measured in S and p is its 3-momentum

in S. From (5.17) we thus have E = pc.

For photons, it is also common to introduce the 4-wavevector k, which is related

to the four-momentum by p = k. Thus, in the frame S, the 4-wavevector has

components given by

=2/

k k

where is the wavelength of the photon as measured in S and k = 2 / n and

n is a unit 3-vector in the direction of propagation.

5.9 The Doppler effect and relativistic aberration

An example of the usefulness of the 4-vector approach (and particularly the photon

4-wavevector) is provided by the Doppler effect. Suppose that an observer is at

rest in some Cartesian inertial frame S defined by the coordinates x in spacetime.

Let us also suppose that a source of radiation is moving relative to S with a speed

v in the positive x1 -direction and that at some event P the observer receives a

photon of wavelength in a direction that makes an angle with the positive

121

5.9 The Doppler effect and relativistic aberration

x1 -direction. Thus, at the event P the components k = k · e of the photon™s

4-wavevector in this coordinate system are

2

= 1 cos sin 0

k

The photon observed at the event P must have been emitted by the source at some

other event Q (say). However, the equation of motion of a photon implies that

its 4-momentum p, and hence its 4-wavevector k, is constant along its worldline.

Thus the photon™s 4-wavevector k at the event Q is the same as that at the

event P.

Let us denote the Cartesian inertial frame in which the radiation source is at

rest by S (whose spatial axes are assumed not to be rotated with respect to those

of S); this frame is represented by the coordinates x in spacetime. Thus, at

the event Q the components in S of the photon™s 4-wavevector are given by

k = k · e and read

k= (5.19)

k

where is given by (5.4).

We denote these components in S by

2

= 1 cos sin 0

k

The zeroth component of (5.19) yields the ratio of the proper wavelength and the

observed wavelength:

= 1 ’ cos

This equation contains all the familiar Doppler effect results as special cases. If

= 0, the source must be approaching the observer along the negative x1 -axis.

If = , the source is receding from the observer along the positive x1 -axis.

Finally, if = ± /2 we obtain the transverse Doppler effect. Similarly, from the

2- and 3- components of (5.19) we obtain immediately

tan

=

tan

1 ’ v/c sec

which is a version of the relativistic aberration formula.

122 Special relativity revisited

5.10 Relativistic mechanics

In relativistic mechanics, the equation of motion of a massive particle is given by

dp

=f

d

where f is the 4-force. In some Cartesian inertial frame S (for which the basis

vectors are constant throughout the spacetime) the components f of the 4-force

are given by the familiar expression

dp d dp dp

f =e · =e · pe = =

d d d d

where we have used the fact that e and e are reciprocal sets of vectors. Noting

that d = dt/ u , we may write

f ·u

d E

= p=

f f

u u

dt c c

where in the last equality we have introduced the familiar 3-force f as measured

in the frame S, and u is the 3-velocity in this frame. Writing the compo-

nents in this way, the time and space parts of the equation of motion in S are

(as required)

1 dE dE

= = f ·u (5.20)

d dt

u

1 dp dp

= =f (5.21)

d dt

u

where E and p are given by (5.14) and (5.15) respectively.

There is, however, a certain rarely discussed subtlety in relativistic mechan-

ics. Let us consider the scalar product u · f , which is of course invariant under

coordinate transformations. This is given by

dp dm0 du

u·f = u· = u· u + m0

d d d

dm0 du

= c2 + m0 u ·

d d

dm0

= c2

d

where we have (twice) used the fact that u · u = c2 . Thus, we see that in special

relativity the action of a force can alter the rest mass of a particle! A force that

preserves the rest mass is called a pure force and must satisfy u · f = 0.

123

5.12 Relativistic collisions and Compton scattering

If so desired, one can also introduce the 4-acceleration of a particle, a = du/d ,

in terms of which a pure 4-force takes the familiar form f = m0 a. In some

Cartesian inertial frame S, the components of the 4-acceleration are

du d du du d

+u u

= = uu =

a uc c

u u u

d dt dt dt dt

du du

= ua + u

c

u

dt dt

where a = du/dt is the 3-acceleration in the frame S.

5.11 Free particles

We now come to a very important observation concerning relativistic mechanics.

In the absence of any forces, the equation of motion of a massive particle is

dp

=0 (5.22)

d

where the proper time is an affine parameter along the particle™s worldline.

Similarly, the equation of motion of a photon is

dp

=0 (5.23)

d

where is some affine parameter along the photon™s worldline. However, in each

case the 4-momentum p at some point on the worldline is simply a fixed multiple

of the tangent vector to the worldline at that point. Thus, equations (5.22) and

(5.23) say that tangent vectors to the worldlines of free particles and of photons

form a parallel field of vectors along the worldline. From Chapter 2 we know

that this is the definition of an affinely parameterised geodesic. Thus, in special

relativity the worldlines of free particles and photons are respectively non-null

and null geodesics in Minkowski spacetime.

5.12 Relativistic collisions and Compton scattering

We note from (5.22) and (5.23) that the conservation of energy and momentum

for a free particle or photon is represented by the single equation p = constant.

We can, of course, add the 4-momenta of different particles. Thus for a system of

n interacting particles i = 1 2 n with no external forces, we have n pi =

i=1

constant, which is very useful in relativistic-collision calculations.

124 Special relativity revisited

x2 x2

photon

θ

electron

φ

photon

electron

x1 x1

Figure 5.4 The Compton effect.

An important example of a relativistic collision is Compton scattering, in which