. 5
( 24)


The symbols are used to exclude unwanted indices from the (anti-)
symmetrisation implied by ( ) and [ ].

4.4 The metric tensor
The most important tensor that one can define on a manifold is the metric tensor
g. This defines a linear map of two vectors into the number that is their inner
product, i.e.

g u v ≡ u·v (4.2)

From this definition, it is clear that g is a symmetric second-rank tensor. Its
covariant and contravariant components are given by

gab = g ea eb = ea · eb g ab = g ea eb = ea · eb

which, from (4.2), clearly match our earlier definitions. As we showed in Chap-
ter 3, the matrix g ab containing the contravariant components of the metric
tensor is the inverse of the matrix gab that contains its covariant components.
The mixed components of g are given by

g eb ea = g ea eb = b

where the last equality is a result of the reciprocity relation between basis vectors
and their duals.
4.6 Mapping tensors into tensors

4.5 Raising and lowering tensor indices
The contravariant and covariant components of the metric tensor can be used
for raising and lowering general tensor indices, just as they are used for vector
indices. As we have seen, when a tensor acts on different combinations of basis
and dual basis vectors it yields different components. Consider, for example, a
third-rank tensor t. Its covariant components are given by
t ea eb ec = tabc (4.3)
whereas one possible set of mixed components of the tensor is given by
t ea eb ec = tab c
As we stated earlier, in general these two sets of components will differ, since
the basis and dual basis vectors are related by the metric: ec = gcd ed . Thus, for
tabc = gcd tab d
In a similar way we can raise or lower more than one index at a time. For example,
ta bc = g ad gce tdb e

4.6 Mapping tensors into tensors
Tensors can be thought of not just as maps between vectors and real numbers
but also as maps between tensors and other tensors. Consider, for example, a
third-rank tensor t, but let us not ˜fill™ all of its argument ˜slots™ with vectors. If,
for instance, we fill just its last slot with some fixed vector u, we have the object
t··u (4.4)
What sort of object is this? Well, it is clear that, if we supply two further vectors to
this object, we will obtain a real number. Thus the object (4.4) is itself a second-
rank tensor, which we could denote by s (say). Thus the third-rank tensor t has
˜mapped™ the vector u into the second-rank tensor s. The covariant components
(say) of s are given by
sab ≡ s ea eb = t ea eb u = tabc uc
where, in the last slot, we have expressed u as uc ec . By expressing this vector as
uc ec instead, we obtain the equivalent expression sab = tab c uc .
As a further example of mapping between tensors, let us fill both the first and
last slots of t with fixed vectors v and u respectively to obtain the object
98 Tensor calculus on manifolds

Clearly, this object is a first-rank tensor (or vector), which we denote by w. Thus
the third-rank tensor t has mapped the two vectors v and u into the vector w. The
covariant components (say) of w are
wb = w eb = t v eb u
which can be expressed in several equivalent ways, i.e.
wb = tabc va uc = ta bc va uc = tab c va uc = ta bc va uc
The number of free indices in such expressions is the rank of the resulting tensor.

4.7 Elementary operations with tensors
Tensor calculus is concerned with tensorial operations, that is, operations on
tensors which result in quantities that are still tensors. We now consider some
elementary tensorial operations.

Addition (and subtraction)
It is clear from the definition of a tensor that the sum and difference of two tensors
of rank N are both themselves tensors of rank N . For example, the covariant
components (say) of the sum s and difference d of two rank-2 tensors are given
straightforwardly by

sab = s ea eb = t ea eb + r ea eb = tab + rab
dab = d ea eb = t ea eb ’ r ea eb = tab ’ rab

Multiplication by a scalar
If t is a rank-N tensor then so too is t, where is some arbitrary real constant.
Clearly, its components are all multiplied by .

Outer product
The outer or tensor product of two tensors produces a tensor of higher rank. The
simplest example of an outer product is that of two vectors. This is defined as the
rank-2 tensor, denoted by u — v, such that

u—v p q ≡ u p v q

where p and q are arbitrary vector arguments (this notation is not to be confused
with the vector product u — v of two vectors, which is itself a vector). Note that
the outer product is not, in general, commutative, so that u — v and v — u are
4.7 Elementary operations with tensors

different rank-2 tensors. The covariant components (say) of u — v in some basis
are given by
u — v ea eb ≡ u ea v eb = ua vb
The outer product of higher-rank tensors is a simple generalisation of the outer
product of two vectors. For example, the outer product of a rank-2 tensor t with
a rank-1 tensor s is defined by

t—s p q r ≡ t p q s r

This is a rank-3 tensor, which we could call h. The mixed components, for
instance, of this tensor are given by

ha bc = t ea eb s ec = ta b sc (4.6)

In general, the outer product of an N th-rank tensor with an Mth-rank tensor will
produce an N + M th-rank tensor.

Contraction (and inner product)
The contraction of a tensor is performed by summing over the basis and dual
basis vectors in two of its vector arguments, and it results in a tensor of lower
rank. Let us take as an example a rank-3 tensor h and consider the quantity

q · = h ea · ea

This is clearly a rank-1 tensor with covariant components (say) given by

qb = h ea eb ea = ha ba (4.7)

Thus in terms of tensor components, contraction amounts to setting a subscript
equal to a superscript and summing, as the summation convention requires. In
general, performing a single contraction on an N th-rank tensor will produce a
tensor of rank N ’ 2.
Contraction may be combined with tensor multiplication to obtain the inner
product of two tensors. For example, if ha bc were in fact given by (4.6), then
(4.7) could be written as

qb = t ea eb s ec = ta b sa

which is the inner product of the tensors t and s. Alternatively, one could view
the qb as a contraction of the rank-3 tensor having components ta b sc , which is
the outer product t — s.
If two tensors t and s are rank 2 or lower then we can denote their inner product
unambiguously by t · s. Note, however, that in general such an inner product is
100 Tensor calculus on manifolds

not commutative. For example, if t is a rank-2 tensor and s is rank 1 then the
contravariant components (say) of the vectors t · s and s · t are respectively

tab sb tab sa

Clearly, the ˜dot™ notation for the inner product becomes ambiguous if either
tensor is rank 3 or higher, since there is then a choice concerning which indices
to contract.

4.8 Tensors as geometrical objects
We have seen that a rank-1 tensor t · can be identified as a vector. The covariant
and contravariant components of this vector in some basis are given by

t ea = ta t ea = ta

We are used to thinking of a vector t as a geometrical object which can be made
up from a linear combination of the basis vectors,

t = ta ea = ta ea (4.8)

Tensors of higher rank are generalisations of the concept of a vector and can
also be regarded as geometrical entities. In a particular basis, a general tensor
can expressed as a linear combination of a tensor basis made up from the basis
vectors and their duals.
Let us consider the outer product ea —eb of two basis vectors of some coordinate
system. The contravariant components of this rank-2 tensor in this basis are very
ea — eb ec ed = ea ec eb ed = cd

Now suppose that we have some general rank-2 tensor t, whose contravariant
components in our basis are tab . Let us consider the quantity tab ea — eb . This is
a sum of rank-2 tensors, which must therefore also be a rank-2 tensor (see above).
If we consider its action on two basis vectors, we find

tab ea — eb ec ed = tab = tcd

the tcd are simply the contravariant components of t. Thus, in an analogous way
to the vector in (4.8), we may express the rank-2 tensor t as a linear combination
basis tensors,

t = tab ea — eb
4.9 Tensors and coordinate transformations

By considering different tensor bases, constructed from other combinations of the
basis and dual basis vectors, we can also write t in several different ways:

t = tab ea — eb = ta b ea — eb = ta b ea — eb

This idea is extended straightforwardly to higher-rank tensors.

4.9 Tensors and coordinate transformations
The description of tensors as a geometrical objects lends itself naturally to a
discussion of the behaviour of tensor components under a coordinate transfor-
mation xa ’ x a on the manifold. As shown in Chapter 3, there is a simple
relationship between the coordinate basis vectors ea associated with the coordi-
nate system xa and the coordinate basis vectors ea associated with a new system
of coordinates x a . We found that at any point P the two sets of coordinate basis
vectors are related by
ea = (4.9)
xa b
where the partial derivative is evaluated at the point P. A similar relationship
holds between the two sets of dual basis vectors:
xa b
e = be
Using (4.9) and (4.10), we can now calculate how the components of any general
tensor must transform under the coordinate transformation.
As shown in Chapter 3, the contravariant components of a vector t in the new
coordinate basis are given by

xa b xa b
t =t e = bt e = bt
a a
x x
Similarly, the covariant components of t are given by

xb xb
ta = t e a = te = t
xa b xa b
It is important to remember that the unprimed and primed components describe
the same vector t in terms of different basis vectors, i.e. t = ta ea = t a ea . The
vector t is a geometric entity that does not depend on the choice of coordinate
102 Tensor calculus on manifolds

The transformation properties of the components of higher-rank tensors may
be found in a similar way. For example, if t is a second-rank tensor then

xc xd
tab = t
x a x b cd
x a x b cd
= c dt
ab (4.11)
xc x b d
tab t
x a xd c

Once again, these components describe the same tensor (which is a geometric
entity) in terms of different bases. For example,

t = tab ea — eb = t ab ea — eb

In general, when transforming the components of a tensor of arbitrary rank,
each superscript inherits a transformation ˜matrix™ x a / xc and each subscript a
transformation matrix xc / x a . Thus, for example,

xd xe x c f
= (4.12)
tab t
x a x b xf de

Indeed, the basic requirement for a set of quantities to be the components of a
tensor is that they transform in such a way under a change of coordinates. We
shall return to this point later.

4.10 Tensor equations
Given a coordinate system (and hence a coordinate basis and its dual), it is
convenient to work in terms of the components of a tensor t in this system rather
than with the geometrical entity t itself. Therefore, from here onwards we shall
adopt a much-used convention, which is to confuse a tensor with its components.
This allows us to refer simply to the tensor tab c , rather than the tensor with
components tab c .
We now come to the reason why tensors are important in mathematical physics.
Let us illustrate this by way of an example. Suppose we find that in one particular
coordinate system two tensors are equal, for example,

tab = sab (4.13)
4.11 The quotient theorem

Let us multiply both sides by xa / x c and xb / x d and take the implied summa-
tions to obtain
xa xb xa xb
t= s
x c x d ab x c x d ab
Since tab and sab are both covariant components of tensors of rank 2, this equation
can be restated as tab = sab . In other words, the equation (4.13) holds in any
other coordinate system. In short, a tensor equation which holds in one coordi-
nate system necessarily holds in all coordinate systems. Put another way, tensor
equations are coordinate independent, which is in fact obvious from the geomet-
ric approach we have adopted since the outset. One particularly useful fact that
emerges clearly from this discussion, and the transformation law (4.12), is that if
all the components of a tensor are zero in one coordinate system then they vanish
in all coordinate systems. This is useful in proving many tensor relations.

4.11 The quotient theorem
Not all objects with indices are the components of a tensor. An important example
is provided by the connection coefficients a bc , which vanish in a locally Cartesian
coordinate system but not in other coordinate systems. Moreover, in Chapter 3
we derived the transformation properies of a bc and found that these were not of
the form (4.12).
As mentioned above, the fundamental requirement that a set of quantities form
the components of a tensor is that they obey a transformation law of the kind
(4.12) under a change of coordinates. The quotient theorem provides a means of
establishing this requirement in a particular case without having to demonstrate
explicitly that the transformation law holds. It states that if a set of quantities
when contracted with a tensor produces another tensor then the original set of
quantities is also a tensor. Rather than give a general statement of the theorem
and its proof, which tend to become obscured by a mass of indices, we shall give
an example that illustrates the gist of the theorem.
In an N-dimensional manifold, suppose that with each system of coordinates
about a point P there are associated N 3 numbers ta bc and it is known that, for arbi-
trary contravariant vector components va , the N 2 numbers ta bc vc transform as the
components of a rank-2 tensor at P under a change of coordinates. This means that
x a xe d f
t bc v = d
a c
x x b ef
where the t a bc are the corresponding N 3 numbers associated with the primed
coordinate system. Then we may deduce that the ta bc are the components of a
104 Tensor calculus on manifolds

rank-3 tensor, as follows. Since vf = xf / x c v c , equation (4.14) yields
x a xe d xf c
t bc v = d
a c
t v
x x b ef x c
which, on rearrangement gives
x a xe xf d
t bc ’ d t ef v c = 0
x xb xc
This holds for arbitrary vector components v c , so the expression in parentheses
must vanish identically. Thus
x a xe xf d
t bc = d
x x b x c ef
and therefore the ta bc must be the components of a third-rank tensor.
Thus the gist of the quotient theorem is that if a set of numbers displays tensor
characteristics when some of their indices are ˜killed off™ by summation with the
components of an arbitrary tensor then the original numbers are the components
of a tensor.

4.12 Covariant derivative of a tensor
It is straightforward to show that in an arbitrary coordinate system (unlike in local
Cartesian coordinates) the differentiation of the components of a general tensor,
other than a scalar, with respect to the coordinates does not in general result in
the components of another tensor. For example, consider the derivative of the
contravariant components va of a vector. Under a change of coordinates we have
va xc v a
xb x b xc
xc xa d
= v
x b xc xd
xc 2 x a d
x c x a vd
= + b c dv (4.15)
x b xd uc x xx
The presence of the second term on the right-hand side of (4.15) shows that the
derivatives va / xb do not form the components of a second-order tensor. This
term arises because the ˜transformation matrix™ x a / xb changes with position
in the manifold (this is not true in local Cartesian coordinates, for which the
second term vanishes).
To avoid this difficulty, in Chapter 3 we introduced the covariant derivative of
a vector,
= +
a a a c
bv bv cb v
4.12 Covariant derivative of a tensor

in terms of which we may write b v = b va ea . Using the transformation prop-
erties of the connection, derived in Chapter 3, it is straightforward to show that
the b va are the (mixed) components of a rank-2 tensor, which is in fact clear
from their definition. We denote this rank-2 tensor by v, which is formally the
outer product of the vector differential operator with the vector v, although it
is usual to omit the symbol — in outer products containing . In a given basis we
have = ea a , so we may write, for example,

v = ea a —v eb = ea — vb eb = ea — eb
b b

Similarly, the b va form the covariant components of this tensor, i.e. v =
a vb e — e . Indeed, it is easy to check that b v and b va satisfy the required
a b a

transformation laws for being the components of a tensor.
We can extend the idea of the covariant derivative to higher-rank tensors. For
example, let us consider an arbitrary rank-2 tensor t and derive the form of the
covariant derivative c tab of its contravariant components. Expressing t in terms
of its contravariant components, we have

= tab ea — eb = ea — eb + tab — eb + tab ea —
ct c ea c eb

We can rewrite the derivatives of the basis vectors in terms of connection coeffi-
cients to obtain

= ea — eb + tab ac ed — eb + t ea —
ab d ab d
ct bc ed

Interchanging the dummy indices a and d in the second term on the right-hand
side and b and d in the third term, this becomes

= + + ea — eb
ab a db b ad
ct ct dc t dc t

where the expression in parentheses is the required covariant derivative,

= + +
ab ab a db b ad
ct ct dc t dc t

Using (4.16), the derivative of the tensor t with respect to xc can be written in
terms of its contravariant components as

= ea — eb
ct ct
106 Tensor calculus on manifolds

Similar results may be obtained for the the covariant derivatives of the mixed
and covariant components of the second-order tensor t. Collecting these results
together, we have

= + +
ab ab a db b ad
ct ct dc t dc t

= ct b + dc t b ’
a a a d d a
ct b bc t d

= c tab ’ ac tdb ’ bc tad
d d
c tab

The positions of the indices in these expressions are once again very systematic.
The last index on each connection coefficient matches that on the covariant
derivative, and the remaining indices can only be logically arranged in one way.
For each contravariant index (superscript) on the left-hand side we add a term on
the right-hand side containing a Christoffel symbol with a plus sign, and for every
covariant index (subscript) we add a corresponding term with a minus sign. This
is extended straightforwardly to tensors with an arbitrary number of contravariant
and covariant indices. We note that the quantities c tab c ta b and c tab are the
components of the same third-order tensor t with respect to different tensor
bases, i.e.
t= ec — ea — eb = ec — ea — eb = ec — ea — eb
ab a
ct ct b c tab

One particularly important result is that the covariant derivative of the metric
tensor g is identically zero at all points in a manifold, i.e.


Alternatively, we can write this in terms of the components in any basis as

=0 =0
and (4.18)
c gab cg

This result follows immediately from comparing, for example, the third result in
(4.17) with our expression (3.20), derived in Chapter 3, for the partial derivative
of the metric in terms of the affine connection. We note, in particular, that the
expression (3.20) holds even in a manifold with non-zero torsion, and therefore
so too must the result (4.18).1
The result (4.18) has an important consequence, which considerably simplifies
tensor manipulations. This is that we can interchange the order of raising or

In fact, for a general manifold with non-zero torsion, it is not necessary that (4.18) holds since one can,
in principle, define the affine connection and the metric independently. In arriving at our earlier expression
(3.20), we had in fact already assumed implicitly that the affine connection was metric-compatible, in which
case (4.18) holds automatically. This topic is, however, beyond the scope of our discussion.
4.13 Intrinsic derivative of a tensor along a curve

lowering an index and performing covariant differentiation without affecting the
result. For example, consider the contravariant components tab of some rank-2
tensor. Using (4.18), we can write, for example,

= g bd ta d = ta d + g bd = g bd
ab bd a a
ct cg ct d ct d

We also note that the covariant derivative obeys the standard rule for the differ-
entiation of a product.

4.13 Intrinsic derivative of a tensor along a curve
In Chapter 3 we encountered vector fields that are defined only on some subspace
of the manifold, an extreme example being when the vector field v u is defined
only along some curve xa u in the manifold (as for the spin s of a single
particle along its worldline in spacetime). In a similar way a tensor field t u
could be defined only along some curve . We now consider how to calculate
the derivative of such a tensor with respect to the parameter u along the curve.
Let us begin by expressing the tensor at any point along the curve in terms of
its contravariant components (say),

t u = tab u ea u — eb u

where the ea u are the coordinate basis vectors at the point on the curve corre-
sponding to the parameter value u. Thus, the derivative of t along the curve is
given by
dt de de
ea — eb + tab a — eb + tab ea — b
du du du du
Using the chain rule to rewrite the derivatives of the basis vectors, we obtain
dtab ce dxc eb
dt ab dx a
= e — eb + t — eb + t ea —
du a du xc du xc
Finally, by writing the partial derivatives of the basis vectors in terms of the
connection and relabelling indices, we find that
dtab c c
dt a db dx ad dx
= + + ea — eb
dc t dc t
du du du du
The term in brackets is called the intrinsic (or absolute) derivative of the compo-
nents tab along the curve and is denoted

Dtab dtab c c
db dx b ad dx
= + +
dc t dc t
Du du du du
108 Tensor calculus on manifolds

Similar results may be obtained for the covariant and mixed components of the
tensor t. For example, the derivative of t along the curve may be written
Dtab Dtb
dt Dtab a
= ea — eb = e —e = ea — eb
du Du Du Du
Clearly, the method can be extended easily to higher-rank tensors.
In a similar way to vectors, a tensor t is said to be parallel-transported along a
curve if dt/du = 0 or, equivalently, in terms of its components, if for example
Dtab /Du = 0.
Following our discussion of the intrinsic derivative of a vector in Chapter 3, a
convenient way to remember the form of the intrinsic derivative is to pretend that
the tensor t is in fact defined throughout (some region of) the manifold, i.e. not
only along the curve . If this were the case then we could differentiate t with
respect to the coordinates xa . Thus we could write
dtab tab dxc
du x du
Substituting this into (4.19), we could then factor out dxc /du and recognise the
other factor as the covariant derivative c tab . Thus we could write
Dtab c
ab dx
= ct (4.20)
Du du
with similar expressions for the intrinsic derivatives of its other components. It
must be remembered, however, that if t is only defined along the curve then
formally (4.20) is not defined and acts merely as an aide-memoire.

4.1 If t is a rank-2 tensor, show that

t u + v w + z = tab ua + va wb + zb

4.2 If sab = sba and tab = ’tba are the component of a symmetric and an antisymmetric
tensor respectively, show that sab tab = 0.
4.3 If tab are the components of an antisymmetric tensor and va the components of a
vector, show that
v a tbc = va tbc + vc tab + vb tca

4.4 If tab are the components of a symmetric tensor and va the components of a vector,
show that if
va tbc + vc tab + vb tca = 0

then either tab = 0 or va = 0.

If the tensor tabcd satisfies tabcd va wb vc wd = 0 for arbitrary vectors va and wa , show
tabcd + tcdab + tcbad + tadcb = 0

4.6 Consider the infinitesimal coordinate transformation

x a = x a + va x

where va x is a vector field and is a small scalar quantity. Show that, to first
order in ,
gab x = gab x ’ gac b vc + gcb a vc

By investigating their transformation properties, show that b va are the mixed
components of a rank-2 tensor.
4.8 If va are the covariant components of a vector and Aab are the components of an
antisymmetric rank-2 tensor, show that

’ = ’ b va
a vb b va a vb

+ + = + c Aab + b Aca
a Abc c Aab b Aca a Abc

Determine the symmetry properties of the rank-3 tensor

Babc = + c Aab + b Aca
a Abc

4.9 Show that covariant differentiation obeys the usual product rule, e.g.

Abc Bcd = Bcd + Abc cd
a Abc aB

Hint: Use local Cartesian coordinates.
4.10 For a general rank-2 tensor T ab , show that the covariant divergence is given by
= g T ab +
ab b ac
aT ca T

Show further that if Aab = ’Aba are the components of an antisymmetric rank-2
tensor then
aA =
g Aab

Hence show that if the antisymmetric tensor field Aab vanishes on a hypersurface
S that bounds a region V of an N -dimensional manifold then

’g dN x = 0

4.11 Any coordinate transformation xa ’ x a under which the metric is form invariant,
i.e. such that
gab x = gab x
110 Tensor calculus on manifolds

is called an isometry (note that the argument is the same on both sides of the above
equation). Show that the infinitesimal coordinate transformation in Exercise 4.6 is
an isometry, to first order in , provided that va satisfies

gac b vc + gcb a vc + vc c gab = 0

Show further that this expression can be written as

+ =0
a vb b va

This is Killing™s equation and any vector satisfying it is known as a Killing vector
of the metric gab . Show that if va and wa are both Killing vectors then so too is any
linear combination va + wa , where and are constants.
Special relativity revisited

Now that we have the machinery of tensor calculus in place, let us return to special
relativity and consider how to express this theory in a more formal manner.

5.1 Minkowski spacetime in Cartesian coordinates
In the language of Chapter 2, the Minkowski spacetime of special relativity
is a fixed four-dimensional pseudo-Euclidean manifold. As such, there exists a
privileged class of Cartesian coordinate systems t x y z covering the whole
manifold, so that at every point (or event) the squared line element takes the form

ds2 = c2 d = c2 dt2 ’ dx2 ’ dy2 ’ dz2

where we have taken the opportunity to define the proper time interval d 2 =
= 0 1 2 3 ,1
ds2 /c2 . It is convenient to introduce the indexed coordinates x
so that

x0 ≡ ct x1 ≡ x x2 ≡ y x3 ≡ z

and to write the line element as

ds2 = dx dx

It is conventional to use Greek indices when discussing four-dimensional spacetimes rather than the Latin
indices a b c etc. from the start of the alphabet, which are used for abstract N -dimensional manifolds.
Moreover, in relativity theory, it is more common for a Greek index to run from 0 to 3 than from 1 to 4
(although the latter usage is found in some textbooks). Also, it is conventional to use Latin letters from the
middle of the alphabet, such as i j k etc., for indices that run from 1 to 3.

112 Special relativity revisited

where the are the covariant components of the metric tensor and are
given by
⎛ ⎞
⎜ 0 ’1 0 0 ⎟
⎜ ⎟
=⎜ ⎟ (5.1)
⎝ 0 0 ’1 0 ⎠
0 0 0 ’1
= diag 1 ’1 ’1 ’1 .
From now on we will often use the shorthand notation
It is clear that the contravariant components of the metric are identical, i.e.
= diag 1 ’1 ’1 ’1 . With this definition of the metric, Minkowski
spacetime has a signature of ’2.2 We also note that, since the metric coefficients
are constant, the connection vanishes everywhere in this coordinate system.

5.2 Lorentz transformations
Cartesian coordinates, which we are using in the context of special relativity, have
a direct physical interpretation and correspond to distances and times measured
by an observer at rest in some inertial frame S that is labelled using three-
dimensional Cartesian coordinates3 (remember that, in Chapter 1, we defined
an inertial frame as one in which a free particle moves in a straight line with
fixed speed). Transforming to a different Cartesian inertial frame corresponds to
performing a coordinate transformation on the Minkowski spacetime to a new
system x . Since we require that the new coordinate system x also corresponds
to a Cartesian inertial frame, the (squared) line element ds2 must take the same
form in these primed coordinates as it did in the unprimed coordinates, i.e.
ds2 = dx dx = dx dx
In other words the metric in the new coordinates must also be given by (5.1).
From the transformation properties of a second-rank tensor, this means that the
transformation x ’ x must satisfy

x x
= (5.2)
x x

which is the necessary and sufficient condition that a transformation x ’ x
is a Lorentz transformation between two Cartesian inertial coordinate systems.
From (5.2), we see that the elements of the transformation matrix must be

= diag ’1 1 1 1 in which the
Note that some relativists use an alternative, but equivalent, definition
signature is +2.
We shall prove this shortly.
5.3 Cartesian basis vectors

constants. Thus the transformation between two inertial coordinate systems must
be linear, i.e.
x= x +a (5.3)

where the and a are constants. This has the form of a general inhomogeneous
Lorentz transformation (or Poincar© transformation). We will generally take the
(unimportant) constants a to be zero, in which case (5.3) reduces to a normal,
homogeneous, Lorentz transformation. As discussed in Chapter 1, the constants
in the transformation matrix depend upon the relative speed and orientation
of the two inertial frames. If the unprimed and primed coordinates correspond
to inertial frames S and S in standard configuration, with S moving at a speed
v relative to S, then the transformation matrix can be written in two equivalent
⎛ ⎞⎛ ⎞
’ cosh ’ sinh 0 0
⎜’ 0 0⎟ ⎜ ’ sinh cosh 0 0⎟
⎜ ⎟⎜ ⎟
= =⎜ ⎟=⎜ ⎟ (5.4)
⎝0 0 1 0⎠ ⎝ 1 0⎠
0 0
0 0 01 0 0 01

where = v/c, = 1 ’ 2 ’1/2 and the rapidity is defined by = tanh’1 .
Clearly, if the axes of S and S are rotated with respect to one another then the
transformation is more complicated.
The transformation inverse to (5.4) is clearly obtained by putting v ’ ’v (or
equivalently ’ ’ ). In general, the inverse transformation matrix is denoted by
and may be calculated from the forward transform using the index-raising and
index-lowering properties of the metric, i.e.


That this is indeed the required inverse may be shown using the condition (5.2),
which gives
= = =

5.3 Cartesian basis vectors
Figure 5.1 shows the coordinate curves for two systems of coordinates xa and
x a , corresponding to Cartesian inertial frames S and S in standard configuration
(with the 2- and 3- directions suppressed). In any coordinate system the coordinate
114 Special relativity revisited




Figure 5.1 The coordinate curves (dotted lines) for two systems of coordinates
xa and x a , corresponding to Cartesian inertial frames S and S in standard
configuration. The coordinate basis vectors for each system are also shown. The
2- and 3-directions are suppressed, and null vectors would lie at 45 degrees to
the vertical axis.

basis vectors are tangents to the coordinate curves; these are shown for S and S
in Figure 5.1 (in this diagram, null vectors would lie at 45 degrees to the vertical
axis). In general, the two sets of basis vectors are related by

e= e=
e e

which tells us how to draw one set of basis vectors in terms of the other set.
The two sets of basis vectors satisfy

e ·e = e ·e =

and so both sets form an orthonormal basis at each point in the pseudo-Euclidean
Minkowski spacetime. As drawn in Figure 5.1 the vectors e appear mutually
perpendicular, but the e do not. This is an artefact of representing a pseudo-
Euclidean space on a Euclidean piece of paper. As we shall see, the notion of an
orthonormal set of basis vectors at any point in the spacetime is of fundamental
importance for our description of observers.
We can also define dual basis vectors for each system as
e= e=
e e
These vectors also form orthonormal sets, since
e ·e = e ·e =
and the components are identical to the components .
5.4 Four-vectors and the lightcone

5.4 Four-vectors and the lightcone
As in any manifold, we can define vectors at any point P in Minkowski spacetime
(and thus vector fields).4 In relativity, vectors defined on a four-dimensional
spacetime manifold are called 4-vectors. These 4-vectors are geometrical entities
in spacetime, which can be defined without any reference to a basis (or coordinate
system). Nevertheless, in a particular coordinate system, we can write a general
4-vector v at P in terms of the coordinate basis vectors at P:

v=v e

Let us assume for the moment that we are using Cartesian coordinates x
corresponding to some inertial frame S. At each point P in spacetime we have a
constant set of orthonormal basis vectors e . The square of the length of a vector
v at a point P (which is a coordinate-independent quantity) is then given by
v·v = v v = vv
We have that
for v v > 0 the vector is timelike (5.5)
v v = 0 the vector is null
for (5.6)
for v v < 0 the vector is spacelike (5.7)


future-pointing timelike vector

future-pointing null vector

spacelike vector


past-pointing timelike vector


Figure 5.2 The lightcone at some point P in Minkowski spacetime (with one
spatial dimension suppressed).

In fact, since Minkowski spacetime is pseudo-Euclidean, the tangent space TP at any point P coincides with
the manifold itelf. Thus, in this special case, we are not restricted to local vectors and can reinstate the notions
of position vector and of the displacement vector between arbitrary points in the manifold.
116 Special relativity revisited

Thus, as we would expect, the coordinate basis vector e0 , which has components
(1, 0, 0, 0), is timelike. Similarly, the basis vectors ei i = 1 2 3 are spacelike.
Moreover, for any timelike or null vector v, if v · e0 > 0 then v is called future-
pointing whereas if v · e0 < 0 then v is past-pointing.
At any point P in the Minkowski spacetime, the set of all null vectors at P
forms the lightcone or null-cone. The structure of the lightcone is illustrated in
Figure 5.2, with one spatial dimension suppressed.

5.5 Four-vectors and Lorentz transformations
Suppose that the Cartesian coordinates x and x correspond to inertial frames
S and S . Thus, at each point P in the Minkowski spacetime we have two sets of
(constant) basis vectors e and e , and a general 4-vector v defined at P can be
expressed in terms of either set:

v=v e =v e

Thus, the components in the two bases are related by

v = v·e = v
v = v·e = v

where is the Lorentz transformation linking the coordinates x and x . Let
us now consider some examples of physical 4-vectors and investigate the physical
consequences of these transformations.

5.6 Four-velocity
A particularly important 4-vector is the 4-velocity of a (massive) particle (or
observer). As discussed in Chapter 1, the trajectory of a particle describes a curve
or worldline in spacetime. We could parameterise this curve in any way we
wish, but for massive particles it is usual to parameterise it using the proper time
measured by the particle. The 4-velocity u of the particle at any event is then
the tangent vector to the worldline at that event. For a massive particle, u is a
future-pointing timelike vector. The length of this tangent vector (which is defined
independently of any coordinate system) is constant along the worldline, since
(as shown in Chapter 3)

u·u = = c2 (5.9)
5.6 Four-velocity



Figure 5.3 The 4-velocity at events along the worldlines of a particle travelling
at uniform speed in S (solid line) and a particle accelerating with respect to S
(broken line).

Since is proportional to the interval s along the worldline, it is an affine parameter
(see Chapter 3).
Suppose that we label spacetime with some Cartesian coordinate system corre-
sponding to an inertial frame S. We can then write the worldline of a particle
in this coordinate system as x = x . Figure 5.3 shows the 4-velocity at two
events on the worldline of a particle moving at uniform velocity in the frame S.
In this case the direction of the 4-velocity is also constant along the worldline.
The figure also shows the 4-velocity at two events on the worldline of a particle
that is accelerating (back and forth) with respect to the frame S. Clearly, in this
case, the direction of the 4-velocity changes along the worldline.
The (contravariant) components of the 4-velocity in the frame S are given by

u = u·e = (5.10)

Setting x0 = ct for the moment, and noting that d = dt/ u, we can write these
components as

dx1 dx2 dx3
= = (5.11)
u c cu
u u
dt dt dt

where in the last line (with a slight abuse of notation) we have introduced the rela-
tive 3-vector u = u1 u2 u3 , which is the familiar (three-dimensional) velocity
vector of the particle as measured by an observer at rest in S.
118 Special relativity revisited

In some other inertial frame S , the components of the 4-velocity of the parti-
cle are
u = u·e = u
Writing this out in full for the case where S and S are in standard configuration
with relative speed v, we obtain
⎛ ⎞⎛ ⎞⎛ ⎞
’v0 0
c c
u v u
⎜ u 1 ⎟ ⎜’ 0⎟ ⎜ 1⎟
⎜u ⎟ ⎜ ⎟⎜ uu ⎟
⎟=⎜ v v
⎜ ⎟⎜ 2⎟
⎝ u u 2⎠ ⎝ 0 0⎠ ⎝ uu ⎠
0 1
3 3
0 0 0 1
uu uu

This is equivalent to four equations. From the first, we find that
1 1
v 1 ’ u v/c
1 2

and from the others we obtain the 3-velocity addition law in special relativity,
u1 ’ v
1 ’ u1 v/c2
1 ’ u1 v/c2

1 ’ u1 v/c2

Note that this approach has allowed us to derive the 3-velocity addition law in an
almost trivial way.

5.7 Four-momentum of a massive particle
The 4-momentum of a massive particle of rest mass m0 is defined in terms of its
four-velocity u by
p = m0 u

At any point P along the particle™s worldline the square of the length of this
vector is
p · p = m0 u · m0 u = m2 c2 (5.12)

In Cartesian coordinates x corresponding to some inertial frame S, the compo-
nents of the 4-momentum are simply p = p·e . According to convention we write

= E/c p1 p2 p3 = E/c p (5.13)
5.8 Four-momentum of a photon

where E is the energy of the particle as measured in the frame S and p is its
3-momentum measured in S. Comparing (5.13) with (5.11) we see that, in special

E= 2
u m0 c

p= (5.15)
u m0 u

In the frame S, the squared length of the 4-momentum is given by p p . Thus,
from (5.13) and (5.12), we find that

E 2 ’ p2 c2 = m2 c4

where p2 = p · p. This is the well-known energy“momentum invariant.

5.8 Four-momentum of a photon
The above discussion concerned particles of non-zero rest mass, which move
at speeds less than c. We now consider particles such as photons and perhaps
neutrinos, which move at the speed of light. The worldline of a massless particle is
a null curve, along which d = 0. Thus, we cannot parameterise such a worldline
using the proper time . Nevertheless, there are many other parameters that we
can use. For example, in an inertial frame, a photon travelling in the positive
x-direction will describe the path x = ct. This could be written parametrically as

x =u (5.16)

= 1 1 0 0 . Using (3.43), the tangent vector
where is the parameter and u
to the worldline is then
u= e =u e
Since the worldline is a null curve, we have

u·u = 0 (5.17)

in contrast with (5.9). Moreover, with this choice of parameter we see that

=0 (5.18)

which is the equation of motion for a photon. We note that although this has
been derived using the fact that the Cartesian basis vectors e do not change
with position, it is a vector equation and therefore will hold in any basis (i.e. any
coordinate system).
120 Special relativity revisited

Our choice of parameterisation in (5.16) may appear somewhat arbitrary.
Indeed, it is true that there exists an unlimited number of parameterisations that
could be used. For example, suppose that we replaced by 2 (say). As the new
parameter varies between ’ and , the same worldline x = ct would be
described in the spacetime. Since this is a null curve, the condition (5.17) would
continue to be true (as may be verified explicitly). In the new parameterisation,
however, the equation of motion (5.18) would not still hold. The special class of
parameters for which the equation of motion has the simple form (5.18) is the
class of affine parameters (as discussed in Section 3.16). Since one is always free
to choose such a parameter, we will assume from here on that equation (5.18) is
So far, we have not mentioned the frequency (or energy) of the photon, which
characterises it in much the same way as the rest mass m0 characterises a massive
particle. Clearly, the tangent vector u can be multipled by any scalar constant and
will still satisfy the equations (5.17) and (5.18). The 4-momentum of a photon is
therefore defined as
p= u

for a constant chosen such that, in an arbitrary inertial frame S, the components
of p are
= E/c p

where E is the energy of the photon as measured in S and p is its 3-momentum
in S. From (5.17) we thus have E = pc.
For photons, it is also common to introduce the 4-wavevector k, which is related
to the four-momentum by p = k. Thus, in the frame S, the 4-wavevector has
components given by
k k

where is the wavelength of the photon as measured in S and k = 2 / n and
n is a unit 3-vector in the direction of propagation.

5.9 The Doppler effect and relativistic aberration
An example of the usefulness of the 4-vector approach (and particularly the photon
4-wavevector) is provided by the Doppler effect. Suppose that an observer is at
rest in some Cartesian inertial frame S defined by the coordinates x in spacetime.
Let us also suppose that a source of radiation is moving relative to S with a speed
v in the positive x1 -direction and that at some event P the observer receives a
photon of wavelength in a direction that makes an angle with the positive
5.9 The Doppler effect and relativistic aberration

x1 -direction. Thus, at the event P the components k = k · e of the photon™s
4-wavevector in this coordinate system are

= 1 cos sin 0

The photon observed at the event P must have been emitted by the source at some
other event Q (say). However, the equation of motion of a photon implies that
its 4-momentum p, and hence its 4-wavevector k, is constant along its worldline.
Thus the photon™s 4-wavevector k at the event Q is the same as that at the
event P.
Let us denote the Cartesian inertial frame in which the radiation source is at
rest by S (whose spatial axes are assumed not to be rotated with respect to those
of S); this frame is represented by the coordinates x in spacetime. Thus, at
the event Q the components in S of the photon™s 4-wavevector are given by
k = k · e and read

k= (5.19)

where is given by (5.4).
We denote these components in S by

= 1 cos sin 0

The zeroth component of (5.19) yields the ratio of the proper wavelength and the
observed wavelength:

= 1 ’ cos

This equation contains all the familiar Doppler effect results as special cases. If
= 0, the source must be approaching the observer along the negative x1 -axis.
If = , the source is receding from the observer along the positive x1 -axis.
Finally, if = ± /2 we obtain the transverse Doppler effect. Similarly, from the
2- and 3- components of (5.19) we obtain immediately

1 ’ v/c sec

which is a version of the relativistic aberration formula.
122 Special relativity revisited

5.10 Relativistic mechanics
In relativistic mechanics, the equation of motion of a massive particle is given by


where f is the 4-force. In some Cartesian inertial frame S (for which the basis
vectors are constant throughout the spacetime) the components f of the 4-force
are given by the familiar expression
dp d dp dp
f =e · =e · pe = =
d d d d
where we have used the fact that e and e are reciprocal sets of vectors. Noting
that d = dt/ u , we may write

f ·u
d E
= p=
f f
u u
dt c c

where in the last equality we have introduced the familiar 3-force f as measured
in the frame S, and u is the 3-velocity in this frame. Writing the compo-
nents in this way, the time and space parts of the equation of motion in S are
(as required)
1 dE dE
= = f ·u (5.20)
d dt
1 dp dp
= =f (5.21)
d dt

where E and p are given by (5.14) and (5.15) respectively.
There is, however, a certain rarely discussed subtlety in relativistic mechan-
ics. Let us consider the scalar product u · f , which is of course invariant under
coordinate transformations. This is given by
dp dm0 du
u·f = u· = u· u + m0
d d d
dm0 du
= c2 + m0 u ·
d d
= c2
where we have (twice) used the fact that u · u = c2 . Thus, we see that in special
relativity the action of a force can alter the rest mass of a particle! A force that
preserves the rest mass is called a pure force and must satisfy u · f = 0.
5.12 Relativistic collisions and Compton scattering

If so desired, one can also introduce the 4-acceleration of a particle, a = du/d ,
in terms of which a pure 4-force takes the familiar form f = m0 a. In some
Cartesian inertial frame S, the components of the 4-acceleration are
du d du du d
+u u
= = uu =
a uc c
u u u
d dt dt dt dt
du du
= ua + u
dt dt
where a = du/dt is the 3-acceleration in the frame S.

5.11 Free particles
We now come to a very important observation concerning relativistic mechanics.
In the absence of any forces, the equation of motion of a massive particle is

=0 (5.22)

where the proper time is an affine parameter along the particle™s worldline.
Similarly, the equation of motion of a photon is

=0 (5.23)

where is some affine parameter along the photon™s worldline. However, in each
case the 4-momentum p at some point on the worldline is simply a fixed multiple
of the tangent vector to the worldline at that point. Thus, equations (5.22) and
(5.23) say that tangent vectors to the worldlines of free particles and of photons
form a parallel field of vectors along the worldline. From Chapter 2 we know
that this is the definition of an affinely parameterised geodesic. Thus, in special
relativity the worldlines of free particles and photons are respectively non-null
and null geodesics in Minkowski spacetime.

5.12 Relativistic collisions and Compton scattering
We note from (5.22) and (5.23) that the conservation of energy and momentum
for a free particle or photon is represented by the single equation p = constant.
We can, of course, add the 4-momenta of different particles. Thus for a system of
n interacting particles i = 1 2 n with no external forces, we have n pi =
constant, which is very useful in relativistic-collision calculations.
124 Special relativity revisited

x2 x2



x1 x1

Figure 5.4 The Compton effect.

An important example of a relativistic collision is Compton scattering, in which


. 5
( 24)