. 14
( 16)



0° 0°

Figure 14.13. Indications of true polar wander derived from palaeomagnetic observations.
[Courtesy of J. Besse, V. Courtillot.]

anomalies, effectively the age of the geoid, is due to mantle convection and re¬‚ects
the time it takes for slabs to sink through the mantle, as explained by Davies
(1984). A formal correspondence between large-scale mantle heterogeneity and
the time-integrated history of subduction was made by Richards & Engebretson
(1992), and a quantitative model of mantle heterogeneity derived from subduction
history was later developed by Ricard et al. (1993).
We can infer also that the mantle™s circulation pattern must vary in time. In
other words, as can be seen from the boundary layer analysis in Section 14.1.1,
long-term changes in plate motion on time scales of the order of 10“100 Myr
must be matched by long term changes in the convective forces of the mantle.
Independent observational evidence for time variations in mantle circulation comes
in the form of palaeomagnetic studies of the secular motion of the Earth™s rotation
axis relative to the surface. This true polar wander accommodates changes in
the Earth™s moment of inertia due to internal mass redistributions as the mantle
convects, and is required to conserve angular momentum. The well known true
polar wander model of Besse & Courtillot (2002), shown in Figure 14.13, suggests
secular motion of the Earth™s rotation axis of the order of 10 degrees over the past
100 Myr.
The fact that mantle circulation varies in time means that convectively generated
dynamic topography is a transient feature. While the existence of dynamic
topography as a result of vertical stresses associated with mantle convection is
commonly accepted, its magnitude is poorly understood, being overshadowed
entirely by the ¬rst-order topographic signal of the continent“ocean contrast of
approximately 4 km. Buoyant mantle upwellings, such as hot spots, produce
14.4 Circulation of the mantle 359

Figure 14.14. Residual topography beneath the Earth™s oceans with the removal of the
effect of features due to the lithosphere; the variations in topography are of the order of 500
m. [Courtesy of M. Sandiford.]

topographic highs, while downwellings go along with depressions of the surface.
We can estimate the range of dynamic topography once the dominant shallow,
isostatic mass anomalies in the crust and lithosphere (i.e., the thickness variations
in the crust and lithosphere, sediment loading and age related thermal subsidence)
are removed from the observed topography. A map obtained this way for the
oceans is shown in Figure 14.14, revealing that large areas in the western Paci¬c are
anomalously shallow for their age. Much of this area lies above mantle regions with
rather slow seismic wavespeeds (see Figure 11.23) that are presumably buoyant.
Furthermore the western Paci¬c has been strongly affected by hot spot activity.
Shallow ocean ¬‚oor in this region is thus not entirely surprising, and should not be
taken as evidence against the half-space cooling model presented in Section 12.2.1.
The dynamic topography of the sea ¬‚oor along the global mid-ocean ridge
system is particularly informative. Since ridges by de¬nition represent ocean ¬‚oor
of the same (zero) age, their depth below sea level should be the same. Yet
prominent variations in the height of ridge crests occur, for example, at Iceland
(above sea level) and south of Australia. The pronounced Australia“Antarctic
discordance coincides with mantle down¬‚ow over an old sinking slab, and suggests
a signal from dynamic topography of the order of several hundred meters. On a
regional scale Lithgow-Bertelloni & Silver (1998) con¬rm this inference for the
African plate, showing that the anomalously high stand of southern Africa (by a
kilometre or so) correlates with lower mantle heterogeneity and the geoid. Husson
(2006) provides a discussion of the timing and magnitude of dynamic topography
over retreating subduction zones. The presence of dynamic topography means that
the various continents should differ in their relative sea level curves as recorded
in sediments on continental platforms, a view that is still somewhat alien to the
360 Mantle Convection

Spreading half-rate [mm/yr]
0 20 40 60 80 100

Figure 14.15. Ocean half spreading rates for the last 100 Myr.

conventional sedimentological perspective that such changes re¬‚ect only the global
eustatic signal of the oceans.
Plate motions change on a variety of time scales. While long-term variations
on the order of 10“100 Myr are clearly driven by mantle-related processes, there
are quite rapid plate movement changes superimposed on this which occur over
periods of 10 Myr or less. A well known feature is the Paci¬c plate motion change
recorded in the Hawaiian“Emperor bend, which is commonly believed to re¬‚ect
the reorientation of the Paci¬c plate to more westward motion around 40 Myr ago.
A recent compilation of oceanic spreading half rates over the past 100 Myr is
shown in Figure 14.15, which documents frequent short-term velocity variations
for all major plates during the Mesozoic and Cenozoic. An important new source
of information on plate motion has come from geodetic plate motion models, and
a detailed comparison of such models with those derived from palaeomagnetic
reconstructions shows that while the rates and direction of plate movement
averaged over several years agree well with those deduced from averages over
millions of years, there are also small but important differences. For example,
present-day Nazca“South America motion from the REVEL model (Sella et al.,
2002) derived from geodetic measurements is slower than predicted by the 3
million year average in NUVEL-1A (De Mets et al., 1994), as shown in Figure
14.16; the light arrows from REVEL are somewhat shorter than the dark arrows
representing NUVEL-1A. Such rapid variations cannot arise from changes in
convective processes in the mantle, which operate on a longer time scale of the
order of 100“150 Myr. One must thus appeal to shallow forces in the lithosphere
operating along plate boundaries. Examples include the initiation or termination of
14.4 Circulation of the mantle 361

subduction zones, the loss of a ridge due to subduction, variations in plate coupling
along subduction zones due to variations in sediment thickness, or the rise of
high topography associated with mountain building along convergent margins. For
example, Iaffaldano et al. (2006) explain the rapid reduction of plate convergence
between the Nazca and South American plates as associated with the recent
uplift of the Andes. Short-term variations in plate boundary forces especially
along convergent margins may therefore play an important role in modulating the
short-term evolution of plate motion.

Figure 14.16. Comparison of the plate motion patterns for the NUVEL-1A model of
DeMets et al. (1994) shown by dark arrows, with the REVEL model of Sella et al. (2002)
derived just from recent space geodetic methods shown with light arrows.

14.4.3 Mantle circulation models
We have seen that a careful analysis of past plate motion models provides important
information about structure and dynamics of the mantle. One can exploit this
insight in a formal way through mantle circulation modelling. The circulation
models solve the Navier“Stokes equations in the Stokes limit (14.1.19), that is
they tie past plate movement directly to the history of deep mantle ¬‚ow through
the elliptic nature of the momentum balance. The circulation models are also
required to satisfy the energy equation (14.1.11). This combination of ¬‚ow and
temperature requirements gives rise to the development of thermal boundary layers
at the surface and the core“mantle boundary. Thus the temperature heterogeneity of
circulation models is also tied to past plate motion, primarily through the location
of cold downwellings along current and past subduction zones.
In mantle circulation modelling one faces a fundamental hurdle: namely it is
impossible to know the initial state from which to start the circulation sometime in
the past. Missing initial condition information precludes a deterministic approach
to modelling the history of mantle ¬‚ow. The initial condition problem is shared
with circulation models of the ocean and the atmosphere. The dif¬culty re¬‚ects the
fact that typically there are far more degrees of freedom in the model than available
362 Mantle Convection

D" layer
lower mantle
upper mantle
whole mantle



0 100 200 300 400 500
Time [Myr]

Figure 14.17. The correlation functions for temperature before and after a change of plate
motion in a two-stage convection model.

observations. Ocean and atmospheric modellers overcome this shortcoming by
a method known as data assimilation; through a variety of techniques computed
values from the model are replaced by observations whenever possible. In this
way the ¬‚ow and dynamics in a model can be adapted to the observations at hand
provided the assimilation period is long enough.
Plate motion histories take the form of data in a mantle circulation model. The
plate motions enter the calculation as a set of time-dependent boundary conditions
on the surface velocity ¬eld, which is required to match plate geometries and
velocities at any given time. In other words, when the models are integrated
forward in time from some assumed initial state, the computed surface velocities
are replaced by velocities corresponding to the plate motion at a particular time in
the past. As a result mantle circulation models assimilate past plate motions, and
we need to ensure that plate motion histories are imposed for a suf¬ciently long
time that the memory of initial conditions is lost.
Figure 14.17, based on Bunge et al. (1998), illustrates how initial condition
information is lost in a mantle circulation model. The results are taken from a
whole mantle convection model similar to the one shown in Figure 14.4. The
scenario includes compressible ¬‚ow, with a constant-temperature core supplying
about 20% of the mantle heat ¬‚ux (the remainder coming from internal heating),
and a factor of 40 increase in mantle viscosity through the transition zone. A regime
of mid-Cretaceous plate motions is imposed at ¬rst, and when quasi-steady-state
is reached a sudden switch is made to present-day plate motion. The adjustment
time between the two plate motion regimes can be monitored through the global
14.4 Circulation of the mantle 363

correlation between model temperatures before and after the plate motion change.
The four curves in Figure 14.17 show the correlations of the temperature patterns
for the entire mantle, the upper mantle, the lower mantle, and the core“mantle
boundary (D ) region; all correlations begin at unity. The whole mantle curve levels
off at a correlation of about 0.2 after an elapsed time of about 175 Myr. The other
curves behave similarly, with the most rapid adjustment occurring, not surprisingly,
in the upper mantle, where plate-related ¬‚ow is most directly felt, and the slowest
in the D layer. The results of Figure 14.17 suggest the following conclusions:
(a) the adjustment time is comparable to the vertical transit time of these models,
which is controlled by the radial viscosity structure and is of the order of
150“200 Myr.
(b) the adjustment time is also comparable to the time over which global plate
reconstructions are feasible (about 150 Myr), which is no coincidence: old
sea¬‚oor represents, roughly, the vertical transit time in the mantle, and a faster
mantle overturn would merely serve to lower the mean age of the lithosphere.
(c) the imposition of plate motion models for the past 150 Myr is not suf¬cient
to overcome initial condition information in the deeper mantle and near the
core“mantle boundary, where the memory to earlier ¬‚ow regimes is retained
for closer to 200 Myr. For the mantle this observation is supported by seismic
models showing slabs of early Jurassic or late Permian age, but not older, buried
beneath Siberia (e.g., van der Voo et al., 1999).
Heterogeneity pattern
Seismic tomography provides us with increasingly clear images of mantle
heterogeneity, which can, in principle, be compared with heterogeneity structure
predicted from circulation modelling. Figure 14.18 shows thermal heterogeneity
in two standard circulation models of whole mantle ¬‚ow from Bunge et al. (2002).
The only difference between the models lies in their amount of bottom heating,
which is kept at less than 1% of the global surface heat ¬‚ux in model one (at the
top), while model two (below) has 35% core heating re¬‚ecting the uncertainty in
the range of core heat ¬‚ux that we have discussed before.
We represent compressible ¬‚ow through the anelastic approximation, with a
thermodynamic background con¬guration based on a Murnaghan equation of state.
The reference density increases radially from 3500 kg m’3 at the surface to 5568
kg m’3 at the core“mantle boundary, in compliance with the Preliminary Reference
Earth Model (PREM). The thermal expansivity decreases from a surface value of
4.0—10’5 K’1 to 1.2—10’5 K’1 at the core“mantle boundary. The heat capacity is
kept constant at 1.1—103 J kg’1 K’1 , and thermal conductivity at 6.0 W m’1 K’1 .
Internal heating at a rate of 6.0—10’12 W kg’1 is applied, comparable to a value
suggested by chondrite meteorites. The upper mantle viscosity of 8.0—1021
Pa s exceeds the value inferred from post-glacial rebound by about an order of
magnitude. We must make this choice in order to reduce the thermal Rayleigh
number and thereby lower the computational burden. The viscosity increases from
364 Mantle Convection

without core heating

2800 km

1500 km

670 km
+400 K -400 K

with core heating

2800 km

1500 km

670 km

Figure 14.18. Mantle circulation models calculated with and without core heating shown
as cross-sections at 2800 km, 1500 km and 670 km depth, as well as in a spherical view.

the upper to the lower mantle by a factor of 40. A mechanically strong lithosphere
results from raising the viscosity in the upper 100 km of the model by a factor of
100 relative to the upper mantle. The convective vigour measured by the Rayleigh
number for internal heating RaH is 108 based on the upper mantle viscosity.
We have seen before that one of the most uncertain assumptions in these models
is the initial state. We adopt the following choice: we approximate the unknown
mid-Cretaceous mantle heterogeneity as a quasi-steady-state of mantle ¬‚ow with
plate motions corresponding to the mid-Cretaceous (119 Myr ago), that is we apply
mid-Cretaceous plate motion for all earth history (4.5 billion years) prior to that
time. We account for the reduced convective vigour in these models by scaling
the root mean square (RMS) plate velocities such that convective motion is neither
increased nor reduced by the assimilated velocities, i.e., we scale the RMS plate
velocities to match the RMS surface velocity of convection with a free slip surface.
In this way the model P´ clet number remains unchanged.
There is a dominant heterogeneity in the circulation models associated with
past subduction (Figure 14.18). In the transition zone (670 km depth slice) cold
downwellings beneath North and South America correspond to subduction of the
14.4 Circulation of the mantle 365

Farallon and Nazca plates. Likewise the cold anomalies under India and the
Western Paci¬c arise from Cenozoic subduction of the Indian and Paci¬c plates
beneath Eurasia. The model without strong core heating lacks hot active mantle
upwellings, and the temperatures are nearly uniform away from the downwellings,
whereas the core heated model (35%) shows additional hot upwellings, or plumes.
These are located under the Paci¬c and the Atlantic. The heterogeneity otherwise
is similar to the internally heated case.
The picture of slab-dominated mantle heterogeneity does not differ for the
cross-section much deeper in the mantle (1500 km depth slice). Cold anomalies
persist under eastern North America, the Caribbean, the northernmost part of
South America, and the Alpine“Himalayan mountain belt due to past subduction.
There are also two strong hot thermal anomalies in the core heated model in the
mid-Atlantic and the Eastern Paci¬c region.
A general observation we can draw from Figure 14.18 is that the location of
cold downwellings agrees reasonably well with fast seismic velocity anomalies
imaged by tomography, while agreement in the location of hot upwellings is poor.
The difference is particularly striking for the strong low seismic velocity anomaly
imaged by many tomographic studies under Africa. Its counterpart in the core
heated circulation model is located some 5000 km further west in the Mid and
South Atlantic.
Heterogeneity near the core“mantle boundary in the two circulation models
differs through the existence of a hot thermal boundary layer. The strength of
thermal heterogeneity is low in the internally heated model, whilst there are
prominent hot upwellings in the model with core heating. The heterogeneity pattern
of the models at the core“mantle boundary does not resemble seismic images from
the region just above the core“mantle boundary. Rather it is an artefact of the model
initialisation, made evident upon comparison with the 120 Ma mid-Cretaceous
plate reconstruction (Figure 14.19b), which we used to initialise the model. We
note that heterogeneity at the core“mantle boundary corresponds closely to the
location of spreading and subduction margins. In other words, the deep mantle and
core-mantle boundary heterogeneity in a circulation model is entirely controlled
by the assumed initial condition. The arti¬cial nature of heterogeneity near
the core“mantle boundary is also evident from the recent circulation model by
McNamara & Zhong (2005) which resembles the heterogeneity structure in Figure
14.18 closely.

Adjoint mantle circulation models
We have seen that it is impossible to overcome the initial condition problem with
circulation models running forward in time from some unknown state in the past.
However, the general character of the mantle convection equations suggests a
different approach to get around the problem. There is an absence of inertial effects
in the momentum balance of the mantle, which means that the equation of motion
is instantaneous and does not involve time. Time enters mantle convection only
366 Mantle Convection

through the energy equation (14.1.11), where irreversible processes are associated
with heat diffusion. We recall that heat diffusion in the mantle is small, and that it
can be neglected outside of thermal boundary layers. It is then feasible to reverse
time in the energy equation provided we assume zero heat diffusion. In that case
one may start from a model of present mantle heterogeneity and step convection
backward in time.
There is an error associated with ignoring thermal diffusion effects, and we
may quantify this error by estimating the length scales of diffusive and advective
mantle heat transport. Thermal disturbances diffuse in the mantle by about 100
km - the thickness of the thermal boundary layer - over 100 Myr. Over the same
time the disturbances travel by advection over distances exceeding 1000 - 5000
km. The error we make then in ignoring thermal diffusion is reasonably small. In
practice one will not want to set the thermal diffusion term to exactly zero in the
time-reversed energy equation. Rather, for reasons of numerical stability one will
choose a very small negative diffusion coef¬cient, so that heat diffuses still from
hot to cold regions when we step back in time.
The calculations of Steinberger & O™Connell (1997) illustrate the capabilities of
this approach. They consider the advection of a mantle density ¬eld back in time;
the density variations are inferred from seismic tomography as discussed in Section
11.4. The changes of large-scale mantle density obtained in this way predict
temporal variations in Earth™s rotation axis which are in reasonable agreement with
paleomagnetically derived inferences of true polar wander for the past 60 Myrs.
There are other geologic observables, such as dynamic topography and relative hot
spot motion that one can infer as well, and Steinberger and colleagues have done
so in subsequent work.
There is a formal way to infer mantle ¬‚ow back in time that requires the
formulation of an inverse problem. We seek optimal initial conditions that
minimise, in a weighted least squares sense, the difference between what a
mantle convection model predicts as heterogeneity structure in the mantle and the
heterogeneity that is actually inferred from, say tomography. In general, this class
of problems is known as history matching, for example in hydrology. It is also often
referred to as variational data-assimilation, for example in meteorology, meaning
that model parameters are inferred from a variational principle.
In mantle circulation modelling the images of seismic tomography take on the
form of data and the assimilation is posed as an inverse problem with the initial
conditions as the variables to be determined. The mis¬t between current day
structure and the predictions for speci¬c initial conditions is speci¬ed through a
cost function F. The necessary condition for a minimum of F, that the variation
∇F = 0, produces a set of equations which involve the usual mantle convection
equations coupled to the corresponding adjoint equations. The adjoint equations
are nearly identical to the forward model except for forcing terms and a change of
sign from positive to negative in the diffusion operator. Reversing the sign of the
14.4 Circulation of the mantle 367

Mantle Temperature
Initial Condition 100 Myr later

First Trial

Best Trial

Figure 14.19. The re¬nement of the estimate of the initial conditions in the mantle, 100
Myr ago using the adjoint method. The ¬rst trial is a simple strati¬ed model, but after 100
iterations a complex initial state is achieved.

diffusion operator makes the adjoint equations unconditionally stable to integration
backwards in time.
There are a number of ways that can be used to derive adjoint equations for the
initial condition problem. Here we illustrate the basic approach to which further
complexities can be added (Bunge et al., 2003). We start with the de¬nition of
the cost function F to measure the mis¬t between predicted and actually observed
heterogeneity for a speci¬c initial condition. We take the general form

F = FI + Feq, (14.4.2)

where FI depends on the initial temperature distribution TI(x), and Feq allows for
contributions from the model equations used to map the initial conditions to a ¬nal
state that can be compared with tomography. The component FI is given by

d3x d3x j(x)Wj(x, x )j(x ),
FI = (14.4.3)
368 Mantle Convection

Mantle Temperature [reference model]
(a) Assumed initial conditions (b) 100 Myr later

Assimilated Plate Motions





(c) Cretaceous to initial (d) Present day from (a) to (b)

Figure 14.20. Reference model calculated using data assimilation, the initial state (a) was
created by imposing the Cretaceous plate motion shown in (c) up to 100 Myr ago and then
switching to the modern plate motion (d) to give a ¬nal convective pattern (b) that is much
more like the modern Earth.

with a dependence on the initial temperature distribution through the residual j(x),

j(x) = T (x, t0) ’ TI(x), (14.4.4)

i.e., the difference between the trial initial condition for temperature TI(x) and
the optimal condition T (x, t0), integrated over the model space V with appropriate
weighting functions Wj.
There are two components to the model error Feq. Such errors can arise
from the numerical inaccuracies of a convection code, i.e., the computational
approximations imposed in solutions of the Navier“Stokes equations, or can re¬‚ect
errors in the model equations themselves. For instance, the heat source term h in the
energy equation is often taken as constant in mantle circulation modelling, although
a full treatment requires time-dependence h(t). Model errors can be included
collectively as non-zero terms in the convection equations, and we de¬ne residuals
for divergence (D), momentum (M) and temperature (˜). For an incompressible
¬‚ow, the convection equations take the form

∇ · v = D(x, t), (14.4.5)
∇(·∇v) + Ra ”θ ^r ’ ∇p = M(x, t),
e (14.4.6)
+ v · ∇θ ’ ∇2θ ’ h = ˜(x, t). (14.4.7)
14.4 Circulation of the mantle 369

We have here adopted a non-dimensionalisation based on the thermal diffusion
time, so that the momentum equation involves the Rayleigh number Ra and the
contrasts in non-dimensional temperature ”θ.
The model error functional Feq is constructed as a weighted integral of the
squared model residuals over the volume V and the entire time interval that
separates the initial condition and ¬nal state, the assimilation period I:

d3x dt d3x
Feq = D(x, t)WD(x, t, x , t )D(x , t )
+ MT (x, t)WM (x, t, x , t )M(x , t )
+ ˜(x, t)W˜(x, t, x , t )˜(x , t ) . (14.4.8)

Considerable simpli¬cation of the cost function Feq can be achieved by
introduction of adjoint variables, χ(x, t), φ(x, t) and „(x, t) as weighted integrals
over the model errors:

d3x WD(x, t, x , t )D(x , t ),
χ(x, t) = dt (14.4.9)

d3x WM (x, t, x , t )M(x , t ),
φ(x, t) = dt (14.4.10)

d3x W˜(x, t, x , t )˜(x , t ).
„(x, t) = dt (14.4.11)
The adjoint variables have no particular physical meaning, although one sometimes
refers to them loosely as adjoint pressure χ(x, t), velocity φ(x, t) and temperature
„(x, t) by analogy with the familar forward variables.
In terms of the adjoint variables the full cost function F takes the form

d3x [∇ · v] χ(x, t) + [∇(·∇v) + Ra (θ ’ θ) ^r ’ ∇p]T φ(x, t)
F= dt e
+ [‚tθ + v · ∇θ ’ ∇2θ ’ h] „(x, t)

d3x [θ(x, t0) ’ θI(x)] d3x Wj(x, x )[θ(x, t0) ’ θI(x)],
+ (14.4.12)
where θ is the mean non-dimensional temperature.
A sure approach to ¬nding the ¬rst variation of F is through a classical
perturbation procedure. We introduce a small scalar quantity and consider the
functions θ(x, t) + θ and v(x, t) + v as arguments for F, where θ and v satisfy
all required initial and boundary conditions. Taking the derivative of F with respect
to and then looking at the special case = 0 we arrive at the ¬rst variation of F:

d3x [∇ · δv] χ(x, t) + [∇(·∇v) + Ra δθ ^r ’ ∇δp]T φ(x, t)
δF = dt e
+ [‚t(δθ) + δv · ∇θ + v · ∇δθ ’ ∇2δθ] „(x, t)

d3x δθ(x, t0) d3x Wj(x, x )[θ(x, t0) ’ θI(x)].
+ (14.4.13)
370 Mantle Convection

We take an important step in the derivation of the adjoint equations when we carry
out integration by parts in both space and time on the elements of δF in (14.4.12).
Our aim is to isolate the terms involving δv, δθ and then rearrange the differential
operators to act upon the adjoint variables.
A necessary condition of the mis¬t function F is that the ¬rst variation δF be
zero at the extremum. This criterion requires that the forward and adjoint variables
satisfy a set of adjoint equations and boundary conditions coupled to the forward
model through the model error terms (Bunge et al., 2003):
∇ · φ = 0, (14.4.14)
∇ · (·∇φ) + „∇θ = ∇χ, (14.4.15)
’ ∇ · („v) + Ra ^r · φ = ∇2„ + δ(x, t ’ t1)[θ(x, t0) ’ θI(x)], (14.4.16)
’ e
where δ(x) is the Dirac delta function, and v, θ are the velocity and temperature
¬elds that couple the adjoint equations to the forward model.
The diffusion term and the time derivative in the adjoint energy equation
(14.4.16) have changed sign from positive to negative so the equation can be
stepped backwards in time. Both of these changes arise from the action of the
integration by parts on the elements of the mis¬t function F. The adjoint variables
have to satisfy the following set of boundary conditions, for the full assimilation
period I. At the Earth™s surface S and the inner boundary ‚C,
φ = 0, x ∈ S, = 0, x ∈ ‚C, (14.4.17)
since the plate velocities are speci¬ed at the surface. Over both boundaries (‚V)
we require
φ · n = 0, χ = 0, „ = 0. (14.4.18)
There is also a ¬nal time condition for the adjoint system,
„(x, t1) = 0. (14.4.19)
The adjoint equations are integrated backwards in time from t1 to t0 from the ¬nal
time condition (14.4.19) on the adjoint variable „, which has the role of coupling
the adjoint and forward sets of equations.
Coupling between the forward equations and the adjoint equations can be
ignored if we can assume that our mantle convection model captures all relevant
physical processes, and the numerical error is insigni¬cant. This simpli¬cation,
which makes the computation far more tractable, amounts to taking the limit as
the relevant model weights in the cost function become in¬nite. It is equivalent to
taking the model as a constraint in the minimisation of F.
The physical signi¬cance of the adjoint equation is readily apparent. Instead of
advecting mantle density heterogeneity directly back in time, the adjoint equations
advect heterogeneity differences, that is the mis¬t between predicted and current
14.5 Mantle rheology 371

heterogeneity, into the past. Advection follows the ¬‚owlines of the forward
calculation, so that adjoint modelling is not unlike one style of tomographic inverse
problem in seismology, where travel time residuals are projected back along the ray
path. Numerically, adjoint modelling takes the form of an iterative procedure. One
starts by solving the forward equations from a ¬rst trial for the initial condition.
In the next step the mis¬t between predicted and observed heterogeneity is carried
back in time through the adjoint equations along the lines of ¬‚ow computed in the
forward run. The trial initial condition is then updated by adding the heterogeneity
residual, and the procedure repeats itself. As is customary in inverse problems,
one must consider issues of suitable damping and regularisation. The close
correspondence of forward and adjoint equations makes it possible for essentially
the same computer code to be used to solve the forward and the adjoint system.
Figure 14.19 represents a modelling experiment to test the adjoint procedure.
The ¬gure shows the progress of the iterative optimisation scheme through
illustrations of the ¬rst-trial initial condition state of the temperature ¬eld and
that achieved after 100 iterations. The initial condition assumption is that of a
simple strati¬ed mantle which nevertheless evolves over 100 Myr to produce a
more complex structure. Through repeated re¬nements of the initial condition
temperature distribution by the adjoint method we arrive at a best-trial estimate
of the initial condition which may be compared with the correct initial condition
for this case in Figure 14.20.

14.5 Mantle rheology
In Chapter 9 we encountered the wide range of mechanisms that control the
mechanical properties of silicate minerals through, e.g., diffusion and dislocation
creep. Each of these mechanisms has an exponential dependence on temperature
through the Arrhenius factor (9.3.3)“(9.3.5), so that apparent viscosity can alter
by many orders of magnitude for a few hundred degrees temperature change. The
inverse dependence on temperature as T ’1 in the Arrhenius exponent for all the
different mechanisms renders viscosity most sensitive to temperature ¬‚uctuations
at lower temperatures, so that the cold, upper thermal boundary layer of convection
can acquire very high viscosity, or strength, akin to plates.
The viscosity of the lithosphere is around 1025 Pa s and may drop to as low as
1017 Pa s in the asthenosphere before recovering to 1021 Pa s in the lower part of
the upper mantle (see Chapter 13). The viscosity may change by as much as 7
orders of magnitude in the top 200 km of the mantle. The low viscosity of the
asthenosphere has a signi¬cant effect on the patterns of mantle ¬‚ow (as discussed
in Section 13.1.5), and encourages the formation of convection cells with large
lateral extent compared with their depth.
Plate tectonics is associated with strong lateral reductions in effective viscosity
that concentrate deformation into narrow regions, referred to as zones of shear
localisation. These regions cannot arise within linear rheologies such as elasticity,
viscoelasticity or Newtonian viscous ¬‚ow, where an increase in deformation (or rate
372 Mantle Convection

of deformation) goes along with greater resistance to deformation rather than a loss
of strength. Plate-like behaviour in convection models therefore requires nonlinear
rheologies and feedback mechanisms, so that loss of strength is controlled by
processes that are themselves a function of deformation.

14.5.1 Temperature dependence
We start our consideration of the in¬‚uence of rheology with the effects of
temperature on viscosity. The main role of temperature dependent viscosity when
put into mantle convection models is to stiffen the upper thermal boundary layer
and to make it stronger than the mantle beneath. A straightforward application of
laboratory values for the activation energy (Karato & Wu, 1993) yields viscosities
suf¬ciently large to inhibit motion of the underlying ¬‚uid, which convects almost as
if it were in a state of isoviscous convection with an imposed no-slip top boundary
At the same time there is a strong asymmetry in the horizontally averaged
temperature pro¬le between the top and bottom of the ¬‚uid, in response to the
inherent strength and associated large temperature drop across the cold upper
boundary layer. The interior temperature of the ¬‚uid layer is strongly hotter than
the average temperature of the top surface. The large temperature contrast across
this surface facilitates small-scale convection, so that the ¬‚ow removes heat from
beneath the boundary layer with no need for the entire boundary layer to return
into the mantle. The resultant stagnant lid regime provides a good description for
the tectonic style of Mars, Mercury, the Moon and perhaps Venus, where the entire
planetary surface is made up from a single plate. However, this convective style is
a poor description of plate tectonics on Earth.
Tozer (1972) identi¬ed an additional role of temperature by considering the
thermal evolution of terrestrial planets. If viscosity were too high for convection
to occur, then a planet would heat up until the viscosity reduced suf¬ciently for
convection to begin. All larger terrestrial bodies in the solar system msut therefore
be in a state of convection due to the strong temperature dependence of the silicates
making up their mantles, and their internal temperatures must be in a state of self-
regulation. The long-term evolution of the plate-mantle system must therefore
take account of the sensitivity of heat ¬‚ow to internal temperature via viscosity
variability. Thus in the early Earth, when temperatures were higher, the vigour
of convection can be expected to have been higher, and the circulation time for a
particle through the mantle somewhat shorter than at present.
Temperature-dependent viscosity provides a feedback mechanism capable of
producing shear localisation, from the coupling of viscous heating and viscosity.
The feedback works in such a way that shear zones undergo dissipative heating
and associated weakening of the material. The weak zones then focus deformation,
leading to a further rise in temperature and weakening of the material. Although
simple, this mechanism is unlikely to account for the generation of plate tectonics,
partly because thermal diffusion is too ef¬cient to maintain narrow hot zones
14.5 Mantle rheology 373

over extended times, and also because observations suggest that the majority of
lithospheric faults are characterised by rather low heat ¬‚ow.

14.5.2 Strain dependence
A more effective feedback between rheology and ¬‚ow than that of temperature-
dependent viscosity involves the non-linear dependence of stress upon strain rate.
In Chapter 9 we encountered the general relation between stress and strain rate for
silicate minerals in (9.3.3), and this power-law rheology is one of the simplest that
produces shear localisation. In this rheology the strain rate is proportional to stress
raised to some power n > 1. The resultant non-Newtonian ¬‚ow behaviour is typical
for silicate polycrystals at high temperature and low stress (with T approaching the
melting temperature and σ in the range of 10“100 MPa), and it is likely that much
of the upper mantle deforms by dislocation creep with experimentally determined
creep laws having power-law exponents in the range 2“5.
The feedback mechanism of power-law rheology works such that rapidly
deforming regions soften, which facilitates further deformation in such zones.
From (9.3.5) we recall the apparent viscosity at constant strain rate, which we write
here as:

σ( ™ ) 1
∝ ™ n ’1.
·™ = (14.5.1)

Thus in the limit of large n the effective viscosity scales inversely with strain
rate. This class of non-Newtonian rheology has been implemented into numerous
two-dimensional convection models (commonly with n = 3), but it produces a
poor plate-like behaviour even when one takes large values (n > 7) in the exponent
of the power law.
A considerable improvement in shear localisation is obtained if we invoke an
exotic rheology by setting the power law exponent to a value of n = ’1. The
effective viscosity then scales inversely with the square of the strain rate. This
rheology, which is not observed in the laboratory and lacks a clear physical
interpretation, is known variably as pseudo-stick-slip, self-lubrication or self-
weakening. Models with this rheology produce narrow regions of deformation and
behaviour with a strong plate-like component, as one would expect. This type of
rheology is dif¬cult to implement numerically due to the strong reduction in local
viscosity. Bercovici & Karato (2002) provide a good overview of this class of
We have seen that the coldest parts of the lithosphere are in the brittle regime in
which deformation occurs through slip on faults. This behaviour can be represented
in the continuum limit by modelling temperature-dependent viscosity combined
with a plastic yield stress. The temperature dependence causes the cold upper
boundary layer (lithosphere) to be strong, whilst the yielding allows the boundary
374 Mantle Convection

Figure 14.21. Three-dimensional spherical convection model with temperature dependent
viscosity and yield stress; grid resolution is about 50 km. The contours are for the logarithm
of viscosity, with the darkest hues corresponding to 1021 Pa s and mid-grey to 3—1023 Pa s.
High-strain rate zones (lighter tones) separating plates show reduced viscosity due to plastic
yielding. A zone of reduced viscosity by a factor of 1000 lies beneath the lithosphere from
140 to 400 km depth, and is visible as the nearly black zone. The yield stress is set at 140

to fail locally in regions of high stress. With the use of this style of rheology Moresi
& Solomatov (1998) identi¬ed three styles of convection:
(1) the stagnant lid regime, which we encountered before, in which the cold upper
boundary later is immobile,
(2) the mobile lid regime at low yield stress in which ubiquitous plastic failure of
the lithosphere yield a highly mobile upper boundary layer, and
(3) a transitional sluggish lid regime that switches episodically between the frozen
and mobile regimes.
Regime 1 applies to Mars and the Moon, while regime 3 may describe the
distributed deformation and episodic overturn of the lithosphere on Venus.
An example of this coupled temperature and yield-stress rheology is shown in
Figure 14.21 from Richards et al. (2001). The rheology produces a reasonably
good plate-like behaviour when used in conjunction with an imposed low-viscosity
There is abundant evidence that the lithosphere is pervasively fractured and that
faults play a fundamental role in controlling the strength of the lithosphere. This
result lies at the heart of mantle convection models that incorporate mobile faults,
14.6 Coupled lithosphere“mantle convection models 375

Resistive Stress [MPa]
0 100 200 300 400


Depth [km]


Friction = 0.60
Friction = 0.15
Friction = 0.01


Figure 14.22. Yield strength envelopes for varying coef¬cients of friction corresponding
to faulted and non-faulted material.

e.g., Zhong & Gurnis (1996). The strength envelope of the lithosphere is controlled
by brittle failure down to considerable depth, especially if one assumes that faults
are weak (Figure 14.22). This means that the coef¬cient of friction for fault failure
is much lower than the value (M1 = 0.8) that would be taken for Byerlee™s law for
unfaulted material (see Section 12.2.4). The reduced coef¬cient of friction agrees
with observations of stress indicators and low heat ¬‚ow that are characteristics of
many major fault systems.
It is only with the use of such exotic, self-lubricating rheologies that a
single continuum treatment can be used to generate realistic plate-like behaviour.
However, we already know that there is a rheologically distinct lithosphere, and
that plate boundaries represent features that must cut through the entire lithosphere
zone with different plates on either side. Lithospheric rheology is suf¬ciently
different from the rest of the mantle that, at our present state of understanding,
it is appropriate to treat it separately.

14.6 Coupled lithosphere“mantle convection models
The distinction between the lithosphere and the rest of the mantle is largely
rheological, as we have seen in Chapter 13. Although we can treat mantle
circulation through a viscous ¬‚ow, we need to take account of more complex
properties and shorter length scales in the lithosphere. Some aspects of
lithosphere behaviour, particularly in subduction, can be captured by a visco-plastic
formulation, but a full description requires a combination of rheological processes
whose computational requirements are rather different from those for the ¬‚uid
However, there is an effective way to link lithospheric modelling with mantle
376 Mantle Convection

Figure 14.23. Lithospheric models can be linked to mantle convection through the impo-
sition of suitable driving conditions at plate boundaries and the base of the lithosphere.

circulation through the driving stresses at the base of the lithosphere (Figure 14.23).
Iaffaldano et al. (2006) have shown how a global thin-plate representation of
the lithosphere (SHELLS, Kong & Bird, 1995) can be coupled to a full mantle
circulation model. The thin-plate approach reduces the lithospheric problem
from three to two dimensions by employing the concepts of isostasy and vertical
integration of lithospheric strength (cf. Section 12.6). In the SHELLS model
the lithosphere is treated by a ¬nite element method with a representation of
rock strength employing frictional sliding on faults in the shallow brittle zone
and dislocation creep in the ductile zone at depth. The major plate boundaries
are introduced by faults, represented through ¬nite element interfaces, with dips
controlled by seismological results.
Iaffaldano et al. (2006) have considered the impact of raising the topography
of the Andes on the global mantle ¬‚ow. The extra load results in an increase
in frictional forces on the plate boundary by about 1—1013 N m’1 , which is
comparable to the shear stresses in the mantle. The increased elevation of the
mountains leads to a slowing of the rate of convergence of the Nazca plate and
South America. This result is consistent with the decline from a convergence rate
around 100 yr’1 at 10 Ma, estimated from plate reconstructions (Gordon & Jurdy,
1986), to about 70 mm yr’1 in the geodetic model of Sella et al. (2002) as illustrated
14.7 Thermochemical convection 377

in Figure 14.16. The in¬‚uence on mantle ¬‚ow is signi¬cant, but largely con¬ned
to the zones beneath the Nazca and South America plates. These results indicate
that the raising of topography has the potential of consuming a signi¬cant amount
of the driving forces available from the mantle.
The work of Iaffaldano et al. (2006) combines two different styles of well-
established modelling, and demonstrates the potential of a coupled lithosphere“
mantle approach. The restriction to a uniform-thickness thin plate means that a
common treatment is applied to both oceanic and continental environments and
that subduction is taken up by ¬‚ow in the mantle rather than direct feed from the
lithosphere above. We can envisage more complex three-dimensional lithosphere
models coming into play as computational resources improve. Such coupled
calculations allow the separation of the main ¬‚ow and represent an excellent
test-bed for concept testing, without the need for the incorporation of ¬ne-scale
complex rheology in three-dimensional global mantle circulation models.

14.7 Thermochemical convection
In addition to the dynamical state of the mantle, we would like to know how
chemical components are distributed in the mantle and the way in which this
distribution has evolved.
A number of classes of questions can be posed that require somewhat different
treatment. The simplest approach is the use of passive tracers in a computational
convection scheme, so that the evolution of the mantle from some initial
con¬guration can be tracked and mixing patterns determined. Such studies are
consistent with geochemical estimates of a mantle residence time of 1“2 Gyr. The
fate of a particular component, e.g., the former oceanic crust, can be followed with
the use of tracers whose properties, such as density, are modi¬ed by the current
thermal state and passage through phase transitions.
A similar concept can be used to associate tracers with the concentration of
chemical species through local segregation and assimilation processes governed
by appropriate partition coef¬cients. Most studies, as in the work of Tackley &
Xie (2002), have used laboratory estimates of volumetric partition coef¬cients.
The nature of mantle mixing means that surface melt extraction is likely to
be more important than volumetric effects, and these would be described by
somewhat different partition coef¬cients. In addition the partition coef¬cients can
be in¬‚uenced by conditions such as the oxidation state that are dif¬cult to capture
in a numerical code.
Over time most of the mantle is well stirred and initially contiguous volumes
tend to be spun into thin ribbons. Although some pockets of material with high
viscosity may be less well mixed, it is dif¬cult to ¬nd conditions suitable for
the preservation of large volumes of relatively undisturbed mantle as envisaged
in the geochemical concept of a primitive mantle reservoir that has preserved the
geochemical abundances for the different elements imposed at the time of the
formation of the Earth.
378 Mantle Convection

A full treatment of thermochemical issues requires keeping track of both major
element chemistry and the minor elements that are important for their geochemical
signatures. Tackley & Xie (2002) employed an initial state with two components
˜basalt™ and ˜harzburgite™ with their own distinct densities. The ˜basalt/eclogite™
tracers, initially set at 25% of the total, are allowed to melt with a change in
composition as the mantle convection system evolves. Once differentiation takes
the composition of a tracer to the ˜harzburgite™ end-member no further melting is
The melting process needs to be assessed in each grid cell for each time-step
by comparing the calculated temperature with a solidus temperature that depends
on pressure and composition derived from experimental results. Suf¬cient melt is
extracted from the tracer to return the temperature to the solidus. The tracer is then
assigned the composition of the residuum. The melt component is immediately
transferred to the surface, under the assumption that melt percolation through the
mantle and eruption occur on time scales short compared with the time-step for
the mantle convection run. The melt forms ˜crust™ with compaction of the material
beneath. In the process of melting trace elements (e.g., Pb, U, He, Ar, K, Th) are
partitioned between the melt and residuum, ensuring mass conservation. The noble
gases are assumed to largely escape to the atmosphere from the melt crust.
In addition to assumptions about the initial state of the major elements, a starting
distribution of trace elements has to be imposed (e.g., allowing for continental crust
differentiation). The complex tracer system provides a useful way of investigating
the dependence of a simpli¬ed geochemical system on different assumptions about
the initial state. Many aspects of the results resemble the geochemical signatures
determined from the limited available samples of melt products at the Earth™s
Because of the level of computational resources needed for such a
thermochemical convection scheme, two-dimensional simulations have often been
used, such as a cylindrical geometry approximating a cross-section through the
Earth (e.g., Tackley & Xie, 2002).
A major complication in the treatment of thermochemical issues is the rather
different scales associated with the main ¬‚ow patterns and the ¬lamentary
structures that may carry speci¬c thermochemical signatures. A similar scenario is
encountered in oceanographic models where ¬ne-scale structures, such as eddies
and jets, are very important to detailed weather patterns, whilst the climate is
determined by the main ¬‚ow. New styles of oceanographic calculations are being
developed where the representation of the model is adaptive to the ¬‚ow, and so
computational effort is concentrated in regions of rapid spatial variation. Similar
concepts would be attractive for the thermochemical scenario, since they would
allow a more direct representation of melt processes and melt transport without the
need for parameterisation of sub-grid scale features.
The Core and the Earth™s Dynamo

Passage from the Earth™s mantle to the core marks a transition from a silicate
mineralogy to a liquid metallic region dominated by iron. The principal evidence
for composition comes from shock-wave equation of state measurements (Section
9.4), reinforced by studies of meteorites. Some lighter component than iron is
required in the outer core to match seismological estimates of density. The density
de¬cit is of the order of 10% compared with a liquid iron“nickel alloy. O™Neill &
Palme (1998) provide a careful discussion of the merits of the main candidates for
a light element component: oxygen, silicon and sulphur. No single element is able
to produce the desired density variation and meet geochemical constraints, but a
mixture of several elements (Si, O, S, C) might be appropriate. The seismological
results are compatible with an adiabatic state through most of the outer core with a
change of density of about 10% from the top of the core to the inner core boundary.
The outer core contains 30% of the mass of the Earth and behaves as a ¬‚uid on
seismic time scales, preventing the passage of seismic shear waves. Slower motions
in the liquid core are inferred to be the origin of the Earth™s magnetic ¬eld through
some class of dynamo action. The effective mixing through the convective motion
in the core is responsible for the adiabatic density pro¬le with radius. In contrast,
the inner core appears to be solid, with a large density jump from the outer core
to the inner core associated with a nearly pure iron composition. As mentioned
in Chapter 1, the properties of the inner core are rather complex, with evidence
for seismic anisotropy that is somewhat weaker near the surface. A signi¬cant
dif¬culty is that uncertainty as to the behaviour of other parts of the Earth tends to
be transferred via different observations to the properties of the inner core, which
is only 5% of the total mass.
The arguments for a solid inner core come most forcibly from the frequencies
of free oscillation of the Earth (Masters & Shearer, 1990). A number of efforts
have been made to identify seismic phases with a shear leg (J) in the inner core.
The most promising result from Duess et al. (2000) uses stacking methods and
synthetic seismogram analysis to identify a likely SKJKP phase consistent with a
mean shear wavespeed of 3.6 km/s.
The boundary between the inner and outer cores is suf¬ciently sharp to re¬‚ect

380 The Core and the Earth™s Dynamo

high-frequency seismic energy (PKiKP). Gradients in seismic properties below
the boundary have suggested the possibility of some ¬‚uid percolation in the outer
region as progressive freezing takes place.
The complex interaction of ¬‚uid ¬‚ow and electromagnetic effects in the core
sustains a geodynamo that generates the internal magnetic ¬eld of the Earth.
This magnetic ¬eld has to penetrate the much more poorly conducting silicate
mantle to reach the surface, and in consequence short-wavelength components are
Sources of energy to power the geodynamo can come from gravitational energy
released by compositional convection due to expulsion of the lighter component as
the inner core solidi¬es and from the cooling and contraction of the core as heat is
extracted, latent heat release will accompany the solidi¬cation and there may also
be internal radioactive heat sources (such as 40 K). The idea that the precessional
motion of the Earth powers the dynamo has strong advocates, but is not generally
accepted as a likely mechanism.
Computations of a magnetohydrodynamic dynamo for the core have yet to reach
a fully realistic Earth-like state. Nevertheless some simulations display many
features of the Earth™s magnetic ¬eld, including polarity reversals in the ¬eld; a
comprehensive review is provided by Kono & Roberts (2002).

15.1 The magnetic ¬eld at the surface and at the top of the core
Most of the magnetic ¬eld measured at the Earth™s surface comes from sources
within the Earth, but there are also external in¬‚uences from the ionosphere. We
can build a model of the contribution to the magnetic ¬eld from internal sources in
terms of a magnetic potential •m with a spherical harmonic expansion as
re l+1
Yl (θ, φ)Gm,
•m(r) = re (15.1.1)
l m=’l
where, as in (11.3.38), Yl is a normalised spherical harmonic. The magnetic ¬eld
Bm above the core is derived from the gradient of •m under the assumption that
the conductivity of the mantle is negligible,

∇2•m ≈ 0.
Bm = ’∇•m, with (15.1.2)

At the core“mantle boundary all the components of the magnetic ¬eld must be
continuous, so that the ¬eld on the underside of the boundary will satisfy

B = Bm at r = rc . (15.1.3)

Inside the core the requirement that ∇ · B = 0, from Maxwell™s equations, can
be met by representing the solenoidal ¬eld B in terms of toroidal and poloidal

B(x, t) = ∇ — [T (r, t)x] + ∇ — ∇ — [S(r, t)x], (15.1.4)
15.1 The magnetic ¬eld at the surface and at the top of the core 381

where T is the de¬ning scalar for the toroidal contribution and S that for the poloidal
part. In spherical polar coordinates, the components of the magnetic ¬eld are then
B = Br, Bθ, Bφ
1 2 1‚ ‚ 1‚ 1 ‚‚ 1‚
= L S, (rS) + T, (rS) ’ T . (15.1.5)
r r ‚θ ‚r sin θ ‚φ r sin θ ‚φ ‚r r ‚θ
where L2 is the angular momentum operator,
1 ‚2
‚ 2‚ 1‚ ‚
2 2 2
L= r ’r ∇ =’ sin θ ’2 , (15.1.6)
sin θ ‚φ2
‚r ‚r sin θ ‚θ ‚θ
so that L2 = ’r2 ∇2 , where ∇H is the horizontal derivative operator.
We can convert the poloidal and toroidal representation for the magnetic ¬eld
into an expansion in terms of spherical harmonics, introduced in (11.3.38). Thus,
for the poloidal contribution we write
Sm(r, t)Yl (θ, φ).
S(r, t) = (15.1.7)
l m=’l
The spherical harmonics Yl are eigenfunctions of the angular momentum operator
L2Yl (θ, φ) = l(l + 1)Yl (θ, φ) = Ł2Yl (θ, φ).
m m m
Further, from (15.1.5) we can cast the magnetic ¬eld in terms of the vector spherical
harmonics introduced in (11.3.41) as
Ł2 m Ł‚
Sl (r, t) Pm(θ, φ) + rSm(r, t) Bm(θ, φ)
B(x, t) = l l l
r r ‚r
l m=’l

+ŁTl (r, t) Cm(θ, φ) ,

with the poloidal contribution in the core speci¬ed by Sm and the toroidal by Tl .
The magnetic ¬eld in the mantle is purely poloidal and so at the core“mantle
boundary r = rc we require the toroidal component to vanish
T (r, t) = 0, Tl = 0 at r = rc . (15.1.10)
The continuity condition (15.1.3) for the radial component Br leads to
lm re
Sl (rc ) = Gm , (15.1.11)
rc rc
and, by eliminating Gm between the radial and θ components, we require

(rSm) + lSm = 0, at r = rc . (15.1.12)
‚r l l

We have therefore a way of transferring information on the magnetic ¬eld at the
surface into the poloidal part of the magnetic ¬eld in the core.
382 The Core and the Earth™s Dynamo

Figure 15.1. Radial magnetic ¬eld at the Earth™s surface in 1980. [Courtesy of A. Jackson.]

Satellite observations such as those from the CHAMP and Oersted satellites,
speci¬cally designed to make magnetic ¬eld measurements have allowed the
construction of maps of the large scale magnetic ¬eld at the Earth™s surface (see
Figure 10.1). The dominant component of the internal ¬eld is dipolar and it is
this property that is used in the standard compass. However, there are signi¬cant
non-dipole components.
The spectrum of the surface magnetic ¬eld shows a sharp decline with the
angular order of the spherical harmonics l out to order 14, corresponding to a
wavelength of the surface of around 3000 km. For short wavelengths, and thus
larger angular orders, the spectrum shows a slight increase. This is interpreted as
being the result of contributions from the lithosphere, both remanent magnetism
from rocks that have cooled in the magnetic ¬eld prevailing at their time of
formation and ¬elds induced by the main core ¬eld.
The contribution to the surface magnetic ¬eld displayed in Figure 15.1 for
angular orders l up to 12 is consistent with an origin at the core“mantle boundary.
We can therefore extrapolate the magnetic ¬eld back to this boundary using the
potential function •m from (15.1.1). From (15.1.11) the coef¬cients of the
expansion of the magnetic potential at the core“mantle boundary are ampli¬ed by
a factor (re /rc )l+2 compared with Gm. Since re is 6371 km and rc is about 2890
km, re /rc ≈ 2.2 and hence there is a substantial enhancement of higher angular
orders at the top of the core relative to the surface. The result is that the ¬eld at the
core“mantle boundary appears to have much more power at shorter wavelengths
(Figure 15.2) than at the surface. In fact what is happening is that the higher
angular orders generated in the core are severely damped in passage through the
nearly insulating mantle and so make little contribution at the surface.
Direct use of (15.1.12) to create ¬eld maps has a tendency to produce unstable
results. The problem of the creation of the magnetic map at the core surface is
15.1 The magnetic ¬eld at the surface and at the top of the core 383

Figure 15.2. Magnetic ¬eld at the core“mantle boundary reconstructed from the data in
¬gure 15.1, the projections of the continents are indicated for reference. [Courtesy of A.

frequently formulated as constraints on the coef¬cients Sm, with a solution through
the minimisation of a mean-square error.
Although the general pattern of the magnetic ¬eld at the core“mantle boundary
remains similar over time there is evidence for secular variation in time. For
example, the patches of magnetic ¬eld in the Atlantic Ocean near the equator
have been moving steadily westwards over the period for which measurements are
available - producing an apparent westward drift of the north magnetic pole for
stations in the northern hemisphere. Other features of the ¬eld are more static, such
as the ¬eld concentrations at higher latitudes (around ±70—¦ ).
The toroidal component of the core magnetic ¬eld does not escape from the core,
and as a result auxiliary assumptions have to be invoked to deduce the ¬‚ow ¬eld in
the core from the time variation of the magnetic ¬eld at the surface, based on the
very high conductivity of the core ¬‚uid. It is necessary to assume that the ¬‚ow is
large-scale and invoke some approximations as to the relation of the magnetic ¬eld
and the ¬‚ow.
The frozen ¬‚ux approximation discussed in Section 8.3.6 reduces to a relation
for the time derivative of the radial component of the magnetic ¬eld Br in terms of
the horizontal derivative of the product of the velocity ¬eld v and Br,

Br = ’∇H · (vBr). (15.1.13)
This assumption is not suf¬cient to determine the ¬‚ow ¬eld uniquely. The
ambiguity can be reduced by assuming that the horizontal Lorentz forces from the
magnetic ¬eld play a minor role in the dynamics of the ¬‚ow at the core surface. This
assumption of ˜tangential geostrophy™ leads to a further equation for the horizontal
derivative of v,
∇H · (v cos ‘) = 0, (15.1.14)
384 The Core and the Earth™s Dynamo

20 km/yr

Figure 15.3. Flow at the top of the core deduced from the radial component magnetic ¬eld
at the core“mantle boundary and its rate of change. [Courtesy of A. Jackson.]

where ‘ is the colatitude. With observations of Br and its time derivative from
secular variation the ¬‚ow ¬eld v can be estimated from (15.1.13), (15.1.14) with
suitable smoothness assumptions (see, e.g., Jackson, 1997).
The model displayed in Figure 15.3 represents the ¬‚ow ¬eld just below the
core“mantle boundary deduced from the magnetic ¬eld and secular variation
information. The scale of the ¬‚ow is much faster than those of plate tectonics,
with typical velocities of 20 km/yr; this is ¬ve orders of magnitude faster than plate
motions of cm/yr.

15.2 Convection and dynamo action
The time scales of the variations in the Earth™s internal magnetic ¬eld are of the
order of decades, although the core is subjected to the daily rotations of the Earth.
To provide a description of the coupled magnetic and ¬‚uid behaviour that can give
rise to dynamo action, we need therefore to combine the treatment of a rapidly
rotating ¬‚uid from Section 7.6 and the description of magnetic ¬‚uid dynamics from
Section 8.3.6.

15.2.1 Basic equations
The equation of mass conservation, (8.1.1), reduces in the case of an incompressible
medium (Dρ/Dt ≡ 0) to the vanishing of the divergence of the velocity ¬eld

∇ · v = 0. (15.2.1)
15.2 Convection and dynamo action 385

The slightly less restrictive requirement that ‚ρ/‚t = 0 provides the anelastic
approximation (cf. Section 14.1.2):
∇ · (ρv) = 0. (15.2.2)
In each case the removal of the time derivative of density suppresses the possibility
of acoustic disturbances with much faster time scales than the rest of the motion.
The equation of motion for a rotating ¬‚uid takes the form (7.6.3)
‚v ∇p 1
’ © — (© — x) ’ 2© — v + ν∇2v + F,
+ (v · ∇)v = ’ (15.2.3)
‚t ρ ρ
where ν = ·/ρ is the kinematic viscosity.
The force term F in the core arises from gravitational effects and the magnetic
F = ρg + μ0j — B; (15.2.4)
following the treatment in Section 8.3.6 we neglect contributions from
the displacement current ‚D/‚t and thereby suppress electromagnetic wave
phenomena as well as acoustic waves. In this magnetohydrodynamic approxim-
ation, the current j is given by
j= ∇ — B. (15.2.5)
Thus from (8.3.49) the core force term becomes
F = ρg + (∇ — B) — B = ’ρ∇• + (B · ∇)B ’ 1 ∇B2, (15.2.6)

where we have introduced the gravitational potential •.
As in Section 7.6 we can represent the centrifugal force as a gradient
© — (© — x) = ’∇( 1 ρ|© — x|2), (15.2.7)

to absorb the in¬‚uence of the centrifugal force. Thus we can write the equation of
motion as
‚v ∇p
’ ∇( 1 ρ|© — x|2) ’ 2© — v + ν∇2v
+ (v · ∇)v = ’ 2
‚t ρ
’ρ∇• + (B · ∇)B ’ 1 ∇B2. (15.2.8)

When the density is constant, or very slowly varying in space, the various gradient
terms can be combined in terms of an augmented pressure
P = p ’ 1 ρ|© — x|2 ’ ρ•, (15.2.9)

with a consequent form of the Navier“Stokes equation for the rapidly rotating
magnetic ¬‚uid
‚v ∇P 1
’ 2© — v + ν∇2v + (∇ — B) — B.
+ (v · ∇)v = ’ (15.2.10)
‚t ρ ρ
386 The Core and the Earth™s Dynamo

From Maxwell™s equations (8.3.1), the magnetic ¬eld has to satisfy the
requirement of vanishing divergence
∇ · B = 0, (15.2.11)
and the induction equation (8.3.48)
‚B 1
= ∇ — (v — B) + (15.2.12)
‚t σμ0
where we have assumed that spatial variation in the electrical conductivity σ can
be neglected. The magnetic diffusivity » = (μ0σ)˜
For the toroidal and poloidal representation for the core magnetic ¬eld
introduced in (15.1.4), the current is to be found from the curl of the magnetic
¬eld, ∇ — B, which has the representation
∇ — B = ∇ — [’∇2S x] + ∇ — ∇ — [T x]. (15.2.13)
Thus the toroidal current system determines the poloidal magnetic ¬eld and vice
In the highly conducting metallic ¬‚uid of the outer core we will have a small
amount of Joule heating (j2/σ) and also some viscous heating effects. We can
extract an equation for the evolution of temperature from (8.3.53) in a similar form
to (7.1.15):
‚ ‚ ‚ ‚
ρ (CpT ) + vk (CpT ) = h + k T
‚t ‚xk ‚xk ‚xk
^^ (∇ — B)2,
+ 2·DijDij + (15.2.14)
Here k is the thermal conductivity, Cp is the speci¬c heat at constant pressure and
h is the heat generation per unit mass due to radioactivity; · is the viscosity, Dij is
the deviatoric strain-rate tensor and the last term on the right hand side is the Joule
heating contribution. The dominant thermal behaviour is described by the thermal
diffusivity (7.1.17), κH = k/ρCp.
The effects of compositional buoyancy in the core can be captured with a simple
model of a binary alloy of iron and a lighter constituent with mass fraction ζ. The
evolution of the mass fraction is described by a diffusion equation similar to that
for temperature

ζ + v · ∇ζ = hζ + κζ ∇2ζ, (15.2.15)
where we have made the simplifying assumption of uniform diffusivity κζ. hζ
is a source term that represents the in¬‚ux of the lighter component into the ¬‚uid
outer core as it is frozen out from the growth of the inner core. The molecular
diffusivity κζ is very small and so direct transport of the light component will
be rather slow. However, we can regard (15.2.14), (15.2.15) as summarising the
in¬‚uence of small-scale motions for which the diffusivities will be enhanced. Such
15.2 Convection and dynamo action 387

small-scale motions are not required to be isotropic in character, and hence the
diffusivities should strictly be tensor quantities (Braginsky & Roberts, 1995).

Reference state
To complete the picture we need to understand the relation of the density to
temperature and composition. We envisage a reference state in which the density
has an adiabatic radial pro¬le in hydrostatic equilibrium with no magnetic ¬eld, so
’∇P0(r) + ρ0(r)g(r) = 0. (15.2.16)
The variations in density associated with deviations from the equilibrium
temperature T0(r) and compositional ζ0(r) pro¬les due to convective ¬‚ow can then
be approximated by
ρ = ρ0(r) 1 ’ ±th [T ’ T0(r)] + ±ζ[ζ ’ ζ0(r)] , (15.2.17)
where the thermal expansivity ±th and the in¬‚uence of composition ±ζ are
determined by compositional effects
1 ‚ρ 1 ‚ρ
±th = ; ±ζ = . (15.2.18)
ρ ‚T ρ ‚ζ
p,ζ p,T

15.2.2 Boundary conditions
The ¬‚uid outer core lies in the region between the solid inner core at radius ri and
the core“mantle boundary at rc . From the discussion in Section 8.2 the interface
condition between the viscous ¬‚uid and the solids requires no slip as in (8.2.4).
If the radial boundaries are stationary in the rotating reference frame, the ¬‚uid
velocity must vanish at ri , rc :
v = 0, r = ri , rc . (15.2.19)
However, if there is differential rotation between the inner and outer core then the
condition at the inner core boundary needs to be modi¬ed to
v = ωi — x, r = ri , (15.2.20)
where ωi is the angular velocity of the inner core relative to the outer core (and the
The temperature at the inner core boundary will be dictated by the freezing of the
iron alloy, whereas, at the core“mantle boundary we need continuity of radial heat
¬‚ux, which is expected to translate into an approximately constant temperature.
If the heat ¬‚ux at the core“mantle boundary is not uniform, the equi-temperature
surfaces no longer coincide with radii. The state of zero motion, v = 0, does not
then satisfy the equation of motion and so the ¬‚uid must move. The resultant ¬‚ow
is termed free convection.
388 The Core and the Earth™s Dynamo

The constant temperature condition at the inner core boundary is accompanied
by a release of latent heat in freezing and a ¬‚ux of the light constituent. There will
be no mass fraction ¬‚ux into the mantle, so
(∇ζ)r = 0, r = rc . (15.2.21)
All components of the magnetic ¬eld are continuous at both the inner core
boundary and the core“mantle boundary. The mantle has a much lower electrical
conductivity than the metallic core, and to a ¬rst approximation can be treated as
an insulator with no sources of magnetic ¬eld. The magnetic ¬eld in the mantle
Bm can then be determined from
∇2•m = 0,
Bm = ’∇•m, (15.2.22)
with a harmonic solution that vanishes as r ’ ∞, as in (15.1.1). All the
components of the magnetic ¬eld should match at the boundary:
B = Bm, r = rc . (15.2.23)
There is likely to be limited electromagnetic coupling between the mantle and the
outer core, concentrated in the D region, which will slightly modify the form of
The inner core is expected to have a similar conductivity to that of the outer
core and so we need to impose the condition that the horizontal components of the
electric ¬eld should be continuous across r = ri (8.3.14).

15.2.3 Interaction of the ¬‚ow with the magnetic ¬eld
In order for a dynamo to function there must be suf¬cient energy generated to
overcome dissipation. We need therefore to examine the way in which energy is
distributed between the ¬‚ow and the magnetic ¬eld. From (15.2.10) we can extract
the rate of change of the kinetic energy, by taking the scalar product with v, as
ρv = v · (j — B) ’ v · (ρ∇• + ∇P), (15.2.24)
Dt 2
The term v · (j — B) represents the rate at which the magnetic energy is converted
into kinetic energy through the agency of the Lorentz force. The corresponding rate
of change of magnetic energy, comes from (15.2.12)
2 J2
D 1B
= ’v · (j — B) ’ . (15.2.25)

Dt σ˜
In a dynamo the rate of growth of magnetic ¬eld energy from ’v · (j — B) over the
whole core needs to exceed the integrated Joule heating J2/σ from the degradation
of magnetic energy by ohmic resistance. Local balance is not essential, but
suf¬cient of the core has to be building magnetic energy to overcome the total
ohmic dissipation.
Lines of magnetic force will tend to be carried with the ¬‚ow ¬eld in the
electrically conducting ¬‚uid in the core. An azimuthal shear with rate ωs will tend
15.2 Convection and dynamo action 389

to shear the lines of forces in the radial and latitudinal directions (BM) to create a
component in the azimuthal direction (Bφ). This ω-effect creates a zonal ¬eld Bφ


. 14
( 16)