Since this must hold for every r, we will need

A(0) = 0; B(0) = 0.

Given the guess (19.272), the derivatives that appear in (19.271) are

1 ‚Pr

= ’B(N )

P ‚r

1 ‚2P

= B(N )2

P ‚r2

1 ‚P

= A0 (N) ’ B 0 (N )r.

P ‚N

Substituting these derivatives in (19.271),

1

’B(N )φ(¯ ’ r) + B(N)2 σ2 ’ A0 (N ) + B 0 (N )r ’ r = ’B(N )σr σΛ .

r r

2

This equation has to hold for every r, so the terms multiplying r and the constant terms must

separately be zero.

1

(19.275)

A0 (N) = B(N )2 σ 2 ’ (φ¯ ’ σ r σΛ ) B(N )

r

r

2

B 0 (N) = 1 ’ B(N)φ.

We can solve this pair of ordinary differential equations by simple integration. The second

339

CHAPTER 19 TERM STRUCTURE OF INTEREST RATES

one is

dB

= 1 ’ φB

dN

Z

dB

= dN

1 ’ φB

1

’ ln (1 ’ φB) = N

φ

and hence

1¡ ¢

1 ’ e’φN . (276)

B(N ) =

φ

Note B(0) = 0 so we did not need a constant in the integration.

We solve the ¬rst equation in (19.275) by simply integrating it, and choosing the constant

to set A(0) = 0. Here we go.

1

A0 (N ) = B(N )2 σ 2 ’ (φ¯ ’ σ r σ Λ ) B(N )

r

r

2

Z Z

σ2r

B(N)2 dN ’ (φ¯ ’ σ r σΛ ) B(N )dN + C

A(N ) = r

2

µ ¶Z

Z

¡ ¢ ¡ ¢

σ2 σ r σΛ

r ’φN ’2φN

1 ’ e’φN dN + C

A(N ) = 1 ’ 2e +e dN ’ r ’¯

2φ2 φ

µ ¶µ ¶µ ¶

σ2 2e’φN e’2φN e’φN

σr σ Λ

r

A(N ) = N+ ’ ’ r’

¯ N+ +C

2φ2 φ 2φ φ φ

We pick the constant of integration to give A(0) = 0. You can do this explicitly, or ¬gure out

directly that the result is achieved by subtracting one from the e’φN terms,

Ã ¢! µ ¶Ã ¢!

¡ ’φN ¢ ¡ ’2φN ¡ ’φN

2 2e ’1 e ’1 e ’1

σ σr σΛρ

A(N ) = r2 N + ’ ’ r’¯ N+

φ 2φ φ φ

2φ

Now, we just have to make it pretty. I™m aiming for the form given in (19.274). Note

1¡ ¢

B(N)2 1 ’ 2e’φN + e’2φN

=

φ2

1 ’ e’φN e’2φN ’ 1

φB(N)2 = 2 +

φ φ

’2φN

e ’1

φB(N )2 ’ 2B(N ) =

φ

340

SECTION 19.5 THREE LINEAR TERM STRUCTURE MODELS

Then

µ ¶µ ¶

σ2 φ σ r σΛ

r 2

A(N ) = 2 N ’ 2B(N ) ’ 2 B(N ) + B(N ) ’ r ’

¯ (N ’ B(N ))

φ

2φ

µ ¶

σ2 σ2

σr σΛ

A(N ) = ’ r B(N)2 ’ r ’ ’ r2 (N ’ B(N )) .

¯

4φ φ 2φ

We™re done.

19.5.2 Vasicek model by expectation

What if we solve the discount rate forward and take an expectation instead? The Vasicek

model is simple enough that we can follow this approach as well, and get the same analytic

solution. The same methods work for the other models, but the algebra gets steadily worse.

The model is

dΛ

(19.277)

= ’rdt ’ σΛ dz

Λ

(19.278)

dr = φ(¯ ’ r)dt + σr dz.

r

The bond price is

µ ¶

ΛN

(N)

(279)

P0 = E0

Λ0

I use 0 and N rather than t and t + N to save a little bit on notation.

To ¬nd the expectation in (19.279), we have to solve the system (19.277)-(19.278) for-

ward. The steps are simple, though the algebra is a bit daunting. First, we solve r forward.

Then, we solve Λ forward. ln Λt turns out to be conditionally normal, so the expectation in

(19.279) is the expectation of a lognormal. Collecting terms in the resulting expectation that

depend on r0 as the B(N ) term, and the constant term as the A(N) term, we ¬nd the same

solution as (19.273)-(19.274).

The interest rate is just an AR(1). By analogy with a discrete time AR(1) you can guess

that its solution is

Z t

(280)

e’φ(t’s) σr dzs + e’φt r0 + (1 ’ e’φt )¯.

rt = r

s=0

To derive this solution, de¬ne r by

˜

rt = eφt (rt ’ r).

˜ ¯

341

CHAPTER 19 TERM STRUCTURE OF INTEREST RATES

Then,

φ˜t dt + eφt drt

d˜t

r = r

φ˜t dt + eφt φ(¯ ’ r)dt + eφt σ r dzt

d˜t

r = r r

φ˜t dt ’ eφt φe’φt rt dt + eφt σr dzt

d˜t

r = r ˜

eφt σr dzt .

d˜t

r =

This equation is easy to solve,

Z t

eφs dzs

rt ’ r0

˜˜ = σr

s=0

Z t

eφt (rt ’ r) ’ (r0 ’ r) = σr eφs dzs

¯ ¯

s=0

Z t

’φt

e’φ(t’s) dzs .

rt ’ r = e

¯ (r0 ’ r) + σ r

¯

s=0

And we have (19.280).

Now, we solve the discount factor process forward. It isn™t pretty, but it is straightforward.

dΛ 1 dΛ2 1

= ’(rt + σ 2 )dt ’ σ Λ dzt

d ln Λt = ’

2Λ

2

Λ 2Λ

Zt Zt

12

ln Λt ’ ln Λ0 =’ (rs + σΛ )ds ’ σΛ dzs .

2

s=0 s=0

Plugging in the interest rate solution (19.280),

Z t ·µZ s ¶ ¸ Zt

12

’φ(s’u) ’φs

ln Λt ’ ln Λ0 = ’ e σr dzu + e (r0 ’ r) + r + σ Λ ds ’ σΛ

¯ ¯ dzs

2

s=0 u=0 s=0

Interchanging the order of the ¬rst integral, evaluating the easy ds integrals and rearranging,

Z t ·Z t ¸ ·µ ¶ ¸

Zt Zt

12

’φ(s’u) ’φs

= ’σΛ dzs ’ σr e ds dzu ’ r + σΛ t + (r0 ’ r)

¯ ¯ e ds

2

s=0 u=0 s=u s=0

Zt · ´¸ µ ¶

σr ³ 1 ’ e’φt

12

’φ(t’u)

(19.281)

=’ σΛ + 1’e dzu ’ r + σ Λ t ’ (r0 ’ r)

¯ ¯ .

φ 2 φ

u=0

The ¬rst integral has a deterministic function of time u. This gives rise to a normally dis-

tributed random variable “ it™s just a weighted sum of independent normals dzu :

µ Zt ¶

Zt

f 2 (u)du .

f(u)dzu ∼ N 0,

u=0 u=0

342

SECTION 19.5 THREE LINEAR TERM STRUCTURE MODELS

Thus, ln Λt ’ ln Λ0 is normally distributed with mean given by the second set of terms in

(19.281) and variance

var0 (ln Λt ’ ln Λ0 ) =

· ´¸2

Z

σr ³

t

1 ’ e’φ(t’u)

= σΛ + du

φ

u=0

Z t "µ #

¶2 µ ¶

σ 2 ’2φ(t’u)

σr σr σr r

e’φ(t’u) + 2 e

= σΛ + ’2 σΛ + du

φ φ φ φ

u=0

µ ¶2 µ ¶

σr ¡ ¢ σ2 ¡ ¢

σr σr

1 ’ e’φt + r3 1 ’ e’2φt . (19.282)

= σΛ + t ’ 2 2 σΛ +

φ φ

φ 2φ

Since we have the distribution of ΛN we are ready to take the expectation.

¡ ¢ 1

ln P (N, 0) = ln E0 eln ΛN ’ln Λ0 = E0 (ln ΛN ’ ln Λ0 ) + σ 2 (ln ΛN ’ ln Λ0 ) .

20

Plugging in the mean from (19.281) and the variance from (19.282)

·µ ¶ ¸

1 ’ e’φN

12

(N)

(19.283)

ln P0 = ’ r + σΛ N + (r0 ’ r)

¯ ¯

2 φ

µ ¶2 µ ¶

¡ ¢ σ2 ¡ ¢

1 σr σr σr

+ r3 1 ’ e(19.284)

’φN ’2φN

+ + σΛ N ’ 2 + σΛ 1 ’ e

2φ φ

φ 4φ

All that remains is to make it pretty. To compare it with our previous result, we want to

express it in the form ln P (N, r0 ) = A(N ) ’ B(N )r0 . The coef¬cient on r0 (19.283) is

1 ’ e’φN

(285)

B(N) = ,

φ

the same expression we derived from the partial differential equation.

To simplify the constant term, recall that (19.285) implies

1 ’ e’2φN

= ’φB(N )2 + 2B(N ).

φ

343

CHAPTER 19 TERM STRUCTURE OF INTEREST RATES

Thus, the constant term (the terms that do not multiply r0 ) in (19.283) is

·µ ¶ ¸

1 ’ e’φN

12

A(N ) = ’ r + σΛ N ’ r

¯ ¯

2 φ

µ ¶2 µ ¶

¡ ¢ σ2 ¡ ¢

1 σr σr σr

+ σ Λ 1 ’ e’φN + r3 1 ’ e’2φN

+ + σΛ N ’ 2

2φ φ

φ 4φ

·µ ¶ ¸

1

A(N ) = ’ r + σ2 N ’ rB(N )

¯ ¯

2Λ

µ ¶2 µ ¶

σ2 ¡ ¢

1 σr σr σr

+ σ Λ B(N) ’ r2 φB(N )2 ’ 2B(N)

+ + σΛ N ’

2φ φ φ 4φ

µ2 ¶

σ2

1 σr σr

’ r (N ’ B(N )) ’ r2 φB(N )2 .

A(N ) = + σΛ ¯

2 φ2 φ 4φ

Again, this is the same expression we derived from the partial differential equation.

This integration is usually expressed under the risk-neutral measure. If we write the risk-

neutral process

dΛ

= ’rdt

Λ

dr = [φ(¯ ’ r) ’ σ r σ Λ ] dt + σr dz.

r

Then the bond price is

RN

(N)

= Ee’ rs ds

P0 .

s=0

The result is the same, of course.

19.5.3 Cox Ingersoll Ross Model

For the Cox-Ingersoll-Ross (1985) model

√

dΛ

= ’rdt ’ σΛ rdz

Λ √

dr = φ(¯ ’ r)dt + σr rdz

r

our differential equation (19.269) becomes

1 ‚ 2P 2

‚P ‚P ‚P

(286)

φ(¯ ’ r) +

r σr r ’ ’ rP = σr σ Λ r.

2 ‚r2

‚r ‚N ‚r

Guess again that log prices are a linear function of the short rate,

(287)

P (N, r) = eA(N)’B(N)r .

344

SECTION 19.5 THREE LINEAR TERM STRUCTURE MODELS

Substituting the derivatives of (19.287) into (19.286),

1

’B(N)φ(¯ ’ r) + B(N )2 σ 2 r ’ A0 (N ) + B 0 (N )r ’ r = ’B(N )σ r σ Λ r.

r r

2

Again, the coef¬cients on the constant and on the terms in r must separately be zero,

1

(19.288)

B 0 (N ) = 1 ’ σ2 B(N )2 ’ (σ r σ Λ + φ) B(N )

2r

A0 (N ) = ’B(N)φ¯.r

The ordinary differential equations (19.288) are quite similar to the Vasicek case, (19.275).

However, now the variance terms multiply an r, so the B(N ) differential equation has the

extra B(N)2 term. We can still solve both differential equations, though the algebra is a little

bit more complicated. The result is

¡ ¢

2 1 ’ eγN

B(N) =

(γ + φ + σr σΛ )(eγN ’ 1) + 2γ

µ µ ¶ ¶

φ¯

r 2γ

A(N) = 2 ln + ψN

σ2 ψ(eγN ’ 1) + 2γ

r

where

q

(φ + σ r σ Λ )2 + 2σ2

γ = r

ψ = φ + σΛ σr + γ.

The CIR model can also be solved by expectation. In fact, this is how Cox Ingersoll

and Ross (1985) actually solve it “ their marginal value of wealth JW is the same thing as

the discount factor. However, where the interest rate in the Vasicek model was a simple

conditional normal, the interest rate now has a non-central χ2 distribution, so taking the

integral is a little messier.

19.5.4 Multifactor af¬ne models

The Vasicek and CIR models are special cases of the af¬ne class of term structure mod-

els (Duf¬e and Kan 1996, Dai and Singleton 1999). These models allow multiple factors,

meaning all bond yields are not just a function of the short rate. Af¬ne models maintain the

convenient form that log bond prices are linear functions of the state variables. This means

that we can take K bond yields themselves as the state variables, and the yields will reveal

anything of interest in the hidden state variables. The short rate and its volatility will be

forecast by lagged short rates but also by lagged long rates or interest rate spreads. My pre-

sentation and notation is similar to Dai and Singleton™s, but as usual I add the discount factor

explicitly.

345

CHAPTER 19 TERM STRUCTURE OF INTEREST RATES

Here is the af¬ne model setup:

(19.289)

dy = φ (¯ ’ y) dt + Σdw

y

= δ0 + δ0y (19.290)

r

dΛ

(19.291)

= ’rdt ’ b0 dw

Λ

Λ q

±i + β 0 ydzi ; E(dzi dzj ) = 0. (19.292)

dwi = i

Equation (19.289) describes the evolution of the state variables. In the end, yields will be

linear functions of the state variables, so we can take the state variables to be yields; thus we

use the letter y. y denotes a K’ dimensional vector of state variables. φ is now a K — K

matrix, y is a K’ dimensional vector, Σ is a K — K matrix. Equation (19.290) describes the

¯

mean of the discount factor or short rate as a linear function of the state variables. Equation

(19.291) is the discount factor. bΛ is a K’dimensional vector that describes how the discount

factor responds to the K shocks. The more Λ responds to a shock, the higher the market price

of risk of that shock. Equation (19.292) describes the shocks dw. The functional form nests

the CIR square root type models if ±i = 0 and the Vasicek type Gaussian process if β i = 0.

You can™t pick ±i and β i arbitrarily, as you have to make sure that ±i + β 0 y > 0 for all values

i

of y that the process can attain. Dai and Singleton characterize this “admissibility” criterion.

We ¬nd bond prices in the af¬ne setup following exactly the same steps as for the Vasicek

and CIR models. Again, we guess that prices are linear functions of the state variables y.

0

P (N, y) = eA(N)’B(N) y .

We apply Ito™s lemma to this guess, and substitute in the basic bond pricing equation (19.268).

We obtain ordinary differential equations that A(N ) and B(N) must satisfy,

Xµ ¶

‚B(N ) 10 2

= ’φ0 B(N ) ’ B(N)i bΛi + [Σ B(N )]i β i + δ (19.293)

‚N 2

i

Xµ ¶

‚A(N ) 10 2

B(N )i bΛi + [Σ B(N )]i ±i ’ B(N )0 φ¯ ’ δ 0 . (19.294)

= y

‚N 2

i

I use the notation [x]i to denote the ith element of a vector x. As with the CIR and Vasicek

models, these are ordinary differential equations that can be solved by integration starting

with A(0) = 0, B(0) = 0. While they do not always have analytical solutions, they are

quick to solve numerically “ much quicker than solving a partial differential equation.

Derivation

To derive (19.294) and (19.293), we start with the basic bond pricing equation (19.268),

which I repeat here,

µ ¶µ ¶ µ ¶

dP 1 ‚P dP dΛ

(295)

Et ’ + r dt = ’Et .

P P ‚N PΛ

346

SECTION 19.5 THREE LINEAR TERM STRUCTURE MODELS

We need dP/P,

1 ‚P 0 1 1 0 ‚ 2P

dP

= dy + dy dy.

‚y‚y0

P P ‚y 2P

The derivatives are

1 ‚P

= ’B(N )

P ‚y

1 ‚ 2P

= B(N)B 0 (N )

P ‚y‚y0

‚A(N ) ‚B(N ) 0

1 ‚P

= ’ y.

P ‚N ‚N ‚N

Thus, the ¬rst term (19.295) is

µ ¶

dP 1

= ’B(N )0 φ (¯ ’ y) dt + Et (dw0 Σ0 B(N )B 0 (N )Σdw)

Et y

P 2

Et (dwi dwj ) = 0, which allows us to simplify the last term. If w1 w2 = 0, then,

· ¸· ¸ X

£ ¤ b1 b1 b1 b2 w1

(w0 bb0 w) = w1 w2 = b2 w1 + b2 w2 =

2 2

b2 wi .

2

1 2 i

b2 b1 b2 b2 w2

Applying the same algebra to our case,

X X 2¡ ¢

2

[Σ0 B(N )]i ±i + β 0 y dt.

00 0 0 2

Et (dw Σ B(N)B (N )Σdw) = [Σ B(N )]i dwi = i

i i

I use the notation [x]i to denote the ith element of the K’dimensional vector x. In sum, we

have

µ ¶

1X 0 2¡ ¢

dP

[Σ B(N)]i ±i + β 0 y dt. (296)

= ’B(N )0 φ (¯ ’ y) dt +

Et y i

P 2i

The right hand side term in (19.295) is

µ ¶

dP dΛ

= ’B(N)0 dwdw0 bΛ

’Et

PΛ

¡ ¢

dwdw0 is a diagonal matrix with elements ±i + β 0 y . Thus,

i

µ ¶ X ¡ ¢

dP dΛ

B(N )i bΛi ±i + β 0 y (297)

’Et =’ i

PΛ i

347

CHAPTER 19 TERM STRUCTURE OF INTEREST RATES

Now, substituting (19.296) and (19.297) in (19.295), along with the easier ‚P/‚N central

term, we get

µ ¶

1X 0 ‚A(N ) ‚B(N ) 0

2¡ ¢

0 0

’B(N )0 φ (¯ ’ y) +

y [Σ B(N)]i ±i + β i y ’ ’ y + δ0 + δ y

2i ‚N ‚N

X ¡ ¢

B(N )i bΛi ±i + β 0 y .

=’ i

i

Once again, the terms on the constant and each yi must separately be zero. The constant

term:

1X 0 X

‚A(N)

2

’B(N )0 φ¯ +

y [Σ B(N)]i ±i ’ ’ δ0 = ’ B(N)i bΛi ±i .

2i ‚N i

µ ¶

‚A(N ) X 10 2

B(N )i bΛi + [Σ B(N )]i ±i ’ B(N)0 φ¯ ’ δ 0

= y

‚N 2

i

The terms multiplying y :

1X 0 X

‚B(N ) 0

20 0

B(N )i bΛi β 0 y.

0

B(N ) φy + [Σ B(N )]i β i y + y’δ y =’ i

2i ‚N i

Taking the transpose and solving,

Xµ ¶

‚B(N) 10 2

0

= ’φ B(N ) ’ B(N )i bΛi + [Σ B(N)]i β i + δ.

‚N 2

i

19.6 Bibliography and comments

The choice of discrete vs. continuous time is really one of convenience. Campbell, Lo

and MacKinlay (1997) give a discrete-time treatment, showing that bond prices are linear

functions of the state variables even in a discrete-time two-parameter square root model.

Models also don™t have to be af¬ne. Constantinides (1992) is a nice discrete time model;

its discount factor is driven by the squared value of AR(1) state variables. It gives closed form

solutions for bond prices. The bond prices are not linear functions of the state variables, but

it is the existence of closed forms rather than linearity that makes af¬ne models so attractive.

It allows for both signs of the term premium, as we seem to see in the data.

So far most of the term structure literature has emphasized the risk-neutral probabilities,

rarely making any reference to the separation between drifts and market prices of risk. This

was not a serious shortcoming for option pricing uses, for which modeling the volatilities is

much more important than for modeling the drifts, and to draw smooth yield curves across

348

SECTION 19.6 BIBLIOGRAPHY AND COMMENTS

maturities. However, it makes the models unsuitable for bond portfolio analysis and other

uses. Many models imply high and time-varying market prices of risk or conditional Sharpe

ratios. Recently, Duffee (1999) and Duarte (2000) have started the important task of specify-

ing term structure models that ¬t the empirical facts about expected returns in term structure

models. In particular, they try to ¬t the Fama-Bliss (1986) and Campbell and Shiller (1991)

regressions that relate expected returns to the slope of the term structure (see Chapter 20),

while maintaining the tractability of af¬ne models.

Term structure models used in ¬nance amount to regressions of interest rates on lagged

interest rates. Macroeconomists also run regressions of interest rates on a wide variety of

variables, including lagged interest rates, but also lagged in¬‚ation, output, unemployment,

exchange rates, and so forth. They often interpret these equations as the Federal Reserve™s

policy-making rule for setting short rates as a function of macroeconomic conditions. This

interpretation is particularly clear in the Taylor rule literature (Taylor 1999) and monetary

VAR literature, see Christiano Eichenbaum and Evans (1999), Cochrane (1994) for surveys.

Someone, it would seem, is missing important right hand variables.

The criticism of ¬nance models is stinging when we only use the short rate as a state vari-

able. Multifactor models are more subtle. If any variable forecasts future interest rates, then it

becomes a state variable, and it should be revealed by bond yields. Thus, bond yields should

completely drive out any other macroeconomic state variables as interest rate forecasters.

They don™t, which is an interesting observation.

In addition, there is an extensive literature that studies yields from a purely statistical point

of view, Gallant and Tauchen (1997) for example, and a literature that studies high frequency

behavior in the federal funds market, for example Hamilton (1996).

Obviously, these three literatures need to become integrated. Balduzzi, Bertola and Foresi

(1996) consider a model based on the Federal Funds target, and Piazzesi (2000) integrated a

careful speci¬cation of high-frequency moves in the Federal funds rate into a term structure

model.

The models studied here are all based on diffusions with rather slow-moving state vari-

ables. These models generate one-day ahead densities that are almost exactly normal. In fact,

as Johannes (2000) points out, one-day ahead densities have much fatter tails than normal

distributions predict. This behavior could be modeled by fast-moving state variables. How-

ever, it is more natural to think of this behavior as generated by a jump process, and Johannes

nicely ¬ts a combined jump-diffusion for yields. This speci¬cation can change pricing and

hedging characteristics of term structure models signi¬cantly.

All of the term structure models in this chapter describe many bond yields as a function

of a few state variables. This is a reasonable approximation to the data. Almost all of the

variance of yields can be described in terms of a few factors, typically a “level” “slope”

and “hump” factor. Knez, Litterman and Scheinkman (1994) make the point with a formal

maximum likelihood factor analysis, but you can see the point with a simple eigenvalue

decomposition of log yields.

349

CHAPTER 19 TERM STRUCTURE OF INTEREST RATES

Maturity

1 2 3 4 5

σ

6.36 0.45 0.45 0.45 0.44 0.44 “Level”

0.61 -0.75 -0.21 0.12 0.36 0.50 “Slope”

0.10 0.47 -0.62 -0.41 0.11 0.46 “Hump”

0.08 0.10 -0.49 0.39 0.55 -0.55

0.07 0.07 -0.36 0.68 -0.60 0.21

Eigenvalue decomposition of the covariance matrix of zero coupon bond yields,

1952-1997. The ¬rst column gives the square root of the eigenvalues. The columns

marked 1-5 give the eigenvectors corresponding to 1-5 year zero coupon bond yields.

I decomposed the covariance matrix as Σ = QΛQ0 ; σ2 gives the diagonal entries

in Λ and the rest of the table gives the entries of Q. With this decomposition, we

can say that bond yields are generated by y = QΛ1/2 µ; E(µµ0 ) = I, thus Q give

“loadings” on the shocks µ.

Not only is the variance of yields well described by a factor model, but the information

in current yields about future yields “ the expected changes in yields and the conditional

volatility of yields “ is well captured by one level and a few spreads as well.

It is a good approximation, but it is an approximation. Actual bond prices do not exactly

follow any smooth yield curve, and the covariance matrix of actual bond yields does not

have an exact K factor structure “ the remaining eigenvalues are not zero. Hence you cannot

estimate a term structure model directly by maximum likelihood; you either have to estimate

the models by GMM, forcing the estimate to ignore the stochastic singularity, or you have to

add distasteful measurement errors.

As always, the importance of an approximation depends on how you use the model. If

you take the model literally, a bond whose price deviates by one basis point is an arbitrage

opportunity. In fact, it is at best a good Sharpe ratio, but a K factor model will not tell

you how good “ it won™t quantify the risk involved in using the model for trading purposes.

Hedging strategies calculated from K-factor models may be sensitive to small deviations as

well.

One solution has been to pick different parameters at each point in time (Ho and Lee

1986). This approach is useful for derivative pricing, but is obviously not a satisfactory

solution. Models in which the whole yield curve is a state variable, Kennedy (1994), Santa

Clara and Sornette (1999), are another interesting response to the problem, and potentially

provide a realistic description of the data.

The market price of interest rate risk re¬‚ects the market price of real interest rate changes

and the market price of in¬‚ation “ or whatever real factors are correlated with in¬‚ation. The

relative contributions of in¬‚ation and real rates in interest rate changes is very important

for the nature of the risks that bondholders face. For example, if real rates are constant

and nominal rates change on in¬‚ation news, then short term bonds are the safest real long

350

SECTION 19.7 PROBLEMS

term investment. If in¬‚ation is constant and nominal rates change on real rate news, then

long term bonds are the safest long term investment. The data seem to suggest a change

in regime between the 1970s and 1990s: in the 70™s, most interest rate changes were due to

in¬‚ation, while the opposite seems true now. Despite all these provocative thoughts, though,

little empirical work has been done that usefully separates interest rate risk premia into real

and in¬‚ation premium components. Buraschi and Jiltsov (1999) is one recent effort in this

direction, but a lot more remains to be done.

19.7 Problems

1. Complete the proof that each of the three statements of the expectations hypothesis

implies the other. Is this also true if we add a constant risk premium? Are the risk premia

in each of the three statements of the yield curve of the same sign?

2. Under the expectations hypothesis, if long-term yields are higher than short term yields,

does this mean that future long term rates should go up, down, or stay the same? (Hint: a

plot of the expected log bond prices over time will really help here.)

(N) (1)

3. Start by assuming risk neutrality, E(HP Rt+1 ) = Yt for all maturities N . Try to

derive the other representations of the expectations hypothesis. Now you see why we

specify that the expected log returns are equal.

4. Look at (19.266) and show that adding orthogonal dw to the discount factor has no effect

on bond pricing formulas.

5. Look at (19.266) and show that P = e’rT if interest rates are constant, i.e. if

dΛ/Λ = ’rdt + σΛ dz.

351

PART IV

Empirical survey

352

SECTION 19.7 PROBLEMS

This part is a brief attempt to survey some of the central empirical issues that have driven

recent thinking about ¬nancial economics for, and which are driving the development of our

theoretical understanding of the nature of risk and risk premia.

This part draws heavily on two previous review articles, Cochrane (1998), (1999a) and

on Cochrane and Hansen (1992). Fama™s (1970) and (1991) ef¬cient market reviews are

classic and detailed reviews of much of the underlying empirical literature, focusing on cross-

sectional questions. Campbell (1999, 2000) and Kocheralkota (1996) are good surveys of the

equity premium literature.

353

Chapter 20. Expected returns in the

time-series and cross-section

The ¬rst revolution in ¬nance started the modern ¬eld. Peaking in the early 1970s, this

revolution established the CAPM, random walk, ef¬cient markets, portfolio based view of

the world. The pillars of this view are:

1. The CAPM is a good measure of risk and thus a good explanation why some stocks,

portfolios, strategies or funds (assets, generically) earn higher average returns than

others.

2. Returns are unpredictable. In particular,

(a) Stock returns are close to unpredictable. Prices are close to random walks; expected

returns do not vary greatly through time. “Technical analysis” that tries to divine

future returns from past price and volume data is nearly useless. Any apparent

predictability is either a statistical artifact which will quickly vanish out of sample,

or cannot be exploited after transactions costs. The near unpredictably of stock

returns is simply stated, but its implications are many and subtle. (Malkiel 1990

is a classic and easily readable introduction.) It also remains widely ignored, and

therefore is the source of lots of wasted trading activity.

(b) Bond returns are nearly unpredictable. This is the expectations model of the term

structure. If long term bond yields are higher than short term yields “ if the yield

curve is upward sloping “ this does not mean that expected long-term bond returns

are any higher than those on short term bonds. Rather, it means that short term

interest rates are expected to rise in the future, so you expect to earn about the same

amount on short term or long term bonds at any horizon.

(c) Foreign exchange bets are not predictable. If a country has higher interest rates

than are available in the U.S. for bonds of a similar risk class, its exchange rate is

expected to depreciate. After you convert your investment back to dollars, you

expect to make the same amount of money holding foreign or domestic bonds.

(d) Stock market volatility does not change much through time. Not only are returns

close to unpredictable, they are nearly identically distributed as well.

3. Professional managers do not reliably outperform simple indices and passive portfolios

once one corrects for risk (beta). While some do better than the market in any given year,

some do worse, and the outcomes look very much like good and bad luck. Managers

who do well in one year are not more likely to do better than average the next year. The

average actively-managed fund does about 1% worse than the market index. The more

actively a fund trades, the lower returns to investors.

Together, these views re¬‚ected a guiding principle that asset markets are, to a good ap-

proximation, informationally ef¬cient. (Fama 1970, 1991.) This statement means that market

prices already contain most information about fundamental value. Informational ef¬ciency

354

in turn derives from competition. The business of discovering information about the value of

traded assets is extremely competitive, so there are no easy quick pro¬ts to be made, as there

are not in every other well-established and competitive industry. The only way to earn large

returns is by taking on additional risk.

These statements are not doctrinaire beliefs. Rather, they summarize the ¬ndings of a

quarter-century of extensive and careful empirical work. However, every single one of them

has now been extensively revised by a new generation of empirical research. Now, it seems

that :

1. There are assets, portfolios, funds, and strategies whose average returns cannot be

explained by their market betas. Multifactor models dominate the empirical description,

performance attribution, and explanation of average returns.

2. Returns are predictable. In particular,

(a) Variables including the dividend/price ratio and term premium can in fact predict

substantial amounts of stock return variation. This phenomenon occurs over

business-cycle and longer horizons. Daily, weekly and monthly stock returns are

still close to unpredictable, and “technical” systems for predicting such movements

are still close to useless after transactions costs.

(b) Bond returns are predictable. Though the expectations model works well in the long

run, a steeply upward sloping yield curve means that expected returns on long term

bonds are higher than on short term bonds for the next year.

(c) Foreign exchange returns are predictable. If you buy bonds in country whose interest

rates are unusually higher than those in the U.S., you expect a greater return, even

after converting back to dollars.

(d) Stock market volatility does in fact change through time. Conditional second

moments vary through time as well as ¬rst moments. Means and variances do not

seem to move in lockstep, so conditional Sharpe ratios vary through time.

3. Some funds seem to outperform simple indices, even after controlling for risk through

market betas. Fund returns are also slightly predictable: past winning funds seem to do

better in the future, and past losing funds seem to do worse than average in the future. For

a while, this seemed to indicate that there is some persistent skill in active management.

However, we now see that multifactor performance attribution models explain most fund

persistence: funds earn persistent returns by following fairly mechanical “styles,” not by

persistent skill at stock selection (Carhart 1997).

Again, these views summarize a large body of empirical work. The strength and interpre-

tation of many results are hotly debated.

This new view of the facts need not overturn the view that markets are reasonable com-

petitive and therefore reasonably ef¬cient. It does substantially enlarge our view of what

activities provide rewards for holding risks, and it challenges our economic understanding of

those risk premia. As of the early 1970s, asset pricing theory anticipated the possibility and

355

CHAPTER 20 EXPECTED RETURNS IN THE TIME-SERIES AND CROSS-SECTION

even probability that expected returns should vary over time and that covariances past mar-

ket betas would be important for understanding cross-sectional variation in expected returns.

What took another 15 to 20 years was to see how important these long-anticipated theoretical

possibilities are in the data.

20.1 Time-series predictability

I start by looking at patterns in expected returns over time in large market indices, and then

look at patterns in expected returns across stocks.

20.1.1 Stocks

Dividend-price ratios forecast excess returns on stocks. Regression coef¬cients and R2

rise with the forecast horizon. This is a result of the fact that the forecasting variable is

persistent.

Table 1 gives a simple example of market return predictability. “Low” prices relative

to dividends forecast higher subsequent returns. The one-year horizon 0.17 R2 is not par-

ticularly remarkable. However, at longer and longer horizons larger and larger fractions of

return variation are forecastable. At a 5 year horizon 60% of the variation in stock returns is

forecastable ahead of time from the price/divided ratio! (Fama and French 1988.)

Horizon k Rt’t+k = a + b(Dt /Pt ) Dt+k /Dt = a + b(Dt /Pt )

(years) R2 R2

b σ(b) b σ(b)

1 5.3 (2.0) 0.15 2.0 (1.1) 0.06

2 10 (3.1) 0.23 2.5 (2.1) 0.06

3 15 (4.0) 0.37 2.4 (2.1) 0.06

5 33 (5.8) 0.60 4.7 (2.4) 0.12

Table 1. OLS regressions of percent excess returns (value weighted NYSE - treasury

bill rate) and real dividend growth on the percent VW dividend/price ratio. Rt’t+k indicates

the k year return. Standard errors in parenthesis use GMM to correct for heteroskedas-

ticity and serial correlation. Sample 1947-1996.

One can object to dividends as the divisor for prices. However, ratios formed with just

about any sensible divisor works about as well, including earnings, book value, and moving

averages of past prices.

Many other variables forecast excess returns, including the term spread between long and

short term bonds, the default spread, the level of the T-bill rate, (Fama and French 1989,)

356

SECTION 20.1 TIME-SERIES PREDICTABILITY

the detrended T-bill rate, and the earnings/dividend ratio (Lamont 1998). Macro variables

forecast stock returns as well, including the investment/capital ratio (Cochrane 1991) and the

consumption/wealth ratio (Lettau and Ludvigson 2000).

Most of these variables are correlated with each other and correlated with or forecast

business cycles. This fact suggests a natural explanation, emphasized by Fama and French

(1999): Expected returns vary over business cycles; it takes a higher risk premium to get

people to hold stocks at the bottom of a recession. When expected returns go up, prices go

down. We see the low prices, followed by the higher returns expected and required by the

market. (Regressions do not have to have causes on the right and effects on the left. You run

regressions with the variable orthogonal to the error on the right, and that is the case here

since the error is a forecasting error. This is like a regression of actual weather on a weather

forecast.)

Table LL, adapted from Lettau and Ludvigson (2000) compares several of these variables.

At a one year horizon, both the consumption/wealth ratio and the detrended T bill rate forecast

returns, with R2 of 0.18 and 0.10 respectively. At the one year horizon, these variables are

more important than the dividend/price and dividend/earnings ratios, and their presence cuts

the dividend ratio coef¬cients in half. However, the d/p and d/e ratios are slower moving

than the t bill rate and consumption/wealth ratio. They track decade-to-to decade movements

more than business cycle movements. This means that their importance builds with horizon.

By six years, the bulk of the return forecastability again comes from the dividend ratios, and

it is their turn to cut down the cay and t-bill regression coef¬cients. The cay and d/e variables

have not been that affected by the late 90s, while it has substantially cut down dividend yield

forecastability.

Horizon(years) R2

cay d’p d’e rrel

1 6.7 0.18

1 0.14 0.08 0.04

1 -4.5 0.10

1 5.4 0.07 -0.05 -3.8 0.23

6 12.4 0.16

6 0.95 0.68 0.39

6 -5.10 0.03

6 5.9 0.89 0.65 1.36 0.42

Table LL. Long-horizon return forecasts. The return variable is log excess returns on the

S&P composite index. cay is Lettau and Ludvigson™s consumption to wealth ratio. d ’ p

is the log dividend yield and e ’ p is the log earnings yield. rrel is a detrended short term

interest rate. Sample 1952:4-1998:3. Source: Lettau and Ludvigson (2000) Table 5.

I emphasize that excess returns are forecastable. We have to understand this as time-

variation in the reward for risk, not time-varying interest rates. One naturally slips in to

non-risk explanations for price variation; for example that the current stock market boom is

due to life-cycle savings of the baby boomers. A factor like this does not reference risks; it

357

CHAPTER 20 EXPECTED RETURNS IN THE TIME-SERIES AND CROSS-SECTION

predicts that interest rates should move just as much as stock returns.

Persistent d/p; Long horizons are not a separate phenomenon

The results at different horizons are not separate facts, but re¬‚ections of a single under-

lying phenomenon. If daily returns are very slightly predictable by a slow-moving variable,

that predictability adds up over long horizons. For example, you can predict that the temper-

ature in Chicago will rise about 1/3 degree per day in the springtime. This forecast explains

very little of the day to day variation in temperature, but tracks almost all of the rise in

temperature from January to July. Thus, the R2 rises with horizon.

Thus, a central fact driving the predictability of returns is that the dividend price ratio

is very persistent. Figure 37 plots the d/p ratio and you can see directly that it is extremely

slow-moving. Below, I will estimate an AR(1) coef¬cient around 0.9 in annual data.

Figure 37.

To see more precisely how the results at various horizons are linked, and how they result

from the persistence of the d/p ratio, suppose that we forecast returns with a forecasting

variable x, according to

(20.298)

rt+1 = axt + µt+1

(20.299)

xt+1 = ρxt + δ t+1 .

(0bviously, you demean the variables or put constants in the regressions.) Small values of

b and R2 in (20.298) and a large coef¬cient ρ in (20.299) imply mathematically that the

358

SECTION 20.1 TIME-SERIES PREDICTABILITY

long-horizon regression has a large regression coef¬cient b and large R2 . To see this, write

rt+1 + rt+2 = a(1 + ρ)xt + aδ t+1 + µt+1 + µt+2

= a(1 + ρ + ρ2 )xt + aρδ t+1 + aδ t+2 + µt+1 + µt+2 + µt+3 .

rt+1 + rt+2 + rt+3

You can see that with ρ near one, the coef¬cients increase with horizon, almost linearly at

¬rst and then at a declining rate. The R2 are a little messier to work out, but also rise with

horizon.

The numerator in the long-horizon regression coef¬cient is

(300)

E [(rt+1 + rt+2 + ... + rt+k ) xt ]

where the symbols represent deviations from their means. With stationary r and x, E(rt+j xt ) =

E(rt+1 xt’j ), so this is the same moment as

(301)

E [rt+1 (xt + xt’1 + xt’2 + ...)] ,

the numerator of a regression coef¬cient of one year returns on many lags of price dividend

ratios. Of course, if you run a multiple regression of returns on lags of p/d, you quickly ¬nd

that most lags past the ¬rst do not help the forecast power. (That statement would be exact in

the AR(1) example.)

This observation shows once again that one-year and multi-year forecastability are two

sides of the same coin. It also suggests that on a purely statistical basis, there will not be a

huge difference one-year return forecasts and multi-year return forecasts (correcting the lat-

ter for the serial correlation of the error term due to overlap). Hodrick (1991) comes to this

conclusion in a careful Monte Carlo experiment, comparing moments of the form (20.300),

(20.301) and E(rt+1 xt ). The multi-year regressions, or the implied multi year regressions

from one-year forecasts with a slow moving right hand variable are thus mostly useful for

illustrating the dramatic economic implications of forecastability, rather than as clever statis-

tical tools that enhance power and allow us to distinguish previously foggy hypotheses.

The slow movement of the price-dividend ratio means that on a purely statistical basis,

return forecastability is a very open question. What we really know (see Figure 37) is that

low prices relative to dividends and earnings in the 50™s preceded the boom market of the

early 60™s; that the high price-dividend ratios of the mid-60™s preceded the poor returns of

the 70™s; that the low price ratios of the mid-70™s preceded the current boom. We really

have three data postwar data points; a once per generation change in expected returns. In

addition, the last half of the 1990s has seen a historically unprecedented rise in stock prices

and price/dividend ratios (or any other ratio). This rise has cut the postwar return forecasting

regression coef¬cient in half. On the other hand, another crash or even just a decade of poor

returns will restore the regression. Data back to the 1600s show the same pattern, but we are

often uncomfortable making inferences from centuries-old data.

359

CHAPTER 20 EXPECTED RETURNS IN THE TIME-SERIES AND CROSS-SECTION

20.1.2 Volatility

Price dividend ratios can only move at all if they forecast future returns, if they forecast

future dividend growth, or if there is a bubble “ if the price-dividend ratio is nonstationary and

is expected to grow explosively. In the data, most variation in price-dividend ratios results

from varying expected returns. “Excess volatility” “ relative to constant discount rate present

value models “ is thus exactly the same phenomenon as forecastable long-horizon returns.

I also derive the very useful price-dividend and return linearizations. Ignoring constants

(means),

∞

X

ρj’1 (∆dt+j ’ rt+j )

pt ’ dt = Et

j=1

®

∞ ∞

X X

= (Et ’ Et’1 ) ° ρj rt+j »

ρj ∆dt+j ’

rt ’ Et’1 rt

j=0 j=1

rt+1 = ∆dt+1 ’ ρ(dt+1 ’ pt+1 ) + (dt ’ pt ).

The volatility test literature starting with Shiller (1981) and LeRoy and Porter (1981)

(See Cochrane 1991 for a review) started out trying to make a completely different point.

Predictability seems like a sideshow. The stunning fact about the stock market is its ex-

traordinary volatility. On a typical day, the value of the U.S. capital stock changes by a full

percentage point, and days of 2 or 3 percentage point changes are not uncommon. In a typical

year it changes by 16 percentage points, and 30 percentage point changes are not uncommon.

Worse, most of that volatility seems not to be accompanied by any important news about fu-

ture returns and discount rates. 30% of the capital stock of the United States vanished in

a year and nobody noticed? Surely, this observation shows directly that markets are “not

ef¬cient” “ that prices do not correspond to the value of capital “ without worrying about

predictability?

It turns out however, that “excess volatility” is exactly the same thing as return predictabil-

ity. Any story you tell about prices that are “too high” or “too low” necessarily imply that

subsequent returns will be too low or too high as prices rebound to their correct levels.

When prices are high relative to dividends (or earnings, cash¬‚ow, book value or some

other divisor), one of three things must be true: 1) Investors expect dividends to rise in the

future. 2) Investors expect returns to be low in the future. Future cash¬‚ows are discounted

at a lower than usual rate, leading to higher prices. 3) Investors expect prices to rise forever,

giving an adequate return even if there are no dividends. This statement is not a theory, it is an

identity: If the price-dividend ratio is high, either dividends must rise, prices must decline, or

the price-dividend ratio must grow explosively The open question is, which option holds for

360

SECTION 20.1 TIME-SERIES PREDICTABILITY

our stock market? Are prices high now because investors expect future earnings, dividends

etc. to rise, because they expect low returns in the future, or because they expect prices to go

on rising forever?

Historically, we ¬nd that virtually all variation in price-dividend ratios has re¬‚ected vary-

ing expected excess returns.

Exact present value identity

To document this statement, we need to relate current prices to future dividends and re-

turns. Start with the identity

Pt+1 + Dt+1

1 = Rt+1 Rt+1 = R’1

’1

(302)

t+1

Pt

and hence

µ ¶

Pt Pt+1 Dt+1

’1

= Rt+1 1 + .

Dt Dt+1 Dt

We can iterate this identity forward and take conditional expectations to obtain the identity

Ãj !

∞

XY

Pt ’1

(303)

= Et Rt+k ∆Dt+k

Dt j=1 k=1

where ∆Dt ≡ Dt /Dt’1 . (We could iterate (20.302) forward to

Ãj !

∞

XY

R’1 Dt+j ,

Pt = t+k

j=1 k=1

but prices are not stationary, so we can™t ¬nd the variance of prices from a time-series average.

Much of the early volatility test literature concerned stationarity problems. Equation (20.303)

also requires a limiting condition that the price dividend ratio cannot explode faster than

³Q ´

j ’1

returns, limj’∞ Et k=1 Rt+k Pt+j /Dt+j . I come back to this condition below)

Equation (??) shows that high prices must, mechanically, come from high future dividend

growth or low future returns.

Approximate identity

The nonlinearity of (20.303) makes it hard to handle, and means that we cannot use simple

time-series tools. You can linearize (20.303) directly with a Taylor expansion ( Cochrane

1991 takes this approach.) Campbell and Shiller (1988) approximate the one period return

identity before iterating, which is algebraically simpler and is the most popular linearization.

Start again from the obvious,

Pt+1 + Dt+1

1 = R’1 Rt+1 = Rt+1

’1

.

t+1

Pt

361

CHAPTER 20 EXPECTED RETURNS IN THE TIME-SERIES AND CROSS-SECTION

Multiplying both sides by Pt /Dt and massaging the result,

µ ¶

Pt Pt+1 Dt+1

’1

= Rt+1 1 + .

Dt Dt+1 Dt

Taking logs, and with lowercase letters denoting logs of uppercase letters,

¡ ¢

pt ’ dt = ’rt+1 + ∆dt+1 + ln 1 + ept+1 ’dt+1

Taking a Taylor expansion of the last term about a point P/D = ep’d

µ ¶ P

P D

pt ’ dt = ’rt+1 + ∆dt+1 + ln 1 + + [pt+1 ’ dt+1 ’ (p ’ d)]

P

D 1+ D

(20.304)

pt ’ dt = ’rt+1 + ∆dt+1 + k + ρ (pt+1 ’ dt+1 ) .

Since the average dividend yield is about 4% and average price/dividend ratio is about 25, ρ

is a number very near one. I will use ρ = 0.96 for calculations,

P/D 1

ρ= = ≈ 1 ’ D/P = 0.96.

1 + P/D 1 + D/P

Without the constant k, the equation can also apply to deviations from means or any other

point.

Now, iterating forward is easy, and results in the approximate identity

∞

X

ρj’1 (∆dt+j ’ rt+j ).

pt ’ dt = const. + (305)

j=1

(Again, we need a condition that pt ’ dt does not explode faster than ρ’t , limj’∞ ρj (pt+j ’

dt+j ) = 0. I return to this condition below.)

Since (20.305) holds ex-post, we can take conditional expectations and relate price-dividend

ratios to ex-ante dividend growth and return forecasts

∞

X

pt ’ dt = const. + Et (306)

ρj’1 (∆dt+j ’ rt+j ).

j=1

Now it is really easy to see that a high price-dividend ratio must be followed by high dividend

growth ∆d, or low returns r. Which is it?

Decomposing the variance of price-dividend ratios

To address this issue, equation (20.305) implies

« «

∞ ∞

X X

var(pt ’ dt ) = cov pt ’ dt , ρj’1 ∆dt+j ’ cov pt ’ dt , ρj’1 rt+j (307)

j=1 j=1

362

SECTION 20.1 TIME-SERIES PREDICTABILITY

In words, price-dividend ratios can only vary if they forecast changing dividend growth or

of they forecast changing returns. (To derive 20.307 from (20.305), multiply both sides by

(pt ’ dt ) ’ E(pt ’ dt ) and take expectations.) Notice that both terms on the right hand side

of (20.307) are the numerators of exponentially weighted long-run regression coef¬cients.

This is a powerful equation. At ¬rst glance, it would seem a reasonable approximation

that returns are unforecastable (the “random walk” hypothesis) and that dividend growth is

not forecastable either. But if this were the case, the price/dividend ratio would have to be a

constant. Thus the fact that the price/dividend ratio varies at all means that either dividend

growth or returns must be forecastable “ that the world is not i.i.d.

At a simple level, Table 1 includes regressions of long-horizon dividend growth on div-

idend/price ratios to match the return regressions. The coef¬cients in the dividend growth

case are much smaller, typically one standard error from zero, and the R2 are tiny. Worse,

the signs are wrong. To the extent that a high price-dividend ratio forecasts any change in

dividends, it seems to forecast a small decline in dividends!

Having seen equation (20.307), one is hungry for estimates. Table 2 presents some, taken

from Cochrane (1991b). As one might suspect from Table 1, Table 2 shows that in the past

almost all variation in price-dividend ratios is due to changing return forecasts.

The elements do not have to be between 0 and 100%. For example, -34, 138 occurs

because high prices seem to forecast lower real dividend growth (though this number is not

statistically signi¬cant). Therefore they must and do forecast really low returns, and returns

must account for more than 100% of price-dividend variation.

Dividends Returns

Real -34 138

std. error 10 32

Nominal 30 85

std. error 41 19

Table 2. Variance decomposition of value-weighted NYSE price-dividend ratio.

Table entries are the percent of the variance of theP

price-dividend ratio attributable

to dividend and return forecasts, 100—cov(pt ’dt , 15 ρj’1 ∆dt+j )/var(pt ’dt )

j=1

and similarly for returns.

This observation solidi¬es one™s belief in price-dividend ratio forecasts of returns. Yes,

the statistical evidence that price-dividend ratios forecast returns is weak, and many return

forecasting variables have been tried and discarded, so selection bias is a big worry in fore-

casting regressions. But the price-dividend ratio (or price-earning, market to book, etc.) has a

special status since it must forecast something. To believe that the price-dividend ratio is sta-

tionary and varies, but does not forecast returns, you have to believe that the price-dividend

ratio does forecast dividends. Given this choice and Table 1, it seems a much ¬rmer conclu-

sion that it forecasts returns.

363

CHAPTER 20 EXPECTED RETURNS IN THE TIME-SERIES AND CROSS-SECTION

It is nonetheless an uncomfortable fact that almost all variation in price-dividend ratios

is due to variation in expected excess returns. How nice it would be if high prices re¬‚ected

expectations of higher future cash¬‚ows. Alas, that seems not to be the case. If not, it would

be nice if high prices re¬‚ected lower interest rates. Again, that seems not to be the case. High

prices re¬‚ect low risk premia, lower expected excess returns.

Campbell™s return decomposition.

Campbell (1991) provides a similar decomposition for unexpected returns,

®

∞ ∞

X X

rt ’ Et’1 rt = (Et ’ Et’1 ) ° ρj rt+j » . (308)

ρj ∆dt+j ’

j=0 j=1

A positive shock to returns must come from a positive shock to forecast dividend growth, or

to a negative shock to forecast returns.

Since a positive shock to time t dividends is directly paid as a return, (the ¬rst sum starts

at j = 0), Campbell ¬nds some fraction of return variation is due to current dividends.

However, once again, the bulk of index return variation comes from shocks to future returns,

i.e. discount rates.

To derive (20.308), start with the approximate identity (20.305), and move it back one

period

∞

X

pt’1 ’ dt’1 = const. + ρj (∆dt+j ’ rt+j ).

j=0

Now take innovations of both sides,

∞

X

ρj (∆dt+j ’ rt+j ).

0 = (Et ’ Et’1 )

j=0

Pulling rt over to the left hand side, you obtain (20.308). (Problem 3 at the end of the chapter

guides you through an alternative and more constructive derivation.)

Cross-section

So far, we have concentrated on the index. One can apply the same analysis to ¬rms.

What causes the variation in price-dividend ratios, or, better book/market ratios (since divi-

dends can be zero) across ¬rms, or over time for a given ¬rm? Vuolteenaho (2000) applies

the same sort of analysis to individual stock data. He ¬nds that as much as half of the vari-

ation in individual ¬rm book/market ratios re¬‚ect expectations of future cash¬‚ows. Much

of the expected cash¬‚ow variation is idiosyncratic, while the expected return variation is

common, which is why variation in the index book/market ratio, like variation in the index

dividend/price ratio, is almost all due to varying expected excess returns.

Bubbles

364

SECTION 20.1 TIME-SERIES PREDICTABILITY

In deriving the exact and linearized present value identities, I assumed an extra condition

that the price-dividend ratio does not explode. Without that condition, and taking expectations

of both sides, the exact identity reads

Ã ! Ã !

j j

∞

X Y Y

Pt Pt+j

’1

R’1 (309)

= Et Rt+k ∆Dt+k + lim Et t+k

Dt Dt+j

j’∞

j=1 k=1 k=1

and the linearized identity reads

∞

X

pt ’ dt = const. + Et (310)

ρj’1 (∆dt+j ’ rt+j ) + Et lim ρj (pt+j ’ dt+j ).

j’∞

j=1