. 2
( 17)



xt+1 . But there are times when we want to relate a price to the entire cash ¬‚ow stream, rather
than just to one dividend and next period™s price.
The most straightforward way to do this is to write out a longer term objective,

β j u(ct+j ).

Now suppose an investor can purchase a stream {dt+j } at price pt . As with the two-period
model, his ¬rst order condition gives us the pricing formula directly,
∞ ∞
j u (ct+j )
pt = Et β dt+j = Et mt,t+j dt+j .
u0 (ct )
j=0 j=0

You can see that if this equation holds at time t and time t + 1, then we can derive the
two-period version

pt = Et [mt+1 (pt+1 + dt+1 )]

Thus, the in¬nite period and two period models are equivalent.
(Going in the other direction is a little tougher. If you chain together (1.24), you get (1.23)
plus an extra term. To get (1.23) you also need the “transversality condition” limt’∞ Et mt,t+j pt+j =
0. This is an extra ¬rst order condition of the in¬nite period investor, which is not present
with overlapping generations of two-period investors. It rules out “bubbles” in which prices
grow so fast that people will buy now just to resell at higher prices later, even if there are no
From (1.23) we can write a risk-adjustment to prices, as we did with one period payoffs,
∞ ∞
X Et dt+j X
pt = + covt (dt+j , mt,t+j )
j=1 j=1

where Rt,t+j ≡ Et (mt,t+j )’1 is the j period interest rate. Again, assets whose dividend
streams covary negatively with marginal utility, and positively with consumption, have lower
prices, since holding those assets gives the investor a more volatile consumption stream. (It
is common instead to write prices as a discounted value using a risk adjusted discount factor,
e.g. pt = j=1 Et dt+j /Rt,t+j but this approach is dif¬cult to use correctly for multiperiod
problems, especially when expected returns can vary over time.)

1.5 Discount factors in continuous time

Continuous time versions of the basic pricing equations.


Discrete Continuous
P 0
pt = Et ∞ β j uu(ct+j ) Dt+j pt u0 (ct ) = Et s=0 e’δs u0 (ct+s )Dt+s ds
0 (c )
j=1 t
mt+1 = β uu(ct+1 ) Λt = e’δt u0 (ct )
0 (c )
p = E(mx) 0 = ΛD dt + Et [d(Λp)]
³´ h i
dp dΛ dp
E(R) = Rf ’ Rf cov(m, R) Et p + p dt = rt dt ’ Et Λ p

It is often convenient to express asset pricing ideas in the language of continuous time
stochastic differential equations rather than discrete time stochastic difference equations as
I have done so far. The appendix contains a brief introduction to continuous time processes
that covers what you need to know for this book. Even if you want to end up with a discrete
time representation, manipulations are often easier in continuous time. For example, relating
interest rates and Sharpe ratios to consumption growth in the last section required a clumsy
lognormal approximation; you™ll see the same sort of thing done much more cleanly in this
The choice of discrete vs. continuous time is one of modeling convenience. The richness
of the theory of continuous time processes often allows one to obtain analytical results that
would be unavailable in discrete time. On the other hand, in the complexity of most practical
situations, one often ends up resorting to numerical simulation of a discretized model anyway.
In those cases, it might be clearer to start with a discrete model. But I emphasize this is all a
choice of language. One should become familiar enough with discrete as well as continuous
time representations of the same ideas to pick the representation that is most convenient for
a particular application.
First, we need to think about how to model securities, in place of price pt and one-period
payoff xt+1 . Let a generic security have price pt at any moment in time, and let it pay
dividends at the rate Dt dt. (I will continue to denote functions of time as pt rather than p(t)
to maintain continuity with the discrete-time treatment, and I will drop the time subscripts
where they are obvious, e.g. dp in place of dpt . In an interval dt, the security pays dividends
Dt dt. I use capital D for dividends to distinguish them from the differential operator d. )
The instantaneous total return is
dpt Dt
+ dt.
pt pt
We model the price of risky assets as diffusions, for example
= µ(·)dt + σ(·)dz.
(I will reserve the notation dz for increments to a standard Brownian motion, e.g. zt+∆ ’
zt ∼ N (0, ∆). I use the notation (·) to indicate that the drift and diffusions can be functions
of state variables. I limit the discussion to diffusion processes “ no jumps.) What™s nice about
this diffusion model is that the increments dz are normal; the dependence of µ and σ on state


variables means that the ¬nite time distribution of prices f (pt+∆ |It ) need not be normal.
We can think of a riskfree security as one that has a constant price equal to one and pays
the riskfree rate as a dividend,
p = 1; Dt = rt ,

or as a security that pays no dividend but whose price climbs deterministically at a rate

dpt f
= rt dt.

Next, we need to express the ¬rst order conditions in continuous time. The utility function
Z ∞
e’δt u(ct )dt.
U ({ct }) = E

Suppose the investor can buy a security whose price is pt and that pays a dividend stream Dt .
As we did in deriving the present value price relation in discrete time, the ¬rst order condition
for this problem gives us the in¬nite period version of the basic pricing equation right away1 ,
e’δs u0 (ct+s )Dt+s ds (27)
pt u (ct ) = Et

This equation is an obvious continuous time analogue to

X u0 (ct+j )
pt = Et Dt+j .
u0 (ct )

It turns out that dividing by u0 (ct ) is not a good idea in continuous time, since the ratio
u0 (ct+∆ )/u0 (ct ) isn™t well behaved for small time intervals. Instead, we keep track of the
level of marginal utility. Therefore, de¬ne the “discount factor” in continuous time as

Λt ≡ e’δt u0 (ct ).

Then we can write the pricing equation as
Z ∞
pt Λt = Et Λt+s Dt+s ds.

One unit of the security pays the dividend stream Dt , i.e. Dt dt units of the numeraire consumption good in a

time interval dt. The security costs pt units of the consumption good. The investor can ¬nance the purchase of ξ
units of the security by reducing consumption from et to ct = et ’ ξpt /dt during time interval dt. The loss in
utility from doing so is u0 (ct )(et ’ ct )dt = u0 (ct )ξpt . The gain is the right hand side of (1.27)


(Some people like to de¬ne Λt = u0 (ct ), in which case you keep the e’δt in the equation, or
Rs f
to scale Λt by the riskfree rate, in which case you get an extra e’ „ =0 rt+„ d„ in the equation.
The latter procedure makes it look like a risk-neutral or present-value formula valuation.)
The analogue to the one period pricing equation p = E(mx) is

0 = ΛD dt + Et [d(Λp)] .

To derive this fundamental equation, take the difference of equation (1.28) at t and t + ∆.
(Or, start directly with the ¬rst order condition for buying the security at t and selling it at
t + ∆.)
pt Λt = Et Λt+s Dt+s ds + Et [Λt+∆ pt+∆ ]

For ∆ small the term in the integral can be approximated

pt Λt ≈ Λt Dt ∆ + Et [Λt+∆ pt+∆ ] .

We want to get to d something, so introduce differences by writing

pt Λt ≈ Λt Dt ∆ + Et [Λt pt + (Λt+∆ pt+∆ ’ Λt pt )] .

Canceling pt Λt ,

0 ≈ Λt Dt ∆ + Et (Λt+∆ pt+∆ ’ Λt pt ).

Taking the limit as ∆ ’ 0,

0 = Λt Dt dt + Et [d(Λt pt )]

or, dropping time subscripts, equation (1.29).
Equation (1.29) looks different than p = E(mx) because there is no price on the left
hand side; we are used to thinking of the one period pricing equation as determining price at
t given other things, including price at t + 1. But price at t is really here, of course, as you
can see from equation (1.30) or (1.31). It is just easier to express the difference in price over
time rather than price today on the left and payoff (including price tomorrow) on the right.
With no dividends and constant Λ, 0 = Et (dpt ) = Et (pt+∆ ’ pt ) says that price should
follow a martingale. Thus, Et [d(Λp)] = 0 means that marginal utility-weighted price should
follow a martingale, and (1.29) adjusts for dividends. Thus, it™s the same as the equation
(1.21), pt u0 (ct ) = Et (mt+1 (pt+1 + dt+1 )) that we derived in discrete time.
Since we will write down price processes for dp and discount factor processes for dΛ, and
to interpret (1.29) in terms of expected returns, it is often convenient to break up the d(Λt pt )
term using Ito™s lemma:

d(Λp) = pdΛ + Λdp + dpdΛ.


Using the expanded version (1.32) in the basic equation (1.29), and dividing by pΛ to make
it pretty, we obtain an equivalent, slightly less compact but slightly more intuitive version,
· ¸
D dΛ dp dΛ dp
0= dt + Et + + .
p Λ p Λp

(This formula only works when both Λ and p can never be zero. That is often enough the case
that this formula is useful. If not, multiply through by Λ and p and keep them in numerators.)
Applying the basic pricing equations (1.29) or (1.33) to a riskfree rate, de¬ned as (1.25)
or (1.26), we obtain
µ ¶
rt dt = ’Et
This equation is the obvious continuous time equivalent to
Rt = .
Et (mt+1 )
If a riskfree rate is not traded, we can use (1.34) to de¬ne a shadow riskfree rate or zero-beta
With this interpretation, we can rearrange equation (1.33) as
µ ¶ · ¸
dpt Dt dΛt dpt
Et + dt = rt dt ’ Et .
pt pt Λt pt
This is the obvious continuous-time analogue to

E(R) = Rf ’ Rf cov(m, R). (36)

The last term in (1.35) is the covariance of the return with the discount factor or marginal
utility. Since means are order dt, there is no difference between covariance and second mo-
ment in the last term of (1.35). The interest rate component of the last term of (1.36) naturally
vanishes as the time interval gets short.
Ito™s lemma makes many transformations simple in continuous time. For example, the
nonlinear transformation between consumption and the discount factor led us to some tricky
approximations in discrete time. This transformation is easy in continuous time (diffusions
are locally normal, so it™s really the same trick). With Λt = e’δt u0 (ct ) we have
dΛt = ’δe’δt u0 (ct )dt + e’δt u00 (ct )dct + e’δt u000 (ct )dc2

ct u00 (ct ) dct 1 c2 u000 (ct ) dc2
+ t0 t
= ’δdt + 0 2
Λt u (ct ) ct 2 u (ct ) ct


Denote the local curvature and third derivative of the utility function as
ct u00 (ct )
γt =’
u0 (ct )
c2 u000 (ct )
·t = .
u0 (ct )
(For power utility, the former is the power coef¬cient γ and the latter is ·t = γ(γ + 1).)
Using this formula we can quickly redo the relationship between interest rates and con-
sumption growth, equation (1.7),
µ ¶ µ ¶ µ 2¶
1 dΛt 1 dct 11 dct
rt = ’ Et = δ + γ t Et ’ · t Et .
dt Λt dt ct 2 dt t

We can also easily express asset prices in terms of consumption risk rather than discount
factor risk, as in equation (1.16). Using (1.37) in (1.35),
µ ¶ µ ¶
dpt Dt dct dpt
Et + dt ’ rt dt = γEt
pt pt ct pt
Thus, assets whose returns covary more strongly with consumption get higher mean returns,
and the constant relating covariance to mean return is the utility curvature coef¬cient γ.
Since correlations are less than one, equation (1.38) implies that Sharpe ratios are re-
lated to utility curvature and consumption volatility directly; we don™t need the ugly log-
normal facts and an approximation that we needed in (1.20). Using µp ≡ Et (dpt /pt ) ;
h i h i
2 2
σp = Et (dpt /pt ) ; σ c = Et (dct /ct ) ,
2 2

µp + dt ’ rt dt
¤ γσ c .

1.6 Problems

(a) The absolute risk aversion coef¬cient is
u00 (c)
u0 (c)
We scale by u0 (c) because expected utility is only de¬ned up to linear
transformations “ a + bu(c) gives the same predictions as u(c) “ and this measure of
the second derivative is invariant to linear transformations. Show that the utility


function with constant absolute risk aversion is

u(c) = ’e’±c .

(b) The coef¬cient of relative risk aversion in a one-period model (i.e. when
consumption equals wealth) is de¬ned as
cu00 (c)
rra = 0 .
u (c)
For power utility u0 (c) = c’γ , show that the risk aversion coef¬cient equals the
(c) The elasticity of intertemporal substitution is de¬ned as
c2 /c1 d(c1 /c2 )
ξI ≡ ’ .
Show that with power utility u0 (c) = c’γ , the intertemporal substitution elasticity is
equal to 1/γ.
2. Show that the “idiosyncratic risk” line in Figure 1 is horizontal.
(a) Suppose you have a mean-variance ef¬cient return Rmv and the risk free rate.
using the fact that Rmv is perfectly correlated with the discount factor, construct a
discount factor m in terms of Rf and Rmv , with no free parameters. (the constants
in m = a + bRmv will be functions of things like E(Rmv ))
(b) Using this result, and the beta model in terms of m, show that expected returns can
be described in a single - beta representation using any mean-variance ef¬cient
return (except the riskfree rate).
£ ¤
E(Ri ) = Rf + β i,mv E(Rmv ) ’ Rf .

4. Can the “Sharpe ratio” between two risky assets exceed the slope of the mean-variance
frontier? I.e. if Rmv is on the frontier, is it possible that
E(Ri ) ’ E(Rj ) E(Rmv ) ’ Rf
> ?
σ(Ri ’ Rj ) σ(Rmv )
5. Show that if consumption growth is lognormal, then
¯ ¯ p
¯ E(Rmv ) ’ Rf ¯ ’γ
¯ = σ h t+1 /ct ) ]i = eγ 2 σ2 (∆ ln ct+1 ) ’ 1 ≈ γσ(∆ ln c).
¯ σ(Rmv ) ¯
E (ct+1 /ct )’γ
(Start with σ 2 (x) = E(x2 ) ’ E(x)2 and the lognormal property E(ez ) = eEz+ 2 σ (z) .)
6. There are assets with mean return equal to the riskfree rate, but substantial standard
deviation of returns. Long term bonds are pretty close examples. Why would anyone


hold such an asset?
7. The ¬rst order conditions for an in¬nitely lived consumer who can buy an asset with
dividend stream {dt } are

X 0
j u (ct+j )
pt = Et β dt+j .
u0 (ct )

The ¬rst order conditions for buying a security with price pt and payoff xt+1 =
dt+1 + pt+1 are
·0 ¸
u (ct+1 )
pt = Et β 0 (pt+1 + dt+1 )
u (ct )
(a) Derive (1.40) from (1.39)
(b) Derive (1.39) from (1.40). You need an extra condition. Show that this extra
condition is a ¬rst order condition for maximization. To do this, think about what
strategy the consumer could follow to improve utility if the condition did not hold.
8. Suppose a consumer has a utility function that includes leisure. (This could also be a
second good, or a good produced in another country.) Using the continuous time setup,
show that expected returns will now depend on two covariances, the covariance of returns
with leisure and the covariance of returns with consumption, so long as leisure enters
non-separably, i.e. u(c, l) cannot be written v(c) + w(l). (This is a three line problem,
but you need to apply Ito™s lemma to Λ.)
9. From
1 = E(mR)
show that the negative of the mean log discount factor must be larger than any mean
’E(ln m) > E(ln R).
How is it possible that E(ln R) is bounded “ what about returns of the form
R = (1 ’ ±)Rf + ±Rm for arbitrarily large ±? (Hint: start by assuming m and R
are lognormal. Then see if you can generalize the results using Jensen™s inequality,
E(f (x)) > f(E(x)) for f convex. The return that solves maxR E(ln R) is known as the
growth optimal portfolio.)

Chapter 2. Applying the basic model

2.1 Assumptions and applicability

Writing p = E(mx), we do not assume

1. Markets are complete, or there is a representative investor
2. Asset returns or payoffs are normally distributed (no options), or independent over time.
3. Two period investors, quadratic utility, or separable utility
4. Investors have no human capital or labor income
5. The market has reached equilibrium, or individuals have bought all the securities they
want to.

All of these assumptions come later, in various special cases, but we haven™t made them
yet. We do assume that the investor can consider a small marginal investment or disinvest-

The theory of asset pricing contains lots of assumptions to derive analytically convenient
special cases and empirically useful representations. In writing p = E(mx) or pu0 (ct ) =
Et [βu0 (ct+1 )xt+1 ] we have not made most of these assumptions.
We have not assumed complete markets or a representative investor. These equations
apply to each individual investor, for each asset to which he has access, independently of
the presence or absence of other investors or other assets. Complete markets/representative
agent assumptions are used if one wants to use aggregate consumption data in u0 (ct ), or other
specializations and simpli¬cations of the model.
We have not said anything about payoff or return distributions. In particular, we have
not assumed that returns are normally distributed or that utility is quadratic. The basic
pricing equation should hold for any asset, stock, bond, option, real investment opportunity,
etc., and any monotone and concave utility function. In particular, it is often thought that
mean-variance analysis and beta pricing models require these kind of limiting assumptions
or quadratic utility, but that is not the case. A mean-variance ef¬cient return carries all pricing
information no matter what the distribution of payoffs, utility function, etc.
This is not a “two-period model.” The fundamental pricing equation holds for any two
periods of a multi-period model, as we have seen. Really, everything involves conditional
moments, so we have not assumed i.i.d. returns over time.
I have written things down in terms of a time- and state-separable utility function and I
have extensively used the convenient power utility example. Nothing important lies in either


choice. Just interpret u0 (ct ) as the partial derivative of a general utility function with respect
to consumption at time t. State- or time-nonseparable utility (habit persistence, durability)
complicates the relation between the discount factor and real variables, but does not change
p = E(mx) or any of the basic structure. We will look at several examples below.
We do not assume that investors have no non-marketable human capital, or no outside
sources of income. The ¬rst order conditions for purchase of an asset relative to consumption
hold no matter what else is in the budget constraint. By contrast, the portfolio approach to
asset pricing as in the CAPM and ICAPM relies heavily on the assumption that the investor
has no non-asset income, and we will study these special cases below. For example, leisure
in the utility function just means that u0 (c, l) may depend on l as well as c.
We don™t even really need the assumption (yet) that the market is “in equilibrium,” that
investor has bought all of the asset that he wants to, or even that he can buy the asset at all.
We can interpret p = E(mx) as giving us the value, or willingness to pay for, a small amount
of a payoff xt+1 that the investor does not yet have. Here™s why: If the investor had a little ξ
more of the payoff xt+1 at time t + 1, his utility u(ct ) + βEt u(ct+1 ) would increase by
· ¸
1 00 2
βEt [u(ct+1 + ξxt+1 ) ’ u(ct+1 )] = βEt u (ct+1 )xt+1 ξ + u (ct+1 ) (xt+1 ξ) + ...

If ξ is small, only the ¬rst term on the right matters. If the investor has to give up a small
amount of money vt ξ at time t, that loss lowers his utility by
1 2
u(ct ’ vt ξ) = u0 (ct )vt ξ + u00 (ct ) (vt ξ) + ....
Again, for small ξ, only the ¬rst term matters. Therefore, in order to receive the small extra
payoff ξxt+1 , the investor is willing to pay the small amount vt ξ where
·0 ¸
u (ct+1 )
vt = Et β 0 xt+1 .
u (ct )

If this private valuation is higher than the market value pt , and if the investor can buy
some more of the asset, he will. As he buys more, his consumption will change; it will be
higher in states where xt+1 is higher, driving down u0 (ct+1 ) in those states, until the value
to the investor has declined to equal the market value. Thus, after an investor has reached
his optimal portfolio, the market value should obey the basic pricing equation as well, using
post-trade or equilibrium consumption. But the formula can also be applied to generate
the marginal private valuation, using pre-trade consumption, or to value a potential, not yet
traded security.
We have calculated the value of a “small” or marginal portfolio change for the investor.
For some investment projects, an investor cannot take a small (“diversi¬ed”) position. For ex-
ample, a venture capitalist or entrepreneur must usually take all or nothing of a project with
payoff stream {xt }. Then the value of a project not already taken, E j β j u(ct+j + xt+j )


might be substantially different from its marginal counterpart, E β j u0 (ct+j )xt+j . Once
the project is taken of course, ct+j + xt+j becomes ct+j , so the marginal valuation still ap-
plies to the ex-post consumption stream. Analysts often forget this point and apply marginal
(diversi¬ed) valuation models such as the CAPM to projects that must be bought in discrete
chunks. Also, we have abstracted from short sales and bid/ask spreads; this modi¬cation
changes p = E(mx) from an equality to a set of inequalities.

2.2 General Equilibrium

Asset returns and consumption: which is the chicken and which is the egg? The exogenous
return model, the endowment economy model, and the argument that it doesn™t matter for
studying p = E(mx).

So far, we have not said where the joint statistical properties of the payoff xt+1 and
marginal utility mt+1 or consumption ct+1 come from. We have also not said anything
about the fundamental exogenous shocks that drive the economy. The basic pricing equation
p = E(mx) tells us only what the price should be, given the joint distribution of consumption
(marginal utility, discount factor) and the asset payoff.
There is nothing that stops us from writing the basic pricing equation as

u0 (ct ) = Et [βu0 (ct+1 )xt+1 /pt ] .

We can think of this equation as determining today™s consumption given asset prices and
payoffs, rather than determining today™s asset price in terms of consumption and payoffs.
Thinking about the basic ¬rst order condition in this way gives the permanent income model
of consumption.
Which is the chicken and which is the egg? Which variable is exogenous and which is en-
dogenous? The answer is, neither, and for many purposes, it doesn™t matter. The ¬rst order
conditions characterize any equilibrium; if you happen to know E(mx), you can use them to
determine p; if you happen to know p, you can use them to determine consumption and sav-
ings decisions. For most asset pricing applications we are interested in understanding a wide
cross-section of assets. Thus, it is interesting to contrast the cross-sectional variation in their
prices (expected returns) with cross-sectional variation in their second moments (betas) with
a single discount factor. In most applications, the discount factor is a function of aggregate
variables (market return, aggregate consumption), so is plausible to hold the properties of the
discount factor constant as we study one individual asset after another. Permanent income
studies typically dramatically restrict the number of assets under consideration, often to just
an interest rate, and study the time-series evolution of aggregate or individual consumption.
Nonetheless, it is an obvious next step to complete the solution of our model economy; to


¬nd c and p in terms of truly exogenous forces. The results will of course depend on what the
rest of the economy looks like, in particular the production or intertemporal transformation
technology and the set of markets.
Figure 2 shows one possibility for a general equilibrium. Suppose that the production
technologies are linear: the real, physical rate of return (the rate of intertemporal transfor-
mation) is not affected by how much is invested. Now consumption must adjust to these
technologically given rates of return. If the rates of return on the intertemporal technologies
were to change, the consumption process would have to change as well. This is, implic-
itly, how the permanent income model works. This is how many ¬nance theories such as the
CAPM and ICAPM and the Cox, Ingersoll and Ross (1986) model of the term structure work
as well. These models specify the return process, and then solve the consumer™s portfolio and
consumption rules.




Figure 2. Consumption adjusts when the rate of return is determined by a linear technology.

Figure 3 shows another extreme possibility for the production technology. This is an
“endowment economy.” Nondurable consumption appears (or is produced by labor) every
period. There is nothing anyone can do to save, store, invest or otherwise transform con-
sumption goods this period to consumption goods next period. Hence, asset prices must
adjust until people are just happy consuming the endowment process. In this case consump-
tion is exogenous and asset prices adjust. Lucas (1978) and Mehra and Prescott (1985) are
two very famous applications of this sort of “endowment economy.”
Which of these possibilities is correct? Well, neither, of course. The real economy and all
serious general equilibrium models look something like ¬gure 4: one can save or transform
consumption from one date to the next, but at a decreasing rate. As investment increases,





Figure 3. Asset prices adjust to consumption in an endowment economy.

rates of return decline




Figure 4. General equilibrium. The solid lines represent the indifference curve and pro-
duction possibility set. The dashed straight line represents the equilibrium rate of return.
The dashed box represents an endowment economy that predicts the same consumption-asset
return process.


Does this observation invalidate any modeling we do with the linear technology (CAPM,
CIR, permanent income) model, or the endowment economy model? No. Start at the equilib-
rium in ¬gure 4. Suppose we model this economy as a linear technology, but we happen to
choose for the rate of return on the linear technologies exactly the same stochastic process for
returns that emerges from the general equilibrium. The resulting joint consumption, asset re-
turn process is exactly the same as in the original general equilibrium! Similarly, suppose we
model this economy as an endowment economy, but we happen to choose for the endowment
process exactly the stochastic process for consumption that emerges from the equilibrium
with a concave technology. Again, the joint consumption-asset return process is exactly the
Therefore, there is nothing wrong in adopting one of the following strategies for empirical

1. Form a statistical model of bond and stock returns, solve the optimal consumption-
portfolio decision. Use the equilibrium consumption values in p = E(mx).
2. Form a statistical model of the consumption process, calculate asset prices and returns
directly from the basic pricing equation p = E(mx).
3. Form a completely correct general equilibrium model, including the production
technology, utility function and speci¬cation of the market structure. Derive the
equilibrium consumption and asset price process, including p = E(mx) as one of the
equilibrium conditions.

If the statistical models for consumption and/or asset returns are right, i.e. if they coincide
with the equilibrium consumption or return process generated by the true economy, either of
the ¬rst two approaches will give correct predictions for the joint consumption-asset return
As we will see, most ¬nance models, developed from the 1950s through the early 1970s,
take the return process as given, implicitly assuming linear technologies. The endowment
economy approach, introduced by Lucas (1978), is a breakthrough because it turns out to be
much easier. It is much easier to evaluate p = E(mx) for ¬xed m than it is to solve joint
consumption-portfolio problems for given asset returns, all to derive the equilibrium con-
sumption process. To solve a consumption-portfolio problem we have to model the investor™s
entire environment: we have to specify all the assets to which he has access, what his la-
bor income process looks like (or wage rate process, and include a labor supply decision).
Once we model the consumption stream directly, we can look at each asset in isolation, and
the actual computation is almost trivial. This breakthrough accounts for the unusual struc-
ture of the presentation in this book. It is traditional to start with an extensive study of
consumption-portfolio problems. But by modeling consumption directly, we have been able
to study pricing directly, and portfolio problems are an interesting side trip which we can
Most uses of p = E(mx) do not require us to take any stand on exogeneity or endo-
geneity, or general equilibrium. This is a condition that must hold for any asset, for any


production technology. Having a taste of the extra assumptions required for a general equi-
librium model, you can now appreciate why people stop short of full solutions when they can
address an application using only the ¬rst order conditions, using knowledge of E(mx) to
make a prediction about p.
It is enormously tempting to slide into an interpretation that E(mx) determines p. We rou-
tinely think of betas and factor risk prices “ components of E(mx) “ as determining expected
returns. For example, we routinely say things like “the expected return of a stock increased
because the ¬rm took on riskier projects, thereby increasing its β.” But the whole consump-
tion process, discount factor, and factor risk premia change when the production technology
changes. Similarly, we are on thin ice if we say anything about the effects of policy interven-
tions, new markets and so on. The equilibrium consumption or asset return process one has
modeled statistically may change in response to such changes in structure. For such ques-
tions one really needs to start thinking in general equilibrium terms. It may help to remember
that there is an army of permanent-income macroeconomists who make precisely the oppo-
site assumption, taking our asset return processes as exogenous and studying (endogenous)
consumption and savings decisions.

2.3 Consumption-based model in practice

The consumption-based model is, in principle, a complete answer to all asset pricing
questions, but works poorly in practice. This observation motivates other asset pricing mod-

The model I have sketched so far can, in principle, give a compete answer to all the
questions of the theory of valuation. It can be applied to any security”bonds, stocks, options,
futures, etc.”or to any uncertain cash ¬‚ow. All we need is a functional form for utility,
numerical values for the parameters, and a statistical model for the conditional distribution of
consumption and payoffs.
To be speci¬c, consider the standard power utility function

u0 (c) = c’γ . (41)

Then, excess returns should obey
"µ #
Re (42)
0 = Et β t+1

Taking unconditional expectations and applying the covariance decomposition, expected ex-


cess returns should follow
"µ #
E(Rt+1 ) = ’Rf cov
e e
, Rt+1 .

Given a value for γ, and data on consumption and returns, one can easily estimate the mean
and covariance on the right hand side, and check whether actual expected returns are, in fact,
in accordance with the formula.
Similarly, the present value formula is
µ ¶’γ

X ct+j
pt = Et β dt+j .

Given data on consumption and dividends or another stream of payoffs, we can estimate the
right hand side and check it against prices on the left.
Bonds and options do not require separate valuation theories. For example, an N-period
default-free nominal discount bond (a U.S. Treasury strip) is a claim to one dollar at time
t + N . Its price should be
à !
µ ¶’γ
ct+N Πt
pt = Et β N 1
ct Πt+N

where Π = price level ($/good). A European option is a claim to the payoff max(St+T ’
K, 0), where St+T = stock price at time t + T, K = strike price. The option price should be
" #
µ ¶’γ
pt = Et β T max(St+T ’ K, 0)

again, we can use data on consumption, prices and payoffs to check these predictions.
Unfortunately, the above speci¬cation of the consumption-based model does not work
very well. To give a ¬‚avor of some of the problems, Figure 5 presents the mean excess
returns on the ten size-ranked portfolios of NYSE stocks vs. the predictions “ the right hand
side of (2.43) “ of the consumption-based model. I picked the utility curvature parameter
γ = 241 to make the picture look as good as possible (The section on GMM estimation
below goes into detail on how to do this. The Figure presents the ¬rst-stage GMM estimate.)
As you can see, the model isn™t hopeless“there is some correlation between sample average
returns and the consumption-based model predictions. But the model does not do very well.
The pricing error (actual expected return - predicted expected return) for each portfolio is of
the same order of magnitude as the spread in expected returns across the portfolios.


Figure 5. Mean excess returns of 10 CRSP size portfolios vs. predictions of the power
utility consumption-based model. The predictions are generated by ’Rf cov(m, Ri ) with
m = β(ct+1 /ct )’γ . β = 0.98 and γ = 241 are picked by ¬rst-stage GMM to minimize the
sum of squared pricing errors (deviation from 45—¦ line). Source: Cochrane (1996).

2.4 Alternative asset pricing models: Overview

I motivate exploration of different utility functions, general equilibrium models, and linear
factor models such as the CAPM, APT and ICAPM as approaches to circumvent the empirical
dif¬culties of the consumption-based model.

The poor empirical performance of the consumption-based model motivates a search for
alternative asset pricing models “ alternative functions m = f (data). All asset pricing mod-
els amount to different functions for m. I give here a bare sketch of some of the different
approaches; we study each in detail in later chapters.
1) Different utility functions. Perhaps the problem with the consumption-based model is
simply the functional form we chose for utility. The natural response is to try different utility
functions. Which variables determine marginal utility is a far more important question than
the functional form. Perhaps the stock of durable goods in¬‚uences the marginal utility of
nondurable goods; perhaps leisure or yesterday™s consumption affect today™s marginal utility.
These possibilities are all instances of nonseparabilities. One can also try to use micro data
on individual consumption of stockholders rather than aggregate consumption. Aggregation
of heterogenous investors can make variables such as the cross-sectional variance of income


appear in aggregate marginal utility.
2) General equilibrium models. Perhaps the problem is simply with the consumption data.
General equilibrium models deliver equilibrium decision rules linking consumption to other
variables, such as income, investment, etc. Substituting the decision rules ct = f(yt , it , . . . )
in the consumption-based model, we can link asset prices to other, hopefully better-measured
macroeconomic aggregates.
In addition, true general equilibrium models completely describe the economy, including
the stochastic process followed by all variables. They can answer questions such as why is
the covariance (beta) of an asset payoff x with the discount factor m the value that it is, rather
than take this covariance as a primitive. They can in principle answer structural questions,
such as how asset prices might be affected by different government policies. Neither kind of
question can be answered by just manipulating investor ¬rst order conditions.
3) Factor pricing models. Another sensible response to bad consumption data is to model
marginal utility in terms of other variables directly. Factor pricing models follow this ap-
proach. They just specify that the discount factor is a linear function of a set of proxies,
mt+1 = a + bA ft+1 + bB ft+1 + . . . .

where f i are factors and a, bi are parameters. (This is a different sense of the use of the word
“factor” than “discount factor.” I didn™t invent the confusing terminology.) By and large, the
factors are just selected as plausible proxies for marginal utility; events that describe whether
typical investors are happy or unhappy. Among others, the Capital Asset Pricing Model
(CAPM) is the model

mt+1 = a + bRW

where RW is the rate of return on a claim to total wealth, often proxied by a broad-based
portfolio such as the value-weighted NYSE portfolio. The Arbitrage Pricing Theory (APT)
uses returns on broad-based portfolios derived from a factor analysis of the return covariance
matrix. The Intertemporal Capital Asset Pricing Model (ICAPM) suggests macroeconomic
variables such as GNP and in¬‚ation and variables that forecast macroeconomic variables or
asset returns as factors. Term structure models such as the Cox-Ingersoll-Ross model specify
that the discount factor is a function of a few term structure variables, for example the short
rate of interest and a few interest rate spreads.
Many factor pricing models are derived as general equilibrium models with linear tech-
nologies and no labor income; thus they also fall into the general idea of using general equi-
librium relations (from, admittedly, very stylized general equilibrium models) to substitute
out for consumption.
4) Arbitrage or near-arbitrage pricing. The mere existence of a representation p =
E(mx) and the fact that marginal utility is positive m ≥ 0 (these facts are discussed in
the next chapter) can often be used to deduce prices of one payoff in terms of the prices of
other payoffs. The Black-Scholes option pricing model is the paradigm of this approach:


Since the option payoff can be replicated by a portfolio of stock and bond, any m that prices
the stock and bond gives the price for the option. Recently, there have been several sug-
gestions on how to use this idea in more general circumstances by using very weak further
restrictions on m, and we will study these suggestions in Chapter 17.
We return to a more detailed derivation and discussion of these alternative models of the
discount factor m below. First, and with this brief overview in mind, we look at p = E(mx)
and what the discount factor m represents in a little more detail.

2.5 Problems

1. The representative consumer maximizes a CRRA utility function.
X j 1’γ
Et β ct+j .

Consumption is given by an endowment stream.
(a) Show that with log utility, the price/consumption ratio of the consumption stream is
constant, no matter what the distribution of consumption growth.
(b) Suppose there is news at time t that future consumption will be higher. For
γ < 1, γ = 1,and γ > 1, evaluate the effect of this news on the price. Make sense of
your results. (Note: there is a real-world interpretation here. It™s often regarded as a
puzzle that the market declines on good economic news. This is attributed to an
expectation by the market that the Fed will respond to such news by raising interest
rates. Note that γ > 0 in this problem gives a completely real and frictionless
interpretation to this phenomenon! I thank Pete Hecht for this nice problem.)
2. The linear quadratic permanent income model is a very useful general equilibrium model
that we can solve in closed form. It speci¬es a production technology rather than ¬xed
endowments, and it easily allows aggregation of disparate consumers. (Hansen 1987 is a
wonderful exposition of what one can do with this setup.)
The consumer maximizes
X t µ 1¶

(ct ’ c— )2
E β’

subject to a linear technology

kt+1 = (1 + r)kt + it
it = et ’ ct

et is an exogenous endowment or labor income stream. Assume β = 1/(1 + r); the
discount rate equals the interest rate or marginal productivity of capital.


(a) Show that optimal consumption follows

β j Et et+j (2.46)
ct = rkt + rβ

β j et+j (2.47)
ct = ct’1 + (Et ’ Et’1 ) rβ

i.e., consumption equals permanent income, precisely de¬ned, and consumption
follows a random walk whose innovations are equal to innovations in permanent
(b) Assume that the endowment et follows an AR(1)

et = ρet’1 + µt

and specialize (2.46) and (2.47). Calculate and interpret the result for ρ = 1 and
ρ = 0. (The result looks like a “consumption function” relating consumption to
capital and current income, except that the slope of that function depends on
the persistence of income shocks. Transitory shocks will have little effect on
consumption, and permanent shocks a larger effect.)
(c) Calculate the one period interest rate (it should come out to r of course) and the
price of a claim to the consumption stream. e and k are the only state variables, so
the price should be a function of e and k. Interpret the time-variation in the price of
the consumption stream. (This consumer gets more risk averse as consumption rises
to c— . c— is the bliss point, so at the bliss point there is no average return that can
compensate the consumer for greater risk.)
3. Consider again CRRA utility,
β j c1’γ .
Et t+j

Consumption growth follows a two-state Markov process. The states are
∆ct = ct /ct’1 = h, l, and a 2—2 matrix π governs the set of transition probabilities, i.e.
pr(∆ct+1 = h|∆ct = l) = πl’h . (This is the Mehra-Prescott 1986 model, but it will be
faster to do it than to look it up. It is a useful and simple endowment economy.)
(a) Find the riskfree rate (price of a certain real payoff of one) in this economy. This
price is generated by

pb = Et (mt,t+1 1).

You are looking for two values, the price in the l state and the price in the h state.
(b) Find the price of the consumption stream (the price at t of {ct+1 , ct+2 , ...}). To do
this, guess that the price/consumption ratio must be a function of state (h,l), and ¬nd


that function. From
¡ ¢
pc = Et mt,t+1 (pc + ct+1 )
t t+1

¬nd a recursive relation for pc /ct , and hence ¬nd the two values of pc /ct , one for the
t t
h state and one for the l state.
(c) Pick β = 0.99 and try γ = 0.5, 5 (Try more if you feel like it). Calibrate
the consumption process to have a 1% mean and 1% standard deviation, and
consumption growth uncorrelated over time. Calculate prices and returns in each
(d) Now introduce serial correlation in consumption growth with γ = 5. (You can do
this by adding weight to the diagonal entries of the transition matrix π.) What effect
does this have on the model?

Chapter 3. Contingent Claims Markets
Our ¬rst task is to understand the p = E(mx) representation a little more deeply. In this
chapter I introduce a very simple market structure, contingent claims. This leads us to an
inner product interpretation of p = E(mx) which allows an intuitive visual representation
of most of the theorems. We see that discount factors exist, are positive, and the pricing
function is linear, just starting from prices and payoffs in a complete market, without any
utility functions. The next chapter shows that these properties can be built up in incomplete
markets as well.

3.1 Contingent claims

I describe contingent claims. I interpret the stochastic discount factor m as contingent
claims prices divided by probabilities, and p = E(mx) as a bundling of contingent claims.

Suppose that one of S possible states of nature can occur tomorrow, i.e. specialize to a
¬nite-dimensional state space. Denote the individual states by s. For example, we might have
S = 2 and s = rain or s = shine.
A contingent claim is a security that pays one dollar (or one unit of the consumption
good) in one state s only tomorrow. pc(s) is the price today of the contingent claim. I write
pc to specify that it is the price of a contingent claim and (s) to denote in which state s the
claim pays off.
In a complete market investors can buy any contingent claim. They don™t necessarily have
to be faced with explicit contingent claims; they just need enough other securities to span
or synthesize all contingent claims. For example, if the possible states of nature are (rain,
shine), one can span or synthesize any contingent claim or portfolio achieved by combining
contingent claims by forming portfolios of a security that pays 2 dollars if it rains and one if
it shines, or x1 = (2, 1), and a riskfree security whose payoff pattern is x2 = (1, 1).
Now, we are on a hunt for discount factors, and the central point is:
If there are complete contingent claims, a discount factor exists, and it is equal to the
contingent claim price divided by probabilities.
Let x(s) denote an asset™s payoff in state of nature s. We can think of the asset as a
bundle of contingent claims”x(1) contingent claims to state 1, x(2) claims to state 2, etc.
The asset™s price must then equal the value of the contingent claims of which it is a bundle,

p(x) = pc(s)x(s).


I denote the price p(x) to emphasize that it is the price of the payoff x. Where the payoff
in question is clear, I suppress the (x). I like to think of equation (3.48) as a happy-meal
theorem: the price of a happy meal (in a frictionless market) should be the same as the price
of one hamburger, one small fries, one small drink and the toy.
It is easier to take expectations rather than sum over states. To this end, multiply and
divide the bundling equation (3.48) by probabilities,
µ ¶
X pc(s)
p(x) = π(s) x(s)

where π(s) is the probability that state s occurs. Then de¬ne m as the ratio of contingent
claim price to probability

m(s) = .
Now we can write the bundling equation as an expectation,
p= π(s)m(s)x(s) = E(mx).

Thus, in a complete market, the stochastic discount factor m in p = E(mx) exists, and it
is just a set of contingent claims prices, scaled by probabilities. As a result of this interpre-
tation, the combination of discount factor and probability is sometimes called a state-price
The multiplication and division by probabilities seems very arti¬cial in this ¬nite-state
context. In general, we posit states of nature ω that can take continuous (uncountably in¬nite)
values in a space „¦. In this case, the sums become integrals, and we have to use some measure
to integrate over „¦. Thus, scaling contingent claims prices by some probability-like object is

3.2 Risk neutral probabilities

I interpret the discount factor m as a transformation to risk-neutral probabilities such that
p = E — (x)/Rf .

Another common transformation of p = E(mx) results in “risk-neutral” probabilities.

π— (s) ≡ Rf m(s)π(s) = Rf pc(s)


R ≡ 1/ pc(s) = 1/E(m).

The π— (s) are positive, less than or equal to one and sum to one, so they are a legitimate set
of probabilities. Then we can rewrite the asset pricing formula as

X 1X— E — (x)
p(x) = pc(s)x(s) = f π (s)x(s) = .

I use the notation E — to remind us that the expectation uses the risk neutral probabilities π—
instead of the real probabilities π.
Thus, we can think of asset pricing as if agents are all risk neutral, but with probabilities
π in the place of the true probabilities π. The probabilities π— gives greater weight to states

with higher than average marginal utility m.
There is something very deep in this idea: risk aversion is equivalent to paying more
attention to unpleasant states, relative to their actual probability of occurrence. People who
report high subjective probabilities of unpleasant events like plane crashes may not have
irrational expectations, they may simply be reporting the risk neutral probabilities or the
product m — π. This product is after all the most important piece of information for many
decisions: pay a lot of attention to contingencies that are either highly probable or that are
improbable but have disastrous consequences.
The transformation from actual to risk-neutral probabilities is given by

π— (s) = π(s).

We can also think of the discount factor m as the derivative or change of measure from the
real probabilities π to the subjective probabilities π — . The risk-neutral probability represen-
tation of asset pricing is quite common, especially in derivative pricing where the results are
independent of risk adjustments.
The risk-neutral representation is particularly popular in continuous time diffusion pro-
cesses, because we can adjust only the means, leaving the covariances alone. In discrete time,
changing the probabilities typically changes ¬rst and second moments. Suppose we start with
a process for prices and discount factor

= µp dt + σ p dz

= µΛ dt + σΛ dz.

The discount factor prices the assets,
µ¶ µ ¶
dp D dΛ dp
= ’σp σ Λ dt
Et + dt ’ r dt = ’Et
p p Λp
In the “risk-neutral measure” we just increase the drift of each price process by its covariance
with the discount factor, and write a risk-neutral discount factor,
¡p ¢
µ + σp σΛ dt + σ p dz = µp— dt + σ p dz

= µΛ dt.
Under this new set of probabilities, we can just write,
dp D

+ dt ’ rf dt = 0
p p
with Et (dp/p) = µp— dt.

3.3 Investors again

We look at investor™s ¬rst order conditions in a contingent claims market. The marginal
rate of substitution equals the discount factor and the contingent claim price ratio.

Though the focus of this chapter is on how to do without utility functions, It™s worth
looking at the investor™s ¬rst order conditions again in the contingent claim context. The
investor starts with a pile of initial wealth y and a state-contingent income y(s). He purchases
contingent claims to each possible state in the second period. His problem is then
max u(c) + βπ(s)u[c(s)] s.t. c + pc(s)c(s) = y + pc(s)y(s).
s s s

Introducing a Lagrange multiplier » on the budget constraint, the ¬rst order conditions are

u0 (c) = »

βπ(s)u0 [c(s)] = »pc(s).

Eliminating the Lagrange multiplier »,
u0 [c(s)]
pc(s) = βπ(s)
u0 (c)


u0 [c(s)]
m(s) = =β 0
π(s) u (c)
Coupled with p = E(mx), we obtain the consumption-based model again.
The investor™s ¬rst order conditions say that marginal rates of substitution between states
tomorrow equals the relevant price ratio,
u0 [c(s1 )]
m(s1 )
=0 .
m(s2 ) u [c(s2 )]
m(s1 )/m(s2 ) gives the rate at which the investor can give up consumption in state 2 in return
for consumption in state 1 through purchase and sales of contingent claims. u0 [c(s1 )]/u0 [c(s2 )]
gives the rate at which the investor is willing to make this substitution. At an optimum, the
marginal rate of substitution should equal the price ratio, as usual in economics.
We learn that the discount factor m is the marginal rate of substitution between date and
state contingent commodities. That™s why it, like c(s), is a random variable. Also, scaling
contingent claims prices by probabilities gives marginal utility, and so is not so arti¬cial as it
may have seemed above.
Figure 6 gives the economics behind this approach to asset pricing. We observe the in-
vestor™s choice of date or state-contingent consumption. Once we know his utility function,
we can calculate the contingent claim prices that must have led to the observed consumption
choice, from the derivatives of the utility function.

State 2
or date 2

(c1, c2)

Indifference curve

State 1, or date 1

Figure 6. Indifference curve and contingent claim prices

The relevant probabilities are the investor™s subjective probabilities over the various states.


Asset prices are set, after all, by investor™s demands for assets, and those demands are set
by investor™s subjective evaluations of the probabilities of various events. We often assume
rational expectations, namely that subjective probabilities are equal to objective frequencies.
But this is an additional assumption that we may not always want to make.

3.4 Risk sharing

Risk sharing: In complete markets, consumption moves together. Only aggregate risk
matters for security markets.

We deduced that the marginal rate of substitution for any individual investor equals the
contingent claim price ratio. But the prices are the same for all investors. Therefore, marginal
utility growth should be the same for all investors

u0 (ci ) j u (ct+1 )
β i 0 t+1 (49)
= β
u (ci ) u0 (cj )
t t

where i and j refer to different investors. If investors have the same homothetic utility func-
tion (for example, power utility), then consumption itself should move in lockstep,

t+1 t+1
i= .
ct t

More generally, shocks to consumption are perfectly correlated across individuals.
This is so radical, it™s easy to misread it at ¬rst glance. It doesn™t say that expected
consumption growth should be equal; it says that consumption growth should be equal ex-
post. If my consumption goes up 10%, yours goes up exactly 10% as well, and so does
everyone else™s. In a complete contingent claims market, all investors share all risks, so
when any shock hits, it hits us all equally (after insurance payments). It doesn™t say the
consumption level is the same “ this is risk-sharing, not socialism. The rich have higher
levels of consumption, but rich and poor share the shocks equally.
This risk sharing is Pareto-optimal. Suppose a social planner wished to maximize every-
one™s utility given the available resources. For example, with two investors i and j, he would
β t u(cj ) s.t. ci + cj = ca
β t u(ci ) + »j
max »i t t t t t

where ca is the total amount available and »i and »j are i and j™s relative weights in the


planner™s objective. The ¬rst order condition to this problem is

»i u0 (ci ) = »j u0 (cj )
t t

and hence the same risk sharing that we see in a complete market, equation (3.49).
This simple fact has profound implications. First, it shows you why only aggregate shocks
should matter for risk prices. Any idiosyncratic income risk will be equally shared, and so
1/N of it becomes an aggregate shock. Then the stochastic discount factors m that determine
asset prices are no longer affected by truly idiosyncratic risks. Much of this sense that only
aggregate shocks matter stays with us in incomplete markets as well.
Obviously, the real economy does not yet have complete markets or full risk sharing “
individual consumptions do not move in lockstep. However, this observation tells us much
about the function of securities markets. Security markets “ state-contingent claims “ bring
individual consumptions closer together by allowing people to share some risks. In addition,
better risk sharing is much of the force behind ¬nancial innovation. Many successful new
securities can be understood as devices to more widely share risks.

3.5 State diagram and price function

I introduce the state space diagram and inner product representation for prices, p(x) =
E(mx) = m · x.
p(x) = E(mx) implies p(x) is a linear function.

Think of the contingent claims price pc and asset payoffs x as vectors in RS , where each
element gives the price or payoff to the corresponding state,
£ ¤0
pc = pc(1) pc(2) · · · pc(S) ,

£ ¤0
x(1) x(2) · · · x(S)
x= .

Figure 7 is a graph of these vectors in RS . Next, I deduce the geometry of Figure 7.


. 2
( 17)