<< стр. 5(всего 17)СОДЕРЖАНИЕ >>
Now,
ВЈ В¤
cov(Ri , Rmv ) = cov (Rв€— + wReв€— ) , (Rв€— + wi Reв€— )
= var(Rв€— ) + wwi var(Reв€— ) в€’ (w + wi )E(Rв€— )E(Reв€— )
= var(Rв€— ) в€’ wE(Rв€— )E(Reв€— ) + wi [w var(Reв€— ) в€’ E(Rв€— )E(Reв€— )]

Thus, cov(Ri , Rmv ) and E(Ri ) are both linear functions of wi . We can solve
cov(Ri , Rmv ) for wi , plug into the expression for E(Ri ) and weвЂ™re done.
To do this, of course, we must be able to solve cov(Ri , Rmv ) for wi . This requires

E(Rв€— )E(Reв€— ) E(Rв€— )E(Reв€— ) E(Rв€— )
(107)
w 6= = =
var(Reв€— ) E(Reв€—2 ) в€’ E(Reв€— )2 1 в€’ E(Reв€— )
ВҐ
which is the condition for the minimum variance return.

6.7 Problems

1. In the argument that Rmv on the mean variance frontier, Rmv = Rв€— + wReв€— , implies a
discount factor m = a + bRmv , do we have to rule out the case of risk neutrality? (Hint:
What is Reв€— when the economy is risk-neutral?)
2. If you use factor mimicking portfolios as in (6.93), you know that the predictions for
expected returns are the same as they are if you use the factors themselves . Are the О±в€— ,

118
SECTION 6.7 PROBLEMS

О»в€— , and ОІ в€— for the factor mimicking portfolio representation the same as the original О±,
О», and ОІ of the factor pricing model?
3. Suppose the CAPM is true, m = a в€’ bRm prices a set of assets, and there is a risk-free
rate Rf . Find Rв€— in terms of the moments of Rm , Rf .
4. If you express the mean-variance frontier as a linear combination of factor-mimicking
portfolios from a factor model, do the relative weights of the various factor portfolios in
the mean-variance efп¬Ѓcient return change as you sweep out the frontier, or do they stay
5. For an arbitrary mean-variance efп¬Ѓcient return of the form Rв€— + wReв€— , п¬Ѓnd its zero-beta
return and zero-beta rate. Show that your rate reduces to the riskfree rate when there is
one.
6. When the economy is risk neutral, and if there is no risk-free rate, show that the
zero-beta, minimum-variance, and constant-mimicking portfolio returns are again all
equivalent, though not equal to the risk-free rate. (In this case, the mean-variance frontier
is just the minimum-variance point.)

119
Chapter 7. Implications of existence and
equivalence theorems

Existence of a discount factor means p = E(mx) is innocuous, and all content п¬‚ows from
the discount factor model.
The theorems apply to sample moments too; the dangers of п¬Ѓshing up ex-post or sample
mean-variance efп¬Ѓcient portfolios.
Sources of discipline in factor п¬Ѓshing expeditions.
The joint hypothesis problem. How efп¬Ѓciency tests are the same as tests of economic
discount factor models.
Factors vs. their mimicking portfolios.
Testing the number of factors.
Plotting contingent claims on the axis vs. mean and variance.

The theorems on the existence of a discount factor, and the equivalence between the p =
E(mx), expected return - beta, and mean-variance views of asset pricing have important
implications for how we approach and evaluate empirical work.
The equivalence theorems are obviously important, especially to the theme of this book,
to show that the choice of discount factor language versus expected return-beta language
or mean-variance frontier is entirely one of convenience. Nothing in the more traditional
statements is lost.

p = E(mx) is innocuous
Before Roll (1976), expected return вЂ“ beta representations had been derived in the con-
text of special and explicit economic models, especially the CAPM. In empirical work, the
success of any expected return - beta model seemed like a vindication of the whole structure.
The fact that, for example, one might use the NYSE value-weighted index portfolio in place
of the return on total wealth predicted by the CAPM seemed like a minor issue of empirical
implementation.
When Roll showed that mean-variance efп¬Ѓciency implies a single beta representation,
all that changed. Some single beta representation always exists, since there is some mean-
variance efп¬Ѓcient return. The asset pricing model only serves to predict that a particular
return (say, the вЂњmarket returnвЂќ) will be mean-variance efп¬Ѓcient. Thus, if one wants to вЂњtest
the CAPMвЂќ it becomes much more important to be choosy about the reference portfolio, to
guard against stumbling on something that happens to be mean-variance efп¬Ѓcient and hence
prices assets by construction.

120
This insight led naturally to the use of broader wealth indices (Stambaugh 1982) in the
reference portfolio to provide a more grounded test of the CAPM. However, this approach
has not caught on. Stocks are priced with stock factors, bonds with bond factors, and so on.
More recently, stocks sorted on size, book/market, and past performance characteristics are
priced by portfolios sorted on those characteristics. Part of the reason for this is that the betas
are small; stocks and bonds are not highly correlated so risk premia from one source of betas
have small impacts on another set of average returns. Larger measures of wealth including
human capital and real estate do not come with high frequency price data, so adding them to
a wealth portfolio has little effect on betas.
The good news in this existence theorem is that you can always start by writing an ex-
pected return-beta model, knowing that you have imposed almost no structure in doing so.
The bad news is that you havenвЂ™t gotten very far. All the economic, statistical and predictive
content comes in picking the factors.
The theorem that, from the law of one price, there exists some discount factor m such
that p = E(mx) is just an updated restatement of RollвЂ™s theorem. The content is all in
m = f (data) not in p = E(mx). Again, an asset pricing framework that initially seemed to
require a lot of completely unbelievable structureвЂ“the representative consumer consumption-
based model in complete frictionless marketsвЂ“turns out to require (almost) no structure at all.
Again, the good news is that you can always start by writing p = E(mx), and need not suffer
criticism about hidden contingent claim or representative consumer assumptions in so doing.
The bad news is that you havenвЂ™t gotten very far by writing p = E(mx) as all the economic,
statistical and predictive content comes in picking the discount factor model m = f (data).

Ex-ante and ex-post.
I have been deliberately vague about the probabilities underlying expectations and other
moments in the theorems. The fact is, the theorems hold for any set of probabilities4 . Thus,
the existence and equivalence theorems work equally well ex-ante as ex-post: E(mx), ОІ, E(R)
and so forth can refer to agentвЂ™s subjective probability distributions, objective population
probabilities, or to the moments realized in a given sample.
Thus, if the law of one price holds in a sample, one may form an xв€— from sample moments
that satisп¬Ѓes p(x) = E(xв€— x), exactly, in that sample, where p(x) refers to observed prices
and E(xв€— x) refers to the sample average. Equivalently, if the sample covariance matrix of
a set of returns is nonsingular, there exists an ex-post mean-variance efп¬Ѓcient portfolio for
which sample average returns line up exactly with sample regression betas.
This observation points to a great danger in the widespread exercise of searching for and
statistically evaluating ad-hoc asset pricing models. Such models are guaranteed empirical
success in a sample if one places little enough structure on what is included in the discount
factor function. The only reason the model doesnвЂ™t work perfectly is the restrictions the re-
searcher has imposed on the number or identity of the factors included in m, or the parameters
of the function relating the factors to m. Since these restrictions are the entire content of the
Precisely, any set of probabilities that agree on impossible (zero-probability) events.
4

121
CHAPTER 7 IMPLICATIONS OF EXISTENCE AND EQUIVALENCE THEOREMS

model, they had better be interesting, carefully described and well motivated!
Obviously, this is typically not the case or I wouldnвЂ™t be making such a fuss about it. Most
empirical asset pricing research posits an ad-hoc pond of factors, п¬Ѓshes around a bit in that
set, and reports statistical measures that show вЂњsuccess,вЂќ in that the model is not statistically
rejected in pricing an ad-hoc set of portfolios. The set of discount factors is usually not large
enough to give the zero pricing errors we know are possible, yet the boundaries are not clearly
deп¬Ѓned.

Discipline
What is wrong, you might ask, with п¬Ѓnding an ex-post efп¬Ѓcient portfolio or xв€— that prices
assets by construction? Perhaps the lesson we should learn from the existence theorems is to
forget about economics, the CAPM, marginal utility and all that, and simply price assets with
ex-post mean variance efп¬Ѓcient portfolios that we know set pricing errors to zero!
The mistake is that a portfolio that is ex-post efп¬Ѓcient in one sample, and hence prices
all assets in that sample, is unlikely to be mean-variance efп¬Ѓcient, ex-ante or ex-post, in the
next sample, and hence is likely to do a poor job of pricing assets in the future. Similarly,
the portfolio xв€— = p0 E(xx0 )в€’1 x (using the sample second moment matrix) that is a discount
factor by construction in one sample is unlikely to be a discount factor in the next sample;
the required portfolio weights p0 E(xx0 )в€’1 change, often drastically, from sample to sample.
For example, suppose the CAPM is true, the market portfolio is ex-ante mean-variance ef-
п¬Ѓcient, and sets pricing errors to zero if you use true or subjective probabilities. Nonetheless,
the market portfolio is unlikely to be ex-post mean-variance efп¬Ѓcient in any given sample. In
any sample, there will be lucky winners and unlucky losers. An ex-post mean variance ef-
п¬Ѓcient portfolio will be a Monday-morning quarterback; it will tell you to put large weights
on assets that happened to be lucky in a given sample, but are no more likely than indicated
by their betas to generate high returns in the future. вЂњOh, if I had only bought Microsoft in
1982...вЂќ is not a useful guide to forming a mean-variance efп¬Ѓcient portfolio today. (In fact,
mean-reversion in the market and book/market effects in individual stocks suggest that if
anything, assets with unusually good returns in the past are likely to do poorly in the future!)
The only solution is to impose some kind of discipline in order to avoid dredging up
spuriously good in-sample pricing.
The situation is the same as in traditional regression analysis. Regressions are used to
forecast or to explain a variable y by other variables x in a regression y = x0 ОІ + Оµ. By
blindly including right hand variables, one can produce models with arbitrarily good statis-
tical measures of п¬Ѓt. But this kind of model is typically unstable out of sample or otherwise
useless for explanation or forecasting. One has to carefully and thoughtfully limit the search
for right hand variables x to produce good models.
What makes for an interesting set of restrictions? Econometricians wrestling with y =
0

are 1) use economic theory to carefully specify the right hand side and 2) use a battery of

122
cross-sample and out-of-sample stability checks.
Alas, this advice is hard to follow. Economic theory is usually either silent on what
variables to put on the right hand side of a regression, or allows a huge range of variables.
The same is true in п¬Ѓnance. вЂњWhat are the fundamental risk factors?вЂќ is still an unanswered
question. At the same time one can appeal to the APT and ICAPM to justify the inclusion of
just about any desirable factor (Fama 1991 calls these theories a вЂњп¬Ѓshing license.вЂќ) Thus, you
will grow old waiting for theorists to provide useful answers to this kind of question.
Following the purely statistical advice, the battery of cross-sample and out-of-sample
tests often reveals the model is unstable, and needs to be changed. Once it is changed, there
is no more out-of-sample left to check it. Furthermore, even if one researcher is pure enough
to follow the methodology of classical statistics, and wait 50 years for another fresh sample
to be available before contemplating another model, his competitors and journal editors are
unlikely to be so patient. In practice, then, out of sample validation is not a strong guard
against п¬Ѓshing.
Nonetheless, these are the only standards we have to guard against п¬Ѓshing. In my opinion,
the best hope for п¬Ѓnding pricing factors that are robust out of sample and across different
markets, is to try to understand the fundamental macroeconomic sources of risk. By this I
mean, tying asset prices to macroeconomic events, in the way the ill-fated consumption based
model does via mt+1 = ОІu0 (ct+1 )/u0 (ct ). The difп¬Ѓculties of the consumption-based model
have made this approach lose favor in recent years. However, the alternative approach is also
running into trouble that the number and identity of empirically-determined risk factors does
not seem stable. Every time a new anomaly or data set pops up, a new set of ad-hoc factors
gets created to explain them!
In any case, one should always ask of a factor model, вЂњwhat is the compelling economic
story that restricts the range of factors used?вЂќ and / or вЂњwhat statistical restraints are usedвЂќ
to keep from discovering ex-post mean variance efп¬Ѓcient portfolios, or to ensure that the
results will be robust across samples. The existence theorems tell us that the answers to these
questions are the only content of the exercise. If the purpose of the model is not just to predict
asset prices but also to explain them, this puts an additional burden on economic motivation
of the risk factors.
There is a natural resistance to such discipline built in to our current statistical method-
ology for evaluating models (and papers). When the last author п¬Ѓshed around and produced
an ad-hoc factor pricing model that generates 1% average pricing errors, it is awfully hard
factor pricing model is better despite 2% average pricing errors. Your model may really be
better and will therefore continue to do well out of sample when the п¬Ѓshed model falls by
the wayside of п¬Ѓnancial fashion, but it is hard to get past statistical measures of in-sample п¬Ѓt.
One hungers for a formal measurement of the number of hurdles imposed on a factor п¬Ѓshing
ВЇ
expedition, like the degrees of freedom correction in R2 . Absent a numerical correction, we
have to use judgment to scale back apparent statistical successes by the amount of economic
and statistical п¬Ѓshing that produced them.

123
CHAPTER 7 IMPLICATIONS OF EXISTENCE AND EQUIVALENCE THEOREMS

Mimicking portfolios
The theorem xв€— = proj(m|X) also has interesting implications for empirical work. The
pricing implications of any model can be equivalently represented by its factor-mimicking
portfolio. If there is any measurement error in a set of economic variables driving m, the
factor-mimicking portfolios for the true m will price assets better than an estimate of m that
uses the measured macroeconomic variables.
Thus, it is probably not a good idea to evaluate economically interesting models with
statistical horse races against models that use portfolio returns as factors. Economically in-
teresting models, even if true and perfectly measured, will just equal the performance of their
own factor-mimicking portfolios, even in large samples. They will always lose in sample
against ad-hoc factor models that п¬Ѓnd nearly ex-post efп¬Ѓcient portfolios.
This said, there is an important place for models that use returns as factors. After we
have found the underlying true macro factors, practitioners will be well advised to look at
the factor-mimicking portfolio on a day-by-day basis. Good data on the factor-mimicking
portfolios will be available on a minute-by-minute basis. For many purposes, one does not
have to understand the economic content of a model.
But this fact does not tell us to circumvent the process of understanding the true macroe-
conomic factors by simply п¬Ѓshing for factor-mimicking portfolios. The experience of practi-
tioners who use factor models seems to bear out this advice. Large commercial factor models
resulting from extensive statistical analysis (otherwise known as п¬Ѓshing) perform poorly out
of sample, as revealed by the fact that the factors and loadings (ОІ) change all the time.
Also models speciп¬Ѓed with economic fundamentals will always seem to do poorly in
a given sample against ad-hoc variables (especially if one п¬Ѓshes an ex-post mean-variance
efп¬Ѓcient portfolio out of the latter!). But what other source of discipline do we have?

Irrationality and Joint Hypothesis
Finance contains a long history of п¬Ѓghting about вЂњrationalityвЂќ vs. вЂњirrationalityвЂќ and
вЂњefп¬ЃciencyвЂќ vs. вЂњinefп¬ЃciencyвЂќ of asset markets. The results of many empirical asset pricing
papers are sold as evidence that markets are вЂњinefп¬ЃcientвЂќ or that investors are вЂњirrational.вЂќ For
example, the crash of October 1987, and various puzzles such as the small-п¬Ѓrm, book/market,
seasonal effects or long-term predictability have all been sold this way.
However, none of these puzzles documents an arbitrage opportunity5 . Therefore, we
know that there is a вЂњrational modelвЂќвЂ“a stochastic discount factor, an efп¬Ѓcient portfolio to use
in a single-beta representationвЂ”that rationalizes them all. And we can conп¬Ѓdently predict
this situation to continue; real arbitrage opportunities do not last long! Fama (1970) contains
a famous statement of the same point. Fama emphasized that any test of вЂњefп¬ЃciencyвЂќ is a joint
test of efп¬Ѓciency and a вЂњmodel of market equilibrium.вЂќ Translated, an asset pricing model, or
a model of m.

The closed-end fund puzzle comes closest since it documents an apparent violation of the law of one price.
5

However, you canвЂ™t costlessly short closed end funds, and we have ignored short sales constraints so far.

124
But surely markets can be вЂњirrationalвЂќ or вЂњinefп¬ЃcientвЂќ without requiring arbitrage oppor-
tunities? Yes, they can, if (and only if) the discount factors that generate asset prices are
disconnected from marginal rates of substitution or transformation in the real economy. But
now we are right back to specifying and testing economic models of the discount factor! At
best, an asset pricing puzzle might be so severe that we can show that the required discount
factors are completely вЂњunreasonableвЂќ (by some standard) measures of real marginal rates of
substitution and/or transformation, but we still have to say something about what a reasonable
marginal rate looks like.
In sum, the existence theorems mean that there are no quick proofs of вЂњrationalityвЂќ or
вЂњirrationality.вЂќ The only game in town for the purpose of explaining asset prices is thinking
about economic models of the discount factor.

The number of factors.
Many asset pricing tests focus on the number of factors required to price a cross-section
of assets. The equivalence theorems imply that this is a silly question. A linear factor model
m = b0 f or its equivalent expected return / beta model E(Ri ) = О± + ОІ 0 О»f are not unique
if
representations. In particular, given any multiple-factor or multiple-beta representation we
can easily п¬Ѓnd a single-beta representation. The single factor m = b0 f will price assets
just as well as the original factors f, as will xв€— = proj(b0 f | X) or the corresponding
Rв€— . All three options give rise to single-beta models with exactly the same pricing ability as
the multiple factor model. We can also easily п¬Ѓnd equivalent representations with different
numbers (greater than one) of factors. For example, write
Вµ В¶
b3 Л†
m = a + b1 f1 + b2 f2 + b3 f3 = a + b1 f1 + b2 f2 + f3 = a + b1 f1 + b2 f2
b2
to reduce a вЂњthree factorвЂќ model to a вЂњtwo factorвЂќ model. In the ICAPM language, consump-
tion itself could serve as a single state variable, in place of the S state variables presumed to
drive it.
There are times when one is interested in a multiple factor representation. Sometimes the
factors have an economic interpretation that is lost on taking a linear combination. But the
pure number of pricing factors is not a meaningful question.

Discount factors vs. mean, variance and beta.
The point of the previous chapter was to show how the discount factor, mean-variance,
and expected return- beta models are all equivalent representations of asset pricing. It seems
a good moment to contrast them as well; to understand why the mean-variance and beta
language developed п¬Ѓrst, and to think about why the discount factor language seems to be
taking over.
Asset pricing started by putting mean and variance of returns on the axes, rather than
payoff in state 1 payoff in state 2, etc. as we do now. The early asset pricing theorists posed
the question just right: they wanted to treat assets in the apples-and-oranges, indifference

125
CHAPTER 7 IMPLICATIONS OF EXISTENCE AND EQUIVALENCE THEOREMS

curve and budget set framework of macroeconomics. The problem was, what labels to put
on the axis? Clearly, вЂњIBM stockвЂќ and вЂњGM stockвЂќ is not a good idea; investors do not
value securities per se, but value some aspects of the stream of random cash п¬‚ows that those
securities give rise to.
Their brilliant insight was to put the mean and variance of the portfolio return on the axis;
to treat these as вЂњhedonicsвЂќ by which investors valued their portfolios. Investors plausibly
want more mean and less variance. They gave investors вЂњutility functionsвЂќ deп¬Ѓned over this
mean and variance, just as standard utility functions are deп¬Ѓned over apples and oranges. The
mean-variance frontier is the вЂњbudget set.вЂќ
With this focus on portfolio mean and variance, the next step was to realize that each
securityвЂ™s mean return measures its contribution to the portfolio mean, and that regression
betas on the overall portfolio give each securityвЂ™s contribution to the portfolio variance. The
mean-return vs. beta description for each security followed naturally.
In a deep sense, the transition from mean-variance frontiers and beta models to discount
factors represents the realization that putting consumption in state 1 and consumption in
state 2 on the axes вЂ” specifying preferences and budget constraints over state-contingent
consumption вЂ” is a much more natural mapping of standard microeconomics into п¬Ѓnance
than putting mean, variance, etc. on the axes. If for no other reason, the contingent claim
budget constraints are linear, while the mean-variance frontier is not. Thus, I think, the focus
on means and variance, the mean-variance frontier and expected return/beta models is all
due to an accident of history, that the early asset pricing theorists happened to put mean and
variance on the axes rather than state contingent consumption.
Well, here we are, why prefer one language over another? The discount factor language
has an advantage for its simplicity, generality, mathematical convenience, and elegance.
These virtues are to some extent in the eye of the beholder, but to this beholder, it is in-
spiring to be able to start every asset pricing calculation with one equation, p = E(mx).
This equation covers all assets, including bonds, options, and real investment opportunities,
while the expected return/beta formulation is not useful or very cumbersome in the latter ap-
plications. Thus, it has seemed that there are several different asset pricing theories: expected
return/beta for stocks, yield-curve models for bonds, arbitrage models for options. In fact all
three are just cases of p = E(mx). As a particular example, arbitrage, in the precise sense
of positive payoffs with negative prices, has not entered the equivalence discussion at all. I
donвЂ™t know of any way to cleanly graft absence of arbitrage on to expected return/beta mod-
els. You have to tack it on after the fact вЂ“ вЂњby the way, make sure that every portfolio with
positive payoffs has a positive price.вЂќ It is trivially easy to graft it on to a discount factor
model: just add m > 0.
The discount factor and state space language also makes it easier to think about different
horizonsP the present value statement of models. p = E(mx) generalizes quickly to
and
pt = Et j mt,t+j xt+j , while returns have to be chained together to think about multiperiod
models. Papers are still written arguing about geometric vs. arithmetic average returns for
multiperiod discounting.

126
The choice of language is not about normality or return distributions. There is a lot of
confusion about where return distribution assumptions show up in п¬Ѓnance. I have made no
distributional assumptions in any of the discussion so far. Second moments as in betas and
the variance of the mean-variance frontier show up because p = E(mx) involves a second
moment. One does not need to assume normality to talk about the mean-variance frontier.
Returns on the mean-variance frontier price other assets even when returns are not normally
distributed.

127
Chapter 8. Conditioning information
The asset pricing theory I have sketched so far really describes prices at time t in terms of
conditional moments. The investorвЂ™s п¬Ѓrst order conditions are

pt u0 (ct ) = ОІEt [u0 (ct+1 )xt+1 ]

where Et means expectation conditional on the investorвЂ™s time t information. Sensibly, the
price at time t should be higher if there is information at time t that the discounted payoff is
likely to be higher than usual at time t + 1. The basic asset pricing equation should be

pt = Et (mt+1 xt+1 ).

(Conditional expectation can also be written

pt = E [mt+1 xt+1 |It ]

when it is important to specify the information set It .).
If payoffs and discount factors were independent and identically distributed (i.i.d.) over
time, then conditional expectations would be the same as unconditional expectations and
we would not have to worry about the distinction between the two concepts. But stock
price/dividend ratios, bond and option prices all change over time, which must reп¬‚ect chang-
ing conditional moments of something on the right hand side.
One approach is to specify and estimate explicit statistical models of conditional distribu-
tions of asset payoffs and discount factor variables (e.g. consumption growth). This approach
is sometimes used, and is useful in some applications, but it is usually cumbersome. As we
make the conditional mean, variance, covariance, and other parameters of the distribution of
(say) N returns depend п¬‚exibly on M information variables, the number of required param-
eters can quickly exceed the number of observations.
More importantly, this explicit approach typically requires us to assume that investors use
the same model of conditioning information that we do. We obviously donвЂ™t even observe all
the conditioning information used by economic agents, and we canвЂ™t include even a fraction
of observed conditioning information in our models. The basic feature and beauty of asset
prices (like all prices) is that they summarize an enormous amount of information that only
individuals see. The events that make the price of IBM stock change by a dollar, like the
events that make the price of tomatoes change by 10 cents, are inherently unobservable to
economists or would-be social planners (Hayek 1945). Whenever possible, our treatment of
conditioning information should allow agents to see more than we do.
If we donвЂ™t want to model conditional distributions explicitly, and if we want to avoid as-
suming that investors only see the variables that we include in an empirical investigation, we
eventually have to think about unconditional moments, or at least moments conditioned on
less information than agents see. Unconditional implications are also interesting in and of
themselves. For example, we may be interested in п¬Ѓnding out why the unconditional mean

128
SECTION 8.1 SCALED PAYOFFS

returns on some stock portfolios are higher than others, even if every agent fundamentally
seeks high conditional mean returns. Most statistical estimation essentially amounts to char-
acterizing unconditional means, as we will see in the chapter on GMM. Thus, rather than
model conditional distributions, this chapter focuses on what implications for unconditional
moments we can derive from the conditional theory.

8.1 Scaled payoffs

pt = Et (mt+1 xt+1 ) в‡’ E(pt zt ) = E(mt+1 xt+1 zt )

One can incorporate conditioning information by adding scaled payoffs and doing everything
unconditionally. I interpret scaled returns as payoffs to managed portfolios.

8.1.1 Conditioning down

The unconditional implications of any pricing model are pretty easy to state. From

pt = Et (mt+1 xt+1 )

we can take unconditional expectations to obtain6

(108)
E(pt ) = E(mt+1 xt+1 ).

Thus, if we just interpret p to stand for E(pt ), everything we have done above applies
to unconditional moments. In the same way, we can also condition down from agentsвЂ™ п¬Ѓne
information sets to coarser sets that we observe,

pt = E(mt+1 xt+1 | в„¦) в‡’ E(pt |I вЉ‚ в„¦) = E(mt+1 xt+1 | I вЉ‚ в„¦)
в‡’ pt = E(mt+1 xt+1 | It вЉ‚ в„¦t ) if pt в€€ It .

In making the above statements I used the law of iterated expectations, which is important
enough to highlight it. This law states that if you take an expected value using less informa-
tion of an expected value that is formed on more information, you get back the expected value
using less information. Your best forecast today of your best forecast tomorrow is the same

We need a small technical assumption that the unconditional momentВі moment conditioned on a coarser
or
6
Вґ ВіВґ
information set exists. For example, if X and Y are normal (0, 1), then E X |Y = 0 but E X is inп¬Ѓnite.
Y Y

129
CHAPTER 8 CONDITIONING INFORMATION

as your best forecast today. In various useful guises,

E(Et (x)) = E(x),

Etв€’1 (Et (xt+1 )) = Etв€’1 (xt+1 )

E [E(x|в„¦) | I вЉ‚ в„¦] = E [x|I]

8.1.2 Instruments and managed portfolios

We can do more than just condition down. Suppose we multiply the payoff and price by an
instrument zt observed at time t. Then,

zt pt = Et (mt+1 xt+1 zt )

and, taking unconditional expectations,

(109)
E(pt zt ) = E(mt+1 xt+1 zt ).

This is an additional implication of the conditional model, not captured by just condition-
ing down as in (8.108). This trick originates from the GMM method of estimating asset
pricing models, discussed below. The word instruments for the z variables comes from the
instrumental variables estimation heritage of GMM.
To think about equation (8.109), group (xt+1 zt ). Call this product a payoff x = xt+1 zt ,
with price p = E(pt zt ). Then 8.109 reads

p = E(mx)

once again. Rather than thinking about (8.109) as a instrumental variables estimate of a
conditional model, we can think of it as a price and a payoff, and apply all the asset pricing
theory directly.
This interpretation is not as artiп¬Ѓcial as it sounds. zt xt+1 are the payoffs to managed
portfolios. An investor who observes zt can, rather than вЂњbuy and hold,вЂќ invest in an asset
according to the value of zt . For example, if a high value of zt forecasts that asset returns are
likely to be high the next period, the investor might buy more of the asset when zt is high and
vice-versa. If the investor follows a linear rule, he puts zt pt dollars into the asset each period
and receives zt xt+1 dollars the next period.
This all sounds new and different, but practically every test uses managed portfolios.
For example, the size, beta, industry, book/market and so forth portfolios of stocks are all
managed portfolios, since their composition changes every year in response to conditioning
information вЂ“ the size, beta, etc. of the individual stocks. This idea is also closely related
to the deep idea of dynamic spanning. Markets that are apparently very incomplete can in

130
SECTION 8.2 SUFFICIENCY OF ADDING SCALED RETURNS

reality provide many more state-contingencies through dynamic (conditioned on information)
Equation (8.109) offers a very simple view of how to incorporate the extra information
in conditioning information: Add managed portfolio payoffs, and proceed with unconditional
moments as if conditioning information didnвЂ™t exist!
Linearity is not important. If the investor wanted to place, say, 2 + 3z 2 dollars in the
asset, we could capture this desire with an instrument z2 = 2 + 3z 2 . Nonlinear (measurable)
transformations of timeв€’t random variables are again random variables.
We can thus incorporate conditioning information while still looking at unconditional
moments instead of conditional moments, without any of the statistical machinery of explicit
models with time-varying moments. The only subtleties are 1) The set of asset payoffs ex-
pands dramatically, since we can consider all managed portfolios as well as basic assets,
potentially multiplying every asset return by every information variable. 2) Expected prices
of managed portfolios show up for p instead of just p = 0 and p = 1 if we started with basic
asset returns and excess returns.

8.2 Sufп¬Ѓciency of adding scaled returns

Checking the expected price of all managed portfolios is, in principle, sufп¬Ѓcient to check
all the implications of conditioning information.
E(zt ) = E(mt+1 Rt+1 zt ) в€Ђzt в€€ It в‡’ 1 = E(mt+1 Rt+1 |It )

E(pt ) = E(mt+1 xt+1 ) в€Ђ xt+1 в€€ X t+1 в‡’ pt = E(mt+1 xt+1 |It )

We have shown that we can derive some extra implications from the presence of con-
ditioning information by adding scaled returns. But does this exhaust the implications of
conditioning information? Are we missing something important by relying on this trick?
The answer is, in principle no.
I rely on the following mathematical fact: The conditional expectation of a variable yt+1
given an information set It , E(yt+1 | It ) is equal to a regression forecast of yt+1 using every
variable zt в€€ It . Now, вЂњevery random variableвЂќ means every variable and every nonlinear
(measurable) transformation of every variable, so there are a lot of variables in this regression!
(The word projection and proj(yt+1 |zt ) is used to distinguish the best forecast of yt+1 using
only linear combinations of zt from the conditional expectation.) Applying this fact to our
case, let yt+1 = mt+1 Rt+1 в€’ 1. Then E [(mt+1 Rt+1 в€’ 1) zt ] = 0 for every zt в€€ It implies
1 = E(mt+1 Rt+1 | It ). Thus, no implications are lost in principle by looking at scaled
returns.

131
CHAPTER 8 CONDITIONING INFORMATION

Another way of looking at the same idea is that Rt+1 zt+1 is the return on a payoff avail-
able at time t + 1. Thus, the space of all payoffs X t+1 should be understood to include the
time-t + 1 payoff you can generate with a basis set of assets Rt+1 and all dynamic strategies
that use information in the set It . With that deп¬Ѓnition of the space X t+1 we can write the
sufп¬Ѓciency of scaled returns with the more general second equality above.
вЂњAll linear and nonlinear transformations of all variables observed at time tвЂќ sounds like a
lot of instruments, and it is. But there is a practical limit to the number of instruments zt one
needs to scale by, since only variables that forecast returns or m (or their higher moments
Since adding instruments is the same thing as including potential managed portfolios,
thoughtfully choosing a few instruments is the same thing as the thoughtful choice of a few
assets or portfolios that one makes in any test of an asset pricing model. Even when evaluating
completely unconditional asset pricing models, one always forms portfolios and omits many
possible assets from analysis. Few studies, in fact, go beyond checking whether a model
correctly prices 10-25 stock portfolios and a few bond portfolios. Implicitly, one feels that
the chosen payoffs do a pretty good job of spanning the set of available risk-loadings (mean
returns) and hence that adding additional assets will not affect the results. Nonetheless, since
data are easily available on all 2000 or so NYSE stocks, plus AMEX and NASDAQ stocks, to
say nothing of government and corporate bonds, returns of mutual funds, foreign exchange,
foreign equities, real investment opportunities, etc., the use of a few portfolios means that a
tremendous number of potential asset payoffs are left out in an ad-hoc manner.
In a similar manner, if one had a small set of instruments that capture all the predictability
of discounted returns mt+1 Rt+1 , then there would be no need to add more instruments.
Thus, we carefully but arbitrarily select a few instruments that we think do a good job of
characterizing the conditional distribution of returns. Exclusion of potential instruments is
exactly the same thing as exclusion of assets. It is no better founded, but the fact that it is a
There is nothing special about unscaled returns, and no economic reason to place them
above scaled returns. A mutual fund might come into being that follows the managed port-
folio strategy and then its unscaled returns would be the same as an original scaled return.
Models that cannot price scaled returns are no more interesting than models that can only
price (say) stocks with п¬Ѓrst letter A through L. (There may be econometric reasons to trust
results for nonscaled returns a bit more, but we havenвЂ™t gotten to statistical issues yet.)
Of course, the other way to incorporate conditioning information is by constructing ex-
plicit parametric models of conditional distributions. With this procedure one can in fact
check all of a modelвЂ™s implications about conditional moments. However, the parametric
model may be incorrect, or may not reп¬‚ect some variable used by investors. Including in-
struments may not be as efп¬Ѓcient, but it is still consistent if the parametric model is incorrect.
The wrong parametric model of conditional distributions may lead to inconsistent estimates.
In addition, one avoids estimating nuisance parameters of the parametric distribution model.

132
SECTION 8.3 CONDITIONAL AND UNCONDITIONAL MODELS

8.3 Conditional and unconditional models

A conditional factor model does not imply a п¬Ѓxed-weight or unconditional factor model:
mt+1 = b0 ft+1 , pt = Et (mt+1 xt+1 ) does not imply that в€ѓb s.t. mt+1 = b0 ft+1 , E(pt ) =
t
E(mt+1 xt+1 ).
Et (Rt+1 ) = ОІ 0 О»t does not imply E(Rt+1 ) = ОІ 0 О».
t
Conditional mean-variance efп¬Ѓciency does not imply unconditional mean-variance efп¬Ѓ-
ciency.
The converse statements are true, if managed portfolios are included.

For explicit discount factor modelsвЂ”models whose parameters are constant over timeвЂ”
the fact that one looks at a conditional vs. unconditional implications makes no difference to
the statement of the model.

pt = Et (mt+1 xt+1 ) в‡’ E(pt ) = E(mt+1 xt+1 )

and thatвЂ™s it. Examples include the consumption-based model with power utility, mt+1 =
ОІ(ct+1 /ct )в€’Оі , and the log utility CAPM, mt+1 = 1/Rt+1 .
W

However, linear factor models include parameters that may vary over time and as func-
tions of conditioning information. In these cases the transition from conditional to uncondi-
tional moments is much more subtle. We cannot easily condition down the model at the same
time as the prices and payoffs.

8.3.1 Conditional vs. unconditional factor models in discount factor language

As an example, consider the CAPM

m = a в€’ bRW

where RW is the return on the market or wealth portfolio. We can п¬Ѓnd a and b from the
condition that this model correctly price any two returns, for example RW itself and a risk-
free rate:
пЈ±
ВЅ 1 W
пЈІ a = Rf + bEt (Rt+1 )
W
1 = Et (mt+1 Rt+1 )
(110)
t
в‡’ .
Et (RW )в€’Rf
f
пЈі b = f t+1 W t
1 = Et (mt+1 )Rt 2 (R
RПѓ )
t t t+1

As you can see, b > 0 and a > 0: to make a payoff proportional to the minimum second-
moment return (on the inefп¬Ѓcient part of the mean-variance frontier) we need a portfolio long
the risk free rate and short the market RW .

133
CHAPTER 8 CONDITIONING INFORMATION

More importantly for our current purposes, a and b vary over time, as Et (RW ), Пѓ 2 (Rt+1 ),
W
t+1 t
f
and Rt vary over time. If it is to price assets conditionally, the CAPM must be a linear factor
model with time-varying weights, of the form

mt+1 = at в€’ bt RW .
t+1

This fact means that we can no longer transparently condition down. The statement that
ВЈ В¤
1 = Et (at + bt RW )Rt+1
t+1

does not imply that we can п¬Ѓnd constants a and b so that
ВЈ В¤
1 = E (a + bRW )Rt+1 .
t+1

Just try it. Taking unconditional expectations,
ВЈ В¤ ВЈ В¤
W W
1 = E (at + bt Rt+1 )Rt+1 = E at Rt+1 + bt Rt+1 Rt+1

= E(at )E(Rt+1 ) + E(bt )E(RW Rt+1 ) + cov(at , Rt+1 ) + cov(bt , Rt+1 Rt+1 )
W
t+1

Thus, the unconditional model
ВЈВЎ Вў В¤
W
1=E E(at ) + E(bt )Rt+1 Rt+1

only holds if the covariance terms above happen to be zero. Since at and bt are formed from
conditional moments of returns, the covariances will not, in general be zero.
On the other hand, suppose it is true that at and bt are constant over time. Then
ВЈ В¤
1 = Et (a + bRW )Rt+1
t+1

does imply
ВЈ В¤
W
1 = E (a + bRt+1 )Rt+1 ,

just like any other constant-parameter factor pricing model. Furthermore, the latter uncondi-
tional model implies the former conditional model, if the latter holds for all managed portfo-
lios.

8.3.2 Conditional vs. unconditional in an expected return / beta model

To put the same observation in beta-pricing language,
f
Et (Ri ) = Rt + ОІ t О»t (111)

134
SECTION 8.3 CONDITIONAL AND UNCONDITIONAL MODELS

does not imply that

E(Ri ) = О± + ОІО» (112)

The reason is that ОІ t and ОІ represent conditional and unconditional regression coefп¬Ѓcients
respectively.
Again, if returns and factors are i.i.d., the unconditional model can go through. In that
case, cov(В·) = covt (В·), var(В·) = vart (В·), so the unconditional regression beta is the same
as the conditional regression beta, ОІ = ОІ t . Then, we can take expectations of (8.111) to get
(8.112), with О» = E(О»t ). But to condition down in this way, the covariance and variance must
each be constant over time. It is not enough that their ratio, or conditional betas are constant.
If covt and vart change over time, then the unconditional regression beta, ОІ = cov/var is
not equal to the average conditional regression beta, E(ОІ t ) or E(covt /vart ). Some models
specify that covt and vart vary over time, but covt /vart is a constant. This speciп¬Ѓcation still
does not imply that the unconditional regression beta ОІ в‰Ў cov/var is equal to the constant
covt /vart . Similarly, it is not enough that О» be constant, since E(ОІ t ) 6= ОІ. The betas must
be regression coefп¬Ѓcients, not just numbers.
If the betas do not vary over time, the О»t may still vary and О» = E(О»t ).

8.3.3 A precise statement

LetвЂ™s formalize these observations somewhat. Let X denote the space of all portfolios of the
primitive assets, including managed portfolios in which the weights may depend on condi-
tioning information, i.e. scaled returns.
A conditional factor pricing model is a model mt+1 = at + b0 ft+1 that satisп¬Ѓes pt =
t
Et+1 (mt+1 xt+1 ) for all xt+1 в€€ X.
An unconditional factor pricing model is model mt+1 = a + b0 ft+1 satisп¬Ѓes E(pt ) =
E(mt+1 xt+1 ) for all xt+1 в€€ X. It might be more appropriately called a п¬Ѓxed-weight factor
pricing model.
Given these deп¬Ѓnitions itвЂ™s almost trivial that the unconditional model is just a special
case of the conditional model, one that happens to have п¬Ѓxed weights. Thus, a conditional
factor model does not imply an unconditional factor model (because the weights may vary)
but an unconditional factor model does imply a conditional factor model.
There is one important subtlety. The payoff space X is common, and contains all managed
portfolios in both cases. The payoff space for the unconditional factor pricing model is not
just п¬Ѓxed combinations of a set of basis assets. For example, we might simply check that
the static (constant a, b) CAPM captures the unconditional mean returns of a set of assets. If
this model does not also price those assets scaled by instruments, then it is not a conditional
model, or, as I argued above, really a valid factor pricing model at all.
Of course, everything applies for the relation between a conditional factor pricing model

135
CHAPTER 8 CONDITIONING INFORMATION

using a п¬Ѓne information set (like investorsвЂ™ information sets) and conditional factor pricing
models using coarser information sets (like ours). If you think a set of factors prices assets
with respect to investorsвЂ™ information, that does not mean the same set of factors prices assets
with respect to our, coarser, information sets.

8.3.4 Mean-variance frontiers

Deп¬Ѓne the conditional mean-variance frontier as the set of returns that minimize vart (Rt+1 )
given Et (Rt+1 ). (This deп¬Ѓnition includes the lower segment as usual.) Deп¬Ѓne the uncondi-
tional mean-variance frontier as the set of returns including managed portfolio returns that
minimize var(Rt+1 ) given E(Rt+1 ). These two frontiers are related by:
If a return is on the unconditional mean-variance frontier, it is on the conditional
mean-variance frontier.

However,
If a return is on the conditional mean-variance frontier, it need not be on the uncon-
ditional mean-variance frontier.

These statements are exactly the opposite of what you п¬Ѓrst expect from the language. The
law of iterated expectations E(Et (x)) = E(x) leads you to expect that вЂњconditionalвЂќ should
imply вЂњunconditional.вЂќ But we are studying the conditional vs. unconditional mean-variance
frontier, not raw conditional and unconditional expectations, and it turns out that exactly the
opposite words apply. Of course вЂњunconditionalвЂќ can also mean вЂњconditional on a coarser
information set.вЂќ
Again, keep in mind that the unconditional mean variance frontier includes returns on
managed portfolios. This deп¬Ѓnition is eminently reasonable. If youвЂ™re trying to minimize
variance for given mean, why tie your hands to п¬Ѓxed weight portfolios? Equivalently, why
not allow yourself to include in your portfolio the returns of mutual funds whose advisers
promise the ability to adjust portfolios based on conditioning information?
You could form a mean-variance frontier of п¬Ѓxed-weight portfolios of a basis set of assets,
and this is what many people often mean by вЂњunconditional mean-variance frontier.вЂќ The re-
turn on the true unconditional mean-variance frontier will, in general, include some managed
portfolio returns, and so will lie outside this mean-variance frontier of п¬Ѓxed-weight portfolios.
Conversely, a return on the п¬Ѓxed-weight portfolio MVF is, in general, not on the uncondi-
tional or conditional mean-variance frontier. All we know is that the п¬Ѓxed-weight frontier lies
inside the other two. It may touch, but it need not. This is not to say the п¬Ѓxed-weight uncon-
ditional frontier is uninteresting. For example, returns on this frontier will price п¬Ѓxed-weight
portfolios of the basis assets. The point is that this frontier has no connection to the other two
frontiers. In particular, a conditionally mean-variance efп¬Ѓcient return (conditional CAPM)
need not unconditionally price the п¬Ѓxed weight portfolios.

136
SECTION 8.3 CONDITIONAL AND UNCONDITIONAL MODELS

I offer several ways to see this important statement.

Using the connection to factor models

We have seen that the conditional CAPM mt+1 = at в€’ bt RW does not imply an uncon-
t+1
ditional CAPM mt+1 = a в€’ bRt+1 . We have seen that the existence of such a conditional
W

factor model is equivalent to the statement that the return Rt+1 lies on the conditional mean-
W

variance frontier, and the existence of an unconditional factor model mt+1 = a в€’ bRW is t+1
equivalent to the statement that RW is on the unconditional mean-variance frontier. Then,
from the вЂњtrivialвЂќ fact that an unconditional factor model is a special case of a conditional
one, we know that RW on the unconditional frontier implies RW on the conditional frontier
but not vice-versa.

Using the orthogonal decomposition

We can see the relation between conditional and unconditional mean-variance frontiers
using the orthogonal decomposition characterization of mean-variance efп¬Ѓciency given above.
This beautiful proof is the main point of Hansen and Richard (1987).
By the law of iterated expectations, xв€— and Rв€— generate expected prices and Reв€— generates
unconditional means as well as conditional means:

E [p = Et (xв€— x)] в‡’ E(p) = E(xв€— x)

ВЈ В¤
E Et (Rв€—2 ) = Et (Rв€— R) в‡’ E(Rв€—2 ) = E(Rв€— R)

E [Et (Reв€— Re ) = Et (Re )] в‡’ E(Reв€— Re ) = E(Re )

This fact is subtle and important. For example, starting with xв€— = p0 Et (xt+1 x0 )в€’1 xt+1 ,
t t+1
you might think we need a different xв€— , Rв€— , Reв€— to represent expected prices and uncon-
ditional means, using unconditional probabilities to deп¬Ѓne inner products. The three lines
above show that this is not the case. The same old xв€— , Rв€— , Reв€— represent conditional as well
as unconditional prices and means.
Recall that a return is mean-variance efп¬Ѓcient if and only if it is of the form

Rmv = Rв€— + wReв€— .

Thus, Rmv is conditionally mean-variance efп¬Ѓcient if w is any number in the time t informa-
tion set.

conditional frontier: Rt+1 = Rt+1 + wt Reв€— ,
mv в€—
t+1

and Rmv is unconditionally mean-variance efп¬Ѓcient if w is any constant.

unconditional frontier: Rt+1 = Rt+1 + wReв€— .
mv в€—
t+1

137
CHAPTER 8 CONDITIONING INFORMATION

Constants are in the t information set; time t random variables are not necessarily constant.
Thus unconditional efп¬Ѓciency (including managed portfolios) implies conditional efп¬Ѓciency
but not vice versa. As with the factor models, once you see the decomposition, it is a trivial
argument about whether a weight is constant or time-varying.

Brute force and examples.

If youвЂ™re still puzzled, an additional argument by brute force may be helpful.
If a return is on the unconditional MVF it must be on the conditional MVF at each date.
If not, you could improve the unconditional mean-variance trade-off by moving to the con-
ditional MVF at each date. Minimizing unconditional variance given mean is the same as
minimizing unconditional second moment given mean,

min E(R2 ) s.t. E(R) = Вµ

Writing the unconditional moment in terms of conditional moments, the problem is
ВЈ В¤
min E Et (R2 ) s.t. E [Et (R)] = Вµ

Now, suppose you could lower Et (R2 ) at one date t without affecting Et (R) at that date.
This change would lower the objective, without changing the constraint. Thus, you should
have done it: you should have picked returns on the conditional mean variance frontiers.
It almost seems that reversing the argument we can show that conditional efп¬Ѓciency im-
plies unconditional efп¬Ѓciency, but it doesnвЂ™t. Just because you have minimized Et (R2 ) for
given value of Et (R) at each date t does not imply that you have minimized E(R2 ) for a
given value of E(R). In showing that unconditional efп¬Ѓciency implies conditional efп¬Ѓciency
we held п¬Ѓxed Et (R) at each date at Вµ, and showed it is a good idea to minimize Пѓ t (R). In
trying to go backwards, the problem is that a given value of E(R) does not specify what
Et (R) should be at each date. We can increase Et (R) in one conditioning information set
and decrease it in another, leaving the return on the conditional MVF.
Figure 22 presents an example. Return B is conditionally mean-variance efп¬Ѓcient. It also
has zero unconditional variance, so it is the unconditionally mean-variance efп¬Ѓcient return at
the expected return shown. Return A is on the conditional mean-variance frontiers, and has
the same unconditional expected return as B. But return A has some unconditional variance,
and so is inside the unconditional mean-variance frontier.
As a second example,the riskfree rate is only on the unconditional mean-variance frontier
if it is a constant. Remember the expression (6.95) for the risk free rate,

Rf = Rв€— + Rf Reв€— .

The unconditional mean-variance frontier is Rв€— + wReв€— with w a constant. Thus, the riskfree
rate is only unconditionally mean-variance efп¬Ѓcient if it is a constant.

138
SECTION 8.4 SCALED FACTORS: A PARTIAL SOLUTION

Et(R) Info. set 1

A Info. set 2
B

A

Пѓt(R)

Figure 22. Return A is on the conditional mean-variance frontiers but not on the uncondi-
tional mean variance frontier.

8.3.5 Implications: Hansen-Richard Critique.

Many models, such as the CAPM, imply a conditional linear factor model mt+1 = at +
b0 ft+1 . These theorems show that such a model does not imply an unconditional model.
t
Equivalently, if the model predicts that the market portfolio is conditionally mean-variance
efп¬Ѓcient, this does not imply that the market is unconditionally mean-variance efп¬Ѓcient. We
often test the CAPM by seeing if it explains the average returns of some portfolios or (equiv-
alently) if the market is on the unconditional mean-variance frontier. The CAPM may quite
well be true (conditionally) and fail these tests; many assets may do better in terms of uncon-
ditional mean vs. unconditional variance.
The situation is even worse than these comments seem, and is not repaired by simple
inclusion of some conditioning information. Models such as the CAPM imply a conditional
linear factor model with respect to investorsвЂ™ information sets. However, the best we can hope
to do is to test implications conditioned down on variables that we can observe and include
in a test. Thus, a conditional linear factor model is not testable!
I like to call this observation the вЂњHansen-Richard critiqueвЂќ by analogy to the вЂњRoll Cri-
tique.вЂќ Roll pointed out, among other things, that the wealth portfolio might not be observ-
able, making tests of the CAPM impossible. Hansen and Richard point out that the condi-
tioning information of agents might not be observable, and that one cannot omit it in testing a
conditional model. Thus, even if the wealth portfolio were observable, the fact that we cannot
observe agentsвЂ™ information sets dooms tests of the CAPM.

139
CHAPTER 8 CONDITIONING INFORMATION

8.4 Scaled factors: a partial solution

You can expand the set of factors to test conditional factor pricing models

factors = ft+1 вЉ— zt

The problem is that the parameters of the factor pricing model mt+1 = at + bt ft+1 may
vary over time. A partial solution is to model the dependence of parameters at and bt on
variables in the timeв€’t information set; let at = a(zt ), bt = b(zt ) where zt is a vector of
variables observed at time t (including a constant). In particular, why not try linear models

at = a0 z t , bt = b0 z t

Linearity is not restrictive: zt is just another instrument. The only criticism one can make
2

is that some instrument zjt is important for capturing the variation in at and bt , and was
omitted. For instruments on which we have data, we can meet this objection by trying zjt
and seeing whether it does, in fact, enter signiп¬Ѓcantly. However, for instruments zt that are
observed by agents but not by us, this criticism remains valid.
Linear discount factor models lead to a nice interpretation as scaled factors, in the same
way that linearly managed portfolios are scaled returns. With a single factor and instrument,
write

(113)
mt = a(zt ) + b(zt )ft+1

= a0 + a1 zt + (b0 + b1 zt )ft+1

(114)
= a0 + a1 zt + b0 ft+1 + b1 (zt ft+1 ) .

Thus, in place of the one-factor model with time-varying coefп¬Ѓcients (8.113), we have a
three-factor model (zt , ft+1 , zt ft+1 ) with п¬Ѓxed coefп¬Ѓcients, (8.114).
Since the coefп¬Ѓcients are now п¬Ѓxed, we can use the scaled-factor model with uncondi-
tional moments.

pt = Et [(a0 + a1 zt + b0 ft+1 + b1 (zt ft+1 )) xt+1 ] в‡’

E(pt ) = E [(a0 + a1 zt + b0 ft+1 + b1 (zt ft+1 )) xt+1 ]

For example, in standard derivations of CAPM, the market (wealth portfolio) return is
conditionally mean-variance efп¬Ѓcient; investors want to hold portfolios on the conditional

140
SECTION 8.5 SUMMARY

mean-variance frontier; conditionally expected returns follow a conditional single-beta rep-
resentation, or the discount factor m follows a conditional linear factor model
W
mt+1 = at в€’ bt Rt+1

as we saw above.
But none of these statements mean that we can use the CAPM unconditionally. Rather
than throw up our hands, we can add some scaled factors. Thus, if, say, the dividend/price ra-
tio and term premium do a pretty good job of summarizing variation in conditional moments,
the conditional CAPM implies an unconditional, п¬Ѓve-factor (plus constant) model. The fac-
tors are a constant, the market return, the dividend/price ratio, the term premium, and the
market return times the dividend-price ratio and the term premium.
The unconditional pricing implications of such a п¬Ѓve-factor model could, of course, be
summarized by a singleв€’ОІ representation. (See the caustic comments in the section on im-
plications and equivalence.) The reference portfolio would not be the market portfolio, of
course, but a mimicking portfolio of the п¬Ѓve factors. However, the single mimicking port-
folio would not be easily interpretable in terms of a single factor conditional model and two
instruments. In this case, it might be more interesting to look at a multiple в€’ОІ or multiple-
factor representation.
If we have many factors f and many instruments z, we should in principle multiply every
factor by every instrument,

m = b1 f1 + b2 f1 z1 + b3 f1 z2 + ... + bN+1 f2 + bN+2 f2 z1 + bN+3 f2 z2 + ...

This operation can be compactly summarized with the Kronecker product notation, a вЉ— b,
which means вЂњmultiply every element in vector a by every element in vector b, or

mt+1 = b0 (ft+1 вЉ— zt ).

8.5 Summary

When you п¬Ѓrst think about it, conditioning information sounds scary вЂ“ how do we account for
time-varying expected returns, betas, factor risk premia, variances, covariances, etc. How-
ever, the methods outlined in this chapter allow a very simple and beautiful solution to the
problems raised by conditioning information. To express the conditional implications of a
given model, all you have to do is include some scaled or managed portfolio returns, and then
pretend you never heard about conditioning information.
Some factor models are conditional models, and have coefп¬Ѓcients that are functions of
investorsвЂ™ information sets. In general, there is no way to test such models, but if you are
willing to assume that the relevant conditioning information is well summarized by a few
variables, then you can just add new factors, equal to the old factors scaled by the conditioning

141
CHAPTER 8 CONDITIONING INFORMATION

variables, and again forget that you ever heard about conditioning information.
You may want to remember conditioning information as a diagnostic and in economic
interpretation of the results. It may be interesting to take estimates of a many factor model,
mt = a0 + a1 zt + b0 ft+1 + b1 zt ft+1 , and see what they say about the implied conditional
model, mt = (a0 + a1 zt ) + (b0 + b1 zt )ft+1 . You may want to make plots of conditional
bs, betas, factor risk premia, expected returns,etc. But you donвЂ™t have to worry about it in
estimation and testing.

8.6 Problems

1. If there is a risk free asset, is it on the a) conditional b) unconditional c) both
mean-variance frontier?
2. If there is a conditionally riskfree asset вЂ“ a claim to 1 is traded at each date, does this
mean that there is an unconditionally risk free asset? (Deп¬Ѓne the latter п¬Ѓrst!) How about
vice versa?
3. Suppose you took the unconditional population moments E(R), E(RR0 ) of assets
returns and constructed the mean-variance frontier. Does this frontier correspond to the
conditional or the unconditional MV frontier, or neither? What is the key assumption

142
Chapter 9. Factor pricing models
In Chapter 2, I noted that the consumption-based model, while a complete answer to most
asset pricing questions in principle, does not (yet) work well in practice. This observation
motivates efforts to tie the discount factor m to other data. Linear factor pricing models are
the most popular models of this sort in п¬Ѓnance. They dominate discrete time empirical work.
Factor pricing models replace the consumption-based expression for marginal utility
growth with a linear model of the form

mt+1 = a + b0 f t+1

a and b are free parameters. This speciп¬Ѓcation is equivalent to a multiple-beta model

E(Rt+1 ) = О± + ОІ 0 О»

where ОІ are multiple regression coefп¬Ѓcients of returns R on the factors f . Here, О± and О» are
the free parameters.
The big question is, what should one use for factors ft+1 ? Factor pricing models look for
variables that are good proxies for aggregate marginal utility growth, i.e., variables for which

u0 (ct+1 )
в‰€ a + b0 f t+1 (115)
ОІ0
u (ct )

is a sensible and economically interpretable approximation.
More directly and interpretably, the essence of asset pricing is that there are special states
of the world in which investors are especially concerned that their portfolios not do badly.
They are willing to trade off some overall performance вЂ“ average return вЂ“ to make sure that
portfolios do not do badly in these particular states of nature. The factors are variables that
indicate that these вЂњbad statesвЂќ have occurred.
The factors that result from this search are and should be intuitively sensible. In any
sensible economic model, as well as in the data, consumption is related to returns on broad-
based portfolios, to interest rates, to growth in GNP, investment, or other macroeconomic
variables, and to returns on production processes. All of these variables measure вЂњwealthвЂќ
or the state of the economy. Consumption is and should be high in вЂњgood timesвЂќ and low in
Furthermore, consumption and marginal utility respond to news: if a change in some
variable today signals high income in the future, then consumption rises now, by permanent
income logic. This fact opens the door to forecasting variables: any variable that forecasts
asset returns (вЂњchanges in the investment opportunity setвЂќ) or macroeconomic variables is a
candidate factor. Variables such as the term premium, dividend/price ratio, stock returns, etc.
can be defended as pricing factors on this logic. Though they themselves are not measures of
aggregate good or bad times, they forecast such times.

143
CHAPTER 9 FACTOR PRICING MODELS

Should factors be independent over time? The answer is, sort of. If there is a constant
real interest rate, then marginal utility growth should be unpredictable. (вЂњConsumption is a
random walkвЂќ in the quadratic utility permanent income model.) To see this, just look at the
п¬Ѓrst order condition with a constant interest rate,

u0 (ct ) = ОІRf Et [u0 (ct+1 )]

or in a more time-series notation,

u0 (ct+1 ) 1
= + Оµt+1 ; Et (Оµt+1 ) = 0.
u0 (ct ) ОІRf

The real risk free rate is not constant, but it does not vary a lot, especially compared to as-
set returns. Measured consumption growth is not exactly unpredictable but it is the least
predictable macroeconomic time series, especially if one accounts properly for temporal ag-
gregation (consumption data are quarterly averages). Thus, factors that proxy for marginal
utility growth, though they donвЂ™t have to be totally unpredictable, should not be highly pre-
dictable. If one chooses highly predictable factors, the model will counterfactually predict
large interest rate variation.
In practice, this consideration means that one should choose the right units: Use GNP
growth rather than level, portfolio returns rather than prices or price/dividend ratios, etc.
However, unless one wants to impose an exactly constant risk free rate, one does not have to
п¬Ѓlter or prewhiten factors to make them exactly unpredictable.
This view of factors as intuitively motivated proxies for marginal utility growth is sufп¬Ѓ-
cient to carry the reader through current empirical tests of factor models. The extra constraints
of a formal exposition of theory in this part have not yet constrained the factor-п¬Ѓshing expe-
dition.
The precise derivations all proceed in the way I have motivated factor models: One writes
down a general equilibrium model, in particular a speciп¬Ѓcation of the production technology
by which real investment today results in real output tomorrow. This general equilibrium
produces relations that express the determinants of consumption from exogenous variables,
and relations linking consumption and other endogenous variables; equations of the form
ct = g(ft ). One then uses this kind of equation to substitute out for consumption in the basic
п¬Ѓrst order conditions.
The formal derivations accomplish two things: they determine one particular list of factors
that can proxy for marginal utility growth, and they prove that the relation should be linear.
Some assumptions can often be substituted for others in the quest for these two features of a
factor pricing model.
This is a point worth remembering: all factor models are derived as specializations of the
consumption-based model. Many authors of factor model papers disparage the consumption-
based model, forgetting that their factor model is the consumption-based model plus extra
assumptions that allow one to proxy for marginal utility growth from some other variables.

144
SECTION 9.1 CAPITAL ASSET PRICING MODEL (CAPM)

My presentation follows ConstantinidesвЂ™ (1989) derivation of traditional models as instances
of the consumption-based model in this regard.
Above, I argued that clear economic foundation was important for factor models, since it
is the only guard against п¬Ѓshing. Alas, we discover here that the current state of factor pricing
models is not a particularly good guard against п¬Ѓshing. One can call for better theories or
derivations, more carefully aimed at limiting the list of potential factors and describing the
fundamental macroeconomic sources of risk, and thus providing more discipline for empirical
work. The best minds in п¬Ѓnance have been working on this problem for 40 years though, so
a ready solution is not immediately in sight. On the other hand, we will see that even current
theory can provide much more discipline than is commonly imposed in empirical work. For
example, the derivations of the CAPM and ICAPM do leave predictions for the risk free rate
and for factor risk premia that are often ignored. The ICAPM gives tighter restrictions on
state variables than are commonly checked: вЂњState variablesвЂќ do have to forecast something!
We also see how special and unrealistic are the general equilibrium setups necessary to derive
popular speciп¬Ѓcations such as CAPM and ICAPM. This observation motivates a more serious
look at real general equilibrium models below.

9.1 Capital Asset Pricing Model (CAPM)

The CAPM is the model m = a + bRw ; Rw = wealth portfolio return. I derive it from
the consumption based model by 1) Two period quadratic utility; 2) Two periods, exponential
utility and normal returns; 3) Inп¬Ѓnite horizon, quadratic utility and i.i.d. returns; 4) Log utility
and normally distributed returns.

The CAPM is the п¬Ѓrst, most famous and (so far) most widely used model in asset pricing.
It ties the discount factor m to the return on the вЂњwealth portfolio.вЂќ The function is linear,
W
mt+1 = a + bRt+1 .

a and b are free parameters. One can п¬Ѓnd theoretical values for the parameters a and b by
requiring the discount factor m to price any two assets, such as the wealth portfolio return
and risk-free rate, 1 = E(mRW ) and 1 = E(m)Rf . (As an example, we did this in equation
(8.110) above.) In empirical applications, we can also pick a and b to вЂњbestвЂќ price larger
cross-sections of assets. We do not have good data on, or even a good empirical deп¬Ѓnition
for, the return on total wealth. It is conventional to proxy RW by the return on a broad-based
stock portfolio such as the value- or equally-weighted NYSE, S&P500, etc.
The CAPM is of course most frequently stated in equivalent expected return / beta lan-
guage,

E(Ri ) = О± + ОІ i,RW [E(Rw ) в€’ О±] .

145
CHAPTER 9 FACTOR PRICING MODELS

This section brieп¬‚y describes some classic derivations of the CAPM. Again, we need
to п¬Ѓnd assumptions that defend which factors proxy for marginal utility (RW here), and
assumptions to defend the linearity between m and the factor.
I present several derivations of the same model. Many of these derivations use classic
modeling assumptions which are important in their own sake. This is also an interesting place
in which to see that various sets of assumptions can often be used to get to the same place.
The CAPM is often criticized for one or another assumption. By seeing several derivations,
we can see how one assumption can be traded for another. For example, the CAPM does not
in fact require normal distributions, if one is willing to swallow quadratic utility instead.

Two period investors with no labor income and quadratic utility imply the CAPM.

Investors have quadratic preferences and only live two periods,
1 1
U(ct , ct+1 ) = в€’ (ct в€’ cв€— )2 в€’ ОІE[(ct+1 в€’ cв€— )2 ]. (116)
2 2
Their marginal rate of substitution is thus

u0 (ct+1 ) (ct+1 в€’ cв€— )
mt+1 = ОІ =ОІ .
u0 (ct ) (ct в€’ cв€— )

The quadratic utility assumption means marginal utility is linear in consumption. Thus, the
п¬Ѓrst target of the derivation, linearity.
Investors are born with wealth Wt in the п¬Ѓrst period and earn no labor income. They
can invest in lots of assets with prices pi and payoffs xi , or, to keep the notation simple,
t t+1
returns Rt+1 . They choose how much to consume at the two dates, ct and ct+1 , and the
i

portfolio weights О±i for their investment portfolio. Thus, the budget constraint is

(117)
ct+1 = Wt+1

Wt+1 = RW (Wt в€’ ct )
t+1

N N
X X
W i
R = О±i R ; О±i = 1.
i=1 i=1

RW is the rate of return on total wealth.

146
SECTION 9.1 CAPITAL ASSET PRICING MODEL (CAPM)

The two-period assumption means that investors consume everything in the second pe-
riod, by constraint (9.117). This fact allows us to substitute wealth and the return on wealth
for consumption, achieving the second goal of the derivation, naming the factor that proxies
for consumption or marginal utility:
Rt+1 (Wt в€’ ct ) в€’ cв€—
W
в€’ОІcв€— ОІ(Wt в€’ ct ) W
mt+1 =ОІ = + Rt+1
ct в€’ cв€— ct в€’ cв€— ct в€’ cв€—
i.e.

mt+1 = at + bt RW .
t+1

9.1.2 Exponential utility, normal distributions

u(c) = в€’eв€’О±c and a normally distributed set of returns also produces the CAPM.

The combination of exponential utility and normal distributions is another set of assump-
tions that deliver the CAPM in a one or two period model. This structure has a particularly
convenient analytical form. Since it gives rise to linear demand curves, it is very widely
used in models that complicate the trading structure, by introducing incomplete markets or
asymmetric information.
I present a model with consumption only in the last period. (You can do the quadratic
utility model of the last section this way as well.) Utility is
ВЈ В¤
E [u(c)] = E в€’eв€’О±c .
О± is known as the coefп¬Ѓcient of absolute risk aversion. If consumption is normally distributed,
we have
О±2
Пѓ 2 (c)
Eu(c) = в€’eв€’О±E(c)+ .
2

Suppose this investor has initial wealth W which can be split between a riskfree asset
paying Rf and a set of risky assets paying return R. Let y denote the amount of this wealth
W (amount, not fraction) invested in each security. Then, the budget constraint is

c = y f Rf + y 0 R
W = y f + y0 1

Plugging the п¬Ѓrst constraint into the utility function we obtain
2
Rf +y 0 E(R)]+ О± y 0 ОЈy
f
Eu(c) = в€’eв€’О±[y (118)
.
2

147
CHAPTER 9 FACTOR PRICING MODELS

As with quadratic utility, the two-period model is what allows us to set consumption to wealth
and then substitute the return on the wealth portfolio for consumption growth in the discount
factor.
Maximizing (9.118) with respect to y, y f , we obtain the п¬Ѓrst order condition describing
the optimal amount to be invested in the risky asset,

E(R) в€’ Rf
y = ОЈв€’1
О±
Sensibly, the investor invests more in risky assets if their expected return is higher, less if his
risk aversion coefп¬Ѓcient is higher, and less if the assets are riskier. Notice that total wealth
does not appear in this expression. With this setup, the amount invested in risky assets is
independent of the level of wealth. This is why we say that this investor has an aversion to
absolute rather than relative (to wealth) risk aversion. Note also that these вЂњdemandsвЂќ for the
risky assets are linear in expected returns, which is a very convenient property.
Inverting the п¬Ѓrst order conditions, we obtain

E(R) в€’ Rf = О±ОЈy = О± cov(R, Rm ). (119)

 << стр. 5(всего 17)СОДЕРЖАНИЕ >>