. 1
( 3)


Chapter 2

Some Basic Theory of

Introduction to Pricing: Single Period Models

Let us begin with a very simple example designed to illustrate the no-arbitrage
approach to pricing derivatives. Consider a stock whose price at present is $s.
Over a given period, the stock may move either up or down, up to a value su
where u > 1 with probability p or down to the value sd where d < 1 with
probability 1 ’ p. In this model, these are the only moves possible for the stock
in a single period. Over a longer period, of course, many other values are
possible. In this market, we also assume that there is a so-called risk-free bond
available returning a guaranteed rate of r% per period. Such a bond cannot
default; there is no random mechanism governing its return which is known
upon purchase. An investment of $1 at the beginning of the period returns a
guaranteed $(1 + r) at the end. Then a portfolio purchased at the beginning
of a period consisting of y stocks and x bonds will return at the end of the
period an amount $x(1 + r) + ysZ where Z is a random variable taking


values u or d with probabilities p and 1 ’ p respectively. We permit owning
a negative amount of a stock or bond, corresponding to shorting or borrowing
the correspond asset for immediate sale.
An ambitious investor might seek a portfolio whose initial cost is zero (i.e.
x + ys = 0) such that the return is greater than or equal to zero with positive
probability. Such a strategy is called an arbitrage. This means that the investor
is able to achieve a positive probability of future pro¬ts with no down-side risk
with a net investment of $0. In mathematical terms, the investor seeks a point
(x, y) such that x + ys = 0 (net cost of the portfolio is zero) and

x(1 + r) + ysu ≥ 0,

x(1 + r) + ysd ≥ 0

with at least one of the two inequalities strict (so there is never a loss and a
non-zero chance of a positive return). Alternatively, is there a point on the line
y = ’ 1 x which lies above both of the two lines

y=’ x
y=’ x

and strictly above one of them? Since all three lines pass through the origin,
we need only compare the slopes; an arbitrage will NOT be possible if

1+r 1 1+r
’ ·’ ·’ (2.1)
sd s su

and otherwise there is a point (x, y) permitting an arbitrage. The condition for
no arbitrage (2.1) reduces to

d u
1+r 1+r

So the condition for no arbitrage demands that (1 + r ’ u) and (1 + r ’ d)
have opposite sign or d · (1 + r) · u. Unless this occurs, the stock always
has either better or worse returns than the bond, which makes no sense in a

free market where both are traded without compulsion. Under a no arbitrage
assumption since d · (1 + r) · u, the bond payo¬ is a convex combination or
a weighted average of the two possible stock payo¬s; i.e. there are probabilities
0 · q · 1 and (1 ’ q) such that (1 + r) = qu + (1 ’ q)d. In fact it is easy to
solve this equation to determine the values of q and 1 ’ q.

(1 + r) ’ d u ’ (1 + r)
and 1 ’ q =
q= , .
u’d u’d

Denote by Q the probability distribution which puts probabilities q and 1 ’ q
on these points su, sd. Then if S1 is the value of the stock at the end of the
period, note that

1 1 1
(qsu + (1 ’ q)sd) =
EQ (S1 ) = s(1 + r) = s
1+r 1+r 1+r

where EQ denotes the expectation assuming that Q describes the probabilities
of the two outcomes.
In other words, if there is to be no arbitrage, there exists a probability mea-
sure Q such that the expected price of future value of the stock S1 discounted
to the present using the return from a risk-free bond is exactly the present value
of the stock. The measure Q is called the risk-neutral measure and the prob-
abilities that it assigns to the possible outcomes of S are not necessarily those
that determine the future behaviour of the stock. The risk neutral measure
embodies both the current consensus beliefs in the future value of the stock and
the consensus investors™ attitude to risk avoidance. It is not usually true that
= s with P denoting the actual probability distribution describing
1+r EP (S1 )

the future probabilities of the stock. Indeed it is highly unlikely that an investor
would wish to purchase a risky stock if he or she could achieve exactly the same
expected return with no risk at all using a bond. We generally expect that
to make a risky investment attractive, its expected return should be greater
than that of a risk-free investment. Notice in this example that the risk-neutral
measure Q did not use the probabilities p, and 1 ’ p that the stock would go

up or down and this seems contrary to intuition. Surely if a stock is more likely
to go up, then a call option on the stock should be valued higher!
Let us suppose for example that we have a friend willing, in a private trans-
action with me, to buy or sell a stock at a price determined from his subjectively
assigned distribution P , di¬erent from Q. The friend believes that the stock
is presently worth

psu + (1 ’ p)sd
6= s since p 6= q.
EP S1 =
1+r 1+r

Such a friend o¬ers their assets as a sacri¬ce to the gods of arbitrage. If the
friend™s assessed price is greater than the current market price, we can buy on
the open market and sell to the friend. Otherwise, one can do the reverse.
Either way one is enriched monetarily (and perhaps impoverished socially)!
So why should we use the Q measure to determine the price of a given asset
in a market (assuming, of course, there is a risk-neutral Q measure and we are
able to determine it)? Not because it precisely describes the future behaviour
of the stock, but because if we use any other distribution, we o¬er an intelligent
investor (there are many!) an arbitrage opportunity, or an opportunity to make
money at no risk and at our expense.
Derivatives are investments which derive their value from that of a corre-
sponding asset, such as a stock. A European call option is an option which
permits you (but does not compel you) to purchase the stock at a ¬xed future
date ( the maturity date) or for a given predetermined price, the exercise price
of the option). For example a call option with exercise price $10 on a stock
whose future value is denoted S1 , is worth on expiry S1 ’ 10 if S1 > 10 but
nothing at all if S1 < 10. The di¬erence S1 ’ 10 between the value of the stock
on expiry and the exercise price of the option is your pro¬t if you exercises the
option, purchasing the stock for $10 and sell it on the open market at $S1 .
However, if S1 < 10, there is no point in exercising your option as you are
not compelled to do so and your return is $0. In general, your payo¬ from pur-

chasing the option is a simple function of the future price of the stock, such as
V (S1 ) = max(S1 ’ 10, 0). We denote this by (S1 ’ 10)+ . The future value of
the option is a random variable but it derives its value from that of the stock,
hence it is called a derivative and the stock is the underlying.

A function of the stock price V (S1 ) which may represent the return from a
portfolio of stocks and derivatives is called a contingent claim. V (S1 ) repre-
sents the payo¬ to an investor from a certain ¬nancial instrument or derivative
when the stock price at the end of the period is S1 . In our simple binomial
example above, the random variable takes only two possible values V (su) and
V (sd). We will show that there is a portfolio, called a replicating portfolio, con-
sisting of an investment solely in the above stock and bond which reproduces
these values V (su) and V (sd) exactly. We can determine the corresponding
weights on the bond and stocks (x, y) simply by solving the two equations in
two unknowns

x(1 + r) + ysu = V (su)

x(1 + r) + ysd = V (sd)

V (su)’y— su
V (su)’V (sd)
Solving: y — = and x— = By buying y — units of
su’sd 1+r

stock and x— units of bond, we are able to replicate the contingent claim V (S1 )
exactly- i.e. produce a portfolio of stocks and bonds with exactly the same
return as the contingent claim. So in this case at least, there can be only one
possible present value for the contingent claim and that is the present value
of the replicating portfolio x— + y — s. If the market placed any other value
on the contingent claim, then a trader could guarantee a positive return by a
simple trade, shorting the contingent claim and buying the equivalent portfolio
or buying the contingent claim and shorting the replicating portfolio. Thus this
is the only price that precludes an arbitrage opportunity. There is a simpler

expression for the current price of the contingent claim in this case: Note that

1 1
(qV (su) + (1 ’ q)V (sd))
EQ V (S1 ) =
1+r 1+r
1 1+r’d u ’ (1 + r)
= ( V (su) + V (sd))
1+r u’d u’d
= x— + y — s.

In words, the discounted expected value of the contingent claim is equal to
the no-arbitrage price of the derivative where the expectation is taken using the
Q-measure. Indeed any contingent claim that is attainable must have its price
determined in this way. While we have developed this only in an extremely
simple case, it extends much more generally.
Suppose we have a total of N risky assets whose prices at times t = 0, 1,
j j
are given by (S0 , S1 ), j = 1, 2, ..., N. We denote by S0 , S1 the column vector of
initial and ¬nal prices
⎛ ⎞ ⎛ ⎞
1 1
S0 S1
⎜ ⎟ ⎜ ⎟
⎜2 ⎟ ⎜2 ⎟
⎜ S0 ⎟ ⎜ S1 ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜. ⎟ ⎜. ⎟
⎜ ⎟ ⎜ ⎟
S0 = ⎜ ⎟ , S1 = ⎜ ⎟
⎜. ⎟ ⎜. ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜. ⎟ ⎜. ⎟
⎝ ⎠ ⎝ ⎠
S0 S1

where at time 0, S0 is known and S1 is random. Assume also there is a riskless
asset (a bond) paying interest rate r over one unit of time. Suppose we borrow
money (this is the same as shorting bonds) at the risk-free rate to buy wj units
P j
of stock j at time 0 for a total cost of wj S0 . The value of this portfolio at
P j j
wj (S1 ’ (1 + r)S0 ). If there are weights wj so that
time t = 1 is T (w) =
this sum is always non-negative, and P (T (w) > 0) > 0, then this is an arbitrage
opportunity. Similarly, by replacing the weights wj by their negative ’wj ,
there is an arbitrage opportunity if for some weights the sum is non-positive
and negative with positive probability. In summary, there are no arbitrage op-

portunities if for all weights wj P (T (w) > 0) > 0 and P (T (w) < 0) > 0 so
T (w) takes both positive and negative values. We assume that the moment
P j j
generating function M (w) = E[exp( wj (S1 ’ (1 + r)S0 ))] exists and is an an-
alytic function of w.Roughly the condition that the moment generating function
is analytic assures that we can expand the function in a series expansion in w.
This is the case, for example, if the values of S1 , S0 are bounded. The following
theorem provides a general proof, due to Chris Rogers, of the equivalence of the
no-arbitrage condition and the existence of an equivalent measure Q. Refer to
the appendix for the technical de¬nitions of an equivalent probability measure
and the existence and properties of a moment generating function M (w).

Theorem 2 A necessary and su¬cient condition that there be no arbitrage op-
portunities is that there exists a measure Q equivalent to P such that EQ (S1 ) =
for all j = 1, ..., N.
1+r S0
P j j
wj (S1 ’ (1 + r)S0 ))] and
Proof. De¬ne M (w) = E exp(T (w)) = E[exp(
consider the problem
min ln(M (w)).

The no-arbitrage condition implies that for each j there exists µ > 0,

j j
P [S1 ’ (1 + r)S0 > µ] > 0

and therefore as wj ’ ∞ while the other weights wk , k 6= j remain ¬xed,
X j j j j
wj (S1 ’(1+r)S0 ))] > C exp(wj µ)P [S1 ’(1+r)S0 > µ] ’ ∞ as wj ’ ∞.
M (w) = E[exp(

Similarly, M (w) ’ ∞ as wj ’ ’∞. From the properties of a moment gen-
erating function (see the appendix) M (w) is convex, continuous, analytic and
M (0) = 1. Therefore the function M (w) has a minimum w— satisfying =0

‚M (w)
= 0 or (2.3)
j j
E[S1 exp(T (w))] = (1 + r)S0 E[exp(T (w))]

E[exp(T (w))S1 ]
S0 = .
(1 + r)E[exp(T (w))]
De¬ne a distribution or probability measure Q as follows; for any event A,

EP [IA exp(w0 S1 )]
Q(A) = .
EP [exp(w0 S1 )]

The Radon-Nikodym derivative (see the appendix) is

exp(w0 S1 )]
= .
EP [exp(w0 S1 )]
Since ∞ > > 0, the measure Q is equivalent to the original probability mea-

sure P (in the intuitive sense that it has the same support). When we calculate
expected values under this new measure, note that for each j,

dQ j
EQ (S1 ) = EP [ S]
dP 1
EP [S1 exp(w0 S1 )]
EP [exp(w0 S1 )]
= (1 + r)S0 .

j j
S0 = EQ (S1 ).
Therefore, the current price of each stock is the discounted expected value of the
future price under this “risk-neutral” measure Q.
Conversely if
j j
S0 , for all j (2.4)
EQ (S1 ) =
holds for some measure Q then EQ [T (w)] = 0 for all w and this implies that the
random variable T (w) is either identically 0 or admits both positive and negative
values. Therefore the existence of the measure Q satisfying (2.4) implies that
there are no arbitrage opportunities.

The so-called risk-neutral measure Q is constructed to minimize the cross-
entropy between Q and P subject to the constraints E(S1 ’ (1 + r)S0 ) = 0

where cross-entropy is de¬ned in Section 1.5. If there N possible values of the
random variables S1 and S0 then (2.3) consists of N equations in N unknowns
and so it is reasonable to expect a unique solution. In this case, the Q measure
is unique and we call the market complete.
The theory of pricing derivatives in a complete market is rooted in a rather
trivial observation because in a complete market, the derivative can be replicated
with a portfolio of other marketable securities. If we can reproduce exactly the
same (random) returns as the derivative provides using a linear combination of
other marketable securities (which have prices assigned by the market) then the
derivative must have the same price as the linear combination of other securities.
Any other price would provide arbitrage opportunities.
Of course in the real world, there are costs associated with trading, these
costs usually related to a bid-ask spread. There is essentially a di¬erent price for
buying a security and for selling it. The argument above assumes a frictionless
market with no trading costs, with borrowing any amount at the risk-free bond
rate possible, and a completely liquid market- any amount of any security can be
bought or sold. Moreover it is usually assumed that the market is complete and
it is questionable whether complete markets exist. For example if a derivative
security can be perfectly replicated using other marketable instruments, then
what is the purpose of the derivative security in the market? All models,
excepting those on Fashion File, have de¬ciencies and critics. The merit of the
frictionless trading assumption is that it provides an accurate approximation
to increasingly liquid real-world markets. Like all useful models, this permits
tentative conclusions that should be subject to constant study and improvement.

Multiperiod Models.

When an asset price evolves over time, the investor normally makes decisions
about the investment at various periods during its life. Such decisions are made

with the bene¬t of current information, and this information, whether used
or not, includes the price of the asset and any related assets at all previous
time periods, beginning at some time t = 0 when we began observation of the
process. We denote this information available for use at time t as Ht . Formally,
Ht is what is called a sigma-¬eld (see the appendix) generated by the past, and
there are two fundamental properties of this sigma-¬eld that will use. The ¬rst
is that the sigma-¬elds increase over time. In other words, our information
about this and related processes increases over time because we have observed
more of the relevant history. In the mathematical model, we do not “forget”
relevant information: this model ¬ts better the behaviour of youthful traders
than aging professors. The second property of Ht is that it includes the value
of the asset price S„ , „ · t at all times „ · t. In measure-theoretic language, St
is adapted to or measurable with respect to Ht . Now the analysis above shows
that when our investment life began at time t = 0 and we were planning for the
next period of time, absence of arbitrage implies a risk-neutral measure Q such
that EQ ( 1+r S1 ) = S0 . Imagine now that we are in a similar position at time
t, planning our investment for the next unit time. All expected values should
be taken in the light of our current knowledge, i.e. given the information Ht .
An identical analysis to that above shows that under the risk neutral measure
Q, if St represents the price of the stock after t periods, and rt the risk-free
one-period interest rate o¬ered that time, then

St+1 |Ht ) = St . (2.5)
EQ (
1 + rt

Suppose we let Bt be the value of $1 invested at time t = 0 after a total
of t periods. Then B1 = (1 + r0 ), B2 = (1 + r0 )(1 + r1 ), and in general
Bt = (1 + r0 )(1 + r1 )...(1 + rt’1 ). Since the interest rate per period is announced
at the beginning of this period, the value Bt is known at time t ’ 1. If you
owe exactly $1.00 payable at time t, then to cover this debt you should have an

investment at time t = 0 of $E(1/Bt ), which we might call the present value
of the promise. In general, at time t, the present value of a certain amount
$VT promised at time T (i.e. the present value or the value discounted to the
present of this payment) is
|Ht ).
Now suppose we divide (2.5) above by Bt. We obtain

St+1 1 1 1 St
|Ht ) = EQ ( St+1 |Ht ) = St+1 |Ht ) =
EQ ( EQ ( .
Bt+1 Bt (1 + rt ) Bt 1 + rt Bt
Notice that we are able to take the divisor Bt outside the expectation since Bt
is known at time t (in the language of Appendix 1, Bt is measurable with re-
spect to Ht+1 ). This equation (2.6) describes an elegant mathematical property
shared by all marketable securities in a complete market. Under the risk-neutral
measure, the discounted price Yt = St /Bt forms a martingale. A martingale
is a process Yt for which the expectation of a future value given the present is
equal to the present i.e.

E(Yt+1 |Ht ) = Yt .for all t. (2.7)

Properties of a martingale are given in the appendix and it is easy to show that
for such a process, when T > t,

E(YT |Ht ) = E[...E[E(YT |HT ’1 )|HT ’2 ]...|Ht ] = Yt . (2.8)

A martingale is a fair game in a world with no in¬‚ation, no need to consume
and no mortality. Your future fortune if you play the game is a random vari-
able whose expectation, given everything you know at present, is your present
Thus, under a risk-neutral measure Q in a complete market, all marketable
securities discounted to the present form martingales. For this reason, we often
refer to the risk-neutral measure as a martingale measure. The fact that prices of

marketable commodities must be martingales under the risk neutral measure has
many consequences for the canny investor. Suppose, for example, you believe
that you are able to model the history of the price process nearly perfectly, and
it tells you that the price of a share of XXX computer systems increases on
average 20% per year. Should you use this P ’measure in valuing a derivative,
even if you are con¬dent it is absolutely correct, in pricing a call option on
XXX computer systems with maturity one year from now? If you do so, you are
o¬ering some arbitrager another free lunch at your expense. The measure Q,
not the measure P , determines derivative prices in a no-arbitrage market. This
also means that there is no advantage, when pricing derivatives, in using some
elaborate statistical method to estimate the expected rate of return because this
is a property of P not Q.

What have we discovered? In general, prices in a market are determined as
expected values, but expected values with respect to the measure Q. This is true
in any complete market, regardless of the number of assets traded in the market.
For any future time T > t, and for any derivative de¬ned on the traded assets
in a market whose value at time t is given by Vt , EQ ( BT VT |Ht ] = Vt = the
market price of the derivative at time t. So in theory, determining a reasonable
price of a derivative should be a simple task, one that could be easily handled
by simulation. Suppose we wish to determine a suitable price for a derivative
whose value is determined by some stock price process St . Suppose that at
time T > t, the value of the derivative is a simple function of the stock price at
that time VT = V (ST ). We may simply generate many simulations of the future
value of the stock and corresponding value of the derivative ST , V (ST ) given the
current store of information Ht . These simulations must be conducted under the
measure Q. In order to determine a fair price for the derivative, we then average
the discounted values of the derivatives, discounted to the present, over all the
simulations. The catch is that the Q measure is often neither obvious from
the present market prices nor statistically estimable from its past. It is given

implicitly by the fact that the expected value of the discounted future value of
traded assets must produce the present market price. In other words, a ¬rst
step in valuing any asset is to determine a measure Q for which this holds. Now
in some simple models involving a single stock, this is fairly simple, and there
is a unique such measure Q. This is the case, for example, for the stock model
above in which the stock moves in simple steps, either increasing or decreasing
at each step. But as the number of traded assets increases, and as the number
of possible jumps per period changes, a measure Q which completely describes
the stock dynamics and which has the necessary properties for a risk neutral
measure becomes potentially much more complicated as the following example

Solving for the Q Measure.

Let us consider the following simple example. Over each period, a stock price
provides a return greater than, less than, or the same as that of a risk free
investment like a bond. Assume for simplicity that the stock changes by the
factor u(1 + r) (greater) or (1 + r) (the same) d(1 + r)(less) where u > 1 > d =
1/u. The Q probability of increases and decreases is unknown, and may vary
from one period to the next. Over two periods, the possible paths executed by
this stock price process are displayed below assuming that the stock begins at
time t = 0 with price S0 = 1.


In general in such a tree there are three branches from each of the nodes
at times t = 0, 1 and there are a total of 1 + 3 = 4 such nodes. Thus, even
if we assume that probabilities of up and down movements do not depend on
how the process arrived at a given node, there is a total of 3 — 4 = 12 unknown
parameters. Of course there are constraints; for example the sum of the three
probabilities on branches exiting a given node must add to one and the price

Figure 2.1: A Trinomial Tree for Stock Prices

process must form a martingale. For each of the four nodes, this provides two
constraints for a total of 8 constraints, leaving 4 parameters to be estimated.
We would need the market price of 4 di¬erent derivatives or other contingent
claims to be able to generate 4 equations in these 4 unknowns and solve for
them. Provided we are able to obtain prices of four such derivatives, then we
can solve these equations. If we denote the risk-neutral probability of ™up™ at
each of the four nodes by p1 , p2 , p3 , p4 then the conditional distribution of St+1
given St = s is:

Stock value su(1 + r) s(1 + r) sd(1 + r)
u’d u’1
1’ = 1 ’ kpi
Probability pi 1’d pi 1’d pi = cpi

Consider the following special case, with the risk-free interest rate per period
r, u = 1.089, S0 = $1.00. We also assume that we are given the price of four
call options expiring at time T = 2. The possible values of the price at time
T = 2 corresponding to two steps up, one step up and one constant, one up
one down, etc. are the values of S(T ) in the set

{1.1859, 1.0890, 1.0000, 0.9183, 0.8432}.

Now consider a “call option” on this stock expiring at time T = 2 with strike

price K. Such an option has value at time t = 2 equal to (S2 ’ K) if this is
For brevity we denote this by (S2 ’ K)+ . The
positive, or zero otherwise.
present value of the option is EQ (S2 ’ K)+ discounted to the present, where
K is the exercise price of the option and S2 is the price of the stock at time 2.
Thus the price of the call option at time 0 is given by

V0 = EQ (S2 ’ K)+ /(1 + r)2

Assuming interest rate r = 1% per period, suppose we have market prices of four
call options with the same expiry and di¬erent exercise prices in the following

K =Exercise Price V0 =Call Option Price
T =Maturity
0.867 2 0.154
0.969 2 .0675
1.071 2 .0155
1.173 2 .0016

If we can observe the prices of these options only, then the equations to be
solved for the probabilities associated with the measure Q equate the observed
price of the options to their theoretical price V0 = E(S2 ’ K)+ /(1 + r)2 .

(1.186 ’ 1.173)p1 p2
0.0016 =
[(1.186 ’ 1.071)p1 p2 + (1.089 ’ 1.071){p1 (1 ’ kp2 ) + (1 ’ kp1 )p2 }]
0.0155 =
[0.217p1 p2 + 0.12{p1 (1 ’ kp2 ) + (1 ’ kp1 )p2 }
0.0675 =
+ 0.031{(1 ’ kp1 )(1 ’ kp2 ) + cp1 p2 + cp1 p4 )}
[0.319p1 p2 + 0.222{p1 (1 ’ kp2 ) + (1 ’ kp1 )p2 }
0.154 =
+ 0.133{(1 ’ kp1 )(1 ’ kp2 ) + cp1 p2 + cp1 p4 )}

+ 0.051{{cp1 (1 ’ kp4 ) + (1 ’ kp1 )cp3 }].

While it is not too di¬cult to solve this system in this case one can see that
with more branches and more derivatives, this non-linear system of equations
becomes di¬cult very quickly. What do we do if we observe market prices for
only two derivatives de¬ned on this stock, and only two parameters can be
obtained from the market information? This is an example of what is called
an incomplete market, a market in which the risk neutral distribution is not
uniquely speci¬ed by market information. In general when we have fewer
equations than parameters in a model, there are really only two choices
(a) Simplify the model so that the number of unknown parameters and the
number of equations match.
(b) Determine additional natural criteria or constraints that the parameters
must satisfy.
In this case, for example, one might prefer a model in which the probability
of a step up or down depends on the time, but not on the current price of the
stock. This assumption would force equal all of p2 = p3 = p4 and simplify the
system of equations above. For example using only the prices of the ¬rst two
derivatives, we obtain equations, which, when solved, determine the probabilities
on the other branches as well.

(1.186 ’ 1.173)p1 p2
0.0016 =
[(1.186 ’ 1.071)p1 p2 + (1.089 ’ 1.071){p1 (1 ’ kp2 ) + (1 ’ kp1 )p2 }]
0.0155 =
This example re¬‚ects a basic problem which occurs often when we build a
reasonable and ¬‚exible model in ¬nance. Frequently there are more parameters
than there are marketable securities from which we can estimate these parame-
ters. It is quite common to react by simplifying the model. For example, it
is for this reason that binomial trees (with only two branches emanating from
each node) are often preferred to the trinomial tree example we use above, even
though they provide a worse approximation to the actual distribution of stock

In general if there are n di¬erent securities (excluding derivatives whose value
is a function of one or more of these) and if each security can take any one of m
di¬erent values, then there are a total of mn possible states of nature at time
t = 1. The Q measure must assign a probability to each of them. This results in
a total of mn unknown probability values, which, of course must add to one, and
result in the right expectation for each of n marketable securities. To uniquely
determine Q we would require a total of mn ’ n ’ 1 equations or mn ’ n ’ 1
di¬erent derivatives. For example for m = 10, n = 100, approximately one with
a hundred zeros, a prohibitive number, are required to uniquely determine Q.
In a complete market, Q is uniquely determined by marketed securities, but
in e¬ect no real market can be complete. In real markets, one asset is not
perfectly replicated by a combination of other assets because there is no value
in duplication. Whether an asset is a derivative whose value is determined by
another marketed security, together with interest rates and volatilities, markets
rarely permit exact replication. The most we can probably hope for in practice
is to ¬nd a model or measure Q in a subclass of measures with desirable features
under which

V (ST )|Ht ] ≈ Vt for all marketable V. (2.9)
EQ [

Even if we had equalities in (2.9), this would represent typically fewer equa-
tions than the number of unknown Q probabilities so some simpli¬cation of the
model is required before settling on a measure Q. One could, at one™s peril,
ignore the fact that certain factors in the market depend on others. Similar
stocks behave similarly, and none may be actually independent. Can we, with
any reasonable level of con¬dence, accurately predict the e¬ect that a lowering
of interest rates will have on a given bank stock? Perhaps the best model
for the future behaviour of most processes is the past, except that as we have
seen the historical distribution of stocks do not generally produce a risk-neutral

measure. Even if historical information provided a ¬‚awless guide to the future,
there is too little of it to accurately estimate the large number of parameters
required for a simulation of a market of reasonable size. Some simpli¬cation of
the model is clearly necessary. Are some baskets of stocks independent of other
combinations? What independence can we reasonably assume over time?
As a ¬rst step in simplifying a model, consider some of the common measures
of behaviour. Stocks can go up, or down. The drift of a stock is a tendency in
one or other of these two directions. But it can also go up and down- by a lot
or a little. The measure of this, the variance or variability in the stock returns
is called the volatility of the stock. Our model should have as ingredients these
two quantities. It should also have as much dependence over time and among
di¬erent asset prices as we have evidence to support.

Determining the Process Bt .

We have seen in the last section that given the Q or risk-neutral measure, we can,
at least in theory, determine the price of a derivative if we are given the price Bt
of a risk-free investment at time t (in ¬nance such a yardstick for measuring and
discounting prices is often called a “numeraire”). Unfortunately no completely
liquid risk-free investment is traded on the open market. There are government
treasury bills which, depending on the government, one might wish to assume
are almost risk-free, and there are government bonds, usually with longer terms,
which complicate matters by paying dividends periodically. The question dealt
with in this section is whether we can estimate or approximate an approximate
risk-free process Bt given information on the prices of these bonds. There are
typically too few genuinely risk-free bonds to get a detailed picture of the process
Bs , s > 0. We might use government bonds for this purpose, but are these
genuinely risk-free? Might not the additional use of bonds issued by other large
corporations provide a more detailed picture of the bank account process Bs ?

Can we incorporate information on bond prices from lower grade debt? To
do so, we need a simple model linking the debt rating of a given bond and the
probability of default and payo¬ to the bond-holders in the event of default. To
begin with, let us assume that a given basket of companies, say those with a
common debt rating from one of the major bond rating organisations, have a
common distribution of default time. The thesis of this section is that even if
no totally risk-free investment existed, we might still be able to use bond prices
to estimate what interest rate such an investment would o¬er.
We begin with what we know. Presumably we know the current prices of
marketable securities. This may include prices of certain low-risk bonds with
face value F , the value of the bond on maturity at time T. Typically such a bond
pays certain payments of value dt at certain times t < T and then the face value
of the bond F at maturity time T, unless the bond-holder defaults. Let us assume
for simplicity that the current time is 0. The current bond prices P0 provide some
information on Bt as well as the possibility of default. Suppose we let „ denote
the random time at which default or bankruptcy would occur. Assume that the
e¬ect of possible default is to render the payments at various times random so
for example dt is paid provided that default has not yet occurred, i.e. if „ > t,
and similarly the payment on maturity is the face value of the Bond F if default
has not yet occurred and if it has, some fraction of the face value pF is paid.
When a real bond defaults, the payout to bondholders is a complicated function
of the hierarchy of the bond and may occur before maturity, but we choose this
model with payout at maturity in any case for simplicity. Then the current
price of the bond is the expected discounted value of all future payments, so

X 1 pF F
I(„ · T ) +
P0 = EQ ( ds I(„ > s) + I(„ > T ))
{s;0<s<T }
ds EQ [Bs I(„ > s)] + F EQ [BT (p + (1 ’ p)I(„ > T ))]
{s;0<s<T }

The bank account process Bt that we considered is the compounded value at
time of an investment of $1 deposited at time 0. This value might be random
but the interest rate is declared at the beginning of each period so, for example,
Bt is completely determined at time t ’ 1. In measure-theoretical language, Bt
is Ht’1 measurable for each t. With Q is the risk-neutral distribution
P0 = EQ { ds Bs Q(„ > s|Hs’1 ) + F BT (p + (1 ’ p)Q(„ > T |HT ’1 ))}.
{s;0<s<T }

This takes a form very similar to the price of a bond which does not default but
with a di¬erent bank account process. Suppose we de¬ne a new bank account
process Bs , equivalent in expectation to the risk-free account, but that only
pays if default does not occur in the interval. Such a process must satisfy

EQ (Bs I(„ > s)|Hs’1 ) = Bs .

From this we see that the process Bs is de¬ned by

f on the set Q[„ > s|Hs’1 ] > 0.
Bs =
Q[„ > s|Hs’1 ]

In terms of this new bank account process, the price of the bond can be rewritten
g g
’1 ’1 ’1
P0 = EQ { ds Bs + (1 ’ p)F BT + pF BT }.
{s;0<s<T }

If we subtract from the current bond price the present value of the guaranteed
payment of pF, the result is
g g
’1 ’1 ’1
P0 ’ = EQ { ds Bs + (1 ’ p)F BT }.
pF EQ (BT )
{s;0<s<T }

This equation has a simple interpretation. The left side is the price of the
bond reduced by the present value of the guaranteed payment on maturity F p.
The right hand side is the current value of a risk-free bond paying the same
dividends, with interest rates increased by replacing Bs by Bs and with face
value F (1 ’ p) all discounted to the present using the bank account process

Bs . In words, to value a defaultable bond, augment the interest rate using the
probability of default in intervals, change the face value to the potential loss of
face value on default and then add the present value of the guaranteed payment
on maturity.
Typically we might expect to be able to obtain prices of a variety of bonds
issued on one ¬rm, or ¬rms with similar credit ratings. If we are willing to
assume that such ¬rms share the same conditional distribution of default time
Q[„ > s|Hs’1 ] then they must all share the same process Bs and so each
observed bond price P0 leads to an equation of the form
X g’1
ds vs + (1 ’ p)F BT + pF vT .
P0 =
{s;0<s<T }

g’1 ’1
in the unknowns vs = EQ (Bs ), ...s · T. and vT = EQ (BT ). If we assume
that the coupon dates of the bonds match, then k bonds of a given maturity
T and credit rating will allow us to estimate the k unknown values of vs . Since
the term vT is included in all bonds, it can be estimated from all of the bond
prices, but most accurately from bonds with very low risk.
Unfortunately, this model still has too many unknown parameters to be
generally useful. We now consider a particular case that is considerably simpler.
While it seems unreasonable to assume that default of a bond or bankruptcy
of a ¬rm is unrelated to interest rates, one might suppose some simple model
which allows a form of dependence. For most ¬rms, one might expect that
the probability of survival another unit time is negatively associated with the
interest rate. For example we might suppose that the probability of default in
the next time interval conditional on surviving to the present is a function of
the current interest rate, for example

a + (b ’ 1)rt
ht = Q(„ = t|„ ≥ t, rt ) = .
1 + a + brt

The quantity ht is a more natural measure of the risk at time t than are
other measures of the distribution of „ and the function ht is called the hazard

function. If the constant b > 1+a, then the“hazard” ht increases with increasing
interest rates, otherwise it decreases. In case the default is independent of the
interest rates, we may put b = 1 + a in which case the hazard is a/(1 + a). Then
on the set [„ ≥ s]

1 + rs e
f e
Bs = Bs’1 = (1 + a + brs )Bs’1
1 ’ hs

which means that the bond is priced using a similar bank account process but
one for which the e¬ective interest rate is not rs but a + brs . The di¬erence
a + (b ’ 1)rs between the e¬ective interest rate and rs is usually referred to as
the spread and this model justi¬es using a linear function to model this spread.
Now suppose that default is assumed independent of the past history of interest
rates under the risk-neutral measure Q. In this case, b = 1 + a and the spread
is a(1 + rs ) ' a ' a/(1 + a) provided both a and rs is small. So in this case
the spread gives an approximate risk-neutral probability of default in a given
time interval, conditional on survival to that time.

We might hope that the probabilities of default are very small and follow a
relatively simple pattern. If the pattern is not perfect, then little harm results
provided that indeed the default probabilities are small. Suppose for example
that the time of default follows a geometric distribution so that the hazard is
constant ht = h = a/(1 + a). Then

Bs = (1 + a)s Bs for s > 0.

Bs grows faster than Bs and it grows even faster as the probability of default h
increases. The e¬ective interest rate on this account is approximately a units
per period higher.

Given only three bond prices with the same default characteristics, for ex-
ample, and assuming constant interest rates so that Bs = (1 + r)s , we may solve
for the values of the three unknown parameters (r, a, p) equations of the form

P0 ’ pF (1 + r)’T = (1 + a + r + ar)’s ds + (1 ’ p)F (1 + a + r + ar)’T .

Market prices for a minimum of three di¬erent bonds would allow us to solve
for the unknowns (r, a, p) and these are obtainable from three di¬erent bonds.

Minimum Variance Portfolios and the Capital As-
set Pricing Model.

Let us begin by building a model for portfolios of securities that captures many
of the features of market movements. We assume that by using the methods of
the previous section and the prices of low-risk bonds, we are able to determine
the value Bt of a risk-free investment at time t in the future. Normally these
values might be used to discount future stock prices to the present. However
for much of this section we will consider only a single period and the analysis
will be essentially the same with our without this discounting.
Suppose we have a number n of potential investments or securities, each
risky in the sense that prices at future dates are random. Suppose we denote
the price of these securities at time t by Si (t), i = 1, 2, ..., n. There is a better
measure of the value of an investment than the price of a security or even the
change in the price of a security Si (t) ’ Si (t ’ 1) over a period because this does
not re¬‚ect the cost of our initial investment. A common measure on investments
that allows to obtain prices, but is more stable over time and between securities
is the return. For a security that has prices Si (t) and Si (t + 1) at times t and
t + 1, we de¬ne the return Ri (t + 1) on the security over this time interval by

Si (t + 1) ’ Si (t)
Ri (t + 1) = .
Si (t)

For example a stock that moved in price from $10 per share to $11 per share
over a period of time corresponds to a return of 10%. Returns can be measured

in units that are easily understood (for example 5% or 10% per unit time) and
are independent of the amount invested. Obviously the $1 pro¬t obtained on
the above stock could has easily been obtained by purchasing 10 shares of a
stock whose value per share changed from $1.00 to $1.10 in the same period
of time, and the return in both cases is 10%. Given a sequence of returns and
the initial value of a stock Si (0), it is easy to obtain the stock price at time t
from the initial price at time 0 and the sequence of returns.

Si (t) = Si (0)(1 + Ri (1))(1 + Ri (2))...(1 + Ri (t))

= Si (0)Πt (1 + Ri (s)).

Returns are not added over time they are multiplied as above. A 10% return
followed by a 20% return is not a 30% return but a return equal to (1 + .1)(1 +
.2) ’ 1 or 32%. When we buy a portfolio of stocks, the individual stock returns
combine in a simple fashion to give the return on the whole portfolio. For
example suppose that we wish to invest a total amount $I(t) at time t. The
amounts will change from period to period because we may wish to reinvest
gains or withdraw sums from the account. Suppose the proportion of our total
investment in stock i at time t is wi (t) so that the amount invested in stock i is
wi (t)I(t). Note that since wi (t) are proportions, i=1 wi (t) = 1. What is the
return on this investment over the time interval from t to t + 1? At the end of
this period of time, the value of our investment is
I(t) wi (t)Si (t + 1).

If we now subtract the value invested at the beginning of the period and divide
by the value at the beginning, we obtain
P P n
I(t) n wi (t)Si (t + 1) ’ I(t) n wi (t)Si (t) X
i=1 i=1
Pn = wi (t)Ri (t + 1)
I(t) i=1 wi (t)Si (t) i=1

which is just a weighted average of the individual stock returns. Note that it
does not depend on the initial price of the stocks or the total amount that we

invested at time t. The advantage in using returns instead of stock prices to
assess investments is that the return of a portfolio over a period is a value-
weighted average of the returns of the individual investments.

When time is measured continuously, we might consider de¬ning returns by
using the de¬nition above for a period of length h and then reducing h. In other
words we could de¬ne the instantaneous returns process as

Si (t + h) ’ Si (t)
lim .
Si (t)

In most cases, the returns over shorter and shorter periods are smaller and
smaller, and approach the limit zero so some renormalization is required above.
It seems more sensible to consider returns per unit time and then take a limit

Si (t + h) ’ Si (t)
Ri (t) = lim .
hSi (t)

Notice that by the de¬nition of the derivative of a logarithm and assuming that
this derivative is well-de¬ned,

d ln(Si (t)) 1d
= Si (t)
dt Si (t) dt
Si (t + h) ’ Si (t)
= lim
hSi (t)

= Ri (t)

In continuous time, if the stock price process Si (t) is di¬erentiable, the natural
de¬nition of the returns process is the derivative of the logarithm of the stock
price. This de¬nition needs some adjustment later because the most common
continuous time models for asset prices does not result in a di¬erentiable process
Si (t). The solution we will use then will be to adopt a new concept of an integral
and recast the above in terms of this integral.

The Capital Asset Pricing Model (CAPM)

We now consider a simpli¬ed model for building a portfolio based on quite basic
properties of the potential investments. Let us begin by assuming a single period
so that we are planning at time t = 0 investments over a period ending at time
t = 1. We also assume that investors are interested in only two characteristics of
a potential investment, the expected value and the variance of the return over
this period. We have seen that the return of a portfolio is the value-weighted
average of the returns of the individual investments so let us denote the return
on stock i by

Si (1) ’ Si (0)
Ri = ,
Si (0)

and de¬ne µi = E(Ri ) and wi the proportion of my total investment in stock i
at the beginning of the period. For brevity of notation, let R, w and µ denote
the column vectors

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
R1 w1 µ1
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ R2 ⎟ ⎜ ⎟ ⎜ ⎟
w2 µ2
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ .⎟ ⎜ ⎟ ⎜ ⎟
. .
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
R =⎜ ⎟,w =⎜ ⎟ ,µ =⎜ ⎟.
⎜ .⎟ ⎜ ⎟ ⎜ ⎟
. .
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ .⎟ ⎜ ⎟ ⎜ ⎟
. .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
Rn wn µn

wi Ri or in matrix notation w0 R. Let us
Then the return on the portfolio is i

suppose that the covariance matrix of returns is the n — n matrix Σ so that

cov(Ri , Rj ) = Σij .

We will frequently use the following properties of expected value and covariance.

Lemma 3 Suppose ⎛ ⎞
⎜ ⎟
⎜ ⎟
⎜ R2 ⎟
⎜ ⎟
⎜ ⎟
⎜ .⎟
⎜ ⎟
R =⎜ ⎟
⎜ .⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ .⎟
⎝ ⎠
is a column vector of random variables Ri with E(Ri ) = µi , i = 1, ..., n and
suppose R has covariance matrix Σ. Suppose A is a non-random vector or matrix
with exactly n columns so that AR is a vector of random variables. Then AR
has mean Aµ and covariance matrix AΣA0 .

Then it is easy to see that the expected return from the portfolio with weights
wi is i wi E(Ri ) = i wi µi = w0 µ and the variance is

var(w0 R) = w0 Σw.

We will need to assume that the covariance matrix Σ is non-singular, that
is it has a matrix inverse Σ’1 . This means, at least for the present, that our
model covers only risky stocks for which the variance of returns is positive. If
a risk-free investment is available (for example a secure bond whose return is
known exactly in advance), this will be handled later.
In the Capital Asset Pricing model it is assumed at the outset that investors
concentrate on two measures of return from a portfolio, the expected value and
standard deviation. These expected values and variances are computed under
the real-world probability distribution P not under some risk-neutral Q measure.
Clearly investors prefer high expected return, wherever possible, associated with
small standard deviation of return. As a ¬rst step in this direction suppose we
plot the standard deviation and expected return for the n stocks, i.e. the n
p √
points {(σi , µi ), i = 1, 2, ..., n} where µi = E(Ri ) and σi = var(Ri ) = Σii .
These n points do not consist of the set of all achievable values of mean and

standard of return, since we are able to construct a portfolio with a certain
proportion of our wealth wi invested in stock i.In fact the set of possible points
consists of


{( w0 Σw, w0 µ) as the vector w ranges over all possible weights such that wi = 1}.

The resulting set has a boundary as in Figure 2.2.




·=mean return


Efficient Frontier


(σ ,· )
g g


0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
σ =standard deviation of return

Figure 2.2: The E¬cient Frontier

Exactly what form this ¬gure takes depends in part on the assumptions ap-
plied to the weights. Since they represent the proportion of our total investment
in each of n stocks they must add to one. Negative weights correspond to selling
short one stock so as to be able to invest more in another, and we may assume
no limit on our ability to do so. In this case the only constraint on w is the
constraint wi = 1. With this constraint alone, we can determine the bound-
ary of the admissible set by ¬xing the vertical component (the mean return) of
a portfolio at some value say · and then ¬nding the minimum possible standard

deviation corresponding to that mean. This allows us to determine the leading
edge or left boundary of the region. The optimisation problem is as follows

min w0 Σw subject to

subject to the two constraints on the weights

w0 1 = 1

w0 µ = ·.

where 1 is the column vector of n ones. Since we will often make use of the
method of Lagrange multipliers for constrained problems such as this one, we
interject a lemma justifying the method. For details, consult Apostol (1973),
Section 13.7 or any advanced calculus text.

Lemma 4 Consider the optimisation problem

min{f (w); w ∈ Rn } subject to p constraints (2.10)

of the form g1 (w) = 0, g2 (w) = 0, ..., gp (w) = 0.

Then provided the functions f, g1 , ..., gp are continuously di¬erentiable, a nec-
essary solution for a solution to (2.10) is that there is a solution in the n + p
variables (w1 , ...wn , »1 , ..., »p ) of the equations

{f (w) + »1 g1 (w) + ... + »p gp (w)} = 0, i = 1, 2, ..., n

{f (w) + »1 g1 (w) + ... + »p gp (w)} = 0, j = 1, 2, ..., p.

This constants »i are called the Lagrange multipliers and the function that
is di¬erentiated, {f (w) + »1 g1 (w) + ... + »p gp (w)} is the Lagrangian.
Let us return to our original minimization problem with one small simpli¬-

cation. Since minimizing w0 Σw results in the same weight vector w as does
minimizing w Σw we choose the latter as our objective function.

We introduce Lagrange multipliers »1 , »2 and we wish to solve

‚ 0
{w Σw + »1 (w0 1 ’ 1) + »2 (w0 µ ’ ·)} = 0, i = 1, 2, ..., n
‚ 0
{w Σw + »1 (w0 1 ’ 1) + »2 (w0 µ ’ ·)} = 0, j = 1, 2.

The solution is obtained from the simple di¬erentiation rule

‚0 ‚0
w Σw = 2Σw and µw=w
‚w ‚w

and is of the form

w = »1 Σ’1 1+»2 Σ’1 µ

with the Lagrange multipliers »1 , »2 chosen to satisfy the two constraints, i.e.

»1 10 Σ’1 µ + »2 10 Σ’1 1 = 1

»1 µ0 Σ’1 µ + »2 µ0 Σ’1 1 = ·.

Suppose we de¬ne an n — 2 matrix M with columns 1 and µ,

M =[1 µ]

and the 2 — 2 matrix A = (M 0 Σ’1 M )’1 , then the Lagrange multipliers are
given by the vector
⎛ ⎞ ⎡ ¤
»1 1
»=⎝ ⎠ = A⎣ ¦
»2 ·

and the weights by the vector
⎡ ¤
w = Σ’1 M A ⎣ ¦. (2.11)

We are now in a position to identify the boundary or the curve in Figure 2.2.

As the mean of the portfolio · changes, the point takes the form ( w0 Σw, ·)

with w given by (2.11). Notice that
⎡ ¤
w Σw = [ 1 · ]A0 M 0 Σ’1 ΣΣ’1 M A ⎣ ¦
⎡ ¤
= [ 1 · ]A0 M 0 Σ’1 M A ⎣ ¦
⎡ ¤
= [ 1 · ]A ⎣ ¦

= A11 + 2A12 · + A22 · 2 .

Therefore a point on the boundary (σ, ·) = ( w0 Σw, ·) satis¬es

σ 2 ’ A22 · 2 ’ 2A12 · ’ A11 = 0


σ 2 = A22 · 2 + 2A12 · + A11

= σg + A22 (· ’ ·g )2


10 Σ’1 µ
·g = ’ (2.12)
= 0 ’1
A22 1Σ 1
σg = A11 ’ 12 =
A22 A22
= 0 ’1 .
1Σ 1

and the point (σg , µg ) represents the point in the region corresponding to the
minimum possible standard deviation over all portfolios. This is the most
conservative investment portfolio available with this class of securities. What
weights to do we need to put on the individual stocks to achieve this conservative
portfolio? It is easy to see that the weight vector is given by

10 Σ’1
wg =
10 Σ’1 1

and since the quantity 10 Σ’1 1 in the denominator is just a scale factor to insure
that the weights add to one, the amount invested in stock i is proportional to
the sum of the elements of the i™th row of the inverse covariance matrix Σ’1 .
An equation of the form

σ2 ’ A22 (· ’ ·g )2 = σg

represents a hyperbola since A22 > 0. Of course investors are presumed to prefer
higher returns for a given value of the standard deviation of portfolio so it is
only the upper boundary of this curve in Figure 2.2 that is e¬cient in the sense
that there is no portfolio that is strictly better (better in the sense of higher
return combined with standard deviation that is not larger).
Now let us return to a portfolio whose standard deviation and mean return
lie on the e¬cient frontier. Let us call these e¬cient portfolios. It turns out
that any portfolio on this e¬cient frontier has the same covariance with the
minimum variance portfolio wg R derived above.

Proposition 5 Every e¬cient portfolio has the same covariance with
10 Σ’1 1
the conservative portfolio wg R.

Proof. We noted before that such a portfolio has mean return · and stan-
dard deviation σ which satisfy the relation

σ 2 ’ A22 · 2 ’ 2A12 · ’ A11 = 0.

Moreover the weights for this portfolio are described by
⎡ ¤
w = Σ’1 M A ⎣ ¦. (2.15)

so the returns vector from this portfolio can be written as

w0 R = [ 1 · ]AM 0 Σ’1 R.

It is interesting to observe that the covariance of returns between this e¬cient
portfolio and the conservative portfolio wg R is given by

. 1
( 3)