<<

. 2
( 4)



>>

2
= re’r /2
, for r > 0.

1
and ˜ has probability density function f˜ (θ) = 2π for 0 < θ < 2π. Since
p
r = r(x, y) = x2 + y 2 and θ(x, y) = arctan(y/x), the Jacobian of the trans-
formation is
¯ ¯
¯ ‚r ‚r ¯
¯ ¯
‚(r, θ)
| = ¯ ‚x ‚y ¯
| ¯ ‚θ ‚θ ¯
‚(x, y) ¯ ‚x ‚y ¯
¯ ¯
¯ ¯
¯√x ¯
y
√2 2
¯ ¯
2 2
= ¯ x +y x +y
¯
¯ ¯
¯ x2’y 2 x
¯
x2 +y 2
+y
1
=p
x2 + y 2
Consequently the joint probability density function of (X, Y ) is given by
p p
‚(r, θ) 1 1
2 2
— x2 + y 2 e’(x +y )/2 — p
f˜ (arctan(y/x))fR ( x2 + y 2 )| |=
‚(x, y) 2π x2 + y 2
1 ’(x2 +y2 )/2
= e

1 1
2 2
= √ e’x /2 √ e’y /2
2π 2π
and this is joint probability density function of two independent standard nor-
mal random variables.
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS131

The tails of the distribution of the pseudo-random numbers produced by the
Box-Muller method are quite sensitive to the granularity of the uniform gener-
ator. For this reason although the Box-Muller is the simplest normal generator
it is not the method of choice in most software. A related alternative algorithm
for generating standard normal variates is the Marsaglia polar method. This
is a modi¬cation of the Box-Muller generator designed to avoid the calculation
of sin or cos. Here we generate a point (Z1 , Z2 )from the uniform distribution
on the unit circle by rejection, generating the point initially from the square
’1 · z1 · 1, ’1 · z2 · 1 and accepting it when it falls in the unit circle or
2 2
if z1 + z2 · 1. Now suppose that the points (Z1 , Z2 ) is uniformly distributed
inside the unit circle. Then for r > 0,
q
P [ ’2 log(Z1 + Z2 ) · r] = P [Z1 + Z2 ≥ exp(’r2 /2)]
2 2
2 2


1 ’ area of a circle of radius exp(’r2 /2)
area of a circle of radius 1
2
= 1 ’ e’r /2
.


This is exactly the same cumulative distribution function as that of the random
variable R in Theorem 21. It follows that we can replace R2 by ’2log(Z1 +Z2 ).
2 2


Similarly, if (Z1 , Z2 ) is uniformly distributed inside the unit circle then the
angle subtended at the origin by a line to the point (X, Y ) is random and
Z1
uniformly[0, 2π] distributed and so we can replace cos ˜, and sin ˜ by √ 2 2
Z1 +Z2
Z2
and √ respectively. The following theorem is therefore proved.
2 2
Z1 +Z2


Theorem 23 If the point (Z1 , Z2 ) is uniformly distributed in the unit circle
2 2
Z1 + Z2 · 1, then the pair of random variables de¬ned by
q
Z1
2 2
X = ’2log(Z1 + Z2 ) p 2 2
Z1 + Z2
q
Z2
2 2
Y = ’2log(Z1 + Z2 ) p 2 2
Z1 + Z2

are independent standard normal variables.
132 CHAPTER 3. BASIC MONTE CARLO METHODS




Figure 3.10: Marsaglia™s Method for Generating Normal Random Numbers



If we use acceptance-rejection to generate uniform random variables Z1 , Z2
inside the unit circle, the probability that a point generated inside the square
falls inside the unit circle is π/4,so that on average around 4/π ≈ 1.27 pairs of
uniforms are needed to generate a pair of normal variates.

The speed of the Marsaglia polar algorithm compared to that of the Box-
Muller algorithm depends on the relative speeds of generating uniform variates
versus the sine and cosine transformations. The Box-Muller and Marsaglia polar
method are illustrated in Figure 3.10:

Unfortunately the speed of these normal generators is not the only con-
sideration. If we run a linear congruential generator through a full period we
have seen that the points lie on a lattice, doing a reasonable job of ¬lling the
two dimensional rectangle. Transformations like (3.10) are highly non-linear
functions of (U, V ) stretching the space in some places and compressing it in
others. It would not be too surprising if, when we apply this transformation to
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS133




Figure 3.11: Box Muller transformation applied to the output to xn =
97xn’1 mod 217



our points on a lattice, they do not provide the same kind of uniform coverage
of the space. In Figure 3.11 we see that the lattice structure in the output
from the linear congruential generator results in an interesting but alarmingly
non-normal pattern, particularly sparse in the tails of the distribution. Indeed,
if we use the full-period generator xn = 16807xn’1 mod (231 ’ 1) the smallest
possible value generated for y is around ’4.476 although in theory there should
be around 8,000 normal variates generated below this.

The normal random number generator in Matlab is called normrnd or for
standard normal randn. For example normrnd(µ, σ, m, n) generates a matrix of
m — n pseudo-independent normal variates with mean µ and standard devia-
tion σ and rand(m,n) generates an m — n matrix of standard normal random
numbers. A more precise algorithm is to use inverse transform and a highly
re¬ned rational approximation to normal inverse cumulative distribution func-
134 CHAPTER 3. BASIC MONTE CARLO METHODS

tion available from P.J. Acklam (2003). The Matlab implementation of this
inverse c.d.f. is called ltqnorm after application of a re¬nement, achieves full
machine precision. In R or Splus, the normal random number generator is called
rnorm. The inverse random number function in Excel has been problematic in
many versions. These problems appear to have been largely corrected in Excel
2002, although there is still signi¬cant error (roughly in the third decimal) in
the estimation of lower and upper tail quantiles. The following table provides
a comparison of the normsinv function in Excel and the Matlab inverse nor-
mal norminv. The “exact” values agree with the values generated by Matlab
norminv to the number of decimals shown.

p Excel 2002 Exact
10’1 -1.281551939 -1.281551566
10’2 -2.326347 -2.326347874
10’3 -3.090252582 -3.090232306
10’4 -3.719090272 -3.719016485
10’5 -4.265043367 -4.264890794
10’6 -4.753672555 -4.753424309
10’7 -5.199691841 -5.199337582
10’8 -5.612467211 -5.612001244
10’9 -5.998387182 -5.997807015
10’10 -6.362035677 -6.361340902

The Lognormal Distribution

If Z is a normal random variable with mean µ and variance σ 2 , then we say that
the distribution of X = eZ is lognormal with mean E(X) = · = exp{µ+σ2 /2} >
0 and parameter σ > 0. Because a lognormal random variable is obtained by
exponentiating a normal random variable it is strictly positive, making it a
reasonable candidate for modelling quantities such as stock prices, exchange
rates, lifetimes, though in a fools paradise in which stock prices and lifetimes
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS135

are never zero. To determine the lognormal probability density function, notice
that

P (X · x] = P [eZ · x]

= P [Z · ln(x)]
ln(x) ’ µ
) with ¦ the standard normal c.d.f.
= ¦(
σ

and di¬erentiating to obtain the probability density function g(x|·, σ) of X, we
obtain

ln(x) ’ µ
d
g(x|·, σ) = ¦( )
dx σ
1
√ exp{’(ln(x) ’ µ)2 /2σ 2 }
=
xσ 2π
1
√ exp{’(ln(x) ’ ln(·) + σ 2 /2)2 /2σ 2 }
=
xσ 2π

A random variable with a lognormal distribution is easily generated by gen-
erating an appropriate normal random variable Z and then exponentiating. We
may use either the parameter µ, the mean of the random variable Z in the expo-
nent or the parameter ·, the expected value of the lognormal. The relationship
is not as simple as a naive ¬rst impression might indicate since

E(eZ ) 6= eE(Z) .

Now is a good time to accommodate to this correction factor of σ 2 /2 in the
exponent

2 2
· = E(eZ ) = eE(Z)+σ /2
= eµ+σ /2
or,
2
E(eZ’µ’σ /2
)=1

since a similar factor appears throughout the study of stochastic integrals and
mathematical ¬nance. Since the lognormal distribution is the one most often
used in models of stock prices, it is worth here recording some of its conditional
moments used in the valuation of options. In particular if X has a lognormal
136 CHAPTER 3. BASIC MONTE CARLO METHODS

2
distribution with mean · = eµ+σ /2
and volatility parameter σ, then for any p
and l > 0,
Z∞
1
E[X p I(X > l)] = √ xp’1 exp{’(ln(x) ’ µ)2 /2σ2 }dx
σ 2π l
Z∞
1
ezp exp{’(z ’ µ)2 /2σ2 }dz
=√
σ 2π ln(l)
pµ+p2 σ 2 /2 Z ∞
1
exp{’(z ’ ξ)2 /2σ 2 }dz where ξ = µ + σ 2 p
=√e
σ 2π ln(l)
ξ ’ ln(l)
22
= epµ+p σ /2 ¦( )
σ
σ2 1
= · exp{’ p(1 ’ p)}¦(σ’1 ln(·/l) + σ(p ’ ))
p
(3.11)
2 2

where ¦ is the standard normal cumulative distribution function.


Application: A Discrete Time Black-Scholes Model

Suppose that a stock price St , t = 1, 2, 3, ... is generated from an independent
sequence of returns Z1 , Z2 over non-overlapping time intervals. If the value of
the stock at the end of day t = 0 is S0 , and the return on day 1 is Z1 then the
value of the stock at the end of day 1 is S1 = S0 eZ1 . There is some justice in the
use of the term “return” for Z1 since for small values Z1 , S0 eZ1 ' S0 (1 + Z1 )
S1 ’S0
and so Z1 is roughly Assume similarly that the stock at the end of day
S1 .

i has value Si = Si’1 exp(Zi ). In general for a total of j such periods (suppose
Pj
there are n such periods in a year) we assume that Sj = S0 exp{ i=1 Zi } for
independent random variables Zi all have the same normal distribution. Note
that in this model the returns over non-overlapping independent periods of time
are independent. Denote var(Zi ) = σ2 /N so that
N
X
Zi ) = σ2
var(
i=1

represents the squared annual volatility parameter of the stock returns. Assume
that the annual interest rate on a risk-free bond is r so that the interest rate
per period is r/N .
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS137

Recall that the risk-neutral measure Q is a measure under which the stock
price, discounted to the present, forms a martingale. In general there may be
many such measures but in this case there is only one under which the stock
P
price process has a similar lognormal representation Sj = S0 exp{ j Zi } for
i=1

independent normal random variables Zi . Of course under the risk neutral
measure, the normal random variables Zi may have a di¬erent mean. Full
justi¬cation of this model and the uniqueness of the risk-neutral distribution
really relies on the continuous time version of the Black Scholes described in
Section 2.6. Note that if the process
j
X r
’rt/N
Sj = S0 exp{ (Zi ’ )}
e
N
i=1

is to form a martingale under Q, it is necessary that

EQ [Sj+1 |Ht ] = Sj or
r r
EQ [Sj exp{Zj+1 ’ }|Hj ] = Sj EQ [exp{Zj+1 ’ }]
N N
= Sj

r
and so exp{Zj+1 ’ N} must have a lognormal distribution with expected value
1. Recall that, from the properties of the lognormal distribution,

σ2
r r
EQ [exp{Zt+1 ’ }] = exp{EQ (Zt+1 ) ’ }
+
N N 2N
σ2
since varQ (Zt+1 ) = In other words, for each i the expected value of Zi is,
N.
σ2
r

under Q, equal to 2N . So under Q, Sj has a lognormal distribution with
N

mean
S0 erj/N
p
and volatility parameter σ j/N .
Rather than use the Black-Scholes formula of Section 2.6, we could price a
call option with maturity j = N T periods from now by generating the random
path Si , i = 1, 2, ...j using the lognormal distribution for Sj and then averaging
138 CHAPTER 3. BASIC MONTE CARLO METHODS

the returns discounted to the present. The value at time j = 0 of a call option
with exercise price K is an average of simulated values of
T
X
’rj/N + ’rj/N
Zi } ’ K)+ ,
(Sj ’ K) = e
e (S0 exp{
i=1

with the simulations conducted under the risk-neutral measure Q with initial
stock price the current price S0 . Thus the random variables Zi are independent
σ2 σ2
r
N(N ’ The following Matlab function simulates the stock price over
2N , N ).

the whole period until maturity and then values a European call option on the
stock by averaging the discounted returns.

Example 24 (simulating the return from a call option)

Consider simulating a call option on a stock whose current value is S0 =
$1.00. The option expires in j days and the strike price is K = $1.00. We
assume constant spot (annual) interest rate r and the stock price follows a
lognormal distribution with annual volatility parameter σ. The following Matlab
function provides a simple simulation and graph of the path of the stock over
the life of the option and then outputs the discounted payo¬ from the option.


function z=plotlogn(r,sigma,T, K)

% outputs the discounted simulated return on expiry of a call option (per dollar

pv of stock).

% Expiry =T years from now, (T = j/N )

% current stock price=$1. (= S0 ), r = annual spot interest rate, sigma=annual

volatility (=σ),

% K= strike price.

N=250 ; % N is the assumed number of business days in a

year.

j=N*T; % the number of days to expiry

s = sigma/sqrt(N); % s is volatility per period
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS139




Figure 3.12: One simulation of the return from a call option with strike price
$1.00



mn = r/N - s^2/2; % mn= mean of the normal increments per period

y=exp(cumsum(normrnd(mn,s,j,1)));

y=[1 y™]; % the value of the stock at times 0,...,

% the time points i
x = (0:j)/N;

plot(x,y,™-™,x,K*ones(1,j+1),™y™)

xlabel(™time (in years)™)

ylabel(™value of stock™)

title(™SIMULATED RETURN FROM CALL OPTION™)

z = exp(-r*T)*max(y(j+1)-K, 0); % payo¬ from option discounted to

present


Figure 3.12 resulted from one simulation run with r = .05, j = 63 (about 3
months), σ = .20, K = 1.
140 CHAPTER 3. BASIC MONTE CARLO METHODS

The return on this run was the discounted di¬erence between the terminal
value of the stock and the strike price or around 0.113. We may repeat this
many times, averaging the discounted returns to estimate the present value of
the option.

For example to value an at the money call option with exercise price=the
initial price of the stock=$1, 5% annual interest rate, 20% annual volatility
and maturity 0.25 years from the present, we ran this function 100 times and
averaged the returns to estimate the option price as 0.044978. If we repeat the
identical statement, the output is di¬erent, for example option val= 0.049117
because each is an average obtained from only 100 simulations. Averaging over
more simulations would result in greater precision, but this function is not
written with computational e¬ciency in mind. We will provide more e¬cient
simulations for this problem later. For the moment we can compare the price of
this option as determined by simulation with the exact price according to the
Black-Scholes formula. This formula was developed in Section 2.6. The price of
a call option at time t = 0 given by


V (ST , T ) = ST ¦(d1 ) ’ Ke’rT /N ¦(d2 )


where

σ2 σ2
log(ST /K) + (r ’
log(ST /K) + (r + 2 )T /N 2 )T /N
p p
and d2 =
d1 =
σ T /N σ T /N

and the Matlab function which evaluates this is the function blsprice which gives,
in this example, and exact price on entering [CALL,PUT] =BLSPRICE(1,1,.05,63/250,.2,0)
which returns the value CALL=0.0464. With these parameters, 4.6 cents on
the dollar allows us to lock in any anticipated pro¬t on the price of a stock (or
commodity if the lognormal model ¬ts) for a period of about three months. The
fact that this can be done cheaply and with ease is part of the explanation for
the popularity of derivatives as tools for hedging.
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS141




Figure 3.13: Comparison between the Lognormal and the Gamma densities



Algorithms for Generating the Gamma and Beta Distribu-
tions

We turn now to algorithms for generating the Gamma distribution with density

xa’1 e’x/b
, for x > 0, a > 0, b > 0. (3.12)
f (x|a, b) =
“(a)ba

The exponential distribution (a = 1) and the chi-squared (corresponding to
a = ν/2, b = 2, for ν integer) are special cases of the Gamma distribution.
The gamma family of distributions permits a wide variety of shapes of density
functions and is a reasonable alternative to the lognormal model for positive
quantities such as asset prices. In fact for certain parameter values the gamma
density function is very close to the lognormal. Consider for example a typical
lognormal random variable with mean · = 1.1 and volatility σ = 0.40.
The probability density functions can be quite close as in Figure ??. Of
course the lognormal, unlike the gamma distribution, has the additional attrac-
tive feature that a product of independent lognormal random variables also has
142 CHAPTER 3. BASIC MONTE CARLO METHODS

a lognormal distribution.
Another common distribution closely related to the gamma is the Beta dis-
tribution with probability density function de¬ned for parameters a, b > 0,
“(a + b) a’1
x (1 ’ x)b’1 , 0 · x · 1. (3.13)
f (x) =
“(a)“(b)
The beta density obtains for example as the distribution of order statistics in
a sample from independent uniform [0, 1] variates. This is easy to see. For
example if U1 , ..., Un are independent uniform random variables on the interval
[0, 1] and if U(k) denotes the k 0 th largest of these n values, then

P [U(k) < x] = P [there are k or more values less than x]
X µn¶
n
xj (1 ’ x)n’j .
=
j
j=k

Di¬erentiating we ¬nd the probability density function of U(k) to be
nµ¶ X µn¶
n
dX n j n’j
{jxj’1 (1 ’ x)n’j + (n ’ j)xj (1 ’ x)n’j’1 }
x (1 ’ x) =
dx j j
j=k j=k
µ¶
n k’1
(1 ’ x)n’k
=k x
k
“(n + 1)
xk’1 (1 ’ x)n’k
=
“(k)“(n ’ k + 1)
and this is the beta density with parameters a = k ’ 1, b = n ’ k + 1. Order
statistics from a Uniform sample therefore have a beta distribution with the
k™th order statistic having the Beta(k ’ 1, n ’ k + 1) distribution. This means
that order statistics from more general continuous distributions can be easily
generated using the inverse transform and a beta random variable. For example
suppose we wish to simulate the largest observation in a normal(µ, σ 2 ) sample of
size 100. Rather than generate a sample of 100 normal observations and take the
largest, we can simulate the value of the largest uniform order statistic U(100) ∼
Beta(99, 1) and then µ + σ¦’1 (U(100) ) (with ¦’1 the standard normal inverse
cumulative distribution function) is the required simulated value. This may be
used to render simulations connected with risk management more e¬cient.
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS143

The following result lists some important relationships between the Gamma
and Beta distributions. For example it allows us to generate a Beta random
variable from two independent Gamma random variables.

Theorem 25 (Gamma distribution) If X1 , X2 are independent Gamma (a1 , b)
X1
and Gamma (a2 , b) random variables, then Z = and Y = X1 + X2
X1 +X2

are independent random variables with Beta (a1 , a2 ) and Gamma (a1 + a2 , b)
distributions respectively. Conversely, if (Z, Y ) are independent variates with
Beta (a1 , a2 ) and the Gamma (a1 + a2 , b) distributions respectively, then X1 =
Y Z, and X2 = Y (1 ’ Z) are independent and have the Gamma (a1 , b) and
Gamma (a2 , b) distributions respectively.

Proof. Assume that X1 , X2 are independent Gamma (a1 , b) and Gamma
(a2 , b) variates. Then their joint probability density function is
1
xa1 ’1 xa2 ’1 e’(x1 +x2 )/b , for x1 > 0, x2 > 0.
fX1 X2 (x1 , x2 ) =
“(a1 )“(a2 ) 1 2


Consider the change of variables x1 (z, y) = zy, x2 (z, y) = (1 ’ z)y. Then the
Jacobian of this transformation is given by
¯ ¯¯ ¯
¯ ‚x1 ‚x1 ¯ ¯ ¯
¯ ‚z ¯ ¯y z¯
¯ ‚y ¯ ¯ ¯
¯ ‚x2 ‚x2 ¯ = ¯ ¯
¯ ‚z ¯ ¯ ’y 1’z ¯
‚y

= y.

Therefore the joint probability density function of (z, y) is given by
¯ ¯
¯ ‚x1 ‚x1 ¯
¯ ¯
fz,y (z, y) = fX1 X2 (zy, (1 ’ z)y) ¯ ‚z ‚y ¯
¯ ‚x2 ‚x2 ¯
¯ ‚z ¯
‚y
1
z a1 ’1 (1 ’ z)a2 ’1 y a1 +a2 ’1 e’y/b , for 0 < z < 1, y > 0
=
“(a1 )“(a2 )
“(a1 + a2 ) a1 ’1 1
(1 ’ z)a2 ’1 — y a1 +a2 ’1 e’y/b , for 0 < z < 1, y > 0
= z
“(a1 )“(a2 ) “(a1 + a2 )
and this is the product of two probability density functions, the Beta(a1 , a2 )
density for Z and the Gamma( a1 + a2 , b) probability density function for Y.
The converse holds similarly.
144 CHAPTER 3. BASIC MONTE CARLO METHODS

This result is a basis for generating gamma variates with integer value of the
parameter a (sometimes referred to as the shape parameter). According to the
theorem, if a is integer and we sum a independent Gamma(1,b) random vari-
ables the resultant sum has a Gamma(a, b) distribution. Notice that ’b log(Ui )
for uniform[0, 1] random variable Ui is an exponential or a Gamma(1, b) random
Qn
variable. Thus ’b log( i=1 Ui ) generates a gamma (n, b) variate for independent
uniform Ui . The computation required for this algorithm, however, increases
linearly in the parameter a = n, and therefore alternatives are required, es-
pecially for large a. Observe that the scale parameter b is easily handled in
general: simply generate a random variable with scale parameter 1 and then
multiply by b. Most algorithms below, therefore, are only indicated for b = 1.
For large a Cheng (1977) uses acceptance-rejection from a density of the
form
x»’1
(3.14)
g(x) = »µ dx , x > 0
(µ + x» )2
called the Burr XII distribution. The two parameters µ and » of this den-
sity (µ is not the mean) are chosen so that it is as close as possible to the
gamma distribution. We can generate a random variable from (3.14) by inverse
µU
transform as G’1 (U ) = { 1’U }1/» .
A much simpler function for dominating the gamma densities is a minor
extension of that proposed by Ahrens and Dieter (1974). It corresponds to
using as a dominating probability density function
§

⎨ a’1 kxa’1 0·x·k
k ( a +exp(’k))
(3.15)
g(x) = ,x > k
⎪ a’1 ’x
© a’1k k e x>k
k ( +exp(’k))
a


Other distributions that have been used as dominating functions for the
Gamma are the Cauchy (Ahrens and Dieter), the Laplace (Tadakamalla), the
exponential (Fishman), the Weibull, the relocated and scaled t distribution
with 2 degrees of freedom (Best), a combination of normal density (left part)
and exponential density (right part) (Ahrens and Dieter), and a mixture of two
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS145

Erlang distributions (Gamma with integral shape parameter ±).

Best™s algorithm generates a Student™s t2 variate as

2(U ’ 1/2)
Y= p (3.16)
U (1 ’ U
where U ∼ U [0, 1]. Then Y has the Students t distribution with 2 degrees of
freedom having probability density function
1
(3.17)
g(y) = .
(2 + y 2 )3/2
p
We then generate a random variable X = (a ’ 1) + Y 3a/2 ’ 3/8 and apply
a rejection step to X to produce a Gamma random variable. See Devroye (p.
408) for details.
Most of the above algorithms are reasonably e¬cient only for a > 1 with
the one main exception being the combination of power of x and exponential
density suggested by Ahrens and Dieter above. Cheng and Feast (1979) also
suggest a ratio of uniforms algorithm for the gamma distribution, a > 1.
A ¬nal fast and simple procedure for generating a gamma variate with a > 1
is due to Marsaglia and Tsang (2000) and generates a gamma variate as the cube
of a suitably scaled normal. Given a fast generator of the Normal to machine
1
precision, this is a highly e¬cient rejection technique. We put d = a ’ and
3

generate a standard normal random variable X and a uniform variate U until,
X
√ )3 ,
with V = (1 + the following inequality holds:
9d

X2
+ d ’ dV + d ln(V ).
ln(U ) <
2
When this inequality is satis¬ed, we accept the value d — V as obtained
from the Gamma(a, 1) distribution. As usual multiplication by b results in
a Gamma(a, b) random variable. The e¬ciency of this algorithm appears to be
very high (above 96% for a > 1).
In the case 0 < a < 1, Stuart™s theorem below allows us to modify a Gamma
variate with a > 1 to one with a < 1. We leave the proof of the theorem as an
exercise.
146 CHAPTER 3. BASIC MONTE CARLO METHODS

Theorem 26 (Stuart) Suppose U is uniform [0, 1] and X is Gamma (a + 1, 1)
independent of U . Then XU 1/a has a gamma (a, 1) distribution


The Matlab function gamrnd uses Best™s algorithm and acceptance rejection
for ± > 1. For ± < 1, it uses Johnk™s generator, which is based on the following
theorem.


Theorem 27 (Johnk) Let U and V be independent Uniform[0,1] random
variables. Then the conditional distribution of

U 1/±
X = 1/±
+ V 1/(1’±)
U

given that the denominator U 1/± + V 1/(1’±) < 1 is Beta(±, 1 ’ ±).



Multiplying this beta random variable by an independent exponential (1)
results in a Gamma(±, 1) random variable.
Toward generating the beta distribution, use of Theorem 24 and the variable
X1
with X1 , X2 independent gamma variates is one method of using
Z= X1 +X2

a gamma generator to produce beta variates, and this is highly competitive as
long as the gamma generator is reasonably fast. The MATLAB generator is
betarnd(a,b,1,n) Alternatives are, as with the gamma density, rejection from a
Burr XII density (Cheng, 1978) and use of the following theorem as a generator
(due to Johnk). This a more general version of the theorem above.


Theorem 28 (Beta distribution)


Suppose U, V are independent uniform[0, 1] variates. Then the conditional
distribution of
U 1/a
(3.18)
X=
U 1/a + V 1/b
given that U 1/a + V 1/b · 1 is Beta (a, b). Similarly the conditional distribution
of U 1/a given that U 1/a + V 1/b · 1 is Beta (a + 1, b).
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS147

Proof. De¬ne a change of variables
U 1/a
, Y =U 1/a + V 1/b
X = 1/a 1/b
U +V
or U = (Y X)a and V = [(1 ’ X)Y ]b

so that the joint probability density function of (X, Y ) is given by
¯ ¯
¯ ‚u ‚u ¯
¯ ¯
fX,Y (x, y) = fU,V ((yx)a , [(1 ’ x)y]b ) ¯ ‚x ‚y ¯
¯ ‚v ‚v ¯
¯ ‚x ‚y ¯

= aby a+b’1 xa’1 (1 ’ x)b’1

1 1
provided either (0 < x < 1 and y < 1) or (1 ’ and 1 < y < 2).
<x<
y y

Notice that in the case y < 1, the range of values of x is the unit interval
and does not depend on y and so the conditional probability density function of
X given Y = y is a constant times xa’1 (1 ’ x)b’1 , i.e. is the Beta(a, b)
probability density function. The rest of the proof is similar.
A generator exploiting this theorem produces pairs (U, V ) until the condi-
tion is satis¬ed and then transforms to the variable X. However, the probability
“(a+1)“(b+1)
that the condition is satis¬ed is which is close to 0 unless a, b are
“(a+b+1)

small, so this procedure should be used only for small values of both parame-
ters. Theorems 24 and 25 together provide an algorithm for generating Gamma
variates with non-integral a from variates with integral ones. For example if X
is Gamma(4, 1)and Z is independent Beta (3.4, .6)then XZ is Gamma (3.4, 1).
There are various other continuous distributions commonly associated with
statistical problems. For example the Student™s t-distribution with ν degrees
q

of freedom is de¬ned as a ratio X Z where Z is standard normal and X is

gamma ( ν , 2). Alternatively, we may use ν √X’1/2 where X is generated as
2 X(1’X)
a symmetric beta(ν/2, ν/2) variate.

Example 29 (some alternatives to lognormal distribution)

The assumption that stock prices, interest rates, or exchange rates follow a
lognormal distribution is a common exercise in wishful thinking. The lognormal
148 CHAPTER 3. BASIC MONTE CARLO METHODS

distribution provides a crude approximation to many ¬nancial time series, but
other less theoretically convenient families of distributions sometimes provide a
better approximation. There are many possible alternatives, including the stu-
dents t distribution and the stable family of distributions discussed later. Sup-
pose, for the present, we modify the usual normal assumption for stock returns
slightly by assuming that the log of the stock price has a distribution “close”
to the normal but with somewhat more weight in the tails of the distribution.
Speci¬cally assume that under the Q measure, ST = S0 exp{µ + cX} where X
has cumulative distribution function F (x). Some constraint is to be placed on
the constant c if we are to compare the resulting prices with the Black-Scholes
model and it is natural to require that both models have identical volatility,
or identical variance of returns. Since the variance of the return in the Black
Scholes model over a period of length T is σ 2 T where σ is the annual volatility,
we therefore require that
s
σ2T
var(cX) = σ 2 T or c = .
var(X)

The remaining constraint is required of all option pricing measures is the martin-
gale constraint and this implies that the discounted asset price is a martingale,
and in consequence
e’rT EQ ST = S0 . (3.19)

Letting the moment generating function of X be

m(s) = EesX ,

the constraint (3.19) becomes

eµ’rT m(c) = 1

and solving for µ, we obtain

µ = rT ’ ln(m(c)).
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS149

Provided that we can generate from the cumulative distribution function of X,
the price of a call option with strike price K under this returns distribution
can be estimated from N simulations by the average discounted return from N
options,
N N
1X X
’rT 1
’rT +
(S0 eµ+cXi ’ K)+
(ST i ’ K) = e
e
N i=1 N i=1
N
1X
’rT
(S0 erT ’ln(m(c))+cXi ’ K)+
=e
N i=1
N
1X ecXi
’ e’rT K)+
= (S0
N i=1 m(c)

A more precise calculation is the di¬erence between the option price in this
case and the comparable case of normally distributed returns. Suppose we use
inverse transform together with a uniform[0,1] variate to generate both the
random variable Xi = F ’1 (Ui ) and the corresponding normal return Zi =

rT + σ T ¦’1 (Ui ). Then the di¬erence is estimated by

option price under F ’ option price under ¦
N
1X
’1
ecF (Ui ) √ ’1 2
’ e’rT K)+ ’ (S0 eσ T ¦ (Ui )’σ T /2 ’ e’rT K)+ }
' {(S0
N i=1 m(c)

If necessary, in case the moment generating function of X is unknown, we can
estimate it and the variance of X using sample analogues over a large number
N of simulations. In this case c is estimated by
s
σ2 T
d
v ar(X)

d
with v ar representing the sample variance and m(c) estimated by
N
1 X cF ’1 (Ui )
e .
N i=1

To consider a speci¬c example, the logistic(0, 0.522) distribution is close to the
normal, except with slightly more weight in the tails. The scale parameter in
150 CHAPTER 3. BASIC MONTE CARLO METHODS

this case was chosen so that the logistic has approximate unit variance. The
1
cumulative distribution function is F (x) = and its inverse is X =
1+exp{’x/b}

b ln(U/(1 ’ U )). The moment generating function is m(s) = “(1 ’ bs)“(1 +
bs), s < 1/b. The following function was used to compare the price of a call
option when stock returns have the logistic distribution(i.e. stock prices have
the “loglogistic” distribution) with the prices in the Black-Scholes model.
function [re,op1,opbs]=di¬optionprice(n,So,strike,r,sigma,T)

%estimates the relative error in the BS option price and price under

% logistic returns distribution . Runs n simulations.

u=rand(1,n);

x=log(u./(1-u)); % generates standard

logistic*

z=sigma*sqrt(T)*norminv(u)-sigma^2*T/2;

c=sigma*sqrt(T/var(x));

mc=mean(exp(c*x));

re=[]; op1=[]; opbs=[];

for i=1:length(strike)

op1=[op1 mean(max(exp(c*x)*So/mc-exp(-r*T)*strike(i),0))]; % price under F

opbs=[opbs mean(max(So*exp(z)-exp(-r*T)*strike(i),0))]; % price under BS

end

dif=op1-opbs;

re=[re dif./(dif+BLSPRICE(So,strike,r,T,sigma,0))];

plot(strike/So,re)

xlabel(™Strike price/initial price™)

ylabel(™relative error in Black Scholes formula™)




The relative error in the Black-Scholes formula obtained from a simulation of
100,000 is graphed in Figure 3.14. The logistic distribution di¬ers only slightly
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS151




Figure 3.14: Relative Error in Black-Scholes price when asset prices are loglo-
gistic, σ = .4, T = .75, r = .05




from the standard normal, and the primary di¬erence is in the larger kurtosis
or weight in the tails. Indeed virtually any large ¬nancial data set will di¬er
from the normal in this fashion; there may be some skewness in the distribution
but there is often substantial kurtosis. How much di¬erence does this slightly
increased weight in the tails make in the price of an option? Note that the
Black-Scholes formula overprices all of the options considered by up to around
3%. The di¬erences are quite small, however and there seems to be considerable
robustness to the Black-Scholes formula at least for this type of departure in
the distribution of stock prices.


A change in the single line x=log(u./(1-u)) in the above function permits
revising the returns distribution to another alternative. For example we might
152 CHAPTER 3. BASIC MONTE CARLO METHODS

choose the double exponential or Laplace density

1
f (x) = exp(’|x|)
2

for returns, by replacing this line by x = (u < .5) log(2 — u) ’ (u > .5) log(2 — (1 ’
u)). The resulting Figure 3.15 shows a similar behaviour but more substantial
pricing error, in this case nearly 10% for an at-the-money option.




Figure 3.15: Relative pricing error in Black Scholes formula when returns follow
the Laplace distribution



Another possible distribution of stock returns which can be used to introduce
some skewness to the returns distribution is the loggamma or extreme value
distribution whose probability density function takes the form



1
exp{’e(x’c) + (x ’ c)a}, ’∞ < x < ∞.
f (x) =
“(a)

We can generate such a distribution as follows. Suppose Y is a random
variable with gamma(a, ec ) distribution and probability density function

y a’1 e’ca ’ye’c
g(y) = e .
“(a)
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS153

and de¬ne X = ln(Y ). Then X has probability density function
dx 1
f (x) = g(ex )| exp{(x(a ’ 1) ’ ca ’ ex’c }ex
e |=
dx “(a)
1
exp{’ex’c + (x ’ c)a}, ’∞ < x < ∞.
=
“(a)
As an example in Figure 3.16 we plot this density in the case a = 2, c = 0 This
distribution is negatively skewed, a typical characteristic of risk-neutral distri-
butions of returns. The large left tail in the risk-neutral distribution of returns
re¬‚ects the fact that investors have an aversion to large losses and consequently
the risk-neutral distribution in¬‚ates the left tail.




x
Figure 3.16: The probability density function e’e +2x




Introducing a scale parameter ν, the probability density function of ν ln(Y ) =
ln(Y ν ) where Y has a Gamma(2,1) distribution is
(νx’c)
f (x) = νe’e +2(νx’c)
.

The mean is approximately 0 and variance approximately σ 2 when we choose
c = ’.42278 and ν = .80308/σ and so this distribution is analogous to the
154 CHAPTER 3. BASIC MONTE CARLO METHODS

standard normal. However, the skewness is ’0.78 and this negative skewness is
more typical of risk neutral distributions of stock returns. We might ask whether
the Black-Scholes formula is as robust to the introduction of skewness in the
returns distribution as to the somewhat heavier tails of the logistic distribution.
For comparison with the Black-Scholes model we permitted adding a constant
and multiplying the returns by a constant which, in this case, is equivalent to
assuming under the risk neutral distribution that

ST = S0 e± Y ν , Y is Gamma(2,1)

where the constants ± and ν are chosen so that the martingale condition is
satis¬ed and the variance of returns matches that in the lognormal case. With
some integration we can show that this results in the equations

± = ’ ln(E(Y ν )) = ’ ln(“(2 + ν))

ν 2 var(ln(Y )) = ν 2 ψ 0 (2) = σ 2 T

where ψ 0 (±) is the trigamma function de¬ned as the second derivative of
P∞ 1
ln(“(±)), and evaluated fairly easily using the series ψ 0 (±) = k=0 (k+±)2 .

For the special cases required here, ψ 0 (2) ≈ .6449 so ν ≈ σ T /.8031 and

± = ’ log(“(2 + σ T /.8031)). Once again replacing the one line marked with
a * in the function di¬optionprice by x=log(gaminf(u,2,1); permits determining
the relative error in the Black-Scholes formula. There is a more signi¬cant pric-
ing error in the Black-Scholes formula now, more typical of the relative pricing
error that is observed in practice. Although the graph can be shifted and tilted
somewhat by choosing di¬erent variance parameters, the shape appears to be a
consequence of assuming a symmetric normal distribution for returns when the
actual risk-neutral distribution is skewed. It should be noted that the practice
of obtaining implied volatility parameters from options with similar strike prices
and maturities is a partial, though not a compete, remedy to the substantial
pricing errors caused by using a formula derived from a frequently ill-¬tting
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS155




Figure 3.17: Relative Error in Black-Scholes formula when Asset returns follow
extreme value
156 CHAPTER 3. BASIC MONTE CARLO METHODS

Black_Scholes model.


The Symmetric Stable Laws

A ¬nal family of distributions of increasing importance in modelling is the stable
family of distributions. The stable cumulative distribution functions F are
such that if two random variables X1 and X2 are independent with cumulative
distribution function F (x) then so too does the sum X1 + X2 after a change
in location and scale. More generally the cumulative distribution function F
of independent random variables X1 , X2 is said to be stable if for each pair of
constants a and b, there exist constants c and m such that

a X1 + b X2 ’ m
c

has the same cumulative distribution function F. A stable random variable X
is most easily characterized through its characteristic function
§
⎨ exp(iuθ ’ |u| ± c± (1 ’ iβ(sign u) tan π± ) for ± 6= 1
2
iuX
Ee =
© exp(iuθ ’ |u|c(1 + iβ(sign u) ln |u|) 2 ) if ±=1
π


where i is the complex number i2 = ’1, θ is a location parameter of the
distribution, and c is a scale parameter. The parameter 0 < ± · 2 is the index
of the stable distribution and governs the tail behavior and β ∈ [’1, 1] governs
the skewness of the distribution. In the case β = 0, we obtain the symmetric
stable family of distributions, all unimodal densities, symmetric about their
mode, and roughly similar in shape to the normal or Cauchy distribution (both
special cases). They are of considerable importance in ¬nance as an alternative
to the normal distribution, in part because they tend to ¬t observations better in
the tail of the distribution than does the normal, and in part because they enjoy
theoretical properties similar to those of the normal family: sums of independent
stable random variables are stable. Unfortunately, this is a more complicated
family of densities to work with; neither the density function nor the cumulative
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS157

distribution function can be expressed in a simple closed form. Both require a
series expansion. The parameter 0 < ± · 2 indicates what moments exist.
Except in the special case ± = 2 (the normal distribution) or the case β = ’1,
moments of order less than ± exist while moments of order ± or more do not.
This is easily seen because the tail behaviour is, when ± < 2,
1+β ±
lim x± P [X > x] = K± c
2
x’∞
1’β ±
lim x± P [X < ’x] = K± c
2
x’∞

for constant K± depending only on ±. Of course, for the normal distribution,
moments of all orders exist. The stable laws are useful for modelling in situ-
ations in which variates are thought to be approximately normalized sums of
independent identically distributed random variables. To determine robustness
against heavy-tailed departures from the normal distribution, tests and estima-
tors can be computed with data simulated from a symmetric stable law with
± near 2. The probability density function does not have a simple closed form
except in the case ± = 1 (Cauchy) and ± = 2 (Normal) but can be expressed
as a series expansion of the form
¡ ¢

1X “ 2k+1 x k
(’1)k ±
fc (x) = ()
π±c (2k)! c
k=0

where c is the scale parameter (and we have assumed the mode is at 0). Espe-
cially for large values of x, this probability density function converges extremely
slowly. However, Small (2003) suggests using an Euler transformation to accel-
erate the convergence of this series, and this appears to provide enough of an
improvement in the convergence to meet a region in which a similar tail formula
(valid for large x) provides a good approximation. According to Chambers,
Mallows and Stuck, (1976), when 1 < ± < 2, such a variate can be generated as
¸ 1 ’1
cos(U (1 ’ ±)) ±
(cos U )’1/± (3.20)
X = c sin(±U )
E
where U is uniform [’π/2, π/2] and E, standard exponential are independent.
The case ± = 1 and X = tan(U ) is the Cauchy. It is easy to see that the
158 CHAPTER 3. BASIC MONTE CARLO METHODS

Cauchy distribution can also be obtained by taking the ratio of two independent
standard normal random variables and tan(U ) may be replaced by Z1 /Z2 for
independent standard normal random variables Z1 , Z2 produced by Marsaglia™s
polar algorithm. Equivalently, we generate X = V1 /V2 where Vi ∼ U [’1, 1]
conditional on V12 + V22 · 1 to produce a standard Cauchy variate X.


Example: Stable random walk.

A stable random walk may be used to model a stock price but the closest analogy
to the Black Scholes model would be a logstable process St under which the
distribution of ln(St ) has a symmetric stable distribution. Unfortunately, this
speci¬cation renders impotent many of our tools of analysis, since except in
the case ± = 2 or the case β = ’1, such a stock price process St has no
¬nite moments at all. Nevertheless, we may attempt to ¬t stable laws to the
distribution of ln(St ) for a variety of stocks and except in the extreme tails,
symmetric stable laws with index ± ' 1.7 often provide a reasonably good ¬t.
To see what such a returns process looks like, we generate a random walk with
10,000 time steps where each increment is distributed as independent stable
random variables having parameter 1.7. The following Matlab function was
used
function s=stabrnd(a,n)

u=(unifrnd(0,1,n,1)*pi)-.5*pi;

e = exprnd(1,n,1);

s=sin(a*u).*(cos((1-a)*u)./e).^(1/a-1).*(cos(u)).^(-1/a)

Then the command
plot(1:10000, cumsum(stabrnd(1.7,10000)));

resulted in the Figure 3.18. Note the occasional very large jump(s) which dom-
inates the history of the process up to that point, typical of random walks
generated from the stable distributions with ± < 2.
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS159


250


200


150


100


50


0


-50


-100


-150


-200


-250
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000



Figure 3.18: A Symmetric Stable Random Walk with index ± = 1.7



The Normal Inverse Gaussian Distribution

There is a very substantial body of literature that indicates that the normal
distribution assumption for returns is a poor ¬t to data, in part because the
observed area in the tails of the distribution is much greater than the normal
distribution permits. One possible remedy is to assume an alternative distribu-
tion for these returns which, like the normal distribution, is in¬nitely divisible,
but which has more area in the tails. A good ¬t to some stock and interest rate
data has been achieved using the Normal Inverse Gaussian (NIG) distribution
(see for example Prausse, 1999). To motivate this family of distributions, let us
suppose that stock returns follow a Brownian motion process but with respect
to a random time scale possibly dependent on volume traded and other exter-
nal factors independent of the Brownian motion itself. After one day, say, the
return on the stock is the value of the Brownian motion process at a random
time, „, independent of the Brownian motion. Assume that this random time
160 CHAPTER 3. BASIC MONTE CARLO METHODS

has the Inverse Gaussian distribution having probability density function

(θ ’ t)2
θ
g(t) = √ } (3.21)
exp{’
2c2 t
c 2πt3

for parameters θ > 0, c > 0. This is the distribution of a ¬rst passage time
for Brownian motion. In particular consider a Brownian motion process B(t)
having drift 1 and di¬usion coe¬cient c. Such a process is the solution to the
stochastic di¬erential equation

dB(t) = dt + cdW (t), B(0) = 0.

Then the ¬rst passage of the Brownian motion to the level θ is T = inf(t; B(t) =
θ} and this random variable has probability density function (3.21). The mean
of such a random variable is θ and with variance θc2 . These can be obtained
from the moment generating function of the distribution with probability density
function (3.21),

’1 + 1 ’ 2sc
g — (s) = exp{’θ( )}.
2
c
Expanding this locally around c = 0 we obtain

1
g — (s) = exp{θs + θs2 c2 + O(c4 )}
2

and by comparing this with the moment generating function of the normal
distribution, as c ’ 0, the distribution of

T ’θ



approaches the standard normal or, more loosely, the distribution (3.21) ap-
proaches Normal(θ, θc2 ).


Lemma 30 Suppose X(t) is a Brownian motion process with drift β and dif-
fusion coe¬cient 1, hence satisfying

dXt = βdt + dWt , X(0) = µ.
GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS161

Suppose a random variable T has probability density function (3.21) and is in-
dependent of Xt . Then the probability density function of the randomly stopped
Brownian motion process is given by
p
p K1 (± δ 2 + (x ’ µ)2 )
±δ
exp(δ ±2 ’ β 2 + β(x ’ µ)) p (3.22)
f (x; ±, β, δ, µ) =
π δ 2 + (x ’ µ)2

with r
θ 1
β2 +
δ = , and ± =
c2
c
and the function K» (x) is the modi¬ed Bessel function of the second kind de¬ned
by
Z ∞
1 x
y »’1 exp(’ (y + y ’1 ))dy, for x > 0.
K» (x) =
2 2
0

Proof. The distribution of the randomly stopped variable X(T ) is the same
as that of the random variable


X = µ + βT + TZ

where Z is N (0, 1) independent of T. Conditional on the value of T the prob-
ability density function of X is

1 1
exp(’ (x ’ µ ’ βT )2 )
f (x|T ) = √
2T
2πT

and so the unconditional distribution of X is given by
Z ∞
(θ ’ t)2
1 1 θ
2
√ exp(’ (x ’ µ ’ βt) ) √ exp(’ )dt
2c2 t
2t c 2πt3
2πt
0
Z∞
(θ ’ t)2
θ 1
’2 2
t exp(’ (x ’ µ ’ βt) ’
= )dt
2c2 t
2πc 0 2t
Z∞
θ 1 θ t 1
t’2 exp(’ (x2 ’ 2xµ + µ2 + θ2 ) + (β(x ’ µ) + 2 ) ’ (β 2 + 2 ))dt
=
2πc 0 2t c 2 c
Z∞
θ θ 1 t 1
t’2 exp(’ ((x ’ µ)2 + θ2 ) ’ (β 2 + 2 ))dt
exp(β(x ’ µ) + 2 )
=

<<

. 2
( 4)



>>