= re’r /2

, for r > 0.

1

and ˜ has probability density function f˜ (θ) = 2π for 0 < θ < 2π. Since

p

r = r(x, y) = x2 + y 2 and θ(x, y) = arctan(y/x), the Jacobian of the trans-

formation is

¯ ¯

¯ ‚r ‚r ¯

¯ ¯

‚(r, θ)

| = ¯ ‚x ‚y ¯

| ¯ ‚θ ‚θ ¯

‚(x, y) ¯ ‚x ‚y ¯

¯ ¯

¯ ¯

¯√x ¯

y

√2 2

¯ ¯

2 2

= ¯ x +y x +y

¯

¯ ¯

¯ x2’y 2 x

¯

x2 +y 2

+y

1

=p

x2 + y 2

Consequently the joint probability density function of (X, Y ) is given by

p p

‚(r, θ) 1 1

2 2

— x2 + y 2 e’(x +y )/2 — p

f˜ (arctan(y/x))fR ( x2 + y 2 )| |=

‚(x, y) 2π x2 + y 2

1 ’(x2 +y2 )/2

= e

2π

1 1

2 2

= √ e’x /2 √ e’y /2

2π 2π

and this is joint probability density function of two independent standard nor-

mal random variables.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS131

The tails of the distribution of the pseudo-random numbers produced by the

Box-Muller method are quite sensitive to the granularity of the uniform gener-

ator. For this reason although the Box-Muller is the simplest normal generator

it is not the method of choice in most software. A related alternative algorithm

for generating standard normal variates is the Marsaglia polar method. This

is a modi¬cation of the Box-Muller generator designed to avoid the calculation

of sin or cos. Here we generate a point (Z1 , Z2 )from the uniform distribution

on the unit circle by rejection, generating the point initially from the square

’1 · z1 · 1, ’1 · z2 · 1 and accepting it when it falls in the unit circle or

2 2

if z1 + z2 · 1. Now suppose that the points (Z1 , Z2 ) is uniformly distributed

inside the unit circle. Then for r > 0,

q

P [ ’2 log(Z1 + Z2 ) · r] = P [Z1 + Z2 ≥ exp(’r2 /2)]

2 2

2 2

1 ’ area of a circle of radius exp(’r2 /2)

area of a circle of radius 1

2

= 1 ’ e’r /2

.

This is exactly the same cumulative distribution function as that of the random

variable R in Theorem 21. It follows that we can replace R2 by ’2log(Z1 +Z2 ).

2 2

Similarly, if (Z1 , Z2 ) is uniformly distributed inside the unit circle then the

angle subtended at the origin by a line to the point (X, Y ) is random and

Z1

uniformly[0, 2π] distributed and so we can replace cos ˜, and sin ˜ by √ 2 2

Z1 +Z2

Z2

and √ respectively. The following theorem is therefore proved.

2 2

Z1 +Z2

Theorem 23 If the point (Z1 , Z2 ) is uniformly distributed in the unit circle

2 2

Z1 + Z2 · 1, then the pair of random variables de¬ned by

q

Z1

2 2

X = ’2log(Z1 + Z2 ) p 2 2

Z1 + Z2

q

Z2

2 2

Y = ’2log(Z1 + Z2 ) p 2 2

Z1 + Z2

are independent standard normal variables.

132 CHAPTER 3. BASIC MONTE CARLO METHODS

Figure 3.10: Marsaglia™s Method for Generating Normal Random Numbers

If we use acceptance-rejection to generate uniform random variables Z1 , Z2

inside the unit circle, the probability that a point generated inside the square

falls inside the unit circle is π/4,so that on average around 4/π ≈ 1.27 pairs of

uniforms are needed to generate a pair of normal variates.

The speed of the Marsaglia polar algorithm compared to that of the Box-

Muller algorithm depends on the relative speeds of generating uniform variates

versus the sine and cosine transformations. The Box-Muller and Marsaglia polar

method are illustrated in Figure 3.10:

Unfortunately the speed of these normal generators is not the only con-

sideration. If we run a linear congruential generator through a full period we

have seen that the points lie on a lattice, doing a reasonable job of ¬lling the

two dimensional rectangle. Transformations like (3.10) are highly non-linear

functions of (U, V ) stretching the space in some places and compressing it in

others. It would not be too surprising if, when we apply this transformation to

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS133

Figure 3.11: Box Muller transformation applied to the output to xn =

97xn’1 mod 217

our points on a lattice, they do not provide the same kind of uniform coverage

of the space. In Figure 3.11 we see that the lattice structure in the output

from the linear congruential generator results in an interesting but alarmingly

non-normal pattern, particularly sparse in the tails of the distribution. Indeed,

if we use the full-period generator xn = 16807xn’1 mod (231 ’ 1) the smallest

possible value generated for y is around ’4.476 although in theory there should

be around 8,000 normal variates generated below this.

The normal random number generator in Matlab is called normrnd or for

standard normal randn. For example normrnd(µ, σ, m, n) generates a matrix of

m — n pseudo-independent normal variates with mean µ and standard devia-

tion σ and rand(m,n) generates an m — n matrix of standard normal random

numbers. A more precise algorithm is to use inverse transform and a highly

re¬ned rational approximation to normal inverse cumulative distribution func-

134 CHAPTER 3. BASIC MONTE CARLO METHODS

tion available from P.J. Acklam (2003). The Matlab implementation of this

inverse c.d.f. is called ltqnorm after application of a re¬nement, achieves full

machine precision. In R or Splus, the normal random number generator is called

rnorm. The inverse random number function in Excel has been problematic in

many versions. These problems appear to have been largely corrected in Excel

2002, although there is still signi¬cant error (roughly in the third decimal) in

the estimation of lower and upper tail quantiles. The following table provides

a comparison of the normsinv function in Excel and the Matlab inverse nor-

mal norminv. The “exact” values agree with the values generated by Matlab

norminv to the number of decimals shown.

p Excel 2002 Exact

10’1 -1.281551939 -1.281551566

10’2 -2.326347 -2.326347874

10’3 -3.090252582 -3.090232306

10’4 -3.719090272 -3.719016485

10’5 -4.265043367 -4.264890794

10’6 -4.753672555 -4.753424309

10’7 -5.199691841 -5.199337582

10’8 -5.612467211 -5.612001244

10’9 -5.998387182 -5.997807015

10’10 -6.362035677 -6.361340902

The Lognormal Distribution

If Z is a normal random variable with mean µ and variance σ 2 , then we say that

the distribution of X = eZ is lognormal with mean E(X) = · = exp{µ+σ2 /2} >

0 and parameter σ > 0. Because a lognormal random variable is obtained by

exponentiating a normal random variable it is strictly positive, making it a

reasonable candidate for modelling quantities such as stock prices, exchange

rates, lifetimes, though in a fools paradise in which stock prices and lifetimes

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS135

are never zero. To determine the lognormal probability density function, notice

that

P (X · x] = P [eZ · x]

= P [Z · ln(x)]

ln(x) ’ µ

) with ¦ the standard normal c.d.f.

= ¦(

σ

and di¬erentiating to obtain the probability density function g(x|·, σ) of X, we

obtain

ln(x) ’ µ

d

g(x|·, σ) = ¦( )

dx σ

1

√ exp{’(ln(x) ’ µ)2 /2σ 2 }

=

xσ 2π

1

√ exp{’(ln(x) ’ ln(·) + σ 2 /2)2 /2σ 2 }

=

xσ 2π

A random variable with a lognormal distribution is easily generated by gen-

erating an appropriate normal random variable Z and then exponentiating. We

may use either the parameter µ, the mean of the random variable Z in the expo-

nent or the parameter ·, the expected value of the lognormal. The relationship

is not as simple as a naive ¬rst impression might indicate since

E(eZ ) 6= eE(Z) .

Now is a good time to accommodate to this correction factor of σ 2 /2 in the

exponent

2 2

· = E(eZ ) = eE(Z)+σ /2

= eµ+σ /2

or,

2

E(eZ’µ’σ /2

)=1

since a similar factor appears throughout the study of stochastic integrals and

mathematical ¬nance. Since the lognormal distribution is the one most often

used in models of stock prices, it is worth here recording some of its conditional

moments used in the valuation of options. In particular if X has a lognormal

136 CHAPTER 3. BASIC MONTE CARLO METHODS

2

distribution with mean · = eµ+σ /2

and volatility parameter σ, then for any p

and l > 0,

Z∞

1

E[X p I(X > l)] = √ xp’1 exp{’(ln(x) ’ µ)2 /2σ2 }dx

σ 2π l

Z∞

1

ezp exp{’(z ’ µ)2 /2σ2 }dz

=√

σ 2π ln(l)

pµ+p2 σ 2 /2 Z ∞

1

exp{’(z ’ ξ)2 /2σ 2 }dz where ξ = µ + σ 2 p

=√e

σ 2π ln(l)

ξ ’ ln(l)

22

= epµ+p σ /2 ¦( )

σ

σ2 1

= · exp{’ p(1 ’ p)}¦(σ’1 ln(·/l) + σ(p ’ ))

p

(3.11)

2 2

where ¦ is the standard normal cumulative distribution function.

Application: A Discrete Time Black-Scholes Model

Suppose that a stock price St , t = 1, 2, 3, ... is generated from an independent

sequence of returns Z1 , Z2 over non-overlapping time intervals. If the value of

the stock at the end of day t = 0 is S0 , and the return on day 1 is Z1 then the

value of the stock at the end of day 1 is S1 = S0 eZ1 . There is some justice in the

use of the term “return” for Z1 since for small values Z1 , S0 eZ1 ' S0 (1 + Z1 )

S1 ’S0

and so Z1 is roughly Assume similarly that the stock at the end of day

S1 .

i has value Si = Si’1 exp(Zi ). In general for a total of j such periods (suppose

Pj

there are n such periods in a year) we assume that Sj = S0 exp{ i=1 Zi } for

independent random variables Zi all have the same normal distribution. Note

that in this model the returns over non-overlapping independent periods of time

are independent. Denote var(Zi ) = σ2 /N so that

N

X

Zi ) = σ2

var(

i=1

represents the squared annual volatility parameter of the stock returns. Assume

that the annual interest rate on a risk-free bond is r so that the interest rate

per period is r/N .

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS137

Recall that the risk-neutral measure Q is a measure under which the stock

price, discounted to the present, forms a martingale. In general there may be

many such measures but in this case there is only one under which the stock

P

price process has a similar lognormal representation Sj = S0 exp{ j Zi } for

i=1

independent normal random variables Zi . Of course under the risk neutral

measure, the normal random variables Zi may have a di¬erent mean. Full

justi¬cation of this model and the uniqueness of the risk-neutral distribution

really relies on the continuous time version of the Black Scholes described in

Section 2.6. Note that if the process

j

X r

’rt/N

Sj = S0 exp{ (Zi ’ )}

e

N

i=1

is to form a martingale under Q, it is necessary that

EQ [Sj+1 |Ht ] = Sj or

r r

EQ [Sj exp{Zj+1 ’ }|Hj ] = Sj EQ [exp{Zj+1 ’ }]

N N

= Sj

r

and so exp{Zj+1 ’ N} must have a lognormal distribution with expected value

1. Recall that, from the properties of the lognormal distribution,

σ2

r r

EQ [exp{Zt+1 ’ }] = exp{EQ (Zt+1 ) ’ }

+

N N 2N

σ2

since varQ (Zt+1 ) = In other words, for each i the expected value of Zi is,

N.

σ2

r

’

under Q, equal to 2N . So under Q, Sj has a lognormal distribution with

N

mean

S0 erj/N

p

and volatility parameter σ j/N .

Rather than use the Black-Scholes formula of Section 2.6, we could price a

call option with maturity j = N T periods from now by generating the random

path Si , i = 1, 2, ...j using the lognormal distribution for Sj and then averaging

138 CHAPTER 3. BASIC MONTE CARLO METHODS

the returns discounted to the present. The value at time j = 0 of a call option

with exercise price K is an average of simulated values of

T

X

’rj/N + ’rj/N

Zi } ’ K)+ ,

(Sj ’ K) = e

e (S0 exp{

i=1

with the simulations conducted under the risk-neutral measure Q with initial

stock price the current price S0 . Thus the random variables Zi are independent

σ2 σ2

r

N(N ’ The following Matlab function simulates the stock price over

2N , N ).

the whole period until maturity and then values a European call option on the

stock by averaging the discounted returns.

Example 24 (simulating the return from a call option)

Consider simulating a call option on a stock whose current value is S0 =

$1.00. The option expires in j days and the strike price is K = $1.00. We

assume constant spot (annual) interest rate r and the stock price follows a

lognormal distribution with annual volatility parameter σ. The following Matlab

function provides a simple simulation and graph of the path of the stock over

the life of the option and then outputs the discounted payo¬ from the option.

function z=plotlogn(r,sigma,T, K)

% outputs the discounted simulated return on expiry of a call option (per dollar

pv of stock).

% Expiry =T years from now, (T = j/N )

% current stock price=$1. (= S0 ), r = annual spot interest rate, sigma=annual

volatility (=σ),

% K= strike price.

N=250 ; % N is the assumed number of business days in a

year.

j=N*T; % the number of days to expiry

s = sigma/sqrt(N); % s is volatility per period

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS139

Figure 3.12: One simulation of the return from a call option with strike price

$1.00

mn = r/N - s^2/2; % mn= mean of the normal increments per period

y=exp(cumsum(normrnd(mn,s,j,1)));

y=[1 y™]; % the value of the stock at times 0,...,

% the time points i

x = (0:j)/N;

plot(x,y,™-™,x,K*ones(1,j+1),™y™)

xlabel(™time (in years)™)

ylabel(™value of stock™)

title(™SIMULATED RETURN FROM CALL OPTION™)

z = exp(-r*T)*max(y(j+1)-K, 0); % payo¬ from option discounted to

present

Figure 3.12 resulted from one simulation run with r = .05, j = 63 (about 3

months), σ = .20, K = 1.

140 CHAPTER 3. BASIC MONTE CARLO METHODS

The return on this run was the discounted di¬erence between the terminal

value of the stock and the strike price or around 0.113. We may repeat this

many times, averaging the discounted returns to estimate the present value of

the option.

For example to value an at the money call option with exercise price=the

initial price of the stock=$1, 5% annual interest rate, 20% annual volatility

and maturity 0.25 years from the present, we ran this function 100 times and

averaged the returns to estimate the option price as 0.044978. If we repeat the

identical statement, the output is di¬erent, for example option val= 0.049117

because each is an average obtained from only 100 simulations. Averaging over

more simulations would result in greater precision, but this function is not

written with computational e¬ciency in mind. We will provide more e¬cient

simulations for this problem later. For the moment we can compare the price of

this option as determined by simulation with the exact price according to the

Black-Scholes formula. This formula was developed in Section 2.6. The price of

a call option at time t = 0 given by

V (ST , T ) = ST ¦(d1 ) ’ Ke’rT /N ¦(d2 )

where

σ2 σ2

log(ST /K) + (r ’

log(ST /K) + (r + 2 )T /N 2 )T /N

p p

and d2 =

d1 =

σ T /N σ T /N

and the Matlab function which evaluates this is the function blsprice which gives,

in this example, and exact price on entering [CALL,PUT] =BLSPRICE(1,1,.05,63/250,.2,0)

which returns the value CALL=0.0464. With these parameters, 4.6 cents on

the dollar allows us to lock in any anticipated pro¬t on the price of a stock (or

commodity if the lognormal model ¬ts) for a period of about three months. The

fact that this can be done cheaply and with ease is part of the explanation for

the popularity of derivatives as tools for hedging.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS141

Figure 3.13: Comparison between the Lognormal and the Gamma densities

Algorithms for Generating the Gamma and Beta Distribu-

tions

We turn now to algorithms for generating the Gamma distribution with density

xa’1 e’x/b

, for x > 0, a > 0, b > 0. (3.12)

f (x|a, b) =

“(a)ba

The exponential distribution (a = 1) and the chi-squared (corresponding to

a = ν/2, b = 2, for ν integer) are special cases of the Gamma distribution.

The gamma family of distributions permits a wide variety of shapes of density

functions and is a reasonable alternative to the lognormal model for positive

quantities such as asset prices. In fact for certain parameter values the gamma

density function is very close to the lognormal. Consider for example a typical

lognormal random variable with mean · = 1.1 and volatility σ = 0.40.

The probability density functions can be quite close as in Figure ??. Of

course the lognormal, unlike the gamma distribution, has the additional attrac-

tive feature that a product of independent lognormal random variables also has

142 CHAPTER 3. BASIC MONTE CARLO METHODS

a lognormal distribution.

Another common distribution closely related to the gamma is the Beta dis-

tribution with probability density function de¬ned for parameters a, b > 0,

“(a + b) a’1

x (1 ’ x)b’1 , 0 · x · 1. (3.13)

f (x) =

“(a)“(b)

The beta density obtains for example as the distribution of order statistics in

a sample from independent uniform [0, 1] variates. This is easy to see. For

example if U1 , ..., Un are independent uniform random variables on the interval

[0, 1] and if U(k) denotes the k 0 th largest of these n values, then

P [U(k) < x] = P [there are k or more values less than x]

X µn¶

n

xj (1 ’ x)n’j .

=

j

j=k

Di¬erentiating we ¬nd the probability density function of U(k) to be

nµ¶ X µn¶

n

dX n j n’j

{jxj’1 (1 ’ x)n’j + (n ’ j)xj (1 ’ x)n’j’1 }

x (1 ’ x) =

dx j j

j=k j=k

µ¶

n k’1

(1 ’ x)n’k

=k x

k

“(n + 1)

xk’1 (1 ’ x)n’k

=

“(k)“(n ’ k + 1)

and this is the beta density with parameters a = k ’ 1, b = n ’ k + 1. Order

statistics from a Uniform sample therefore have a beta distribution with the

k™th order statistic having the Beta(k ’ 1, n ’ k + 1) distribution. This means

that order statistics from more general continuous distributions can be easily

generated using the inverse transform and a beta random variable. For example

suppose we wish to simulate the largest observation in a normal(µ, σ 2 ) sample of

size 100. Rather than generate a sample of 100 normal observations and take the

largest, we can simulate the value of the largest uniform order statistic U(100) ∼

Beta(99, 1) and then µ + σ¦’1 (U(100) ) (with ¦’1 the standard normal inverse

cumulative distribution function) is the required simulated value. This may be

used to render simulations connected with risk management more e¬cient.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS143

The following result lists some important relationships between the Gamma

and Beta distributions. For example it allows us to generate a Beta random

variable from two independent Gamma random variables.

Theorem 25 (Gamma distribution) If X1 , X2 are independent Gamma (a1 , b)

X1

and Gamma (a2 , b) random variables, then Z = and Y = X1 + X2

X1 +X2

are independent random variables with Beta (a1 , a2 ) and Gamma (a1 + a2 , b)

distributions respectively. Conversely, if (Z, Y ) are independent variates with

Beta (a1 , a2 ) and the Gamma (a1 + a2 , b) distributions respectively, then X1 =

Y Z, and X2 = Y (1 ’ Z) are independent and have the Gamma (a1 , b) and

Gamma (a2 , b) distributions respectively.

Proof. Assume that X1 , X2 are independent Gamma (a1 , b) and Gamma

(a2 , b) variates. Then their joint probability density function is

1

xa1 ’1 xa2 ’1 e’(x1 +x2 )/b , for x1 > 0, x2 > 0.

fX1 X2 (x1 , x2 ) =

“(a1 )“(a2 ) 1 2

Consider the change of variables x1 (z, y) = zy, x2 (z, y) = (1 ’ z)y. Then the

Jacobian of this transformation is given by

¯ ¯¯ ¯

¯ ‚x1 ‚x1 ¯ ¯ ¯

¯ ‚z ¯ ¯y z¯

¯ ‚y ¯ ¯ ¯

¯ ‚x2 ‚x2 ¯ = ¯ ¯

¯ ‚z ¯ ¯ ’y 1’z ¯

‚y

= y.

Therefore the joint probability density function of (z, y) is given by

¯ ¯

¯ ‚x1 ‚x1 ¯

¯ ¯

fz,y (z, y) = fX1 X2 (zy, (1 ’ z)y) ¯ ‚z ‚y ¯

¯ ‚x2 ‚x2 ¯

¯ ‚z ¯

‚y

1

z a1 ’1 (1 ’ z)a2 ’1 y a1 +a2 ’1 e’y/b , for 0 < z < 1, y > 0

=

“(a1 )“(a2 )

“(a1 + a2 ) a1 ’1 1

(1 ’ z)a2 ’1 — y a1 +a2 ’1 e’y/b , for 0 < z < 1, y > 0

= z

“(a1 )“(a2 ) “(a1 + a2 )

and this is the product of two probability density functions, the Beta(a1 , a2 )

density for Z and the Gamma( a1 + a2 , b) probability density function for Y.

The converse holds similarly.

144 CHAPTER 3. BASIC MONTE CARLO METHODS

This result is a basis for generating gamma variates with integer value of the

parameter a (sometimes referred to as the shape parameter). According to the

theorem, if a is integer and we sum a independent Gamma(1,b) random vari-

ables the resultant sum has a Gamma(a, b) distribution. Notice that ’b log(Ui )

for uniform[0, 1] random variable Ui is an exponential or a Gamma(1, b) random

Qn

variable. Thus ’b log( i=1 Ui ) generates a gamma (n, b) variate for independent

uniform Ui . The computation required for this algorithm, however, increases

linearly in the parameter a = n, and therefore alternatives are required, es-

pecially for large a. Observe that the scale parameter b is easily handled in

general: simply generate a random variable with scale parameter 1 and then

multiply by b. Most algorithms below, therefore, are only indicated for b = 1.

For large a Cheng (1977) uses acceptance-rejection from a density of the

form

x»’1

(3.14)

g(x) = »µ dx , x > 0

(µ + x» )2

called the Burr XII distribution. The two parameters µ and » of this den-

sity (µ is not the mean) are chosen so that it is as close as possible to the

gamma distribution. We can generate a random variable from (3.14) by inverse

µU

transform as G’1 (U ) = { 1’U }1/» .

A much simpler function for dominating the gamma densities is a minor

extension of that proposed by Ahrens and Dieter (1974). It corresponds to

using as a dominating probability density function

§

⎪

⎨ a’1 kxa’1 0·x·k

k ( a +exp(’k))

(3.15)

g(x) = ,x > k

⎪ a’1 ’x

© a’1k k e x>k

k ( +exp(’k))

a

Other distributions that have been used as dominating functions for the

Gamma are the Cauchy (Ahrens and Dieter), the Laplace (Tadakamalla), the

exponential (Fishman), the Weibull, the relocated and scaled t distribution

with 2 degrees of freedom (Best), a combination of normal density (left part)

and exponential density (right part) (Ahrens and Dieter), and a mixture of two

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS145

Erlang distributions (Gamma with integral shape parameter ±).

Best™s algorithm generates a Student™s t2 variate as

√

2(U ’ 1/2)

Y= p (3.16)

U (1 ’ U

where U ∼ U [0, 1]. Then Y has the Students t distribution with 2 degrees of

freedom having probability density function

1

(3.17)

g(y) = .

(2 + y 2 )3/2

p

We then generate a random variable X = (a ’ 1) + Y 3a/2 ’ 3/8 and apply

a rejection step to X to produce a Gamma random variable. See Devroye (p.

408) for details.

Most of the above algorithms are reasonably e¬cient only for a > 1 with

the one main exception being the combination of power of x and exponential

density suggested by Ahrens and Dieter above. Cheng and Feast (1979) also

suggest a ratio of uniforms algorithm for the gamma distribution, a > 1.

A ¬nal fast and simple procedure for generating a gamma variate with a > 1

is due to Marsaglia and Tsang (2000) and generates a gamma variate as the cube

of a suitably scaled normal. Given a fast generator of the Normal to machine

1

precision, this is a highly e¬cient rejection technique. We put d = a ’ and

3

generate a standard normal random variable X and a uniform variate U until,

X

√ )3 ,

with V = (1 + the following inequality holds:

9d

X2

+ d ’ dV + d ln(V ).

ln(U ) <

2

When this inequality is satis¬ed, we accept the value d — V as obtained

from the Gamma(a, 1) distribution. As usual multiplication by b results in

a Gamma(a, b) random variable. The e¬ciency of this algorithm appears to be

very high (above 96% for a > 1).

In the case 0 < a < 1, Stuart™s theorem below allows us to modify a Gamma

variate with a > 1 to one with a < 1. We leave the proof of the theorem as an

exercise.

146 CHAPTER 3. BASIC MONTE CARLO METHODS

Theorem 26 (Stuart) Suppose U is uniform [0, 1] and X is Gamma (a + 1, 1)

independent of U . Then XU 1/a has a gamma (a, 1) distribution

The Matlab function gamrnd uses Best™s algorithm and acceptance rejection

for ± > 1. For ± < 1, it uses Johnk™s generator, which is based on the following

theorem.

Theorem 27 (Johnk) Let U and V be independent Uniform[0,1] random

variables. Then the conditional distribution of

U 1/±

X = 1/±

+ V 1/(1’±)

U

given that the denominator U 1/± + V 1/(1’±) < 1 is Beta(±, 1 ’ ±).

Multiplying this beta random variable by an independent exponential (1)

results in a Gamma(±, 1) random variable.

Toward generating the beta distribution, use of Theorem 24 and the variable

X1

with X1 , X2 independent gamma variates is one method of using

Z= X1 +X2

a gamma generator to produce beta variates, and this is highly competitive as

long as the gamma generator is reasonably fast. The MATLAB generator is

betarnd(a,b,1,n) Alternatives are, as with the gamma density, rejection from a

Burr XII density (Cheng, 1978) and use of the following theorem as a generator

(due to Johnk). This a more general version of the theorem above.

Theorem 28 (Beta distribution)

Suppose U, V are independent uniform[0, 1] variates. Then the conditional

distribution of

U 1/a

(3.18)

X=

U 1/a + V 1/b

given that U 1/a + V 1/b · 1 is Beta (a, b). Similarly the conditional distribution

of U 1/a given that U 1/a + V 1/b · 1 is Beta (a + 1, b).

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS147

Proof. De¬ne a change of variables

U 1/a

, Y =U 1/a + V 1/b

X = 1/a 1/b

U +V

or U = (Y X)a and V = [(1 ’ X)Y ]b

so that the joint probability density function of (X, Y ) is given by

¯ ¯

¯ ‚u ‚u ¯

¯ ¯

fX,Y (x, y) = fU,V ((yx)a , [(1 ’ x)y]b ) ¯ ‚x ‚y ¯

¯ ‚v ‚v ¯

¯ ‚x ‚y ¯

= aby a+b’1 xa’1 (1 ’ x)b’1

1 1

provided either (0 < x < 1 and y < 1) or (1 ’ and 1 < y < 2).

<x<

y y

Notice that in the case y < 1, the range of values of x is the unit interval

and does not depend on y and so the conditional probability density function of

X given Y = y is a constant times xa’1 (1 ’ x)b’1 , i.e. is the Beta(a, b)

probability density function. The rest of the proof is similar.

A generator exploiting this theorem produces pairs (U, V ) until the condi-

tion is satis¬ed and then transforms to the variable X. However, the probability

“(a+1)“(b+1)

that the condition is satis¬ed is which is close to 0 unless a, b are

“(a+b+1)

small, so this procedure should be used only for small values of both parame-

ters. Theorems 24 and 25 together provide an algorithm for generating Gamma

variates with non-integral a from variates with integral ones. For example if X

is Gamma(4, 1)and Z is independent Beta (3.4, .6)then XZ is Gamma (3.4, 1).

There are various other continuous distributions commonly associated with

statistical problems. For example the Student™s t-distribution with ν degrees

q

2ν

of freedom is de¬ned as a ratio X Z where Z is standard normal and X is

√

gamma ( ν , 2). Alternatively, we may use ν √X’1/2 where X is generated as

2 X(1’X)

a symmetric beta(ν/2, ν/2) variate.

Example 29 (some alternatives to lognormal distribution)

The assumption that stock prices, interest rates, or exchange rates follow a

lognormal distribution is a common exercise in wishful thinking. The lognormal

148 CHAPTER 3. BASIC MONTE CARLO METHODS

distribution provides a crude approximation to many ¬nancial time series, but

other less theoretically convenient families of distributions sometimes provide a

better approximation. There are many possible alternatives, including the stu-

dents t distribution and the stable family of distributions discussed later. Sup-

pose, for the present, we modify the usual normal assumption for stock returns

slightly by assuming that the log of the stock price has a distribution “close”

to the normal but with somewhat more weight in the tails of the distribution.

Speci¬cally assume that under the Q measure, ST = S0 exp{µ + cX} where X

has cumulative distribution function F (x). Some constraint is to be placed on

the constant c if we are to compare the resulting prices with the Black-Scholes

model and it is natural to require that both models have identical volatility,

or identical variance of returns. Since the variance of the return in the Black

Scholes model over a period of length T is σ 2 T where σ is the annual volatility,

we therefore require that

s

σ2T

var(cX) = σ 2 T or c = .

var(X)

The remaining constraint is required of all option pricing measures is the martin-

gale constraint and this implies that the discounted asset price is a martingale,

and in consequence

e’rT EQ ST = S0 . (3.19)

Letting the moment generating function of X be

m(s) = EesX ,

the constraint (3.19) becomes

eµ’rT m(c) = 1

and solving for µ, we obtain

µ = rT ’ ln(m(c)).

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS149

Provided that we can generate from the cumulative distribution function of X,

the price of a call option with strike price K under this returns distribution

can be estimated from N simulations by the average discounted return from N

options,

N N

1X X

’rT 1

’rT +

(S0 eµ+cXi ’ K)+

(ST i ’ K) = e

e

N i=1 N i=1

N

1X

’rT

(S0 erT ’ln(m(c))+cXi ’ K)+

=e

N i=1

N

1X ecXi

’ e’rT K)+

= (S0

N i=1 m(c)

A more precise calculation is the di¬erence between the option price in this

case and the comparable case of normally distributed returns. Suppose we use

inverse transform together with a uniform[0,1] variate to generate both the

random variable Xi = F ’1 (Ui ) and the corresponding normal return Zi =

√

rT + σ T ¦’1 (Ui ). Then the di¬erence is estimated by

option price under F ’ option price under ¦

N

1X

’1

ecF (Ui ) √ ’1 2

’ e’rT K)+ ’ (S0 eσ T ¦ (Ui )’σ T /2 ’ e’rT K)+ }

' {(S0

N i=1 m(c)

If necessary, in case the moment generating function of X is unknown, we can

estimate it and the variance of X using sample analogues over a large number

N of simulations. In this case c is estimated by

s

σ2 T

d

v ar(X)

d

with v ar representing the sample variance and m(c) estimated by

N

1 X cF ’1 (Ui )

e .

N i=1

To consider a speci¬c example, the logistic(0, 0.522) distribution is close to the

normal, except with slightly more weight in the tails. The scale parameter in

150 CHAPTER 3. BASIC MONTE CARLO METHODS

this case was chosen so that the logistic has approximate unit variance. The

1

cumulative distribution function is F (x) = and its inverse is X =

1+exp{’x/b}

b ln(U/(1 ’ U )). The moment generating function is m(s) = “(1 ’ bs)“(1 +

bs), s < 1/b. The following function was used to compare the price of a call

option when stock returns have the logistic distribution(i.e. stock prices have

the “loglogistic” distribution) with the prices in the Black-Scholes model.

function [re,op1,opbs]=di¬optionprice(n,So,strike,r,sigma,T)

%estimates the relative error in the BS option price and price under

% logistic returns distribution . Runs n simulations.

u=rand(1,n);

x=log(u./(1-u)); % generates standard

logistic*

z=sigma*sqrt(T)*norminv(u)-sigma^2*T/2;

c=sigma*sqrt(T/var(x));

mc=mean(exp(c*x));

re=[]; op1=[]; opbs=[];

for i=1:length(strike)

op1=[op1 mean(max(exp(c*x)*So/mc-exp(-r*T)*strike(i),0))]; % price under F

opbs=[opbs mean(max(So*exp(z)-exp(-r*T)*strike(i),0))]; % price under BS

end

dif=op1-opbs;

re=[re dif./(dif+BLSPRICE(So,strike,r,T,sigma,0))];

plot(strike/So,re)

xlabel(™Strike price/initial price™)

ylabel(™relative error in Black Scholes formula™)

The relative error in the Black-Scholes formula obtained from a simulation of

100,000 is graphed in Figure 3.14. The logistic distribution di¬ers only slightly

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS151

Figure 3.14: Relative Error in Black-Scholes price when asset prices are loglo-

gistic, σ = .4, T = .75, r = .05

from the standard normal, and the primary di¬erence is in the larger kurtosis

or weight in the tails. Indeed virtually any large ¬nancial data set will di¬er

from the normal in this fashion; there may be some skewness in the distribution

but there is often substantial kurtosis. How much di¬erence does this slightly

increased weight in the tails make in the price of an option? Note that the

Black-Scholes formula overprices all of the options considered by up to around

3%. The di¬erences are quite small, however and there seems to be considerable

robustness to the Black-Scholes formula at least for this type of departure in

the distribution of stock prices.

A change in the single line x=log(u./(1-u)) in the above function permits

revising the returns distribution to another alternative. For example we might

152 CHAPTER 3. BASIC MONTE CARLO METHODS

choose the double exponential or Laplace density

1

f (x) = exp(’|x|)

2

for returns, by replacing this line by x = (u < .5) log(2 — u) ’ (u > .5) log(2 — (1 ’

u)). The resulting Figure 3.15 shows a similar behaviour but more substantial

pricing error, in this case nearly 10% for an at-the-money option.

Figure 3.15: Relative pricing error in Black Scholes formula when returns follow

the Laplace distribution

Another possible distribution of stock returns which can be used to introduce

some skewness to the returns distribution is the loggamma or extreme value

distribution whose probability density function takes the form

1

exp{’e(x’c) + (x ’ c)a}, ’∞ < x < ∞.

f (x) =

“(a)

We can generate such a distribution as follows. Suppose Y is a random

variable with gamma(a, ec ) distribution and probability density function

y a’1 e’ca ’ye’c

g(y) = e .

“(a)

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS153

and de¬ne X = ln(Y ). Then X has probability density function

dx 1

f (x) = g(ex )| exp{(x(a ’ 1) ’ ca ’ ex’c }ex

e |=

dx “(a)

1

exp{’ex’c + (x ’ c)a}, ’∞ < x < ∞.

=

“(a)

As an example in Figure 3.16 we plot this density in the case a = 2, c = 0 This

distribution is negatively skewed, a typical characteristic of risk-neutral distri-

butions of returns. The large left tail in the risk-neutral distribution of returns

re¬‚ects the fact that investors have an aversion to large losses and consequently

the risk-neutral distribution in¬‚ates the left tail.

x

Figure 3.16: The probability density function e’e +2x

Introducing a scale parameter ν, the probability density function of ν ln(Y ) =

ln(Y ν ) where Y has a Gamma(2,1) distribution is

(νx’c)

f (x) = νe’e +2(νx’c)

.

The mean is approximately 0 and variance approximately σ 2 when we choose

c = ’.42278 and ν = .80308/σ and so this distribution is analogous to the

154 CHAPTER 3. BASIC MONTE CARLO METHODS

standard normal. However, the skewness is ’0.78 and this negative skewness is

more typical of risk neutral distributions of stock returns. We might ask whether

the Black-Scholes formula is as robust to the introduction of skewness in the

returns distribution as to the somewhat heavier tails of the logistic distribution.

For comparison with the Black-Scholes model we permitted adding a constant

and multiplying the returns by a constant which, in this case, is equivalent to

assuming under the risk neutral distribution that

ST = S0 e± Y ν , Y is Gamma(2,1)

where the constants ± and ν are chosen so that the martingale condition is

satis¬ed and the variance of returns matches that in the lognormal case. With

some integration we can show that this results in the equations

± = ’ ln(E(Y ν )) = ’ ln(“(2 + ν))

ν 2 var(ln(Y )) = ν 2 ψ 0 (2) = σ 2 T

where ψ 0 (±) is the trigamma function de¬ned as the second derivative of

P∞ 1

ln(“(±)), and evaluated fairly easily using the series ψ 0 (±) = k=0 (k+±)2 .

√

For the special cases required here, ψ 0 (2) ≈ .6449 so ν ≈ σ T /.8031 and

√

± = ’ log(“(2 + σ T /.8031)). Once again replacing the one line marked with

a * in the function di¬optionprice by x=log(gaminf(u,2,1); permits determining

the relative error in the Black-Scholes formula. There is a more signi¬cant pric-

ing error in the Black-Scholes formula now, more typical of the relative pricing

error that is observed in practice. Although the graph can be shifted and tilted

somewhat by choosing di¬erent variance parameters, the shape appears to be a

consequence of assuming a symmetric normal distribution for returns when the

actual risk-neutral distribution is skewed. It should be noted that the practice

of obtaining implied volatility parameters from options with similar strike prices

and maturities is a partial, though not a compete, remedy to the substantial

pricing errors caused by using a formula derived from a frequently ill-¬tting

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS155

Figure 3.17: Relative Error in Black-Scholes formula when Asset returns follow

extreme value

156 CHAPTER 3. BASIC MONTE CARLO METHODS

Black_Scholes model.

The Symmetric Stable Laws

A ¬nal family of distributions of increasing importance in modelling is the stable

family of distributions. The stable cumulative distribution functions F are

such that if two random variables X1 and X2 are independent with cumulative

distribution function F (x) then so too does the sum X1 + X2 after a change

in location and scale. More generally the cumulative distribution function F

of independent random variables X1 , X2 is said to be stable if for each pair of

constants a and b, there exist constants c and m such that

a X1 + b X2 ’ m

c

has the same cumulative distribution function F. A stable random variable X

is most easily characterized through its characteristic function

§

⎨ exp(iuθ ’ |u| ± c± (1 ’ iβ(sign u) tan π± ) for ± 6= 1

2

iuX

Ee =

© exp(iuθ ’ |u|c(1 + iβ(sign u) ln |u|) 2 ) if ±=1

π

where i is the complex number i2 = ’1, θ is a location parameter of the

distribution, and c is a scale parameter. The parameter 0 < ± · 2 is the index

of the stable distribution and governs the tail behavior and β ∈ [’1, 1] governs

the skewness of the distribution. In the case β = 0, we obtain the symmetric

stable family of distributions, all unimodal densities, symmetric about their

mode, and roughly similar in shape to the normal or Cauchy distribution (both

special cases). They are of considerable importance in ¬nance as an alternative

to the normal distribution, in part because they tend to ¬t observations better in

the tail of the distribution than does the normal, and in part because they enjoy

theoretical properties similar to those of the normal family: sums of independent

stable random variables are stable. Unfortunately, this is a more complicated

family of densities to work with; neither the density function nor the cumulative

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS157

distribution function can be expressed in a simple closed form. Both require a

series expansion. The parameter 0 < ± · 2 indicates what moments exist.

Except in the special case ± = 2 (the normal distribution) or the case β = ’1,

moments of order less than ± exist while moments of order ± or more do not.

This is easily seen because the tail behaviour is, when ± < 2,

1+β ±

lim x± P [X > x] = K± c

2

x’∞

1’β ±

lim x± P [X < ’x] = K± c

2

x’∞

for constant K± depending only on ±. Of course, for the normal distribution,

moments of all orders exist. The stable laws are useful for modelling in situ-

ations in which variates are thought to be approximately normalized sums of

independent identically distributed random variables. To determine robustness

against heavy-tailed departures from the normal distribution, tests and estima-

tors can be computed with data simulated from a symmetric stable law with

± near 2. The probability density function does not have a simple closed form

except in the case ± = 1 (Cauchy) and ± = 2 (Normal) but can be expressed

as a series expansion of the form

¡ ¢

∞

1X “ 2k+1 x k

(’1)k ±

fc (x) = ()

π±c (2k)! c

k=0

where c is the scale parameter (and we have assumed the mode is at 0). Espe-

cially for large values of x, this probability density function converges extremely

slowly. However, Small (2003) suggests using an Euler transformation to accel-

erate the convergence of this series, and this appears to provide enough of an

improvement in the convergence to meet a region in which a similar tail formula

(valid for large x) provides a good approximation. According to Chambers,

Mallows and Stuck, (1976), when 1 < ± < 2, such a variate can be generated as

¸ 1 ’1

cos(U (1 ’ ±)) ±

(cos U )’1/± (3.20)

X = c sin(±U )

E

where U is uniform [’π/2, π/2] and E, standard exponential are independent.

The case ± = 1 and X = tan(U ) is the Cauchy. It is easy to see that the

158 CHAPTER 3. BASIC MONTE CARLO METHODS

Cauchy distribution can also be obtained by taking the ratio of two independent

standard normal random variables and tan(U ) may be replaced by Z1 /Z2 for

independent standard normal random variables Z1 , Z2 produced by Marsaglia™s

polar algorithm. Equivalently, we generate X = V1 /V2 where Vi ∼ U [’1, 1]

conditional on V12 + V22 · 1 to produce a standard Cauchy variate X.

Example: Stable random walk.

A stable random walk may be used to model a stock price but the closest analogy

to the Black Scholes model would be a logstable process St under which the

distribution of ln(St ) has a symmetric stable distribution. Unfortunately, this

speci¬cation renders impotent many of our tools of analysis, since except in

the case ± = 2 or the case β = ’1, such a stock price process St has no

¬nite moments at all. Nevertheless, we may attempt to ¬t stable laws to the

distribution of ln(St ) for a variety of stocks and except in the extreme tails,

symmetric stable laws with index ± ' 1.7 often provide a reasonably good ¬t.

To see what such a returns process looks like, we generate a random walk with

10,000 time steps where each increment is distributed as independent stable

random variables having parameter 1.7. The following Matlab function was

used

function s=stabrnd(a,n)

u=(unifrnd(0,1,n,1)*pi)-.5*pi;

e = exprnd(1,n,1);

s=sin(a*u).*(cos((1-a)*u)./e).^(1/a-1).*(cos(u)).^(-1/a)

Then the command

plot(1:10000, cumsum(stabrnd(1.7,10000)));

resulted in the Figure 3.18. Note the occasional very large jump(s) which dom-

inates the history of the process up to that point, typical of random walks

generated from the stable distributions with ± < 2.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS159

250

200

150

100

50

0

-50

-100

-150

-200

-250

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Figure 3.18: A Symmetric Stable Random Walk with index ± = 1.7

The Normal Inverse Gaussian Distribution

There is a very substantial body of literature that indicates that the normal

distribution assumption for returns is a poor ¬t to data, in part because the

observed area in the tails of the distribution is much greater than the normal

distribution permits. One possible remedy is to assume an alternative distribu-

tion for these returns which, like the normal distribution, is in¬nitely divisible,

but which has more area in the tails. A good ¬t to some stock and interest rate

data has been achieved using the Normal Inverse Gaussian (NIG) distribution

(see for example Prausse, 1999). To motivate this family of distributions, let us

suppose that stock returns follow a Brownian motion process but with respect

to a random time scale possibly dependent on volume traded and other exter-

nal factors independent of the Brownian motion itself. After one day, say, the

return on the stock is the value of the Brownian motion process at a random

time, „, independent of the Brownian motion. Assume that this random time

160 CHAPTER 3. BASIC MONTE CARLO METHODS

has the Inverse Gaussian distribution having probability density function

(θ ’ t)2

θ

g(t) = √ } (3.21)

exp{’

2c2 t

c 2πt3

for parameters θ > 0, c > 0. This is the distribution of a ¬rst passage time

for Brownian motion. In particular consider a Brownian motion process B(t)

having drift 1 and di¬usion coe¬cient c. Such a process is the solution to the

stochastic di¬erential equation

dB(t) = dt + cdW (t), B(0) = 0.

Then the ¬rst passage of the Brownian motion to the level θ is T = inf(t; B(t) =

θ} and this random variable has probability density function (3.21). The mean

of such a random variable is θ and with variance θc2 . These can be obtained

from the moment generating function of the distribution with probability density

function (3.21),

√

’1 + 1 ’ 2sc

g — (s) = exp{’θ( )}.

2

c

Expanding this locally around c = 0 we obtain

1

g — (s) = exp{θs + θs2 c2 + O(c4 )}

2

and by comparing this with the moment generating function of the normal

distribution, as c ’ 0, the distribution of

T ’θ

√

cθ

approaches the standard normal or, more loosely, the distribution (3.21) ap-

proaches Normal(θ, θc2 ).

Lemma 30 Suppose X(t) is a Brownian motion process with drift β and dif-

fusion coe¬cient 1, hence satisfying

dXt = βdt + dWt , X(0) = µ.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS161

Suppose a random variable T has probability density function (3.21) and is in-

dependent of Xt . Then the probability density function of the randomly stopped

Brownian motion process is given by

p

p K1 (± δ 2 + (x ’ µ)2 )

±δ

exp(δ ±2 ’ β 2 + β(x ’ µ)) p (3.22)

f (x; ±, β, δ, µ) =

π δ 2 + (x ’ µ)2

with r

θ 1

β2 +

δ = , and ± =

c2

c

and the function K» (x) is the modi¬ed Bessel function of the second kind de¬ned

by

Z ∞

1 x

y »’1 exp(’ (y + y ’1 ))dy, for x > 0.

K» (x) =

2 2

0

Proof. The distribution of the randomly stopped variable X(T ) is the same

as that of the random variable

√

X = µ + βT + TZ

where Z is N (0, 1) independent of T. Conditional on the value of T the prob-

ability density function of X is

1 1

exp(’ (x ’ µ ’ βT )2 )

f (x|T ) = √

2T

2πT

and so the unconditional distribution of X is given by

Z ∞

(θ ’ t)2

1 1 θ

2

√ exp(’ (x ’ µ ’ βt) ) √ exp(’ )dt

2c2 t

2t c 2πt3

2πt

0

Z∞

(θ ’ t)2

θ 1

’2 2

t exp(’ (x ’ µ ’ βt) ’

= )dt

2c2 t

2πc 0 2t

Z∞

θ 1 θ t 1

t’2 exp(’ (x2 ’ 2xµ + µ2 + θ2 ) + (β(x ’ µ) + 2 ) ’ (β 2 + 2 ))dt

=

2πc 0 2t c 2 c

Z∞

θ θ 1 t 1

t’2 exp(’ ((x ’ µ)2 + θ2 ) ’ (β 2 + 2 ))dt

exp(β(x ’ µ) + 2 )

=