ńņš. 4 |

1

[(22 ā’ 22)+ + (20 ā’ 22)+ + (23 ā’ 22)+ + ....].

V (ST ) =

1000

1

= [0 + 0 + 1 + ....]

1000

The law of large numbers assures us for a large number of simulations n, the

average V (ST ) will approximate the true expectation EQ V (ST ). Now while it

would be foolish to use simulation in a simple problem like this, there are many

models in which it is much easier to randomly generate values of the process ST

than it is to establish its exact distribution. In such a case, simulation is the

method of choice.

Randomly generating a value of ST for the discrete distribution above is easy,

provided that we can produce independent random uniform random numbers

on a computer. For example, if we were able to generate a random number Yi

which has a uniform distribution on the integers {0, 1, 2, ...., 15} then we could

deļ¬ne ST for the i0 th simulation as follows:

{0} {1, 2, 3, 4} {5, 6, 7, 8, 9, 10} {11, 12, 13, 14} {15}

If Yi is in set

deļ¬ne ST = 20 21 22 23 24

Of course, to get a reasonably accurate estimate of the price of a complex

derivative may well require a large number of simulations, but this is decreas-

ingly a problem with increasingly fast computer processors. The ļ¬rst ingredient

in a simulation is a stream of uniform random numbers Yi used above. In prac-

tice all other distributions are generated by processing discrete uniform random

numbers. Their generation is discussed in the next section.

Uniform Random Number Generation

The ļ¬rst requirement of a stochastic model is the ability to generate ārandomā

variables or something resembling them. Early such generators attached to

computers exploited physical phenomena such as the least signiļ¬cant digits in

UNIFORM RANDOM NUMBER GENERATION 99

an accurate measure of time, or the amount of background cosmic radiation

as the basis for such a generator, but these suļ¬er from a number of disadvan-

tages. They may well be ārandomā in some more general sense than are the

pseudo-random number generators that are presently used but their properties

are diļ¬cult to establish, and the sequences are impossible to reproduce. The

ability to reproduce a sequence of random numbers is important for debugging

a simulation program and for reducing its variance.

It is quite remarkable that some very simple recursion formulae deļ¬ne se-

quences that behave like sequences of independent random numbers and appear

to more or less obey the major laws of probability such as the law of large num-

bers, the central limit theorem, the Glivenko-Cantelli theorem, etc. Although

computer generated pseudo random numbers have become more and more like

independent random variables as the knowledge of these generators grows, the

main limit theorems in probability such as the law of large numbers and the

central limit theorem still do not have versions which directly apply to depen-

dent sequences such as those output by a random number generator. The fact

that certain pseudo-random sequences appear to share the properties of inde-

pendent sequences is still a matter of observation rather than proof, indicating

that many results in probability hold under much more general circumstances

than the relatively restrictive conditions under which these theorems have so far

been proven. One would intuitively expect an enormous diļ¬erence between the

behaviour of independent random variables Xn and a deterministic (i.e. non-

random) sequence satisfying a recursion of the form xn = g(xnā’1 ) for a simple

function g. Surprisingly, for many carefully selected such functions g it is quite

diļ¬cult to determine the diļ¬erence between such a sequence and an indepen-

dent sequence. Of course, numbers generated from a simple recursion such as

this are neither random, nor are xnā’1 and xn independent. We sometimes draw

attention to this by referring to such a sequence as pseudo-random numbers.

While they are in no case independent, we will nevertheless attempt to ļ¬nd

100 CHAPTER 3. BASIC MONTE CARLO METHODS

simple functions g which provide behaviour similar to that of independent uni-

form random numbers. The search for a satisfactory random number generator

is largely a search for a suitable function g, possibly depending on more than one

of the earlier terms of the sequence, which imitates in many diļ¬erent respects

the behaviour of independent observations with a speciļ¬ed distribution.

Deļ¬nition: reduction modulo m. For positive integers x and m, the value

a mod m is the remainder (between 0 and m ā’ 1 ) obtained when a is divided

by m. So for example 7 mod 3 = 1 since 7 = 2 Ć— 3 + 1.

The single most common class of random number generators are of the form

xn := (axnā’1 + c) mod m

for given integers a, c, and m which we select in advance. This generator is

initiated with a āseedā x0 and then run to produce a whole sequence of values.

When c = 0, these generators are referred to as multiplicative congruential

generators and in general as mixed or linear congruential generators. The

āseedā, x0 , is usually updated by the generator with each call to it. There are

two common choices of m, either m prime or m = 2k for some k (usually 31 for

32 bit machines).

Example: Mixed Congruential generator

Deļ¬ne xn = (5xnā’1 + 3) mod 8 and the seed x0 = 3. Note that by this

recursion

x1 = (5 Ć— 3 + 3) mod 8 = 18 mod 8 = 2

x2 = 13 mod 8 = 5

x3 = 28 mod 8 = 4

and x4 , x5 , x6, x7 , x8 = 7, 6, 1, 0, 3 respectively

UNIFORM RANDOM NUMBER GENERATION 101

and after this point (for n > 8) the recursion will simply repeat again the

pattern already established, 3, 2, 5, 4, 7, 6, 1, 0, 3, 2, 5, 4, .......

The above repetition is inevitable for a linear congruential generator. There

are at most m possible numbers after reduction mod m and once we arrive

back at the seed the sequence is destined to repeat itself. In the example

above, the sequence cycles after 8 numbers. The length of one cycle, before the

sequence begins to repeat itself again, is called the period of the generator. For a

mixed generator, the period must be less than or equal to m. For multiplicative

generators, the period is shorter, and often considerably shorter.

Multiplicative Generators.

For multiplicative generators, c = 0. Consider for example the generator xn =

5xnā’1 mod 8 and x0 = 3. This produces the sequence 3, 7, 3, 7, .... In this case,

the period is only 2, but for general m, it is clear that the maximum possible

period is mā’1 because it generates values in the set {1, ..., mā’1}. The generator

cannot generate the value 0 because if it did, all subsequent values generated

are identically 0. Therefore the maximum possible period corresponds to a cycle

through non-zero integers exactly once. But in the example above with m = 2k ,

the period is far from attaining its theoretical maximum, m ā’ 1. The following

Theorem shows that the period of a multiplicative generator is maximal when

m is a prime number and a satisļ¬es some conditions.

Theorem 14 (period of multiplicative generator).

If m is prime, the multiplicative congruential generator xn = axnā’1 (mod m), a 6=

if and only if ai 6= 1(mod m) for all i =

has maximal period m ā’ 1

0,

1, 2, ..., m ā’ 1.

If m is a prime, and if the condition amā’1 = 1(mod m) and ai 6= 1(mod m)

for all i < m ā’ 1 holds, we say that a is a primitive root of m, which means

102 CHAPTER 3. BASIC MONTE CARLO METHODS

that the powers of a generate all of the possible elements of the multiplicative

group of integers mod m. Consider the multiplicative congruential generator

xn = 2xnā’1 mod 11. It is easy to check that 2i mod 11 = 2, 4, 8, 5, 10, 9, 7, 3, 6, 1

as i = 1, 2, ...10. Since the value i = mā’ 1 is the ļ¬rst for which 2i mod 11 = 1, 2

is a primitive root of 11 and this is a maximal period generator having period

10. When m = 11, only the values a = 2, 6, 7, 8 are primitive roots and produce

full period (10) generators.

One of the more common moduli on 32 bit machines is the Mersenne prime

m = 231 ā’ 1. In this case, the following values of a (among many others) all

produce full period generators:

a = 7, 16807, 39373, 48271, 69621, 630360016, 742938285, 950706376,

1226874159, 62089911, 1343714438

Let us suppose now that m is prime and a2 is the multiplicative inverse

(mod m) of a1 by which we mean (a1 a2 ) mod m = 1. When m is prime,

the set of integers {0, 1, 2, ..., m ā’ 1} together with the operations of addition

and multiplication mod m forms what is called a ļ¬nite ļ¬eld. This is a ļ¬nite

set of elements together with operations of addition and multiplication such as

those we enjoy in the real number system. For example for integers x1 , a1 , a2 ā

{0, 1, 2, ..., mā’1}, the product of a1 and x1 can be deļ¬ned as (a1 x1 ) mod m = x2,

say. Just as non-zero numbers in the real number system have multiplicative

inverses, so too do non=zero elements of this ļ¬eld. Suppose for example a2 is

the muultiplicative inverse of a1 so that a2 a1 mod m = 1. If we now multiply x2

by a2 we have

(a2 x2 ) mod m = (a2 a1 x1 ) mod m = (a2 a1 mod m)(x1 mod m) = x1 .

This shows that x1 = (a2 x2 ) mod m is equivalent to x2 = (a1 x1 ) mod m. In

other words, using a2 the multiplicative inverse of a1 mod m, the multiplicative

generator with multiplier a2 generates exactly the same sequence as that with

UNIFORM RANDOM NUMBER GENERATION 103

multiplier a1 except in reverse order. Of course if a is a primitive root of m,

then so is its multiplicative inverse.

Theorem 15 (Period of Multiplicative Generators with m = 2k )

If m = 2k with k ā„ 3, and if a mod 8 = 3 or 5 and x0 is odd, then the

multiplicative congruential generator has maximal period = 2kā’2 .

For the proof of these results, see Ripley(1987), Chapter 2. The follow-

ing simple Matlab code allows us to compare linear congruential generators

with small values of m. It generates a total of n such values for user deļ¬ned

a, c, m, x0 =seed. The eļ¬cient implementation of a generator for large values of

m depends very much on the architecture of the computer. We normally choose

m to be close to the machine precision (e.g. 232 for a 32 bit machine.

function x=lcg(x0,a,c,m,n)

y=x0; x=x0;

for i=1:n ; y=rem(a*y+c,m); x=[x y]; end

The period of a linear congruential generator varies both with the multiplier

a and the constant c. For example consider the generator

xn = (axnā’1 + 1) mod 210

for various multipliers a. When we use an even multiplier such as a = 2, 4, ...(using

seed 1) we end up with a sequence that eventually locks into a speciļ¬c value.

For example with a = 8 we obtain the sequence 1,9,73,585,585,....never changing

beyond that point. The periods for odd multipliers are listed below (all started

with seed 1)

1 3 5 7 9 11 13 15 17 19 21 23 25

a

Period 1024 512 1024 256 1024 512 1024 128 1024 512 1024 256 1024

The astute reader will notice that the only full-period multipliers a are those

which are multipliers of 4. This is a special case of the following theorem.

104 CHAPTER 3. BASIC MONTE CARLO METHODS

Theorem 16 (Period of Mixed or Linear Congruential Generators.)

The Mixed Congruential Generator,

(3.1)

xn = (axnā’1 + c) mod m

has full period m if and only if

(i) c and m are relatively prime.

(ii) Each prime factor of m is also a factor of a ā’ 1.

(iii) If 4 divides m it also divides a ā’ 1.

When m is prime, (ii) together with the assumption that a < m implies that

m must divide a ā’ 1 which implies a = 1. So for prime m the only full-period

generators correspond to a = 1. Prime numbers m are desirable for long periods

in the case of multiplicative generators, but in the case of mixed congruential

generators, only the trivial one xn = (xnā’1 + c)(mod m) has maximal period

m when m is prime. This covers the popular Mersenne prime m = 231 ā’ 1..

For the generators xn = (axnā’1 + c) mod 2k where m = 2k , k ā„ 2, the

condition for full period 2k requires that c is odd, and a = 4j + 1 for some

integer j.

Some of the linear or multiplicative generators which have been suggested

are the following:

UNIFORM RANDOM NUMBER GENERATION 105

m a c

231 ā’ 1 75 = 16807 0 Lewis,Goodman, Miller (1969)IBM,

231 ā’ 1 0 Fishman (Simscript II)

630360016

231 ā’ 1 0 Fishman and Moore

742938285

231 0 RANDU

65539

232 1 Super-Duper (Marsaglia)

69069

232 0 Fishman and Moore

3934873077

232 1 DERIVE

3141592653

232 0 Ahrens (C-RAND )

663608941

232 Turbo-Pascal,Version 7(period= 232 )

1

134775813

235 513 0 APPLE

1012 ā’ 11 0 MAPLE

427419669081

259 1313 0 NAG

261 ā’ 1 220 ā’ 219 0 Wu (1997)

Table 3.1: Some Suggested Linear and Multiplicative Random

Number Generators

Other Random Number Generators.

A generalization of the linear congruential generators which use a k-dimensional

vectors X has been considered, speciļ¬cally when we wish to generate correlation

among the components of X. Suppose the components of X are to be integers

between 0 and mā’ 1 where m is a power of a prime number. If A is an arbitrary

k Ć— k matrix with integral elements also in the range {0, 1, ..., m ā’ 1} then we

begin with a vector-valued seed X0 , a constant vector C and deļ¬ne recursively

Xn := (AXnā’1 + C) mod m

Such generators are more common when C is the zero vector and called matrix

multiplicative congruential generators. A related idea is to use a higher order

106 CHAPTER 3. BASIC MONTE CARLO METHODS

recursion like

(3.2)

xn = (a1 xnā’1 + a2 xnā’2 + .. + ak xnā’k ) mod m,

called a multiple recursive generator. Lā™Ecuyer (1996,1999) combines a number

of such generators in order to achieve a period around 2319 and good uniformity

properties. When a recursion such as (3.2) with m = 2 is used to generate

pseudo-random bits {0, 1}, and these bits are then mapped into uniform (0,1)

numbers, it is called a Tausworthe or Feedback Shift Register generators. The

coeļ¬cients ai are determined from primitive polynomials over the Galois Field.

In some cases, the uniform random number generator in proprietary packages

such as Splus and Matlab are not completely described in the package documen-

tation. This is a further recommendation of the transparency of packages like

R. Evidently in Splus, the multiplicative congruential generator is used, and

then the sequence is ā shuļ¬„edā using a Shift-register generator (a special case

of the matrix congruential generator described above). This secondary process-

ing of the sequence can increase the period but it is not always clear what other

eļ¬ects it has. In general, shuļ¬„ing is conducted according to the following steps

1. Generate a sequence of pseudo-random numbers xi using xi+1 = a1 xi (mod m1 ).

2. For ļ¬xed k put (T1 , . . . , Tk ) = (x1 , . . . , xk ).

3. Generate, using a diļ¬erent generator, a sequence yi+1 = a2 yi (mod m2 ).

4. Output the random number TI where I = dYi k/m2 e .

5. Increment i, replace TI by the next value of x, and return to step 3.

One generator is used to produce the sequence x, numbers needed to ļ¬ll

k holes. The other generator is then used select which hole to draw the next

number from or to āshuļ¬„eā the x sequence.

Example: A shuļ¬„ed generator

Consider a generator described by the above steps with k = 4, xn+1 = (5xn )(mod1 19)

and yn+1 = (5yn )(mod 29)

UNIFORM RANDOM NUMBER GENERATION 107

3 15 18 14 13 8 2

xn =

yn = 3 15 17 27 19 8 11

We start by ļ¬lling four pigeon-holes with the numbers produced by the ļ¬rst

generator so that (T1 , . . . , T4 ) = (3, 15, 18, 14). Then use the second generator

to select a random index I telling us which pigeon-hole to draw the next number

from. Since these holes are numbered from 1 through 4, we use I = d4Ć— 3/29e =

1. Then the ļ¬rst number in our random sequence is drawn from box 1, i.e.

z1 = T1 = 3, so z1 = 3. This element T1 is now replaced by 13, the next number

in the x sequence. Proceeding in this way, the next index is I = d4Ć— 15/29e = 3

and so the next number drawn is z2 = T3 = 18. Of course, when we have

ļ¬nished generating the values z1 , z2 , ... all of which lie between 1 and m1 = 18,

we will usually transform them in the usual way (e.g. zi /m1 ) to produce

something approximating continuous uniform random numbers on [0,1]. When

m1 , is large, it is reasonable to expect the values zi /m1 to be approximately

continuous and uniform on the interval [0, 1]. One advantage of shuļ¬„ing is that

the period of the generator is usually greatly extended. Whereas the original

x sequence had period 9 in this example, the shuļ¬„ed generator has a larger

period or around 126.

There is another approach, summing pseudo-random numbers, which is also

used to extend the period of a generator. This is based on the following theo-

rem (see Lā™Ecuyer (1988)). For further discussion of the eļ¬ect of taking linear

combinations of the output from two or more random number generators, see

Fishman (1995, Section 7.13).

Theorem 17 (Summing mod m)

If X is random variable uniform on the integers {0, . . . , m ā’ 1} and if Y is

any integer-valued random variable independent of X, then the random variable

W = (X + Y )(mod m) is uniform on the integers {0, . . . , m ā’ 1}.

Theorem 18 (Period of generator summed mod m1 )

108 CHAPTER 3. BASIC MONTE CARLO METHODS

If xi+1 = a1 xi mod m1 has period m1 ā’ 1 and yi+1 = a2 yi mod m2 has period

m2 ā’ 1, then (xi + yi )(mod m1 ) has period the least common multiple of (m1 ā’

1, m2 ā’ 1).

Example: summing two generators

If xi+1 = 16807xi mod(231 ā’ 1) and yi+1 = 40692yi mod(231 ā’ 249), then the

period of (xi + yi )mod(231 ā’ 1) is

(231 ā’ 2)(231 ā’ 250)

ā 7.4 Ć— 1016

2 Ć— 31

This is much greater than the period of either of the two constituent generators.

Other generators.

One such generator, the āMersenne-Twisterā, from Matsumoto and Nishimura

(1998) has been implemented in R and has a period of 219937 ā’ 1. Others use a

non-linear function g in the recursion xn+1 = g(xn )(mod m) to replace a linear

one For example we might deļ¬ne xn+1 = x2 (mod m) (called a quadratic residue

n

generator) or xn+1 = g(xn ) mod m for a quadratic function or some other non-

linear function g. Typically the function g is designed to result in large values

and thus more or less random low order bits. Inversive congruential generators

generate xn+1 using the (mod m) inverse of xn .

Other generators which have been implemented in R include: the Wichmann-

Hill (1982,1984) generator which uses three multiplicative generators with prime

moduli 30269, 30307, 30323 and has a period of 1 (30268 Ć— 30306 Ć— 30322). The

4

outputs from these three generators are converted to [0, 1] and then summed

mod 1. This is similar to the idea of Theorem 17, but the addition takes

place after the output is converted to [0,1]. See Applied Statistics (1984), 33,

123. Also implemented are Marsagliaā™s Multicarry generator which has a pe-

riod of more than 260 and reportedly passed all tests (according to Marsaglia),

Marsagliaā™s āSuper-Duperā, a linear congruential generator listed in Table 1,

APPARENT RANDOMNESS OF PSEUDO-RANDOM NUMBER GENERATORS109

and two generators developed by Knuth (1997,2002) the Knuth-TAOCP and

Knuth-TAOCP-2002.

Conversion to Uniform (0, 1) generators:

In general, random integers should be mapped into the unit interval in such a

way that the values 0 and 1, each of which have probability 0 for a continuous

distribution are avoided. For a multiplicative generator, since values lie between

1 and mā’1, we may divide the random number by m. For a linear congruential

generator taking possible values x ā {0, 1, ..., m ā’ 1}, it is suggested that we use

(x + 0.5)/m.

Apparent Randomness of Pseudo-Random Num-

ber Generators

Knowing whether a sequence behaves in all respects like independent uniform

random variables is, for the statistician, pretty close to knowing the meaning of

life. At the very least, in order that one of the above generators be reasonable

approximations to independent uniform variates it should satisfy a number of

statistical tests. Suppose we reduce the uniform numbers on {0, 1, ..., m ā’ 1}

to values approximately uniformly distributed on the unit interval [0, 1] as de-

scribed above either by dividing through by m or using (x + 0.5)/m. There are

many tests that can be applied to determine whether the hypothesis of inde-

pendent uniform variates is credible (not, of course, whether the hypothesis is

true. We know by the nature of all of these pseudo-random number generators

that it is not!).

110 CHAPTER 3. BASIC MONTE CARLO METHODS

Runs Test

We wish to test the hypothesis H0 that a sequence{Ui , i = 1, 2, ..., n} consists

of n independent identically distributed random variables under the assump-

tion that they have a continuous distribution. The runs test measures runs,

either in the original sequence or in its diļ¬erences. For example, suppose we

denote a positive diļ¬erence between consecutive elements of the sequence by +

and a negative diļ¬erence by ā’. Then we may regard a sequence of the form

.21, .24, .34, .37, .41, .49, .56, .51, .21, .25, .28, .56, .92,.96 as unlikely under inde-

pendence because the corresponding diļ¬erences + + + + + + ā’ ā’ + + + + + have

too few ārunsā (the number of runs here is R = 3). Under the assumption that

the sequence {Ui , i = 1, 2, ..., n} is independent and continuous, it is possible

2nā’1 3nā’5

to show that E(R) = 3 and The proof of this result is a

var(R) = 18 .

problem at the end of this chapter. We may also approximate the distribution of

R with the normal distribution for n ā„ 25. A test at a 0.1% level of signiļ¬cance

is therefore: reject the hypothesis of independence if

ĀÆ ĀÆ

ĀÆ 2nā’1 ĀÆ

ĀÆR ā’ 3 ĀÆ

ĀÆq ĀÆ > 3.29,

ĀÆ ĀÆ

ĀÆ 3nā’5 ĀÆ

18

where 3.29 is the corresponding N (0, 1) quantile. A more powerful test based

on runs compares the lengths of the runs of various lengths (in this case one

run up of length 7, one run down of length 3, and one run up of length 6) with

their theoretical distribution.

Another test of independence is the serial correlation test. The runs test

above is one way of checking that the pairs (Un , Un+1 )are approximately uni-

formly distributed on the unit square. This could obviously be generalized to

pairs like (Ui , Ui+j ). One could also use the sample correlation or covariance as

the basis for such a test. For example, for j ā„ 0,

1

(3.3)

Cj = (U1 U1+j + U2 U2+j + ..Unā’j Un + Un+1ā’j U1 + .... + Un Uj )

n

APPARENT RANDOMNESS OF PSEUDO-RANDOM NUMBER GENERATORS111

The test may be based on the normal approximation to the distribution of Cj

with mean E(C0 ) = 1/3 and E(Cj ) = 1/4 for j ā„ 1. Also

ā§

āŖ 4

āŖ for j=0

āŖ 45n

āØ

13 n

var(Cj ) = j ā„ 1, j 6=

for

āŖ 144n 2

āŖ

āŖ

ā© 7 n

for j=

72n 2

Such a test, again at a 0.1% level will take the form: reject the hypothesis of

independent uniform if

ĀÆ ĀÆ

ĀÆ 1ĀÆ

ĀÆ Cj ā’ 4 ĀÆ

ĀÆq ĀÆ > 3.29.

ĀÆ ĀÆ

ĀÆ 13 ĀÆ

144n

for a particular preselected value of j (usually chosen to be small, such as j =

1, ...10).

Chi-squared test.

The chi-squared test can be applied to the sequence in any dimension, for ex-

ample k = 2. Suppose we have used a generator to produce a sequence of

uniform(0, 1) variables, Uj , j = 1, 2, ...2n, and then, for a partition {Ai ; i =

1, ..., K} of the unit square, we count Ni , the number of pairs of the form

(U2jā’1 , U2j ) ā Ai . See for example the points plotted in Figure 3.1. Clearly

this should be related to the area or probability P (Ai ) of the set Ai . Pearsonā™s

chi-squared statistic is

K

X [Ni ā’ nP (Ai )]2

2

(3.4)

Ļ=

nP (Ai )

i=1

which should be compared with a chi-squared distribution with degrees of free-

dom K ā’ 1 or one less than the number of sets in the partition. Observed values

of the statistic that are unusually large for this distribution should lead to re-

jection of the uniformity hypothesis. The partition usually consists of squares

of identical area but could, in general, be of arbitrary shape.

112 CHAPTER 3. BASIC MONTE CARLO METHODS

1

0.9

0.8

A

2 A

0.7

3

0.6

U2j

0.5

0.4

0.3

A

1

A

0.2

4

0.1

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

U

2j-1

Figure 3.1: The Chi-squared Test

Spectral Test

Consecutive values plotted as pairs (xn , xn+1 ), when generated from a multi-

plicative congruential generator xn+1 = axn (mod m) fall on a lattice. A lattice is

a set of points of the form t1 e1 +t2 e2 where t1 , t2 range over all integers and e1 , e2

are vectors, (here two dimensional vectors since we are viewing these points in

pairs of consecutive values (xn , xn+1 )) called the ābasisā for the lattice. A given

lattice, however, has many possible diļ¬erent bases, and in order to analyze the

lattice structure, we need to isolate the most ānaturalā basis, e.g. the one that

we tend to see in viewing a lattice in two dimensions. Consider, for example, the

lattice formed by the generator xn = 23xnā’1 mod 97. A plot of adjacent pairs

(xn , xn+1 ) is given in Figure 3.2. For basis vectors we could use e1 = (1, 23) and

e2 = (4, ā’6), or we could replace e1 by (5, 18)or (9, 13) etc. Beginning at an

arbitrary point O on the lattice as origin (in this case, since the original point

(0,0) is on the lattice, we will leave it unchanged), we choose an unambiguous

APPARENT RANDOMNESS OF PSEUDO-RANDOM NUMBER GENERATORS113

100

90

80

70

60

xn+1

50

40

30

e

20

2

e1 O

10

0

0 10 20 30 40 50 60 70 80 90 100

x

n

Figure 3.2: The Spectral Test

deļ¬nition of e1 to be the shortest vector in the lattice, and then deļ¬ne e2 as the

shortest vector in the lattice which is not of the form te1 for integer t. Such a

basis will be called a natural basis. The best generators are those for which the

cells in the lattice generated by the 2 basis vectors e1 , e2 or the parallelograms

with sides parallel to e1 , e2 are as close as possible to squares so that e1 and

e2 are approximately the same length. As we change the multiplier a in such a

way that the random number generator still has period ' m, there are roughly

m points in a region above with area approximately m2 and so the area of a

parallelogram with sides e1 and e2 is approximately a constant (m) whatever

the multiplier a. In other words a longer e1 is associated with a shorter vector

e2 and therefore for an ideal generator, the two vectors of reasonably similar

length. A poor generator corresponds to a basis with e2 much longer than e1 .

The spectral test statistic Ī½ is the renormalized length of the ļ¬rst basis vector

||e1 ||. The extension to a lattice in k-dimensions is done similarly. All linear

114 CHAPTER 3. BASIC MONTE CARLO METHODS

1

0.8

0.6

0.4

0.2

0

1

0.8 1

0.6 0.8

0.6

0.4

0.4

0.2

0.2

0 0

Figure 3.3: Lattice Structure of Uniform Random Numbers generated from

RANDU

congruential random number generators result in points which when plotted as

consecutive k-tuples lie on a lattice. In general, for k consecutive points, the

spectral test statistic is equal to min(b2 + b2 + . + . + . + b2 )1/2 under the con-

1 2 k

straint b1 + b2 a + ...bk akā’1 = mq, q 6= 0. Large values of the statistic indicate

that the generator is adequate and Knuth suggests as a minimum threshold the

value Ļā’1/2 [(k/2)!m/10]1/k .

One of the generators that fails the spectral test most spectacularly with

k = 3 is the generator RANDU, xn+1 = 65539 xn (mod 231 ). This was used

commonly in simulations until the 1980ā™s and is now notorious for the fact

that a small number of hyperplanes ļ¬t through all of the points (see Marsaglia,

1968). For RANDU, successive triplets tend to remain on the plane xn =

6xnā’1 ā’ 9xnā’2 . This may be seen by rotating the 3-dimensional graph of the

sequence of triplets of the form {(xnā’2 , xnā’1 , xn ); n = 2, 3, 4, ...N } as in Figure

3.3

As another example, in Figure 3.4 we plot 5000 consecutive triplets from

a linear congruential random number generator with a = 383, c = 263, m =

APPARENT RANDOMNESS OF PSEUDO-RANDOM NUMBER GENERATORS115

3d plot for linear congruential generator,a=383,c=263,m=10000

10000

8000

6000

4000

2000

0

10000

8000 10000

6000 8000

6000

4000

4000

2000

2000

0 0

Figure 3.4: The values (xi , xi+1 , xi+2 ) generated by a linear congruential gen-

erator xn+1 = (383xn + 263)(mod 10000)

10, 000.

Linear planes are evident from some angles in this view, but not from others.

In many problems, particularly ones in which random numbers are processed in

groups of three or more, this phenomenon can lead to highly misleading results.

The spectral test is the most widely used test which attempts to insure against

lattice structure. tABLE 3.2 below is taken from Fishman(1996) and gives some

values of the spectral test statistic for some linear congruential random number

generators in dimension k Ā· 7.

116 CHAPTER 3. BASIC MONTE CARLO METHODS

m a c k=2 k=3 k=4 k=5 k=6 k=7

231 ā’ 1 75 0 0.34 0.44 0.58 0.74 0.65 0.57

231 ā’ 1 0 0.82 0.43 0.78 0.80 0.57 0.68

630360016

231 ā’ 1 0 0.87 0.86 0.86 0.83 0.83 0.62

742938285

0.01 0.06

231 65539 0 0.93 0.16 0.29 0.45

232 69069 0 0.46 0.31 0.46 0.55 0.38 0.50

232 0 0.87 0.83 0.83 0.84 0.82 0.72

3934873077

232 663608941 0 0.88 0.60 0.80 0.64 0.68 0.61

235 513 0 0.47 0.37 0.64 0.61 0.74 0.68

259 1313 0 0.84 0.73 0.74 0.58 0.64 0.52

TABLE 3.2. Selected Spectral Test Statistics

The unacceptably small values for RANDU in the case k = 3 and k = 4 are

highlighted. On the basis of these values of the spectral test, the multiplicative

generators

xn+1 = 742938285xn (mod 231 ā’ 1)

xn+1 = 3934873077xn (mod 232 )

seem to be recommended since their test statistics are all reasonably large for

k = 2, ..., 7.

Generating Random Numbers from Non-Uniform

Continuous Distributions

By far the simplest and most common method for generating non-uniform vari-

ates is based on the inverse cumulative distribution function. For arbitrary

cumulative distribution function F (x), deļ¬ne F ā’1 (y) = min{x; F (x) ā„ y}.

This deļ¬nes a pseudo-inverse function which is a real inverse (i.e. F (F ā’1 (y)) =

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS117

F ā’1 (F (y)) = y) only in the case that the cumulative distribution function is con-

tinuous and strictly increasing. However, in the general case of a possibly discon-

tinuous non-decreasing cumulative distribution function the function continues

to enjoy some of the properties of an inverse. Notice that F ā’1 (F (x)) Ā· x and

F (F ā’1 (y)) ā„ y but F ā’1 (F (F ā’1 (y))) = F ā’1 (y) and F (F ā’1 (F (x))) = F (x). In

the general case, when this pseudo-inverse is easily obtained, we may use the

following to generate a random variable with cumulative distribution function

F (x).

Theorem 19 (inverse transform) If F is an arbitrary cumulative distribution

function and U is uniform[0, 1] then X = F ā’1 (U ) has cumulative distribution

function F (x).

Proof. The proof is a simple consequence of the fact that

[U < F (x)] ā‚ [X Ā· x] ā‚ [U Ā· F (x)] for all x, (3.5)

evident from Figure 3.5. Taking probabilities throughout (3.5), and using the

continuity of the distribution of U so that P [U = F (x)] = 0, we obtain

F (x) Ā· P [X Ā· x] Ā· F (x).

Examples of Inverse Transform

Exponential (Īø)

This distribution, a special case of the gamma distributions, is common in most

applications of probability. For example in risk management, it is common to

model the time between defaults on a contract as exponential (so the default

times follow a Poisson process). In this case the probability density function is

118 CHAPTER 3. BASIC MONTE CARLO METHODS

Figure 3.5: The Inverse Transform generator

f (x) = 1 eā’x/Īø , x ā„ 0 and f (x) = 0 for x < 0. The cumulative distribution

Īø

function is F (x) = 1 ā’ eā’x/Īø , x ā„ 0. Then taking its inverse,

X = ā’Īø ln(1 ā’ U ) or equivalently

X = ā’Īø ln U since U and 1 ā’ U have the same distribution.

In Matlab, the exponential random number generators is called exprnd and in

Splus or R it is rexp.

Cauchy (a, b)

This distribution is a member of the stable family of distributions which we

discuss later. It is similar to the normal only substantially more peaked in

the center and with more area in the extreme tails of the distribution. The

probability density function is

b

, ā’ā < x < ā.

f (x) =

Ļ(b2 + (x ā’ a)2 )

See the comparison of the probability density functions in Figure 3.6. Here

we have chosen the second (scale) parameter b for the Cauchy so that the two

densities would match at the point x = a = 0.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS119

Figure 3.6: The Normal and the Cauchy Probability Density Functions

1 1

arctan( xā’a ). Then

The cumulative distribution function is F (x) = +

2 Ļ b

the inverse transform generator is, for U uniform on [0,1],

1 b

X = a + b tan{Ļ(U ā’ )} or equivalently X = a +

2 tan(ĻU )

where the second expression follows from the fact that tan(Ļ(xā’ 1 )) = (tan Ļx)ā’1 .

2

Geometric (p)

This is a discrete distribution which describes the number of (independent) trials

necessary to achieve a single success when the probability of a success on each

trial is p. The probability function is

f (x) = p(1 ā’ p)x , x = 1, 2, 3, ....

and the cumulative distribution function is

F (x) = P [X Ā· x] = 1 ā’ (1 ā’ p)[x] , x ā„ 0

120 CHAPTER 3. BASIC MONTE CARLO METHODS

where [x] denotes the integer part of x. To invert the cumulative distribution

function of a discrete distribution like this one, we need to refer to a graph of the

cumulative distribution function analogous to Figure 3.5. We wish to output

an integer value of x which satisļ¬es the inequalities

F (x ā’ 1) < U Ā· F (x).

Solving these inequalities for integer x,we obtain

1 ā’ (1 ā’ p)xā’1 < U Ā· 1 ā’ (1 ā’ p)x

(1 ā’ p)xā’1 > 1 ā’ U ā„ (1 ā’ p)x

(x ā’ 1) ln(1 ā’ p) > ln(1 ā’ U ) ā„ x ln(1 ā’ p)

ln(1 ā’ U )

(x ā’ 1) < Ā·x

ln(1 ā’ p)

Note that changes of direction of the inequality occurred each time we multiplied

or divided by negative quantity. We should therefore choose the smallest integer

ln(1ā’U)

for X which is greater than or equal to or equivalently,

ln(1ā’p)

log(1 ā’ U ) ā’E

] or1 + [

X =1+[ ]

log(1 ā’ p) log(1 ā’ p)

where we write ā’ log(1ā’U ) = E, an exponential(1) random variable. In Matlab,

the geometric random number generators is called geornd and in R or Splus

it is called rgeom.

Pareto (a, b)

This is one of the simpler families of distributions used in econometrics for

modeling quantities with lower bound b.

ba

F (x) = 1 ā’ ( ) , for x ā„ b > 0.

x

Then the probability density function is

aba

f (x) =

xa+1

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS121

and the mean is E(X) = . The inverse transform in this case results in

b b

or

X= 1/a 1/a

(1 ā’ U ) U

The special case b = 1 is often considered in which case the cumulative distrib-

ution function takes the form

1

F (x) = 1 ā’

xa

and the inverse

X = (1 ā’ U )1/a .

Logistic

This is again a distribution with shape similar to the normal but closer than is

the Cauchy. Indeed as can be seen in Figure 3.7, the two densities are almost

indistinguishable, except that the logistic is very slightly more peaked in the

center and has slightly more weight in the tails. Again in this graph, parameters

have been chosen so that the densities match at the center.

The logistic cumulative distribution function is

1

F (x) = .

1 + exp{ā’(x ā’ a)/b}

and on taking its inverse, the logistic generator is

X = a + b ln(U/(1 ā’ U )).

Extreme Value

This is one of three possible distributions for modelling extreme statistics such as

the largest observation in a very large random sample. As a result it is relevant

to risk management. The cumulative distribution function is for parameters

ā’ā < a < ā and b > 0,

xā’a

F (x) = 1 ā’ exp{ā’ exp( )}.

b

122 CHAPTER 3. BASIC MONTE CARLO METHODS

Figure 3.7: Comparison of the Standard Normal and Logistic(0.625) Probability

density functions.

The corresponding inverse is

X = a + b ln(ln(U )).

Weibull Distribution

In this case the parameters a, b are both positive and the cumulative distribution

function is

F (x) = 1 ā’ exp{ā’axb } for x ā„ 0.

The corresponding probability density function is

f (x) = abxbā’1 exp{ā’axb }.

Then using inverse transform we may generate X as

Ā½ Ā¾1/b

ā’ ln(1 ā’ U )

X= .

a

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS123

Studentā™s t.

The Student t distribution is used to construct conļ¬dence intervals and tests for

the mean of normal populations. It also serves as a wider-tailed alternative to

the normal, useful for modelling returns which have moderately large outliers.

The probability density function takes the form

x2 ā’(v+1)/2

Ī“((v + 1)/2)

f (x) = ā , ā’ā < x < ā.

(1 + )

v

vĻĪ“(v/2)

The case v = 1 corresponds to the Cauchy distribution. There are specialized

methods of generating random variables with the Student t distribution we will

return to later. In MATLAB, the studentā™s t generator is called trnd. In general,

trnd(v,m,n) generates an m Ć— n matrix of studentā™s t random variables having

v degrees of freedom.

The generators of certain distributions are as described below. In each case

a vector of length n with the associated parameter values is generated.

DISTRIBUTION R and SPLUS MATLAB

rnorm(n, Āµ, Ļ) normrnd(Āµ, Ļ, 1, n) or randn(1, n) if Āµ = 1, Ļ = 1

normal

rt(n, Ī½) trnd(Ī½, 1, n)

Studentā™s t

rexp(n, Ī») exprnd(Ī», 1, n)

exponential

runif(n, a, b) unifrnd(a, b, 1, n) or rand(1, n) if a = 0, b = 1

uniform

rweibull(n, a, b) weibrnd(a, b, 1, n)

Weibull

rgamma(n, a, b) gamrnd(a, b, 1, n)

gamma

rcauchy(n, a, b) a+b*trnd(1, 1, n)

Cauchy

rbinom(n, m, p) binornd(m, p, 1, n)

binomial

rpois(n, Ī») poissrnd(Ī», 1, n)

Poisson

TABLE 3.3: Some Random Number Generators in R, SPLUS and MATLAB

Inversion performs reasonably well for any distribution for which both the

124 CHAPTER 3. BASIC MONTE CARLO METHODS

cumulative distribution function and its inverse can be found in closed form

and computed reasonably eļ¬ciently. This includes the Weibull, the logistic

distribution and most discrete distributions with a small number possible val-

ues. However, for other distributions such as the Normal, Studentā™s t, the

chi-squared, the Poisson or Binomial with large parameter values, other more

specialized methods are usually used, some of which we discuss later.

When the cumulative distribution function is known but not easily inverted,

we might attempt to invert it by numerical methods. For example, using the

Newton-Ralphson method, we would iterate until convergence the equation

F (X) ā’ U

X=Xā’ (3.6)

f (X)

with f (X) = F 0 (X), beginning with a good approximation to X. For example

we might choose the initial value of X = X(U ) by using an easily inverted

approximation to the true function F (X). The disadvantage of this approach

is that for each X generated, we require an iterative solution to an equation

and this is computationally very expensive.

The Acceptance-Rejection Method

Suppose F (x) is a cumulative distribution function and f (x) is the corresponding

probability density function. In this case F is continuous and strictly increasing

wherever f is positive and so it has a well-deļ¬ned inverse F ā’1 . Consider the

transformation of a point (u, v) in the unit square deļ¬ned by

x(u, v) = F ā’1 (u)

y(u, v) = vf (F ā’1 (u)) = vf (x)

for 0 < u < 1, 0<v<1

This maps a random point (U, V ) uniformly distributed on the unit square into

a point (X, Y ) uniformly distributed under the graph of the probability density

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS125

f . The fact that X has cumulative distribution function F follows from its def-

inition as X = F ā’1 (U ) and the inverse transform theorem. By the deļ¬nition of

Y = V f (X) with V uniform on [0, 1] we see that the conditional distribution of

Y given the value of X, is uniform on the interval [0, f (X)]. Suppose we seek a

random number generator for the distribution of X but we are unable to easily

invert the cumulative distribution function We can nevertheless use the result

that the point (X, Y ) is uniform under the density as the basis for one of the

simplest yet most useful methods of generating non-uniform variates, the rejec-

tion or acceptance-rejection method. It is based on the following very simple

relationship governing random points under probability density functions.

Theorem 20 (Acceptance-Rejection) (X, Y ) is uniformly distributed in the

region between the probability density function y = f (x) and the axis y = 0 if

and only if the marginal distribution of X has density f (x) and the conditional

distribution of Y given X is uniform on [0, f (X)].

Proof. If a point (X, Y ) is uniformly distributed under the graph of f (x)

notice that the probability P [a < X < b] is proportional to the area under the

graph between vertical lines at x = a and x = b. In other words P [a < X <

Rb

b] is proportional to a f (x)dx. This implies that f (x) is proportional to the

Rā

probability density function of X and provided that ā’ā f (x)dx = 1, f (x) is

the probability density function of X. The converse and the rest of the proof is

similar.

Even if the scaling constant for a probability density function is unavailable,

in other words if we know f (x) only up to some unknown scale multiple, we

can still use Theorem 19 to generate a random variable with probability density

f because the X coordinate of a random point uniform under the graph of

a constantĆ— f (x) is the same as that of a random point uniformly distributed

under the graph of f (x). The acceptance-rejection method works as follows. We

wish to generate a random variable from the probability density function f (x).

126 CHAPTER 3. BASIC MONTE CARLO METHODS

We need the following ingredients:

ā¢ A probability density function g(x) with the properties that

Rx

1. the corresponding cumulative distribution function G(x) = g(z)dz

ā’ā

is easily inverted to obtain Gā’1 (u).

2.

f (x)

; ā’ā < x < ā} < ā. (3.7)

sup{

g(x)

For reasonable eļ¬ciency we would like the supremum in (3.7) to be as close

as possible to one (it is always greater or equal to one).

The condition (3.7) allows us to ļ¬nd a constant c > 1 such that f (x) Ā·

cg(x) for all x. Suppose we are able to generate a point (X, Y ) uniformly

distributed under the graph of cg(x). This is easy to do using Theorem

19. Indeed we can deļ¬ne X = Gā’1 (U ) and Y = V Ć— cg(X) where U

and V are independent U [0, 1]. Can we now ļ¬nd a point (X, Y ) which is

uniformly distributed under the graph of f (x)? Since this is a subset of

the original region, this is easy. We simple test the point we have already

generated to see if it is in this smaller region and if so we use it. If not start

over generating a new pair (X, Y ), and repeating this until the condition

Y Ā· f (X) is eventually satisļ¬ed, (see Figure ??).The simplest version of

this algorithm corresponds to the case when g(x) is a uniform density on

an interval [a, b]. In algorithmic form, the acceptance-rejection method is;

1. Generate a random variables X = Gā’1 (U ), where U where U is uniform

on [0, 1].

2. Generate independent V ā¼ U [0, 1]

f (X)

3. If V Ā· then return X and exit

cg(X) ,

4. ELSE go to step 1.

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS127

Figure 3.8: The acceptance-Rejection Method

The rejection method is useful if the density g is considerably simpler than

f both to evaluate and to generate distributions from and if the constant c is

close to 1. The number of iterations through the above loop until we exit at

step 3 has a geometric distribution with parameter p = 1/c and mean c so when

c is large, the rejection method is not very eļ¬ective.

Most schemes for generating non-uniform variates are based on a transfor-

mation of uniform with or without some rejection step. The rejection algorithm

is a special case. Suppose, for example, that T = (u(x, y), v(x, y)) is a one-one

area-preserving transformation of the region ā’ā < x < ā, 0 < y < f (x) into a

subset A of a square in R2 as is shown in Figure 3.9.

Notice that any such transformation deļ¬nes a random number generator for

the density f (x). We need only generate a point (U, V ) uniformly distributed

in the set A by acceptance-rejection and then apply the inverse transformation

T ā’1 to this point, deļ¬ning (X, Y ) = T ā’1 (U, V ). Since the transformation is

area-preserving, the point (X, Y ) is uniformly distributed under the probability

density function f (x) and so the ļ¬rst coordinate X will then have density f . We

can think of inversion as a mapping on [0, 1] and acceptance-rejection algorithms

128 CHAPTER 3. BASIC MONTE CARLO METHODS

Figure 3.9: T (x, y) is an area Preserving invertible map f (x, y) from the region

under the graph of f into the set A, a subset of a rectangle.

as an area preserving mapping on [0, 1]2 .

The most common distribution required for simulations in ļ¬nance and else-

where is the normal distribution. The following theorem provides the simple

connections between the normal distribution in Cartesian and in polar coordi-

nates.

Theorem 21 If (X, Y ) are independent standard normal variates, then ex-

pressed in polar coordinates,

p

(R, Ī˜) = ( X 2 + Y 2 , arctan(Y /X)) (3.8)

ā

X 2 + Y 2 has the distribution of the

are independent random variables. R =

square root of a chi-squared(2) or exponential(2) variable. Ī˜ = arctan(Y /X))

has the uniform distribution on [0, 2Ļ].

It is easy to show that if (X, Y ) are independent standard normal variates,

ā

then X 2 + Y 2 has the distribution of the square root of a chi-squared(2) (i.e.

exponential(2)) variable and arctan(Y /X)) is uniform on [0, 2Ļ]. The proof of

this result is left as a problem.

This observation is the basis of two related popular normal pseudo-random

number generators. The Box-Muller algorithm uses two uniform[0, 1] variates

GENERATING RANDOM NUMBERS FROM NON-UNIFORM CONTINUOUS DISTRIBUTIONS129

U, V to generate R and Ī˜ with the above distributions as

R = {ā’2 ln(U )}1/2 , Ī˜ = 2ĻV (3.9)

and then deļ¬nes two independent normal(0,1) variates as

(3.10)

(X, Y ) = R(cos Ī˜, sin Ī˜)

Note that normal variates must be generated in pairs, which makes simulations

involving an even number of normal variates convenient. If an odd number are

required, we will generate one more than required and discard one.

Theorem 22 (Box-Muller Normal Random Number generator)

Suppose (R, Ī˜) are independent random variables such that R2 has an ex-

ponential distribution with mean 2 and Ī˜ has a Uniform[0, 2Ļ] distribution.

Then (X, Y ) = (R cos Ī˜, R sin Ī˜) is distributed as a pair of independent normal

variates.

Proof. Since R2 has an exponential distribution, R has probability density

function

d

P [R Ā· r]

fR (r) =

dr

d

P [R2 Ā· r2 ]

=

dr

d 2

(1 ā’ eā’r /2 )

=

dr

2

= reā’r /2

, for r > 0.

1

and Ī˜ has probability density function fĪ˜ (Īø) = 2Ļ for 0 < Īø < 2Ļ. Since

p

r = r(x, y) = x2 + y 2 and Īø(x, y) = arctan(y/x), the Jacobian of the trans-

130 CHAPTER 3. BASIC MONTE CARLO METHODS

formation is

ĀÆ ĀÆ

ĀÆ ā‚r ā‚r ĀÆ

ĀÆ ĀÆ

ā‚(r, Īø)

| = ĀÆ ā‚x ā‚y ĀÆ

| ĀÆ ā‚Īø ā‚Īø ĀÆ

ā‚(x, y) ĀÆ ā‚x ā‚y ĀÆ

ĀÆ ĀÆ

ĀÆ ĀÆ

ĀÆāx ĀÆ

y

ā2 2

ĀÆ ĀÆ

2 2

= ĀÆ x +y x +y

ĀÆ

ĀÆ ĀÆ

ĀÆ x2ā’y 2 x

ĀÆ

x2 +y 2

+y

1

=p

x2 + y 2

ńņš. 4 |