<< стр. 10(всего 11)СОДЕРЖАНИЕ >>

the expectation conditional (as always) on the value of the open O. The eп¬Ђect
of a knock-out lower barrier at Oeв€’a is essentially the same but with b replaced
by a, namely
a(a + ln(C/O))
E[П€(C)(1 в€’ exp{в€’2 })].
Пѓ2T

In the next section we consider the eп¬Ђect of two barriers, both an upper and a
lower barrier.

One Process, Two barriers.

We have discussed a simple device above for generating jointly the high and the
close or the low and the close of a process given the value of the open. The joint
distribution of H, L, C given the value of O or the distribution of C in the case
of upper and lower barriers is more problematic. Consider a single factor model
and two barriers- an upper and a lower barrier. Note that the high and the
low in any given interval is dependent, but if we simulate a path in relatively
short segments, by п¬Ѓrst generating n increments and then generating the highs
and lows within each increment, then there is an extremely low probability
that the high and low of the process will both lie in the same short increment.
For example for a Brownian motion with the time interval partitioned into 5
equal subintervals, the probability that the high and low both occur in the
same increment is less than around 0.011 whatever the drift. If we increase the
number of subintervals to 10, this is around 0.0008. This indicates that provided
we are willing to simulate highs, lows and close in ten subintervals, pretending
SIMULATING BARRIER AND LOOKBACK OPTIONS 287

that within subintervals the highs and lows are conditionally independent, the
error in our approximation is very small.

An alternative, more computationally intensive, is to diп¬Ђerentiate the inп¬Ѓnite
series expression for the probability P (H В· b, L в‰Ґ a, C = u|O = 0] A п¬Ѓrst step
in this direction is the the following result, obtained from the reп¬‚ection principle
with two barriers.

Theorem 47 For a Brownian motion process

dSt = Вµdt + dWt , S0 = 0

deп¬Ѓned on [0, 1] and for в€’a < u < b,

P (L < в€’a or H > b|C = u)
в€ћ
1X
[П†{2n(a + b) + u} + П†{2n(a + b) в€’ 2a в€’ u}
=
П†(u) n=1

+ П†{в€’2n(b + a) + u} + П†{2n(b + a) + 2a + u}]

where П† is the N (0, 1) probability density function.

Proof. The proof is a well-known application of the reп¬‚ection principal.
It is suп¬ѓcient to prove the result in the case Вµ = 0 since the conditional
distribution of L, H given C does not depend on Вµ (A statistician would say
that C is a suп¬ѓcient statistic for the drift parameter). Denote the following
paths determined by their behaviour on 0 < t < 1. All paths are assumed to
end at C = u.
288 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS

= H > b (path goes above b)
A+1
path goes above b and then falls below в€’a
=
A+2
goes above b then falls below в€’a then rises above b
=
A+3
etc.
L < в€’a
=
Aв€’1
path falls below в€’a then rises above b
=
Aв€’2
falls below в€’a then rises above b then falls below в€’a
=
Aв€’3
etc.
For an arbitrary event A, denote by P (A|u) probability of the event conditional
on C = u. Then according to the reп¬‚ection principal the probability that the
Brownian motion leaves the interval [в€’a, b] is given from an inclusion-exclusion
argument by

P (A+1 |u) в€’ P (A+2 |u) + P (A+3 |u) в€’ В· В· В· (5.30)

+P (Aв€’1 |u) в€’ P (Aв€’2 |u) + P (Aв€’3 |u) В· В· В·

This can be veriп¬Ѓed by considering the paths in Figure 5.7. (It should be noted
here that, as in our application of the reп¬‚ection principle in the one-barrier case,
the reп¬‚ection principle allows us to show that the number of paths in two sets is
the same, and this really only translates to probability in the case of a discrete
sample space, for example a simple random walk that jumps up or down by a
п¬Ѓxed amount in discrete time steps. This result for Brownian motion obtains if
we take a limit over a sequence of simple random walks approaching a Brownian
motion process.)
Note that

П†(2b в€’ u)
P (A+1 |u) =
П†(u)
П†{2n(a + b) + u}
P (A+2n |u) =
П†(u)
П†{2n(a + b) в€’ 2a в€’ u}
P (A+(2nв€’1) |u) =
П†(u)
SIMULATING BARRIER AND LOOKBACK OPTIONS 289

Figure 5.7: The Reп¬‚ection principle with Two Barriers

and

П†(в€’2a в€’ u)
P (Aв€’1 |u) =
П†(u)
П†{в€’2n(b + a) + u}
P (Aв€’2n |u) =
П†(u)
П†{2n(b + a) + 2a + u}
P (Aв€’(2n+1) |u) = .
П†(u)

The result then obtains from substitution in (5.30).

As a consequence of this result we can obtain an expression for P (a < L В·
H < b, u < C < v) (see also Billingsley, (1968), p. 79) for a Brownian motion
on [0, 1] with zero drift:

P (a, b, u, v) = P (a < L В· H < b, u < C < v)
в€ћ
X
О¦[v + 2k(b в€’ a)] в€’ О¦[u + 2k(b в€’ a)]
=
k=в€’в€ћ
в€ћ
X
в€’ О¦[2b в€’ u + 2k(b в€’ a)] в€’ О¦[2b в€’ v + 2k(b в€’ a)]. (5.31)
k=в€’в€ћ

where О¦ is the standard normal cumulative distribution function. From (5.31)
we derive the joint density of (L, H, C) by taking the limit P (a, b, u, u + Оґ)/Оґ as
290 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS

Оґ в†’ 0, and taking partial derivatives with respect to a and b:
в€ћ
X
k 2 П†00 [u + 2k(b в€’ a)] в€’ k(1 + k)П†00 [2b в€’ u + 2k(b в€’ a)]
f (a, b, u) = 4
k=в€’в€ћ
в€ћ
X
k П† [u + 2k(b в€’ a)] в€’ k(1 + k)П†00 [2b в€’ u + 2k(b в€’ a)]
2 00
=4
k=1

+ k2 П†00 [u в€’ 2k(b в€’ a)] + k(1 в€’ k)П†00 [2b в€’ u в€’ 2k(b в€’ a)] (5.32)

for a < u < b.
From this it is easy to see that the conditional cumulative distribution func-
tion of L given C = u, H = b is given by on a В· u В· b (where в€’2П†0 (2b в€’ u)
is the joint p.d.f. of H, C) by
в€‚2
в€‚bв€‚v P (a, b, u, v)| v=u
(5.33)
F (a|b, u) = 1 +
2П†0 (2b в€’ u)
в€ћ
X
в€’1
{в€’kП†0 [u + 2k(b в€’ a)] + (1 + k)П†0 [2b в€’ u + 2k(b в€’ a)]
=0
П† (2b в€’ u)
k=1

+ kП†0 [u в€’ 2k(b в€’ a)] + (1 в€’ k)П†0 [2b в€’ u в€’ 2k(b в€’ a)]}

This allows us to simulate both the high and the low, given the open and the
close by п¬Ѓrst simulating the high and the close using в€’2П†0 (2b в€’ u) as the joint
p.d.f. of (H, C) and then simulating the low by inverse transform from the
cumulative distribution function of the form (5.33).

Survivorship Bias

It is quite common for retrospective studies in п¬Ѓnance, medicine and to be
subject to what is often called вЂњsurvivorship biasвЂќ. This is a bias due to the
fact that only those members of a population that remained in a given class
(for example the survivors) remain in the sampling frame for the duration of
the study. In general, if we ignore the вЂњdrop-outsвЂќ from the study, we do so
at risk of introducing substantial bias in our conclusions, and this bias is the
survivorship bias.
SURVIVORSHIP BIAS 291

Suppose for example we have hired a stable of portfolio managers for a large
pension plan. These managers have a responsibility for a given portfolio over
a period of time during which their performance is essentially under continuous
review and they are subject to one of several possible decisions. If returns below
a given threshhold, they are deemed unsatisfactory and п¬Ѓred or converted to
another line of work. Those with exemplary performance are promoted, usually
to an administrative position with little direct п¬Ѓnancial management. And those
between these two вЂњabsorbingвЂќ barriers are retained. After a period of time,
T, an amibitious graduate of an unnamed Ivey league school working out of
head oп¬ѓce wishes to compare performance of those still employed managing
portfolios. How are should the performance measures reп¬‚ect the п¬Ѓltering of
those with unusually good or unusually bad performance? This is an example
of a process with upper and lower absorbing barriers, and it is quite likely
that the actual value of these barriers diп¬Ђers from one employee to another, for
example the son-in-law of the CEO has a substantially diп¬Ђerent barriers than the
math graduate fresh out of UW. However, let us ignore this diп¬Ђerence, at least
for the present, and concentrate on a diп¬Ђerence that is much harder to ignore
in the real world, the diп¬Ђerence between the volatility parameters of portfolios,
possibly in diп¬Ђerent sectors of the market, controlled by diп¬Ђerent managers.
For example suppose two managers were responsible for funds that began and
ended the year at the same level and had approximately the same value for the
lower barrier as in Table 5.2. For each the value of the volatility parameter
Пѓ was estimated using individual historical volatilities and correlations of the
component investments.

Portfolio Open price Close Price Lower Barrier Volatility

56 5
1 40 30 .5
8

56 1
2 40 30 .2
4

Table 5.3
292 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS

Suppose these portfolios (or their managers) have been selected retrospec-
tively from a list of вЂњsurvivorsвЂќ which is such that the low of the portfolio value
never crossed a barrier at l = Oeв€’a (bankruptcy of fund or termination or
demotion of manager, for example) and the high never crossed an upper barrier
at h = Oeb . However, for the moment let us assume that the upper barrier is so
high that its inп¬‚uence can be neglected, so that the only absorbtion with any
substantial probability is at the lower barrier. We interested in the estimate of
return from the two portfolios, and a preliminary estimate indicates a continu-
ously compounded rate of return from portfolio 1 of R1 = ln(56.625/40) = 35%
and from portfolio two of R2 = ln(56.25/40) = 34%. Is this diп¬Ђerence signiп¬Ѓcant
and are these returns reasonably accurate in view of the survivorship bias?
We assume a geometric Brownian motion for both portfolios,

(5.34)
dSt = ВµSt dt + ПѓSt dWt ,

and deп¬Ѓne O = S(0), C = S(T ),

H = max S(t), L = min S(t)
0tT 0tT

with parameters Вµ, Пѓ possibly diп¬Ђerent.
In this case it is quite easy to determine the expected return or the value of
any performance measure dependent on C conditional on survival, since this is
essentially the same as a problem already discussed, the valuation of a barrier
option. According to (5.27), the probability that a given Brownian motion
process having open 0 and close c strikes a barrier placed at l < min(0, c) is

zl
}
exp{в€’2
Пѓ2 T

with

zl = l(l в€’ c).

Converting this statement to the Geometric Brownian motion (5.34), the prob-
ability that a geometric Brownian motion process with open O and close c
SURVIVORSHIP BIAS 293

breaches a lower barrier at l is

zl
P [L В· l|O, C] = exp{в€’2 }
Пѓ2T

with

zl = ln(O/l) ln(C/l) = a(a + ln(C/O)).

Of course the probability that a particular path with this pair of values (O, C)
is a вЂњsurvivorвЂќ is 1 minus this or

zl
1 в€’ exp{в€’2 }. (5.35)
Пѓ2 T

When we observe the returns or the closing prices C of survivors only, the results
have been п¬Ѓltered with probability (5.35). In other words if the probability
density function of C without any barriers at all is f (c) (in our case this is a
lognormal density with parameters that depend on Вµ and Пѓ) then the density
function of C of the survivors in the presence of a lower barrier is proportional
to

ln(O/l) ln(c/l)
f (c)[1 в€’ exp{в€’2 }]
Пѓ2 T
l 2 ln(O/l) 2a
= f (c)(1 в€’ ( )О» ), with О» = = 2 > 0.
2T
c Пѓ ПѓT

It is interesting to note the eп¬Ђect of this adjustment on the moments of C for
various values of the parameters. For example consider the expected value of C
conditional on survival
Rв€ћ l
cf (c)(1 в€’ ( c )О» )dc
l
E(C|L в‰Ґ l] = R в€ћ l
f (c)(1 в€’ ( c )О» )dc
l
E[CI(C в‰Ґ l)] в€’ lО» E[C 1в€’О» I(C в‰Ґ l)]
(5.36)
=
P [C в‰Ґ l] в€’ lО» E[C в€’О» I(C в‰Ґ l)]

and this is easy to evaluate in the case of interest in which C has a lognormal
distribution. In fact the same kind of calculation is used in the development of
the Black-Scholes formula. In our case C = exp(Z) where Z is N (ВµT, Пѓ 2 T )
294 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS

and so for any p and l > 0, we have from (3.11), using the fact that E(C|O) =
O exp{ВµT + Пѓ2 T /2}, (and assuming O is п¬Ѓxed),
в€љ
1
E[C p I(C > l)] = Op exp{pВµT + p2 Пѓ 2 T /2}О¦( в€љ (a + ВµT ) + Пѓ T p)
ПѓT

To keep things slightly less combersome, let us assume that we observe the
geometric Brownian motion for a period of T = 1. Then (5.36) results in
2 2 2
1 1
OeВµ+Пѓ /2
О¦( Пѓ (a + Вµ) + Пѓ) в€’ Oeв€’aО»+(1в€’О»)Вµ+(1в€’О») Пѓ /2 О¦( Пѓ (a + Вµ) + Пѓ(1 в€’ О»))
1 1
О¦( Пѓ (a + Вµ)) в€’ eв€’О»aв€’О»Вµ+О»2 Пѓ2 /2 О¦( Пѓ (a + Вµ) в€’ ПѓО»)

Let there be no bones about it. At п¬Ѓrst blush this is still a truly ugly and
opaque formula. We can attempt to beautify it by re-expressing it in terms more
like those in the Black-Scholes formula, putting

1 1
(Вµ в€’ a), and d2 (0) = (a + Вµ),
d2 (О») =
Пѓ Пѓ
d1 (О») = d2 (О») + Пѓ, d1 (0) = d2 (0) + Пѓ.

These are analogous to the values of d1, d2 in the Black-Scholes formula in the
case О» = 0. Then
2 2 2
eВµ+Пѓ /2
О¦(d1 (0)) в€’ eв€’О»a+(1в€’О»)Вµ+(1в€’О») Пѓ /2 О¦(d1 (О»))
E[C|L в‰Ґ l] = O (5.37)
.
О¦(d2 (0)) в€’ eв€’О»aв€’О»Вµ+О»2 Пѓ2 /2 О¦(d2 (О»))

What is interesting is how this conditional expectation, the expected close for
the survivors, behaves as a function of the volatility parameter Пѓ. Although this
is a rather complicated looking formula, we can get a simpler picture (Figure
5.8) using a graph with the drift parameter Вµ chosen so that E(C) = 56.25 is
held п¬Ѓxed. We assume a = в€’ ln(30/40) (consistent with Table 5.2)and vary
the value of Пѓ over a reasonable range from Пѓ = 0.1 (a very stable investment)
through Пѓ = .8 (a highly volatile investment). In Figure 5.8 notice that for small
volatility, e.g. for Пѓ В· 0.2, the conditional expectation E[C|L в‰Ґ 30] remains
close to its unconditional value E(C) but for Пѓ в‰Ґ 0.3 it increases almost linearly
in Пѓ to around 100 for Пѓ = 0.8. The intuitive reason for this dramatic increase is
quite simple. For large values of Пѓ the process п¬‚uctuates more, and only those
SURVIVORSHIP BIAS 295

Figure 5.8: E[C|L в‰Ґ 30] for various values of (Вµ, Пѓ) chosen such that E(C) =
56.25.

paths with very large values of C have abeen able to avoid the absorbing barrier
at l = 30. Two comparable portfolios with unconditional return about 40%
will show radically diп¬Ђerent apparent returns in the presence of an absorbing
barrier. If Пѓ = 20% then the survivorвЂ™s return will still average around 40%,
but if Пѓ = 0.8, the survivorвЂ™s returns average close to 150%. The practical
implications are compelling. If there is any form of survivorship bias (as there
usually is), no measure of performance should be applied to the returns from
diп¬Ђerent investments, managers, or portfolios without an adjustment for the
risk or volatility.

In the light of this discussion we can return to the comparison of the two
portfolios in Table 5.3. Evidently there is little bias in the estimate of returns
for portfolio 2, since in this case the volatility is small Пѓ = 0.2. However there
is very substantial bias associated with the estimate for portfolio 1, Пѓ = 0.5.
In fact if we repeat the graph of Figure 5.8 assuming that the unconditional
return is around 8% we discover that E[C|L в‰Ґ 30] is very close to 56 5 when
8

Пѓ = 0.5 indicating that this is a more reasonable estimator of the performance
of portfolio 1.
296 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS

Figure 5.9: The Eп¬Ђect of Surivorship bias for a Brownian Motion

For a Brownian motion process it is easy to demonstrate graphically the
nature of the surivorship bias. In Figure 5.9, the points under the graph of
the probability density which are shaded correspond to those which whose low
fell below the absorbing barrier at l = 30. The points in the unshaded region
correspond to the survivors. The expected value of the return conditional on
survival is the mean return (x-cooredinate of the center of mass) of those points
chosen uniformly under the density but above the lower curve, in the region
labelled вЂњsurvivorsвЂќ. Note that if the mean Вµ of the unconditional density
approaches the barrier (here at 30) , this region approaches a narrow band
along the top of the curve and to the right of 30. Similarly if the unconditional
standard deviation or volatility increases, the unshaded region stretches out to
the right in a narrow band and the conditional mean increases.

We arrive at the following seemingly paradoxical conclusions which make it
imperative to adjust for survivorship bias: the conditional mean, conditional on
survivorship, may increase as the volatility increases even if the unconditional
SURVIVORSHIP BIAS 297

mean decreases.
Let us return to the problem with both an upper and lower barrier and
consider the distribution of returns conditional on the low never passing a barrier
Oeв€’a and the high never crossing a barrier at Oeb ( representing a fund buyout,
recruitment of manager by competitor or promotion of fund manager to Vice
President). It is common in process control to have an upper and lower barrier
and to intervene if either is crossed, so we might wish to study those processes
for which no intervention was required. Similarly, in a retrospective study we
may only be able to determine the trajectory of a particle which has not left
a given region and been lost to us. Again as an example, we use the following
data on two portfolio managers, both observed conditional on survival, for a
period of one year.

Portfolio Open price Close Price Lower Barrier Upper Barrier Volatility

56 5
1 40 30 100 .5
8

56 1
2 40 30 100 .2
4

If П† denotes the standard normal p.d.f., then the conditional probability
density function of ln(C/O) given that Oeв€’a < L < H < Oeb is proportional
uв€’Вµ
1
to where, as before
Пѓ П†( Пѓ )w(u)

2 2 2 2
w(u) = 1 в€’ eв€’2b(bв€’u)/Пѓ + eв€’2(a+b)(a+bв€’u)/Пѓ в€’ eв€’2a(a+u)/Пѓ + eв€’2(a+b)(a+b+u)/Пѓ в€’ E(W ),
в€’ ln(L)
ln(H) b a
where W = I[f rac1( ], and
)> ] + I[f rac1( )>
a+b a+b a+b a+b
a = в€’ln(30/40).
b = ln(100/40),

The expected return conditional on survival when the drift is Вµ is given by
Z b
uв€’Вµ
1
E(ln(C/O)|30 < L < H < 100) = uw(u)П†( )du.
Пѓ Пѓ
в€’a
298 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS

where w(u) is the weight function above. Therefore a moment estimator of the
drift for the two portfolios is determined by setting this expected return equal
to the observed return, and solving for Вµi the equation
Z
1b u в€’ Вµi
uw(u)П†( )du = Ri , i = 1, 2.
Пѓi в€’a Пѓi
The solution is, for portfolio 1, Вµ1 = 0 and for portfolio 2, Вµ2 = 0.3. Thus the
observed values of C are completely consistent with a drift of 30% per annum
for portfolio 2 and a zero drift for portfolio 1. The bias again very strongly
eп¬Ђects the portfolio with the greater volatility and estimators of drift should
account for this substantial bias. Ignoring the survivorship bias has led in the
mutual funds.

Problems

1. If the values of dj are equally spaced, i.e. if dj = jв€†, j = ..., в€’2, в€’1, 0, 1, ...and
with S0 = 0, ST = C and M = max(S0 , ST ), show that
P [C > u and Cв€’M is even]
в€†
E[H|C = u] = M + в€† .
P [C = u]

2. Let W (t) be a standard Brownian motion on [0, 1] with W0 = 0. Deп¬Ѓne
C = W (1) and H = max{W (t); 0 В· t В· 1}. Show that the joint probabil-
ity density function of (C, H) is given by

f (c, h) = 2П†(c)(2h в€’ c)eв€’2h(hв€’c) , for h > max(0, c)

where П†(c) is the standard normal probability density function.

3. Use the results of Problem 2 to show that the joint probability density
function of the random variables

Y = exp{в€’(2H в€’ C)2 /2}

and C is a uniform density on the region {(x, y); y < exp(x2 /2)}.
PROBLEMS 299

4. Let X(t) be a Brownian motion on [0, 1], i.e. Xt satisп¬Ѓes

dXt = Вµdt + ПѓdWt , and X0 = 0.

Deп¬Ѓne C = X(1) and H = max{X(t); 0 В· t В· 1}. Find the joint proba-
bility density function of (C, H).
300 CHAPTER 5. SIMULATING THE VALUE OF OPTIONS
Chapter 6

Quasi- Monte Carlo
Multiple Integration

Introduction

In some sense, this chapter п¬Ѓts within Chapter 4 on variance reduction; in
some sense it is stratiп¬Ѓcation run wild. Quasi-Monte Carlo methods are purely
deterministic, numerical analytic methods in the sense that they do not even
attempt to emulate the behaviour of independent uniform random variables, but
rather cover the space in d dimensions with fewer gaps than independent random
variables would normally admit. Although these methods are particularly when
evaluating integrals in moderate dimensions, we return brieп¬‚y to the problem
of evaluating a one-dimension integral of the form

Z 1
f (x)dx.
0

The simplest numerical approximation to this integral consists of choosing a
point xj in the interval [ N , j+1 ], j = 0, 1, ..., N в€’ 1, perhaps the midpoint of the
j
N

301
302 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

interval, and then evaluating the average
N в€’1
1X
(6.1)
f (xj ).
N j=0

If the function f has one continuous derivative, such a numerical method with
N equally or approximately equally spaced points will have bias that approaches
0 at the rate 1/N because, putting M = sup{|f 0 (z)|; 0 < z < 1},
Z (j+1)/N
1 1
f (x)dx в€’ f (xj ) В· 2 M (6.2)
N N
j/N

and so summing both sides over j gives
Z Nв€’1
1X
1
1
| f (x)dx в€’ f (xj )| В· M.
N j=0 N
0

We will refer to the error in the numerical integral in this case
Z N в€’1
1X
1
ОµN = | f (x)dx в€’ f (xj )|
N j=0
0

as O(N в€’1 ) which means that the sequence of errors ОµN satisп¬Ѓes

lim sup N в€’1 ОµN < в€ћ
N в†’в€ћ

or intuitively that the errors are bounded by a constant times N в€’1 .
If the function f is known to have bounded derivatives of second or third
order, then integrals can be approximated to an even higher degree of precision.
For example various numerical quadrature formulae permit approximating an
R1
integral of the form 0 f (x)w(x)dx with a weighted average of N points
N
X
(6.3)
wj f (xj )
j=1

in such a way that if f (x) is a polynomial of degree 2N в€’ 1 or less, the approx-
imation is exact. Here the function w(x) is typically some density such as the
uniform, exponential or normal density and the optimal placement of the points
xj as well as the weights wj depends on w(x). Of course a smooth function
INTRODUCTION 303

can be closely approximated with a polynomial of high degree and so numerical
quadrature formulae of the form (6.3) permit approximating a one-dimension
integral arbitrarily closely provided that the function is suп¬ѓciently smooth, i.e.
it has bounded derivatives of suп¬ѓciently high order. We should note that in
this case, the weights wj and the points xj are both deterministic. By contrast,
the Monte Carlo integral
N
1X
b
ОёMC = f (Ui )
N i=1
with N points places these points at random or pseudo-random locations, has
q
b
zero bias but the standard deviation of the estimator var(ОёMC ) is a constant
в€љ
multiple of 1/ N . The Central Limit theorem assures us that
Z1
b
N 1/2 (ОёMC в€’ f (x)dx)
0

converges to a normal distribution which means that the error is order (in prob-
ability) N в€’1/2 . Note that there is a change in our measure of the size of an
error, since only the variance or standard deviation of a given term in the se-
quence of errors is bounded, not the whole sequence of errors ОµN . In particular
b
if a pseudo-random estimator Оё satisп¬Ѓes
Z1
b f (x)dx)2 = O(N в€’2k )
E(Оё в€’
0

then we say that the error is OP (N в€’k ) where OP denotes вЂњorder in probabilityвЂќ.
This is clearly a weaker notion than O(N в€’k ). Even the simplest numerical
integral (6.1) has a faster rate of convergence then that of the Monte Carlo
integral with or without use of the variance reduction techniques of Chapter 4.
This is a large part of the reason numerical integration is usually preferred to
Monte Carlo methods in one dimension, at least for smooth functions, but it
also indicates that for regular integrands, there is room for improvement over
Monte Carlo in higher dimensions as well.
The situation changes in 2 dimensions. Suppose we wish to distribute N
points over a uniform lattice in some region such as the unit square. One
304 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

possible placement is to points of the form
в€љ
i j
( в€љ , в€љ ), i, j = 1, 2, ... N
N N
в€љ
assuming for convenience of notation that N is integer. The distance between
в€љ
adjacent points is of order 1/ N and by an argument akin to (6.2), the bias in
в€љ
a numerical integral is order 1/ N . This is the now same order as the standard
deviation of a Monte Carlo integral, indicating that the latter is already, in two
dimensions, competitive. When the dimension s в‰Ґ 3, a similar calculation shows
that the standard deviation of the Monte-Carlo method is strictly smaller order
than the error of a numerical integral with weights at lattice points. Essentially,
the placement of points on a lattice for evaluating a dв€’dimensional integral is far
from optimal when d в‰Ґ 2. Indeed various deterministic alternatives called quasi-
random samples provide substantially better estimators especially for smooth
functions of several variables. Quasi-random samples are analogous to equally
spaced points in one dimension and are discussed at length by Niederreiter
(1978), where it is shown that for suп¬ѓciently smooth functions, one can achieve
rates of convergence close to the rate 1/N for the one-dimensional case.
We have seen a number of methods designed to reduce the dimensionality
of the problem. Perhaps the most important of these is conditioning, which can
reduce an dв€’dimensional integral to a one-dimensional one. In the multidimen-
sional case, variance reduction has an increased importance because of the high
variability induced by the dimensionality of crude methods. The other vari-
ance reduction techniques such as regression and stratiп¬Ѓcation carry over to the
multivariable problem with little change, except for the increased complexity of
determining a reasonable stratiп¬Ѓcation in such problems.

Errors in numerical Integration

We consider the problem of numerical integration in d dimensions. For d = 1
classical integration methods, like the trapezoidal rule, are weighted averages of
INTRODUCTION 305

the value of the function at equally spaced points;

Z m
X
1
n
f (u)du в‰€ (6.4)
wn f ( ),
m
0 n=0

where w0 = wm = 1/(2m), and wn = 1/m for 1 В· n В· m в€’ 1. The trape-
zoidal rule is exact for any function that is linear (or piecewise linear between
grid-points) and so we can assess the error of integration by using a linear ap-
proximation through the points ( m , f ( m )) and ( j+1 , f ( j+1 )). Assume
j j
m m

j j+1
<x< .
m m

If the function has a continuous second derivative, we have by TaylorвЂ™s Theorem
that the diп¬Ђerence between the function and its linear interpolant is of order
j2
O(x в€’ i.e.
m) ,

j j j+1 j j
) в€’ f ( )] + O(x в€’ )2 .
) + (x в€’ )m[f (
f (x) = f (
m m m m m
j j+1
Integrating both sides between and notice that
m,
m

Z
f ( j+1 ) + f ( m )
j
(j+1)/m
j j j+1 j m
{f ( ) + (x в€’ )m[f ( ) в€’ f ( )]}dx =
m m m m 2m
j/m

is the area of the trapezoid and the error in the approximation is
Z (j+1)/m
j2
) ) = O(mв€’3 ).
(x в€’
O(
m
j/m

Adding these errors of approximation over the m trapezoids gives O(mв€’2 ). Con-
sequently, the error in the trapezoidal rule approximation is O(mв€’2 ), provided
that f has a continuous second derivative on [0, 1].
We now consider the multidimensional case, d в‰Ґ 2. Suppose we evaluate
the function at all of the (m + 1)d points of the form ( n1 , . . . , ns ) and use this
m m

to approximate the integral. The classical numerical integration methods use a
Cartesian product of one-dimensional integration rules. For example, the d-fold
Cartesian product of the trapezoidal rule is
306 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

Z m m
X X n1 ns
f (u)du в‰€ В·В·В· wn1 В· В· В· wns f ( (6.5)
, . . . , ),
m m
[0,1]d n1 =0 ns =0

where [0, 1]d is the closed s-dimensional unit cube and the wn are as before.
The total number of nodes is N = (m + 1)s . From the previous error bound it
follows that the error is O(mв€’2 ), provided that the second partial derivatives of
f are continuous on [0, 1]d . We know that the error cannot be smaller because
when the function depends on only one variable and is constant in the others,
the one-dimensional result is a special case. In terms of the number N of nodes
or function evaluations, since m = O(N 1/d ), the error is O(N в€’2/d ), which,
with increasing dimension d, changes dramatically. For example if we required
N = 100 nodes to achieve a required precision in the case d = 1, to achieve the
same precision for a d = 5 dimensional integral using this approach we would
need to evaluate the function at a total of 100d = 1010 = ten billion nodes. As
the dimension increases, the number of function evaluations or computation
required for a п¬Ѓxed precision increases exponentially. This phenomena is often
called the вЂњcurse of dimensionalityвЂќ, exorcised in part at least by quasi or regular
Monte Carlo methods.

The ordinary Monte Carlo method based on simple random sampling is free
of the curse of dimensionality. By the central limit theorem, even a crude Monte
Carlo estimate for numerical integration yields a probabilistic error bound of the
form OP (N в€’1/2 ) in terms of the number N of nodes (or function evaluations)
and this holds under a very weak regularity condition on the function f . The
remarkable feature here is that this order of magnitude does not depend on the
dimension d. This is true even if the integration domain is complicated. Note
however that the deп¬Ѓnition of вЂњ OвЂќ has changed from one that essentially con-
siders the worst case scenario to OP which measures the average or probabilistic
behaviour of the error.

Some of the oft-cited deп¬Ѓciencies of the Monte Carlo method limiting its
THEORY OF LOW DISCREPANCY SEQUENCES 307

usefulness are:

1. There are only probabilistic error bounds (there is no guarantee that the
expected accuracy is achieved in a particular case -an alternative approach
would optimize the вЂњworst-caseвЂќ behaviour);

2. Regularity of the integrand is not exploited even when it is available. The
probabilistic error bound OP (N в€’1/2 ) holds under a very weak regularity
condition but no extra beneп¬Ѓt is derived from any additional regularity or
smoothness of the integrand. For example the estimator is no more precise
if we know that the function f has several continuous derivatives. In cases
when we do not know whether the integrand is smooth or diп¬Ђerentiable,
it may be preferable to use Monte Carlo since it performs reasonably well
without this assumption.

3. Genuine Monte Carlo is not feasible anyway since generating truly in-
dependent random numbers is virtually impossible. In practice we use
pseudo-random numbers to approximate independence.

Theory of Low discrepancy sequences

The quasi-Monte Carlo method places attention on the objective, approximating
an integral, rather than attempting to imitate the behaviour of independent
uniform random variates. Quasi-random sequences of low discrepancy sequences
would fail all of the tests applied to a pseudo-random number generate except
those testing for uniformity of the marginal distribution because the sequence
is, by construction, autocorrelated. Our objective is to approximate an integral
using a average of the function at N points, and we may adjust the points so that
the approximation is more accurate. Ideally we would prefer these sequences to
be self-avoiding, so that as the sequence is generated, holes are п¬Ѓlled. As usual
308 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

we will approximate the integral with an average;
Z N
1X
f (u)du в‰€ (6.6)
f (xn ).
N n=1
d
[0,1]

Quasi Monte-Carlo is able to achieve a deterministic error bound O((logN )d /N )
for suitably chosen sets of nodes and for integrands with a relatively low degree
of regularity, much better than the rate O(N в€’1/2 ) achieved by Monte Carlo
methods. Even smaller error bounds can be achieved for suп¬ѓciently regular
integrands. There are several algorithms or quasi-Monte-Carlo sequences which
give rise to this level of accuracy.

Suppose, as with a crude Monte Carlo estimate, we approximate the integral
with (6.6) with x1 , . . . , xN в€€ [0, 1]d . The sequence x1 , . . . , xN ,... is determinis-
tic (as indeed are the pseudo-random sequences we used for Crude Monte-Carlo),
but they are now chosen so as to guarantee a small error. Points are chosen so
as to achieve the maximal degree of uniformity or a low degree of discrepancy
with a uniform distribution. A п¬Ѓrst requirement for a low discrepancy sequence
is that we obtain convergence of the sequence of averages so that:
Z
N
1X
lim f (xn ) = f (u)du,
Nв†’в€ћ N [0,1]d
n=1

and this should hold for a reasonably large class of integrands. This suggests
that the most desirable sequences of nodes x1 , . . . , xN are вЂњevenly distributedвЂќ
over [0, 1]d . Various notions of discrepancy have been considered as quantitative
measures for the deviation from the uniform distribution but we will introduce
only one here, the so-called вЂњstar-discrepancyвЂќ. The star discrepancy is perhaps
the more natural one in statistics, since it measures the maximum diп¬Ђerence be-
tween the empirical cumulative distribution function of the points {x1 , . . . , xN }
and the uniform distribution of measure on the unit cube. Suppose we construct

N
X
bN (x) = 1 I(xn В· x),
F
N n=1
THEORY OF LOW DISCREPANCY SEQUENCES 309

the empirical cumulative distribution function of the points x1 , . . . , xN , and
compare it with

F (x) =F (x1 , ...xd ) = min(1, x1 x2 ...xd ) if all xi в‰Ґ 0

the theoretical uniform distribution on [0, 1]d . While any measure of the dif-
ference could be used, the star discrepancy is simply the Kolmogorov-Smirnov
distance between these two cumulative distribution functions
# of points in B
b
в€—
DN = sup | FN (x) в€’ F (x)| = sup | в€’ О»(B)|,
N
x B

where the supremum is taken over all rectangles B of the form [0, x1 ] Г— [0, x2 ] Г—
... Г— [0, xd ] and where О»(B) denotes the Lebesgue measure of B in Rd .
It makes intuitive sense that we should choose points {x1 , . . . , xN } such
that the discrepancy is small for each N. This intuition is supported by a
large number of theoretical results, at least in the case of smooth integrands
with smooth partial derivatives. The smoothness is measured using V (f ), a
вЂњtotal variationвЂќ in the sense of Hardy and Krause, intuitively the length of the
monotone segments of f. For a one dimensional function with a continuous п¬Ѓrst
derivative it is simply
Z 1
|f 0 (x)|dx.
V (f ) =
0

In higher dimensions, the Hardy Krause variation may be deп¬Ѓned in terms of
the integral of partial derivatives;

Deп¬Ѓnition 48 Hardy and Krause Total Variation
If f is suп¬ѓciently diп¬Ђerentiable then the variation of f on [0, 1]d in the sense
of Hardy and Krause is
s
X X
V (k) (f ; i1 , . . . , ik ), (6.7)
V (f ) =
k=1 1 i1 <В·В·В·<ik s

where
ВЇ ВЇ
Z Z
ВЇ ВЇ
1 1
в€‚sf
ВЇ ВЇ
(k)
В·В·В· dxi1 В· В· В· dxik . (6.8)
V (f ; i1 , . . . , ik ) = ВЇ в€‚xi В· В· В· в€‚xi ВЇ
0 0 k xj =1,j6=i1 ,...,ik
1
310 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

The precision in our approximation to an integral as an average of function
values is closely related to the discrepancy measure as the following result shows.
Indeed the mean of the function values diп¬Ђers from the integral of the function
by an error which is bounded by the product of the discrepancy of the sequence
and the measure V (f ) of smoothness of the function.

Theorem 49 (Koksma-Hlawka inequality)
If f has bounded variation V (f ) on [0, 1]d in the sense of Hardy and Krause,
then, for any x1 , . . . , xN в€€ [0, 1]d , we have
Z
N
1X в€—
| f (xn ) в€’ f (u)du| В· V (f )DN . (6.9)
N n=1 s
I

We do not normally use this inequality as it stands since the evaluation of the
error bound on the right hand side requires determining V (f ), typically a very
diп¬ѓcult task. However this bound allows a separation between the regularity
properties of the integrand and the degree of uniformity of the sequence. We can
guarantee a reasonable approximation for any function f with bounded total
в€—
variation V (f ) by ensuring that the discrepancy of the sequence DN is small.
For this reason, the discrepancy is central to quasi-Monte Carlo integration.
Sequences with small star discrepancy are called low-discrepancy sequences. In
fact since a variety of sequences exist with discrepancy of order

(log N )d
N

as N в†’ в€ћ, the term вЂњlow-discrepancyвЂќ is often reserved for these.

Examples of low discrepancy sequences

Van der Corput Sequence.

In the one dimensional case the best rate of convergence is O(N в€’1 log N ), N в‰Ґ 2.
It is achieved, for example, by the van der Corput sequence, obtained by
EXAMPLES OF LOW DISCREPANCY SEQUENCES 311

reversing the digits in the representation of some sequence of integers in a given
base. Consider one-dimensional case d = 1 and base b = 2. Take the base b
representation of the sequence of natural numbers;

1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 1100, 1101, ...

P
and then map these into the unit interval [0, 1] so that the integer t ak bk
k=0
Pt
is mapped into the point k=0 ak bв€’kв€’1 . These binary digits are mapped into
(0,1) in the following three steps;

1. Write n using its binary expansion. e.g. 13 = 1(8) + 1(4) + 0(2) + 1(1)
becomes 1101.

2. Reverse the order of the digits. e.g. 1101 becomes 1011.

3. Determine the number that this is the binary decimal expansion for. e.g.
1011 = 1( 1 ) + 0( 1 ) + 1( 1 ) + 1( 16 ) =
1 11
16 .
2 4 8

Thus 1 generates 1/2, 10 generates 0( 1 ) + 1( 1 ), 11 generates 1( 1 ) + 1( 1 )
2 4 2 4

and the sequence of positive integers generates the points. The intervals are
recursively split in half in the sequence 1/2, 1/4, 3/4, 1/8, 5/8, 3/8, 7/8, ... and
the points are fairly evenly spaced for any value for the number of nodes N ,
and perfectly spaced if N is of the form 2k в€’ 1. The star discrepancy of this
sequence is
log N
в€—
DN = O( )
N
which matches the best that is attained for inп¬Ѓnite sequences.

The Halton Sequence

This is simply the multivariate extension of the Van der Corput sequence. In
higher dimensions, say in d dimensions, we choose d distinct primes, b1 , b2 , ...bd
(usually the smallest primes) and generate, from the same integer m , the d
components of the vector using the method described for the Van der Corput
312 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

sequence. For example, we consider the case d = 3 and use bases b1 = 2,
b2 = 3, b3 = 5 because these are the smallest three prime numbers. The п¬Ѓrst
few vectors , ( 1 , 1 , 1 ), ( 1 , 2 , 2 ), ( 3 , 1 , 3 ), ...are generated in the table below.
235 435 495

repres п¬Ѓrst repres. second repres third
m
base 2 component base 3 comp base 5 comp
1 1 1/2 1 1/3 1 1/5
2 10 1/4 2 2/3 2 2/5
3 11 3/4 10 1/9 3 3/5
4 100 1/8 11 4/9 4 4/5
5 101 5/8 12 7/9 10 1/25
6 110 3/8 20 2/9 11 6/25
7 111 7/8 21 5/9 12 11/25
9 1000 1/16 22 8/9 13 16/25
10 1001 9/16 100 1/27 14 21/25

Figure 6.1 provides a plot of the п¬Ѓrst 500 points in the above Halton sequence
of dimension 3.
There appears to be greater uniformity than a sequence of random points
would have. Some patterns are discernible on the two dimensional plot of the
п¬Ѓrst 100 points, for example see Figures 6.2 and 6.3.
These п¬Ѓgures can be compared with the plot of 100 pairs of independent
uniform random numbers in Figure 6.4, which seems to show more clustering
and more holes in the point cloud.
These points were generated with the following function for producing the
Halton sequence.
function x=halton(n,s)

%x has dimension n by s and is the п¬Ѓrst n terms of the halton sequence of

%dimension s.

p=primes(s*6); p=p(1:s); x=[];
EXAMPLES OF LOW DISCREPANCY SEQUENCES 313

1

0.8

0.6

0.4

0.2

0
1
0.8 1
0.6 0.8
0.6
0.4
0.4
0.2
0.2
0 0

Figure 6.1: 500 points from a Halton seqnece of dimension 3

for i=1:s

x=[x (corput(n,p(i)))вЂ™];

end

function x=corput(n,b)

% converts integers 1:n to from van der corput number with base b

m=п¬‚oor(log(n)/log(b));

n=1:n; A=[];

for i=0:m

a=rem(n,b); n=(n-a)/b;

A=[A ;a];

end

x=((1./bвЂ™).^(1:(m+1)))*A;

The Halton sequence is a genuine low discrepancy sequence in the sense that

(log N )d
в€—
DN = O( )
N
314 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

Halton sequence of dimension 3
1

0.9

0.8

0.7

0.6
second coordinate

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
first coordinate

Figure 6.2: The п¬Ѓrst and second coordinate of 100 points from the Halton
sequence of dimension 3

and the coverage of the unit cube is reasonably uniform for small dimensions.
Unfortunately the notation O() hides a constant multiple, one which, in this
case, depends on the dimension d. Roughly (Niedereiter, 1992), this constant
is asymptotic to dd which grows extremely fast in d. This is one indicator that
for large d, the uniformity of the points degrades rapidly, largely because the
relative sparseness of the primes means that the d0 th prime is very large for d
large. This results in larger holes or gaps in that component of the vector than
we would like. This is evident for example in Figure6.5 where we plot the last
two coordinates of the Halton sequence of dimension 15.

The performance of the Halton sequence is considerably enhanced by per-
muting the coeп¬ѓcients ak prior to mapping into the unit interval as is done by
the Faure sequence.
EXAMPLES OF LOW DISCREPANCY SEQUENCES 315

Figure 6.3: The second and third coordinate of 100 points from the Halton
sequence of dimension 3

Faure Sequence

The Faure sequence is similar to the Halton sequence in that each dimension is
a permutation of a van der Corput sequence; however, the same prime is used
as the base b for each of the components of the vector, and is usually chosen to
be the smallest prime greater than or equal to the dimension (Fox, 1996).
In the Van der Corput sequence we wrote the natural numbers in the form
Pt P
ak bk which was then mapped into the point t ak bв€’kв€’1 in the unit
k=0 k=0

interval. For the Faure sequence we use the same construction but we use
diп¬Ђerent permutations of the coeп¬ѓcients ak for each of the coordinates. In
particular in order to generate the iвЂ™th coordinate we generate the point
t
X
ck bв€’kв€’1
k=0
316 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

Figure 6.4: 100 independent U [0, 1] pairs

where
X ВµmВ¶
t
(i в€’ 1)mв€’k am mod b
ck =
k
m=k

Notice that only the last t в€’ k + 1 values of ai are used to generate ck . For
example consider the case d = 2, b = 2. Then the п¬Ѓrst 10 Faure numbers are

0 1/2 1/4 3/4 1/8 5/8 3/8 7/8 1/16 9/16
0 1/2 3/4 1/4 5/8 1/8 3/8 7/8 15/16 7/16

The п¬Ѓrst row corresponds to the Van der Corput numbers and the second row
of obtained from the п¬Ѓrst by permuting the values with the same denominator.
The Faure sequence has better regularity properties than does the Halton
sequence above particularly in high dimensions. However the diп¬Ђerences are by
no means evident from a graph when the dimension is moderate. For example
we plot in Figure 6.6 the 14вЂ™th and 15вЂ™th coordinates of 1000 points from the
Faure sequence of dimension d = 15 for comparison with Figure 6.5.
EXAMPLES OF LOW DISCREPANCY SEQUENCES 317

Figure 6.5: The 14вЂ™th and 15вЂ™th coordinates of the п¬Ѓrst 1000 of a Halton sequence
d = 15

Other suggestions for permuting the digits in a Halton sequence include
using only every l0 th term in the sequence so as to destroy the cycle.

In practice, in order to determine the eп¬Ђect of using one of these low dis-
crepancy sequences we need only substitute such a sequence for the vector of
independent uniform random numbers used by a simulation. For example if we
wished to simulate a process for 10 time periods, then value a call option and
average the results, we could replace the 10 independent uniform random num-
bers that we used to generate one path by an element of the Halton sequence
with d = 10.

Suppose we return brieп¬‚y to the call option example treated in Chapter 3.
The true value of this call option was around 0.4615 according to the Black-
Scholes formula. If however we substitute the Van der Corput sequence for the
sequence of uniform random numbers,
318 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

Figure 6.6: The last two coordinates of the п¬Ѓrst 1000 Faure points of dimension
d = 15.

mean(fn(corput(100000,2)))

we obtain an estimate of 0.4614 very close to the correct value. I cannot
compare these estimators using the notion of eп¬ѓciency that we used there, how-
ever, because these low-discrepancy sequences are not random and do not even
attempt to emulate random numbers. Though unable to compare performance
with the variance of an estimator, we can look at the Mean squared error (see
for example Figure 6.8). which shows a faster rate of convergence for Quasi
Monte Carlo equivalent to variance reduction in excess of 100). Galanti & Jung
(1997), report that the Faure sequence suп¬Ђers from the problem of start-up
EXAMPLES OF LOW DISCREPANCY SEQUENCES 319

and especially in high-dimensions and the Faure numbers can exhibit clustering
about zero. In order to reduce this problem, Faure suggests discarding the п¬Ѓrst
b4 в€’ 1 points.

Sobol Sequence

The Sobol sequence is generated using a set of so-called direction numbers
mi
= 1, 2, where the mi are odd positive integers less than 2i . The
vi = 2i , i

values of mi are chosen to satisfy a recurrence relation using the coeп¬ѓcients of
a primitive polynomial in the Galois Field of order 2. A primitive polynomial is
irreducible (i.e. cannot be factored into polynomials of smaller degree) and does
not divide the polynomial xr + 1 for r < 2p в€’ 1. For example the polynomial
x2 + x + 1 has no non-trival factors over the Galois Field of order 2 and it does
divide x3 + 1 but not xr + 1 for r < 3. Corresponding to a primitive polynomial

z p + c1 z pв€’1 + ...cpв€’1 z + cp

is the recursion

mi = 2c1 miв€’1 + 22 c2 miв€’2 + ... + 2p cp miв€’p

where the addition is carried out using binary arithmetic. For the Sobol se-
quence, we then replace the binary digit ak by ak vk .
In the case d = 2, the п¬Ѓrst 10 Sobol numbers are, using irreducible polyno-
mials x + 1 and x3 + x + 1

0 1/2 1/4 3/4 3/8 7/8 1/8 5/8 5/16 13/16
0 1/2 1/4 3/4 1/8 5/8 3/8 7/8 11/16 3/16

Again we plot the last two coordinates for the п¬Ѓrst 1000 points from a Sobol
sequence of dimension d = 15 in Figure 6.7 for comparison with Figures 6.5 and
6.6.
320 CHAPTER 6. QUASI- MONTE CARLO MULTIPLE INTEGRATION

Figure 6.7: The last two coordinates of the п¬Ѓrst 1000 points from a Sobel se-
quence of dimension 15

Although there is a great deal of literature espousing the use of one quasi-
Monte Carlo sequence over another, most results from a particular application
 << стр. 10(всего 11)СОДЕРЖАНИЕ >>