<<

. 2
( 2)



b (7.102)
This same linear regression gives as an estimator of the gradient at θk ,

‚\ b (7.103)
Pj (θk ) = βjk .
‚θ
Unfortunately, simulations at every value of θk provide estimators of the same
quantity Pj (θ) and some estimators are clearly more precise than others and
so we may wish to combine these individual estimators to provide a weighted
b0
combination of the values ±jk + βjk (θ ’ θk ), k = 1, 2, ...K. The ideal weights
b
would be inversely proportional to the variance of the estimators themselves if
the linear model for Pj (θ) were a perfect ¬t. However, given that it is usually not,
some form of distance between the parameter value θk at which the simulations
were conducted and the parameter at which we wish to extrapolate should also
be included in the weights. We suggest weights which approximate
H(θ ’ θk )
cjk (θ) ∝ (7.104)
var(\Pj (θ))
7.11. CALIBRATING A MODEL USING SIMULATIONS 421

with the “kernel” H(t) having the property that it is maximized when t = 0
and decreases to zero as |t| increases. A simpler example of a kernel function
H(t) is the Gaussian kernel,
p
X
t2 }
H(t) = exp{’c i
i=1

or the Cauchy kernel
1
Pp
H(t) = 2
1+c i=1 ti

for some positive parameter c governing the window width or the amount of
smoothing applied to the observations. The symbol “∝00 indicates that the
weights are proportional to the values on the right hand side, but renormal-
ized so that
K
X
cjk (θ) = 1.
k=1

Then our estimator of Pj (θ) is

K
X
\ b0
cjk (θ)[bjk + βjk (θ ’ θk )]. (7.105)
Pj (θ) = ±
k=1


Similarly, the gradients ‚θ Pj (θ) can be estimated by a weighted average of the
b0
individual estimators βjk leading to the estimator

\ K
X
‚ b (7.106)
Pj (θ) = dk (θ)βjk ,
‚θ
k=1

with
dk (θ) ∝ H(θ ’ θk ).
If we now substitute (7.105, 7.106) in (7.101), we obtain an estimator of
[
the gradient grad(θ). A steepest descent algorithm would consist of updating
the current estimate (on the m0 th step) of the optimal parameter θ(m) , say in a
direction opposite to the gradient vector

[
grad(θ(m) )
θ(m+1) = θ(m) ’ δ
[
|| grad(θ(m) )||

where δ is the stepsize and ||.|| denotes the length of the vector. We will
adjust the step size from time to time so that it eventually converges at the
asymptotically optimal rate of 1/n. This is the case if, at each step, we retain
the original step size δ (m) provided that a new simulation at θ(m+1) shows a
decrease in the objective function (7.100) but if it does not, (this will happen
422SENSITIVITY ANALYSIS, ESTIMATING DERIVATIVES AND THE GREEKS

at random of the time when we are close enough to the minimum to overstep it
), we reduce the step size according to

δ (m+1) = δ (m) (1 ’ δ (m) ). (7.107)

Then the suggested algorithm is as follows: we begin with a small number K of
simulations at arbitrary points θ1 , ..., θK .

\
b
1. Use simulations at θ1 , ..., θK to estimate the option values ±jk = Pj (θk ),
the gradients βjk = ‚θ\ Pj (θk ) as well as crude estimators of var(Pj (θ)).
\
b ‚

Begin with a step size δ (0) .

2. Use these estimators to obtain weights cjk (θ) from (7.104) and dk (θ) as
well as estimators (7.105),(7.106).

3. Estimate the direction of the gradient vector

[
grad(θ(m) )
δ
[
|| grad(θ(m) )||

4. Set θK+1 =this solution.

5. Conduct simulations at this new parameter θK+1 .

6. With K replaced by K + 1, repeat steps 1-4 until the objective function
J
X
wj (\Pj (θ) ’ Pj )2
j=1

no longer changes signi¬cantly or we have done a maximum number of
iterations.

7. On termination choose the value of θ(m) , m = 1, 2, ....K which minimizes
PJ \ (m) ) ’ Pj )2 .
j=1 wj (Pj (θ



It is well-known that for functions that are non-random, steepest descent is
not a particularly e¬cient minimization routine, because it can bounce back and
forth across a valley, and that methods like Newton™s method which are based
on a quadratic approximation to the function tend to be more e¬cient. For
example setting the gradient (7.101) equal to zero with the estimators (7.105,
7.106) replacing their true values results in the equation

J K K
X X X
b0 b
(m+1)
βjk (θ(m+1) dk (θ(m+1) )βjk ) = 0,
)[bjk ’ Pj + ’ θk )])(
wj ( cjk (θ ±
j=1 k=1 k=1
7.11. CALIBRATING A MODEL USING SIMULATIONS 423

which we might wish to solve for the next parameter value θ(m+1) . It appears
that the success of such an algorithm depends on how precise our gradient
b
estimators βjk are, and in general, since they are quite noisy, this algorithm
relies too heavily on them for determining both direction and distance of travel.
The solution θ(m+1) = θ can be expressed in familiar least squares terms as
\ \
J K J K
X X X X
‚ ‚
b b0
0 ’1
cjk (θ)[Pj ’ ±jk + βjk θk ].
b
θ=[ wj Pj (θ) cjk (θ)βjk ] wj Pj (θ)
‚θ ‚θ
j=1 j=1
k=1 k=1
(7.108)
except that the presence of θ on both sides of (7.108) means that we should
solve this equation iteratively, substituting an old value of θ on the right hand
side. This solution (7.108) can also be regarded as the estimator in a weighted
linear regression if we de¬ne the vector of responses Y as the JK values of
b0
Pj ’ ±jk + βjk θk arranged as a column vector, the JK by JK weight matrix
b
b0
„¦jl,jk = wj djl (θ)cjk (θ) and the matrix X to be the JK vectors βjk stacked to
form a JK by p matrix. In this case (7.108) can be re-expressed in the more
familiar form
θ = (X 0 „¦X)’1 (X 0 „¦Y ).

7.11.1 Example: Calibrating the Normal Inverse Gaussian
model
We return to the problem of calibrating the Normal Inverse Gaussian model to
option the option prices listed in Table 7.1 (the initial value of the S&P 500 was
116.84):

Exercise K 950 1005 1050 1100 1125 1150 1175 1200 1225 1250
Option Price PO (K) 173.1 124.8 88.5 53.7 39.3 27.3 17.85 10.95 6.2 3.25
Table 6.1. Price of SPX Call Options
We ¬t the Normal Inverse Gamma distribution to historical prices in Section
3.4 where we obtained the parameter values

b b
± = 95.23, β = ’4.72, δ = 0.016, µ = 0.0009
b
but of course these are just estimates of the parameters for the historical dis-
tribution and are not necessarily the appropriate parameters for a risk-neutral
distribution. Running the algorithm above for 40 iterations beginning with
b b
b b
± = 3.8010, β = 2.4105, δ = 0.4591, µ = 0.0009
In Figure 7.19 we compare the error in the option pricing formulas for the
NIG and the Black Scholes model. using the algorithm of the last Section.
Notice that there is again no evidence from this graph that the NIG model with
constant parameters ¬ts the risk neutral distribution better than does the Black
Scholes model with constant volatility parameter. We saw a similar comparison
in Chapter three when we compared the two volatility smiles Figure 7.1 and
Figure 7.2.
424SENSITIVITY ANALYSIS, ESTIMATING DERIVATIVES AND THE GREEKS

4
NIG process
Black Scholes


2



Empirical Error in Formula for Option Price

0




-2




-4




-6




-8
0 20 40 60 80 100 120 140 160 180
Option Price




Figure 7.19: The error in pricing options on the S&P 500 using the NIG model
and the Black Scholes model.



7.12 Problems
1. Assume that X has a normal (θ, 1) distribution and T (X) = X +

bX 2 + cX 3 . Show we can estimate ‚θ Eθ T (X) = 1 + 2bθ + 3c(1 + θ2 )
by randomly samplingP independent values of Xi , i = 1, 2, ...n and
n
n
1
using the estimator n i=1 T (Xi )(Xi ’ θ). How would the variance of
P
this compare with the variance of an alternative estimator n n V 0 (Xi ).
1
i=1
How do they compare if V is close to being a linear function, i.e. if b, c
are small?

2. Using the Black-Scholes formula for the price of a call option, verify the
following formulae for the greeks. In each case use simulation to verify the
formula in the case T = .25, σ = .3, r = .05, K = S0 = 10.
‚V
(a) (delta) = ¦(d1 )
‚S
‚2V φ(d1 )
=
(b) (gamma) √
‚S 2 sσ T ’t
‚V
K(T ’ t)e’r(T ’t) ¦(d2 )
(c) (Rho) =
‚r
sσφ(d1 )
‚V
’ rKe’r(T ’t) ¦(d2 )
=
(d) (theta) √
‚t 2 T ’t

‚V
sφ(d1 ) T ’ t .
(e) (vega) =
‚σ

3. The rho of an option is the derivative of the option price with respect to
the interest rate parameter r. What is the value of ρ for a call option with
S0 = 10, strike=K = 10, r = 0.05, T = .25 and σ = .2? Use a simulation
to estimate this slope and determine the variance of your estimator. Try
using (i) independent simulations at two points and (ii) common random
numbers. What can you say about the variances of your estimators?
7.12. PROBLEMS 425

4. Consider the estimator given by (7.65) when θ = 1. For what values of the
importance sample parameter ψ is the variance of the estimator (7.65)
¬nite?
5. Use the score function method (7.71) and a simulation to estimate the
sensitivity of the value of a digital option to the parameters S0 , r, σ in the
Black Scholes model. Speci¬cally suppose the discounted payo¬ from the
option is
e’rT 1(ST > K)
with ST Normal(ln(S0 ) + rT ’ σ 2 T /2 ,σ 2 T ) and S0 = K = 10, r = 0.05,
T = 0.25. Compare your estimated values with the true values given in
Problem 2.
6. Use numerical quadrature corresponding to the 4™th degree Hermite poly-
nomial to approximate the integral
Z∞
ex •(x)dx
’∞

where •(x) is the standard normal probability density function. Compare
the numerical approximation with the true value of the integral.
7. Use simulation to calibrate the volatility parameter of the Black Scholes
model to the S&P500 option prices in Table 6.1 so that the sum of squares
of the pricing errors is minimized (note we wish to choose a volatility
parameter that does not depend on the strike price). Check your answer
to the results of a numerical minimization where the options are priced
using the Black-Scholes formula.
426SENSITIVITY ANALYSIS, ESTIMATING DERIVATIVES AND THE GREEKS

<<

. 2
( 2)