<<

. 13
( 19)



>>

The estimator of Gallant and Tauchen (1996) circumvents the need to evaluate the
binding function by using the score vector ‚fa (yt |Y t’1 ; π)/‚π (score generator) to de¬ne
the matching conditions. If the auxiliary model fa (yt |Y t’1 ; π) is chosen ¬‚exibly with
a suitable non-parametric interpretation, then the estimator achieves the asymptotic ef¬-
ciency of maximum likelihood and has good power properties for detecting misspeci-
¬cation (Gallant and Long, 1997; Tauchen, 1997), hence the term Ef¬cient Method of
Moments (EMM). EMM delivers consistent estimates of the structural parameter vector
under weak conditions on the choice of the auxiliary model. However, extrapolating from
the generalised method of moments evidence, it is natural to conjecture that the quality of
inference may hinge on how well the auxiliary model approximates the salient features of
the observed data. This intuition is formalised by Gallant and Long (1997), who show that
a judicious selection of the auxiliary model, ensuring that the quasi-scores asymptotically
span the true score vector, will result in full asymptotic ef¬ciency.19

17
For all these methods, it is necessary to recycle the random numbers used in the calculation when θ changes,
in order to have good numerical and statistical properties of the estimators based on these simulations.
18
Another possible auxiliary model is an ARMA(p, q) on the logarithms of the squared data (see Monfardini
(1998)).
19
In fact, as the score generator approaches the true conditional density, the estimated covariance matrix
for the structural parameter approaches that of maximum likelihood. This result embodies one of the main
advantages of EMM, since it prescribes a systematic approach to the derivation of ef¬cient moment conditions
for estimation in a general parametric setting.
258 Applied Quantitative Methods for Trading and Investment

Andersen et al. (1999) perform an extensive Monte Carlo study of EMM estimation of a
stochastic volatility model. They examine the sensitivity to the choice of auxiliary model
using ARCH, GARCH and EGARCH models for the score as well as non-parametric
extensions. EMM ef¬ciency approaches that of maximum likelihood for larger sample
sizes, while inference is sensitive to the choice of auxiliary model in small samples, but
robust in larger samples.20
The indirect inference theory, however, crucially depends on the correct speci¬cation
assumption concerning the structural model. There is now an emerging literature (see,
for example, Dridi and Renault (2000) and Dridi (2000)) which focuses on procedures
more robust to the structural model speci¬cation. In particular, Dridi and Renault (2000)
propose an extension to the indirect inference methodology to semiparametric settings
and show how the semiparametric indirect inference works on basic examples using
SV models.


8.4.3.2.2 Importance sampling

A more direct way of performing inference is to compute the likelihood by integrating
out the latent ht process. As previously seen, the integral (8.8) has no closed form and
it has to be computed numerically. However, the likelihood function naturally appears as
the expectation of the function T f (yt |Y t’1 , H t ; θ ) with respect to the p.d.f. P de¬ned
t=1
by21 T f (ht |Y t’1 , H t’1 ; θ ), from which it is easy to recursively draw. Therefore, an
t=1
˜
unbiased simulator of the whole likelihood function T (θ ) is T f (yt |Y t’1 , n H t ; θ )
t=1
˜
where n H t are recursively drawn from the auxiliary p.d.f. P . The likelihood is then
approximated by the empirical mean:

N T
1 ˜
f (yt |Y t’1 , n H t ; θ )
N n=1 t=1


and this simulated likelihood can be numerically maximised. However, this basic simulator
may be very slow, in the sense that the simulator may have a very large variance and then
some accelerating technique is needed. One solution is to consider the general method
of importance sampling based on a sequence of conditional p.d.f.s q(ht |Y T , H t’1 1). Let
us denote this probability distribution by Q and the corresponding expectation by EQ .
We have:
T
= EP f (yt |Y t’1 , H t ; θ )
T (θ )
t=1
T
f (yt |Y t’1 , H t ; θ )f (ht |Y t’1 , H t’1 ; θ )
= EQ
q(ht |Y T , H t’1 ; θ )
t=1


20
Care must be taken, however, to avoid over-parameterisation of the auxiliary model, as convergence problems
may arise if the quasi-score is extended to the point where it begins to ¬t the purely idiosyncratic noise in
the data.
It is important to note that this p.d.f. is neither f (H T ; θ ), except when yt does not cause ht , nor f (H T |Y T ; θ ).
21
Stochastic Volatility Models 259

T (θ )
Therefore, an unbiased simulator of is:

˜ ˜ ˜
T
f (yt |Y t’1 , n H t ; θ )f (n ht |Y t’1 , n H t’1 ; θ )
EQ
˜ ˜
q(n ht |Y T , n H t’1 ; θ )
t=1

˜
where n H T is drawn in Q. The problem is then how to choose the importance func-
tion: the natural answer is by reducing the Monte Carlo variance. It is easy to calculate
the theoretical optimal choice f (H T |Y T ; θ ) = T f (ht |Y T , H t’1 ) (i.e. the smoothing
t=1
density of ht ), for which one simulation is suf¬cient, but it is clearly not computable.
Then it is possible to consider the smoothing density of an approximating model, or to
¬x a parametric family of importance functions, choosing the member that minimises
the Monte Carlo variance (which is eventually computed in an approximated way). For
the SV model (8.2), the ¬rst solution is proposed by Sandmann and Koopman (1998)
by using as approximating model the linearised version (8.6). In the aim of the second
solution, Danielsson and Richard (1993) propose a sequentially optimised importance
sampling, which Danielsson (1994) applies to the SV model.22 In both cases, the sim-
ulated maximum likelihood estimates of model parameters are obtained by numerical
optimisation of the logarithm of the simulated likelihood.23

8.4.3.2.3 Bayesian approach
In the Bayesian setting, there are also serious dif¬culties in estimating the SV model. In
general, the posterior density f (θ |Y T ) and the posterior expectation of θ cannot be com-
puted in a closed form. Again, this complex setting requires a simulation-based approach.
The data augmentation principle, which considers the latent variable ht as nuisance par-
ameters, and the utilisation of Gibbs sampling (Gelfand and Smith, 1990), by iterating
simulations from f (H T |Y T , θ ) (data augmentation step) and f (θ |Y T , H T ) (parameter
simulation step), allow simulation from the joint posterior distribution f (H T , θ |Y T ),
derivation of the distribution of interest as the marginal distribution of θ and approx-
imation of the posterior expectation by a sample average. When conditional distributions
cannot be directly simulated, the corresponding steps in the Gibbs algorithm are replaced
by Metropolis“Hastings steps.24 Moreover, the prior modelling on the parameters is usu-
ally quasi-non-informative.
One way of considering this approach is to regard it as an empirical Bayes procedure,
reporting the mean of the posterior distributions as an estimator of θ . This is the approach
followed by Jacquier et al. (1994) who show that empirical Bayes outperforms QML and
GMM in the SV case.
In Jacquier et al. (1994) the posterior distribution of the parameters was sampled by
MCMC methods using a one-move approach (i.e. the latent variables ht were sampled each
at time from f (ht |Y T , H ’t , ±, β, σ· ), where H ’t denotes all the elements of H T exclud-
2

ing ht ). Although this algorithm is conceptually simple, it is not particularly ef¬cient from
a simulation perspective, as is shown by Kim et al. (1998), who develop an alternative,

22
The details will not be dealt with here as they are quite involved, even for the simplest model.
23
As for non-ef¬cient methods, numerical and statistical accuracy is obtained by recycling the random numbers
used in the calculation for each parameter value.
24
Such hybrid algorithms are validated in Tierney (1994).
260 Applied Quantitative Methods for Trading and Investment

more ef¬cient, multi-move MCMC algorithm. The ef¬ciency gain in the Kim et al. (1998)
algorithm arises from the joint sampling of H T in one block conditioned on everything
else in the model. Finally, Chib et al. (2002) develop ef¬cient Markov Chain Monte Carlo
algorithms for estimating generalised models of SV de¬ned by heavy-tailed Student-t dis-
tributions, exogenous variables in the observation and volatility equations, and a jump
component in the observation equation (see Section 8.5.1).


8.4.3.2.4 An MCMC approach to maximum likelihood estimation

Although the Bayesian approach is straightforward to state and computationally attractive,
it requires the elicitation of a prior, which is often regarded by some econometricians
as being dif¬cult in dynamic models. Even if this is not an insurmountable problem,
alternatives are available which allow us to perform maximum likelihood estimation using
MCMC methods.
The ¬rst possibility is the Simulated Expectation Maximisation (SEM) algorithm pro-
posed by Shephard (1993). The EM algorithm exploits the following decomposition of
the log-likelihood function:

log f (Y T ; θ ) = log f (Y T , H T ; θ ) ’ log f (H T |Y T ; θ )
= E[log f (Y T , H T ; θ)|Y T ] ’ E[log f (H T |Y T ; θ )|Y T ]

and iterates:
θ i+1 = arg max Eθ i [log f (Y T , H T ; θ)|Y T ]
θ


This is an increasing algorithm such that the sequence θ i converges to the ML estimator.
The problem is that, although log f (Y T , H T ; θ ) has in general a closed form, the same is
not true for its conditional expectation. In the SEM algorithm this expectation is replaced
by an approximation based on simulations. Thus, the problem is now to be able to draw
in the conditional distribution of H T given Y T and θ . Shephard (1993), in the context
of a non-linear state space model, uses the Hastings“Metropolis algorithm to solve this
problem, and applies it to the SV model.
Another possible approach is the Simulated Likelihood Ratio (SLR) method proposed
by Billio et al. (1998). The general principle is:

f (Y T ; θ ) f (Y T , H T ; θ ) T
= Eθ Y (8.9)
f (Y T ; θ ) f (Y T , H T ; θ )

where θ is an arbitrary ¬xed value of the parameters. Obviously:

f (Y T ; θ)
arg max f (Y ; θ ) = arg max
T
f (Y T ; θ)
θ θ


˜
and with n H T , n = 1, 2, . . . , N , simulated paths in the conditional distribution
f (H T |Y T ; θ), the SLR method amounts to maximising:
Stochastic Volatility Models 261

˜
N
f (n H T , Y T ; θ )
1
˜
N f (n H T , Y T ; θ )
n=1


with respect to θ . The method can be implemented by simulating the conditional distribu-
tion25 f (H T |Y T ; θ ). As already noted, it is impossible to simulate directly this distribution,
thus a Hastings“Metropolis approach is suggested.
Contrary to the SEM approach, the SLR method allows for the computation of the
likelihood surface and then of likelihood ratio test statistics. It needs only one optimisation
run and not a sequence of optimisations; it is possible to store the simulated paths, and
then only one simulation run is required. Moreover, as the simulation is made for only
one value of the parameter, the objective function will be smooth with respect to θ , even
if simulations involve rejection methods.
Billio et al. (1998) apply the SLR method also to the SV model (8.2).


8.5 EXTENSIONS OF SV MODELS
The basic SV models can be generalised in a number of directions. Straightforward
generalisations might allow µt to have heavy-tailed Student-t distributions and exogenous
variables in the observation and volatility equations.
Moreover, the ARCH in mean model of Engle et al. (1987) can be extended to the
SV framework, by specifying yt = µ0 + µ1 exp(ht ) + µt exp(ht /2). This model allows yt
to be moderately serially correlated, but in the discrete SV model the Hamilton ¬lter no
longer works, because yt , ht are not jointly Markovian.


8.5.1 Extensions of continuous SV models

In the context of continuous SV models, Harvey et al. (1994) concentrated their attention
on models based on Student-t error; Mahieu and Schotman (1998) analysed the possibility
of using a mixture distribution. Jacquier et al. (1995) have computed the posterior density
of the parameters of a Student-t-based SV model. This particular type of model in fact
can be viewed as an Euler discretisation of a Student-t-based Levy process but with
additional stochastic volatility effects; further articles are available in (continuous-time)
mathematical options and risk assessment literature.26 By building on the work of Kim
et al. (1998), Chib et al. (2002) develop ef¬cient Markov Chain Monte Carlo algorithms
for estimating these models. They also consider a second type of model which contains
a jump component27 in the observation equation to allow for large, transient movements.

25
The resulting Monte Carlo approximation of (8.9) could be only locally good around θ , and so Geyer (1996)
suggests updating θ to the maximiser of the Monte Carlo likelihood and repeating the Monte Carlo procedure
using the new θ . By updating θ a few times, one should obtain better approximations of the relative likelihood
function near the true maximum likelihood estimate.
26
Leading references include Eberlein (2001) and Eberlein and Prause (2001). The extension to allow for
stochastic volatility effects is discussed in Eberlein and Prause (2001) and Eberlein et al. (2001).
27
Jump models are quite popular in continuous-time models of ¬nancial asset pricing. See, for example, Merton
(1976), Ball and Torous (1985), Bates (1996), Duf¬e et al. (2000).
262 Applied Quantitative Methods for Trading and Investment

Moreover, a natural framework for extension of continuous SV models might be based
on adapting the Gaussian state space so that:

yt = µt exp(zt ht /2)
ht = Tt ht’1 + ·t ·t ∼ N (0, Ht )

and then on allowing ht to follow a more complicated ARMA process. Another simple
example would be:
2
σ ·1 0
β
1 0
zt = ht = ht’1 + ·t , ·t ∼ N 0, 2
1 0 1 σ ·2
0

Now, the second component of ht is a random walk, allowing the permanent level of the
volatility to slowly change. This is analogous to the Engle and Lee (1992) decomposition
of shocks into permanent and transitory. A model along the same lines has been suggested
by Harvey and Shephard (1993), who allow (ignoring the cyclical AR(1) component):

0 0
1 1 1
zt = Tt = Ht = 2
σ ·2
0 0 1 0

This uses the Kitagawa and Gersch (1984) smooth trend model in the SV context, which
in turn is close to putting a cubic spline through the data. This may provide a good
summary of historical levels of volatility, but it could be poor as a vehicle for forecasting
as con¬dence intervals for forecasted volatilities ht+r may grow very quickly with r.
Another suggestion is to allow ht to be a fractional process, giving the long-memory
SV model. For ¬nancial time series, there is strong evidence that the effect of a shock to
volatility persists (i.e. is not absorbed) for a long number of periods (see e.g. Andersen
and Bollerslev (1997), Lobato and Savin (1998), Harvey (1998), Bollerslev and Jubinski
(1999), Bollerslev and Mikkelsen (1999), Bollerslev and Wright (2000) and Ray and Tsay
(2000)), thus the concept of long memory seems suitable and has been suggested by Breidt
et al. (1998). A covariance stationary time series yt has long memory if:

|Cov(yt , yt’r )| = ∞
r=0

with Var(yt ) < ∞. Basically, it says that the autocovariances do decay as the lag increases
but very slowly, usually hyperbolically.
Currently there exist four approaches to estimate the long-memory SV model. The
quasi-maximum likelihood estimator of Breidt et al. (1998), the GMM approach of Wright
(1999), the widely used semiparametric, log-periodogram estimator of Geweke and Porter-
Hudak (1983) (see e.g. Andersen and Bollerslev (1997), Ray and Tsay (2000), Wright
(2000), Deo and Hurvich (2001) and the recent developments of Hurvich and Ray
(2001), Hurvich et al. (2001)) and the Bayesian estimator based on the Markov Chain
Monte Carlo sampler (Chan and Petris, 1999) and eventually the wavelet representation
of the log-squared returns (Jensen, 1999, 2000, 2001).
The quasi-MLE of the long-memory SV model is known to be strongly consistent,
but requires the order of the short-memory autoregressive and moving average parame-
ters to be correctly identi¬ed, as does the GMM estimator. The difference between the
Stochastic Volatility Models 263

quasi-MLE and GMM is that when the fractional order of integration is smaller than 1/4,
the asymptotic properties in addition to consistency are known for the GMM estimator.
Unlike the quasi-MLE of a short-memory stochastic volatility model whose asymptotic
properties are known (Harvey et al., 1994; Ruiz, 1994), these other asymptotic proper-
ties are not yet known for the quasi-MLE of the long-memory SV model. However, in
simulation experiments, Wright (1999) ¬nds that neither estimator™s ¬nite-sample prop-
erties dominate the other; the GMM estimator of the long-memory parameter generally
produces smaller standard errors but with a signi¬cant downward bias. From these simu-
lations Wright (1999) admonishes developing alternative estimators of the long-memory
SV model that are more ef¬cient and less biased. In fact, even if Deo and Hurvich (2001)
¬nd the asymptotic properties for the log-periodogram estimator of volatility to be similar
to those proved by Robinson (1995) for the same estimator in the mean, correct infer-
ence about the degree of long-memory relies on the number of Fourier frequencies in the
regression growing at a rate that is dependent on the value of the unknown long-memory
parameter. It thus seems that neither the quasi-MLE nor the log-periodogram estimator of
the long-memory volatility model lend themselves nicely to the construction of con¬dence
intervals or hypothesis testing of the long-memory parameter estimate.
Finally, it could be useful to allow the SV model to capture the non-symmetric response
to shocks. This feature can be modelled by allowing µt’1 and ·t to be correlated. If µt’1
and ·t are negatively correlated, and if µt’1 > 0, then yt’1 > 0 and ht is likely to fall.
2
Hence, a large effect of yt’1 on the estimated ht will be accentuated by a negative sign on
yt’1 , while its effect will be partially ameliorated by a positive sign. This correlation was
suggested by Hull and White (1987) and estimated using GMM by Melino and Turnbull
(1990) and Scott (1991). A simple quasi-maximum likelihood estimator has been proposed
by Harvey and Shephard (1996). Jacquier et al. (1995) have extended their single-move
MCMC sampler to estimate this effect.


8.5.2 Extensions of discrete SV models
In the discrete case, the basic model might be extended by considering different Markov
chains, which can allow the decomposition of shocks into permanent and transitory as in
the continuous case:
yt = µt exp(zt ht /2)
(8.10)
ht = µ + TSt

where St represents a vector of Markov chains. However, the vector of Markov chains
can easily be represented by a single Markov chain with a suf¬cient number of states and
then the model (8.10) formally reduces to the basic model (8.3).
Finally, the two SV models can be combined by allowing the continuous latent volatil-
ity to be governed by a ¬rst-order Markov chain. In that case, the estimation is very
dif¬cult. So et al. (1998) therefore propose Bayesian estimators which are constructed by
Gibbs sampling.


8.6 MULTIVARIATE MODELS
Most macroeconomics and ¬nance is about how variables interact, thus a multivariate
approach is very important. For multivariate stochastic volatility models this means that
264 Applied Quantitative Methods for Trading and Investment

it is essential to capture changing cross-covariance patterns. Multivariate modelling of
covariance is rather new and dif¬cult because it is af¬‚icted by extreme problems of lack
of parsimony. From a modelling point of view, the multivariate SV models are easier to
extend than the ARCH models, but the estimation problem remains.

8.6.1 Multivariate continuous SV models
Some multivariate continuous SV models are easy to state. Harvey et al. (1994) applied
quasi-likelihood Kalman ¬ltering techniques on:

yit = µit exp(hit /2) i = 1, . . . , M, µt = (µ1t , . . . , µMt ) ∼ NIID(0, µ) (8.11)

where µ is a correlation matrix and ht = (h1t , . . . , hMt ) a multivariate random walk,
although more complicated linear dynamics could be handled. The approach again relies
on linearising, this time with loss of information, by writing log yit = hit + log µit . The
2 2

vector of log µit is iid, all with means ’1.27, and a covariance matrix which is a known
2

function of µ . Consequently, µ and the parameters indexing the dynamics of ht can
be estimated.
It is worthwhile pointing out two aspects of this model. If rank constraints are imposed
on ht , common trends and cycles will be allowed into the process describing the volatility.
Furthermore, the model is similar to Bollerslev™s (1990) model which is characterised by
constant conditional correlation. Hence the model is better de¬ned as one of changing
variances rather than of changing correlation. Consequently, it fails to represent important
features of the data and so it is of limited interest.
Perhaps a more attractive multivariate SV model can be obtained by introducing factors.
The simplest one-factor model is:

yt = »ft + wt , wt ∼ NIID(0, w )
ft = µt exp(ht /2), ht = βht’1 + ·t , ·t ∼ NIID(0, σ· )
2


where yt is perturbed by wt and explained by the scaled univariate SV model ft . Typically
w will be assumed diagonal, perhaps driven by independent SV models.
The lack of an obvious linearising transformation for these models prevents us from
effectively using Kalman ¬ltering methods. MCMC methods do not suffer this drawback
and are explored in Jacquier et al. (1995) and Pitt and Shephard (1999b).

8.6.2 Multivariate discrete SV models
The multivariate extension of the discrete stochastic volatility model (8.3) is easy to
state. We can consider the multivariate framework (8.11) and allow each component of
ht = (h1t , . . . , hMt ) to follow a two-state Markov chain, i.e.

yit = µit exp(hit /2), i = 1, . . . , M, µt = (µ1t , . . . , µMt ) ∼ NIID(0, µ)
hit = ± + βsit

In order to apply the Hamilton ¬lter and to obtain the likelihood function, it is useful to
de¬ne a new Markov chain St with 2M states, which represents the M Markov chains
governing the dynamics of ht .
Stochastic Volatility Models 265

If the M Markov chains are independent, the transition probabilities of St are simply
obtained by multiplying the probabilities that drive the different Markov chains. Accord-
ingly, the transition probability matrix will be Q = P1 — P2 — · · · — PM , where — indi-
cates the Kronecker product and Pi the transition matrix of sit , i = 1, 2, . . . , M. In that
case, the number of states rises exponentially with the dimension of ht , but the number
of parameters describing the Markov chains grows linearly with M and is 2M.
A more general speci¬cation does not make any a priori assumptions about the relations
between the different Markov chains. The transition probabilities of the composite Markov
chain St are then given by:

qij = (St = j |St’1 = i), i, j = 1, 2, . . . , 2M

which requires 2M (2M ’ 1) parameters. To understand the dimension of the problem, with
M = 2 (and two states), the independent case requires four parameters, while the general
speci¬cation requires 12 parameters.
Clearly the general speci¬cation becomes quickly unfeasible but, in some applications,
the independent case is not useful to understand the causality between the volatility
of different assets. Billio (2002a) proposes considering several correlated cases with a
number of parameters comprised between 2M and 2M (2M ’ 1) by exploiting the concept
of Granger causality.
As for the continuous SV model, a more interesting multivariate extension can be
obtained by introducing a latent factor structure where the latent factors are characterised
by discrete stochastic volatility. Unfortunately, in that case the joint process of the observ-
able variable yt and of the latent Markov chains is no longer Markovian, and then the
Hamilton ¬lter no longer works. For the estimation it is thus necessary to use some
approximation or to use simulation-based methods (see Billio and Monfort (1998), Kim
and Nelson (1999)).


8.7 EMPIRICAL APPLICATIONS
To provide simple illustrations of the usefulness of SV models, the two basic models are
estimated and their output is used to develop standard option pricing and to calculate the
VaR of an asset or a portfolio.


8.7.1 The Volatility program

There do not exist statistical packages to easily and directly estimate28 SV models and
thus the necessary routines have been developed with Ox (version 3.20), a programming


28
Linear state space models can be estimated with the Kalman ¬lter in EViews, with the Gauss package
FANPAC or the Ox package SSFPack (see also STAMP). Thus the linearised version (8.6) could be estimated
with a quasi-maximum likelihood approach. For the switching regime models, see also MSVAR, an Ox package
developed by H.M. Krolzig and designed for the econometric modelling of univariate and multiple time series
subject to shifts in regime (http://www.economics.ox.ac.uk/research/hendry/krolzig/).
266 Applied Quantitative Methods for Trading and Investment

language created mainly by Jurgen A. Doornik.29 These routines can also be used within
the package GiveWin.
The ¬les required for running the Volatility program30 are “volatilitymain.ox”, the main
program ¬le, “volatility.oxo”, a compiled ¬le containing the de¬nition of the functions,
and the header ¬le “volatility.h”, containing the lists of global variables and functions.
In Ox or GiveWin it is suf¬cient to load the main program ¬le, to select the appropriate
options and then to run the program (for the details of the commands and options see the
enclosed readme.txt ¬le).
Depending on which commands are commented out (// in front of the command) the
program can:

• Estimate a basic continuous or discrete SV model on a user provided series;
• Simulate a basic continuous or discrete SV model;
• Estimate a basic continuous or discrete model on a user provided series and then
simulate an alternative path with the estimated parameters.

It shall be stressed that GiveWin is not needed to estimate the models but only to display
graphs. This program can easily be used with the freeware version of Ox in conjunction
with any text editor, however we recommend the use of OxEdit since it integrates with
Ox; both packages can be downloaded from Doornik™s website (see footnote 29). All the
graphic windows presented in this chapter are taken from GiveWin.
The ¬rst line of code in Figure 8.2, just before the “main” command, imports the
“volatility.oxo” ¬le, which contains the functions, recalls the Database class and other Ox
packages such as the graphic, the probabilistic and the maximisation ones.31
The program is then organised as follows:

• In a ¬rst step the time series of interest is requested in an Excel spreadsheet. The user
has to indicate the exact path of the ¬le to be loaded (other data formats can be used, see
the readme.txt ¬le for additional details). Data passed to the programs must be the price
levels, the necessary transformations are directly carried out by the estimation routines.
• In a second step the model is chosen, estimated and, if desired, simulated.
• Finally, the outputs of the model are printed and graphed to the screen and saved.

29
Ox is an object-oriented matrix programming language with a comprehensive mathematical and statistical
function library whose major features are speed, extensive library and well-designed syntax, leading to programs
which are easy to maintain. In particular, this program takes advantage of the concept of class: it is possible
to create new classes based on existing ones and to use their functions, therefore avoiding the need to rewrite
them for the new class. In our case, the program is built on the Database class, which is the class designed
for handling databases, samples, names of variables, etc. The Database class is used as a starting point for
the more speci¬c class of Stochastic Volatility Model: the functions, both to estimate and to simulate the
models and to store the results, are totally rewritten, while the functions that manage the time series are part
of the Database class. More information on Ox can be found at http://www.nuff.ox.ac.uk/users/doornik/. See
also Doornik (2001).
30
Updated versions of the program will be available for download at the address www.greta.it (under the
Working Papers section). The package is free of charge for academic and research purposes. For commercial
use, please contact the author (mgobbo@greta.it).
31
Of course, it is possible to modify the program by adding functions belonging to the loaded Database class
or to other different classes. In this case, the class containing the desired functions must be loaded by adding
a line of code (#import “. . .”) before the “main” command.
Stochastic Volatility Models 267




Figure 8.2 The main program “volatilitymain.ox” loaded with GiveWin with the full list
of commands


The available variables in the Excel ¬le “series.xls” are the daily stock indexes analysed
in Section 8.1.1, i.e. the FTSE100, the CAC40 and the MIB30 indexes. In the example
developed in this chapter attention will be focused on the modelling of the FTSE100 index.
In the ¬rst part of the program, these variables are loaded in Excel format. Thus the full
sample of FTSE100 is selected in order to start the analysis and perform the estimation.
The command “Estimate” is quite complex and requires inputs by the user: the type of
model, the choice of the initialisation of the parameter values and their values if the user
wants to de¬ne them (see Figure 8.3 and the readme.txt ¬le).


8.7.1.1 Estimation

The package allows the analysis of the two basic models, i.e. the log-normal SV model
(8.2), called ARSV in the program, which is estimated by quasi-maximum likelihood
with the Kalman ¬lter, and the two-regime switching model (8.3), called SRSV, which
is estimated by maximum likelihood with the Hamilton ¬lter.
The ¬rst model is:
yt = µ + µt exp(ht /2)
ht = ± + βht’1 + ·t

2
with µt and ·t independent Gaussian white noises. Their variances are 1 and σ· , respec-
tively. The volatility equation is characterised by the constant parameter ±, the autoregres-
2
sive parameter β and the variance σ· of the volatility noise. The mean is either imposed
equal to zero or estimated with the empirical mean of the series (see equation (8.12)).
268 Applied Quantitative Methods for Trading and Investment




Figure 8.3 The “Load”, “Select” and “Estimate” commands

Since the speci¬cation of the conditional volatility is an autoregressive process of order
one, the stationarity condition is |β| < 1. Moreover, the volatility σ· must be strictly posi-
tive. In the estimation procedure the following logistic and logarithm reparameterisations:

exp(b)
β=2 ’1 σ· = exp(s· )
1 + exp(b)

have been considered in order to satisfy the above constraints.
The second model is a particular speci¬cation of the regime switching model introduced
by Hamilton. Precisely the distribution of the returns is described by two regimes with
the same mean but different variances and by a constant transition matrix:

µ + σ 0 µt if st = 0 1 ’ p11
p00
yt = P=
1 ’ p00
µ + σ 1 µt if st = 1 p11

where32 st is a two-state Markov chain independent of µt , which is a Gaussian white
noise with unit variance. The parameters of this model are the mean µ, the low and
high standard deviation σ0 , σ1 and the transition probabilities p00 , p11 (also called regime
persistence probabilities). As for the log-normal SV model, the logarithm and the logistic
transformations ensure the positiveness of the volatilities and constrain the transition
probabilities to assume values in the (0,1) interval.
Before starting the estimation it is necessary to transform the raw time series, expressed
in level, into logarithmic returns33 and to set the starting values of the parameters in

According to the model (8.3), σ0 = exp(±/2) and σ1 = exp((± + β)/2).
32
33
In the Excel ¬le “series.xls” there are 899 observations of the daily stock indexes analysed in Section 8.1.1.
In the estimation we therefore consider 898 daily return observations.
Stochastic Volatility Models 269

the maximisation algorithm.34 Moreover, for the log-normal SV model the returns are
modi¬ed as follows:
yt— = log(yt ’ y t )2 + 1.27 (8.12)

where y t is the empirical mean. Thus, for the log-normal SV model the mean is not
estimated but is simply set equal to the empirical mean.
While these transformations are automatically done by the estimation procedure, the
setting of the starting parameter values requires a choice by the user from the follow-
ing options.

• Random initialisation: a range of possible values of the parameters is ¬xed, where
necessary, and a value is randomly extracted. This method is useful when the user
has no idea about the possible value of the parameters but wants to better investigate
the parametric space. The drawback of this option is that the optimisation algorithm
may be quite time-consuming, because it needs more iterations to converge and the
probability that it does not converge to the global maximum increases and then several
optimisation runs (with different random starting values) may be required.
• Data-driven initialisation: the starting values of the parameters are calculated consider-
ing the time series analysed. For example, the sample mean is used as an approximation
of the mean of the switching regime model and the empirical variance multiplied by
appropriate factors is used for the high and low variance. This alternative helps the
user to speed up the convergence even if he has no opinion on the possible values of
the parameters.
• User initialisation: the starting values of the parameters are directly inserted by the user.

In the example, the data-driven initialisation has been selected.
During the estimation it is possible to control each step of the algorithm through the
command MaxControl (see the readme.txt ¬le for more information). The estimation
output is then given by the estimated values of the parameters, their standard errors and
relative signi¬cance test statistics.35
Figure 8.4 shows the ¬nal output of the log-normal SV model for the FTSE100 index.
In this example the numerical optimisation ends after 76 iterations, which take 8.32
seconds,36 and the log-likelihood37 is ’1153.7. The volatility of the FTSE100 index is
very persistent, in fact the autoregressive coef¬cient of the volatility equation (β) is equal
to 0.956. In practice, for ¬nancial time series this coef¬cient is very often bigger than 0.9.
Figure 8.5 exempli¬es the graphic output, which consists of the estimated volatility
for the FTSE100 index along with the historical return series. The estimated volatil-
ˆ
ity is obtained by using the Kalman smoother ht/T = E(ht |Y —T ), which is however
not immediately useful. In fact, we are interested in E(σt |Y T ) = E(exp(ht /2)|Y T ), but
E(exp(ht /2)|Y T ) = exp(E(ht /2|Y T )). Thus, we consider a ¬rst-order Taylor expansion

34
Recall that data transformations are directly carried out by the programs that require input price levels.
35
The standard errors are calculated following Ruiz (1994) for the log-normal SV model and as the inverse of
the information matrix for the switching regime model. In both cases the z-statistics asymptotically follow an
N (0,1) distribution.
36
On a Pentium III 933 MHz.
37
For the log-normal SV model the log-likelihood is computed with the Kalman ¬lter for the transformed series
yt— , see equation (8.12).
270 Applied Quantitative Methods for Trading and Investment




Figure 8.4 Estimation output of the log-normal SV model for the FTSE100 index

0.050
historical returns

0.025

0.000

’0.025

’0.050

0 100 200 300 400 500 600 700 800 900
estimated volatility

0.020


0.015


0.010

0 100 200 300 400 500 600 700 800 900

Figure 8.5 Historical returns and estimated volatility for the FTSE100 index obtained with the
log-normal SV model
Stochastic Volatility Models 271

ˆ
of exp(ht /2) around ht/T , and compute the conditional mean and estimate the volatility
in the following way:

ˆ ˆ
ht ∼ exp ht/T ht/T
1
σt/T = E exp
ˆ |Y T +
= Qt/T
exp
2 2 8 2

This computation is performed directly by the program and Figure 8.5 presents the esti-
mated volatility.
Figure 8.6 shows the ¬nal output38 of the switching regime model for the FTSE100
index. In this case the numerical optimisation ends after 26 iterations, which take 7.65
seconds, and the log-likelihood39 is ’2680.18. For this model we can judge the persistence




Figure 8.6 Estimation output of the switching regime model for the FTSE100 index

38
It is important to underline that the z-statistics for the transition probabilities are not useful for testing
pii = 0, pii = 1, i = 0, 1. In fact, these tests are not standard since they imply testing for the presence of two
regimes (see Davies (1977, 1987) and Hansen (1992, 1996)).
39
In this case the log-likelihood is computed with the Hamilton ¬lter for the return series and thus it is not
directly comparable with the log-likelihood of the log-normal SV model.
272 Applied Quantitative Methods for Trading and Investment

of the volatility by the value taken by the transition (or persistence) probabilities p00 , p11 .
They are very high (0.99 and 0.96), con¬rming the high persistence of the volatility of
the FTSE100 index. Moreover, the levels of the high and low volatility are perfectly in
line with the values of the volatility estimated with the log-normal SV model.
In Figure 8.7 the graphic output of the switching regime model is presented. It consists
of the historical return series, the weighted or estimated volatility and the estimated
switches between regimes.40
To estimate the volatility we consider the output of the Kim smoother. Since σt =
exp(±/2)(1 ’ st ) + exp((± + β)/2)st = σ0 (1 ’ st ) + σ1 st , we can compute:

σt/T = E(σt |Y T ) = σ0 P (st = 0|Y T ) + σ1 P (st = 1|Y T )
ˆ (8.13)

where

P (st = 0|Y T ) = P (ht = ±|Y T ) and P (st = 1|Y T ) = P (ht = ± + β|Y T )

Finally, it is possible to save the estimated volatility. Since the visualisation of the
graphs is possible only with the commercial version of Ox, if this is not available the
program allows only the saving of the estimated volatility series. The “SeriesEst” com-
mand allows the saving of the following series in an Excel format:41 historical returns,

0.05 historical returns


0.00


’0.05
0 100 200 300 400 500 600 700 800 900

weighted volatility

0.020

0.015


0 100 200 300 400 500 600 700 800 900
1.0
regime shifts



0.5



0 100 200 300 400 500 600 700 800 900

Figure 8.7 Historical returns, weighted volatility and estimated switches between regimes for
the FTSE100 index obtained with the regime switching model

The regime is 0 if P (ht = ±|Y T ) ≥ 0.5 and 1 otherwise.
40
41
In the output ¬le, for the ARSV model, Var1 indicates the historical returns and Var2 the estimated volatilities.
For the SRSV, Var1 indicates the historical returns, Var2 the estimated volatilities, Var3 and Var4 the smoothed
and ¬ltered probabilities of the high volatility regime and Var5 the regime shifts.
Stochastic Volatility Models 273




Figure 8.8 The “Simulation”, “SeriesEst”, “SeriesSim” and “Graph” commands


estimated volatilities and for the switching regime model the smoothed and ¬ltered prob-
abilities of the high volatility regime and the regime shifts. Moreover, the graphs can
be directly saved in a postscript format with the “Graph” command. In both cases the
user should provide a path including the name and extension of the destination ¬le. The
“Graph” command includes also an additional control variable to choose whether or not
to plot the series (see Figure 8.8).


8.7.1.2 Simulation

The Volatility program also allows simulation of both the models. The “Simulation”
command gives the possibility to choose the type of model, the values of the parameters
and the length of the simulated series. If the user wants to simulate the model characterised
by the parameters just estimated, the last input of the “Simulation” command must be set
to 0, otherwise it has to be replaced by the column vector of the desired parameters.
The graphic output of the simulation is composed of the simulated series and their
volatilities. A ¬nal possibility is to plot both the estimation and simulation phases (see
Figures 8.9 and 8.10). In particular, for the switching regime model the program plots
the simulated volatility, which jumps between the low and high level, and the smoothed
(weighted) simulated volatility, which is computed in the same way as the estimated
volatility (see equation (8.13)).
274 Applied Quantitative Methods for Trading and Investment

0.050
historical returns estimated volatility

0.025 0.020

0.000
0.015
’0.025
0.010
’0.050

0 200 400 600 800 0 200 400 600 800
simulated returns simulated volatility
0.025
0.025
0.020
0.000
0.015

’0.025 0.010

0.005
’0.050
0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500

Figure 8.9 Estimation and simulation graphic output of the log-normal SV model


0.05 historical returns weighted volatility

0.020
0.00
0.015

’0.05
0 200 400 600 800 0 200 400 600 800

1.0 simulated returns
regime shifts
0.05

0.5 0.00

’0.05

0 500 1000 1500 2000 2500
0 200 400 600 800

1.0 simulated regime shifts
weighted simulated volatility


0.020
0.5
0.015


0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500

Figure 8.10 Estimation and simulation graphic output of the switching regime model
Stochastic Volatility Models 275

Finally, it is possible to save the simulated volatility. The “SeriesSim” command allows
the saving of the following series in an Excel format:42 simulated returns, simulated
volatilities and for the switching regime model the smoothed probabilities of the high
volatility regime and the regime shifts.

8.7.1.3 Forecasting
The ¬nal output given by the Volatility program is the forecasted volatility for the fol-
lowing period. The Kalman and Hamilton ¬lters also give the prediction density of ht+1 ,
then it is possible to forecast the next value of the volatility.
For the log-normal SV model, we consider a ¬rst-order Taylor expansion of exp(ht /2)
ˆ
around hT +1/T and by taking the conditional expectation we forecast the volatility in the
following way:

ˆ ˆ
hT +1 ∼ exp hT +1/T hT +1/T
1
σT +1/T = E exp
ˆ +
YT = QT +1/T
exp
2 2 8 2

With regard to the switching regime model, since σt = σ0 (1 ’ st ) + σ1 st , we can forecast
the volatility as follows:

σT +1/T = E(σT +1 |Y T ) = σ0 P (sT +1 = 0|Y T ) + σ1 P (sT +1 = 1|Y T )
ˆ

where

P (sT +1 = 0|Y T ) = P (hT +1 = ±|Y T ) P (sT +1 = 1|Y T ) = P (hT +1 = ± + β|Y T )

are the prediction probabilities43 obtained with the last iteration of the Hamilton ¬lter.
The forecasted volatility is evidenced in the output and it is saved as the last value of
the estimated volatility.44
Let us now consider some practical utilisations of the estimated volatilities.

8.7.2 Option pricing
As seen in Section 8.1.2, the option price in the Black and Scholes framework can be
expressed as a conditional expectation given the current price of the underlying asset:

CtBS = exp(’r„ )ESt+„ |St [max(St+„ ’ K, 0)]

42
For the ARSV model, Var1 indicates the simulated returns and Var2 the simulated volatilities. For the SRSV,
Var1 indicates the simulated returns, Var2 the simulated volatilities, Var3 the smoothed probabilities of the high
volatility regime and Var4 the regime shifts.
43
It is possible to obtain the prediction probabilities by multiplying the transition matrix P by the ¬ltered
probabilities (which are saved in the estimation output ¬le as Var4), i.e.

P (st+1 = 0|Y t ) P (st = 0|Y t )
=P
P (st+1 = 1|Y ) P (st = 1|Y t )
t




44
In the estimation output ¬le, in the last row the non-zero value for the estimated volatility is the forecasted
volatility, while the non-zero value for the ¬ltered probability is the prediction probability of the high volatility
regime (see footnote 43). All the other variables are set equal to zero.
276 Applied Quantitative Methods for Trading and Investment

where the dynamic of the asset is described by a geometric diffusion process and the
expectation is taken with respect to the risk-neutral probability measure.
Since the Black and Scholes formula can be expressed as a function only of the volatil-
ity, great effort has been made in modelling its behaviour. While Black and Scholes
assume that it is constant over the life of the option, a series of models proposed in the
late 1980s supposes that it varies through time in a deterministic or stochastic way, in an
attempt to capture the empirical features of the option prices. In fact, an analysis of the
volatility implied in the market option prices (the so-called implied volatility) highlights
that the volatility is neither constant through time nor independent of the strike price (the
so-called “smile” and “sneer” effect) (see Rubinstein (1985)).
A very simple approach consists of using the volatility estimated with stochastic volatil-
ity models as input to the Black and Scholes formula. In that case, it is suf¬cient to
ˆ
consider the forecasted volatility σt/t’1 as the volatility parameter in the formula to obtain
the option price:
Ct = CtBS (σt/t’1 )
ˆ

In the following we review some stochastic volatility models for the option pricing, which
consider the volatility as an exogenous stochastic process.
The path-breaking work on stochastic volatility models applied to option pricing is the
paper by Hull and White (1987). The authors assume that both the underlying security S
and the variance σ 2 follow a geometric diffusion process:

dS = µSdt + σ Sdz
dσ 2 = φσ 2 dt + ξ σ 2 dω

where the correlation ρ between the two Brownian motions dz, dω is a constant with mod-
ulus less than one. Hull and White take ρ ≡ 0. Scott (1987) considers the case in which
the volatility follows an Ornstein“Uhlenbeck process and also imposes the restriction
ρ ≡ 0. Finally, Heston (1993) proposes the familiar mean-reverting square root process
for the volatility:

dS = µSdt + σ Sdz
dσ 2 = γ (φ ’ σ 2 )dt + ξ σ dω

where φ is the long-run average variance and he takes the assumption ρ = 0.
Introducing stochastic volatility into the de¬nition of the stochastic differential equation
of the underlying asset creates several complications. A dynamic portfolio with only one
option and one underlying asset is not suf¬cient to create a riskless investment strategy.
The problem arises since the stochastic differential equation for the option contains two
sources of uncertainty. Unfortunately, it is impossible to eliminate volatility market risk
premium and correlation parameters from the partial differential equation using only one
option and one underlying asset. Moreover, these parameters are dif¬cult to estimate45
and extensive use of numerical techniques is required to solve the two-dimensional partial
differential equation.

45
An exception occurs when the volatility is a deterministic function of the asset price or time. In this case it
is possible to easily ¬nd a solution to the partial differential equation.
Stochastic Volatility Models 277

In the Hull and White (1987) formula, the option price is determined assuming that
the volatility market risk premium is zero and there is zero correlation between the two
Brownian motions describing the underlying asset and the volatility, i.e. the volatility
is uncorrelated with the asset price. With these assumptions and using a risk-neutral
valuation procedure, they show that the price of an option with stochastic volatility is the
Black and Scholes price integrated over the distribution of the mean volatility:

CtHW = CtBS (σ 2 )g(σ 2 |σt2 )dσ 2

where t+„
1
σ= 2
σ 2 (u)du
„ t

and g(σ |σt2 ) is the conditional probability density of the mean variance σ 2 over the
period „ .
In the more general case of non-zero correlation, the framework becomes more complex,
allowing only numerical solutions.
It can be observed that continuous time stochastic volatility provides an attractive
and intuitive explanation for observed volatility patterns and for observed biases in
implied volatility. Precisely, smiles, skews, upward and downward term structures
of implied volatility arise naturally from a stochastic volatility model. However the fact
that stochastic volatility models ¬t empirical patterns does not mean that those models are
correct and the biases in market prices may be the result of other factors, not considered,
such as liquidity problems.
A work related to that of Hull and White is Naik (1993). While in the Hull and White
speci¬cation the volatility follows a continuous diffusion process, Naik analyses the case
where the instantaneous variance of an asset is subject to random discontinuous shifts.
In particular, the volatility is described with a right-continuous Markov chain process: it
remains in the same state for a random amount of time and then shifts to another state
with transition probabilities determined by a matrix. In the case of a two-state volatility
process the transition matrix is simply:

1 ’ p11
p00
P=
1 ’ p00 p11

Assuming that the underlying process is continuous, the risk of a shift in volatility is
diversi¬able, and therefore not priced, and that the two processes are uncorrelated, the
option valuation equation can be expressed in a closed form as follows:

=
CtN (σ1 ) CtBS (σ 2 (x))g(x|σ1 ) dx
0

where σ1 indicates the high volatility level, σ 2 (x) = [σ1 x + σ0 („ ’ x)]/„ , 0 ¤ x ¤ „ and
2 2

g(x|σ1 ) denotes the unconditional density of the time spent by the volatility process in the
high volatility state, given the current high volatility state. In the same way it is possible
to determine the option price conditional on the current low volatility state (CtN (σ0 ) with
σ0 the low volatility level).
278 Applied Quantitative Methods for Trading and Investment

As in the Hull and White model, the option price is the expectation of the Black and
Scholes formula computed for the average future volatility, given the current state. Since
two regimes are considered, the ¬nal option price can be obtained by a weighted average
of the two conditional values CtN (σ1 ) and CtN (σ0 ).
This analysis can be extended by considering multiple states and correlation between
changes of the underlying and shifts in volatility, but unfortunately in these cases the
option price can be obtained only via numerical methods. In this kind of procedure a
discrete time Markov chain is used as an approximation of the volatility process.
We brie¬‚y present two examples of this approach, due to Billio and Pelizzon (1997)
and Bollen (1998) (see also Bollen et al. (2000)). Both these works are based on the
hypothesis that the returns of the underlying asset follow a switching regime model. The
distribution is characterised by a mixture of distributions with different variance, where
the weights depend on a hidden Markov chain process which represents possible different
volatility regimes of the market. To obtain the numerical solution, the basic idea is to
approximate the distribution through a multinomial approach, considering a binomial tree
for each of the two distributions characterised by different variance.
Following Cox et al. (1979), the future value of the underlying process can be
expressed as:
uSt p
St+ t =
dSt 1 ’ p
One possible speci¬cation of the parameters that guarantees asymptotic normality and
convergence to the desired mean and variance of the continuously compounded returns is

u = exp( σ 2 t + r 2 t 2 ), d = u’1 and p = (er t ’ d)/(u ’ d). In this case the process
is fully characterised by the parameter representing the variance σ 2 , since the step size
t and the risk-free interest rate r are given. Once the variance is estimated it is possible
to construct a tree to represent the possible future paths of the variable and hence the
distribution of returns (at maturity).
If a regime switching model is considered, the distribution of the returns is simply a
mixture of the distributions characterising each state. Therefore a discrete process can be
used to approximate the continuous time process, and hence the distribution of returns,
in each state. In this case, two binomial distributions are used as an approximation of the
mixture of the two distributions:
±
u1 St |st = 1 p11

u0 St |st = 0 p00
St+ t =
d0 St |st = 0 1 ’ p00

d1 St |st = 1 1 ’ p11
where st denotes the regime.
In this case it is necessary to invoke the risk-neutral hypothesis to determine
the values of the parameters that are u1 = exp( σ1 t + r 2 t 2 ), d1 = u’1 , p11 =
2
1

’ d1 )/(u1 ’ d1 ) for the high volatility regime and u0 = exp( σ0 t + r 2 t 2 ), d0 =
(er t 2

u’1 and p00 = (er t ’ d0 )/(u0 ’ d0 ) for the low volatility regime. This model is usually
0
called a quadrinomial tree (or lattice). The inner two branches correspond to the low
volatility regime while the outer ones correspond to the high volatility regime. Each set
of probabilities (pii , 1 ’ pii ) must be interpreted as the branch probability conditional on
the regime i, with i = 0, 1.
Stochastic Volatility Models 279

Although the quadrinomial tree represents both distributions accurately, its branches do
not recombine ef¬ciently, exhibiting an exponential growth of the computational time as
the number of steps increases.
Bollen (1998) proposes a method to overcome this problem and develops a pentanomial
tree. The de¬nition of the parameters is modi¬ed to yield both the possibility to recom-
bine the tree and the convergence of the discrete process to the mixture of distributions.
This is obtained by approximating each regime density by a trinomial distribution instead
of the binomial one. The modi¬ed tree has ¬ve evenly spaced branches because the
step sizes of the two regimes are in the ratio 1:2 (u1 = exp( σ1 t + r 2 t 2 ), u0 =
2


exp( 1 σ1 t + r 2 t 2 ) and the middle branch is shared by the two regimes:
2
2

±
u1 St |st =1 p1,u

u0 St |st =0
 p0,u
= St pm
St+ t
d S |s
0t t =0
 p0,d

d1 St |st =1 p1,d

where pst ,u , pst ,d are the probabilities to go up and down conditional on the regime st
and pm is the probability to remain at the same level price St .
Once the tree is generated, the option value is calculated operating backward from the
terminal values, i.e. the payoffs, to the valuation time. In the regime switching approaches,
for simplicity two conditional option values are calculated at each step, where the condi-
tioning information is the regime at the valuation time t:

Ct (st = 0) = [p00 Ct+1 (st+1 = 0) + (1 ’ p00 )Ct+1 (st+1 = 1)]e’r t

Ct (st = 1) = [(1 ’ p11 )Ct+1 (st+1 = 0) + p11 Ct+1 (st+1 = 1)]e’r t



At the valuation time the value of the option is obtained as a weighted average of the
two conditional option values, where the weights depend on the knowledge of the current
regime. If the regime is unknown, the unconditional probabilities:

1 ’ p11 1 ’ p00
2 ’ p00 ’ p11 2 ’ p00 ’ p11

are used.
Let us take an example of the pentanomial approach by considering the parameters
estimated in Section 8.7.1.1.
We deal with a European call option on the FTSE100 index quoted at LIFFE on
22 August 2000 (when the FTSE100 index quoted 6584.82), with maturity June 2001
and strike price 5900. The risk-free interest rate r is approximated with three-month
Libor46 and the time to maturity in terms of trading days is 213. In this example t = 1.


46
London Interbank Offered Rate.
280 Applied Quantitative Methods for Trading and Investment

Taking the estimated volatilities of Section 8.7.1.1, the parameters of the pentanomial
model can be computed as follows:


e ’ e σ1 +r
’ 2 2
r
p1,u = √ 2 2 √2
σ1 +r ’ σ1 +r 2
’e
e
1√ 2 2 1√ 2 2
’ σ1 +r ’ σ1 +r
e ’e ’ pm 1 ’ e
r 2 2
√ p0,u = 1√ 2 2 1√ 2 2
σ1 +r 2
2
u1 = e σ1 +r
’ e 2 σ1 +r

e2
1√ 2 2 « 2
e 2 σ1 +r
u0 =
¬ σ0 + r ·
2 2
1√ 2 2
pm = 1 ’ 4  
’ σ1 +r
d0 = e 2
√ σ1 + r
2 2
d1 = e’ σ1 +r 2
2

p0,d = 1 ’ p0,u ’ pm
p1,d = 1 ’ p1,u

Once the tree is generated, the payoffs are calculated with the usual procedure (see
Appendix A for details). The conditional option values Ct (st = 0), Ct (st = 1) are obtained
using the estimated transition probabilities, while the ¬nal value of the option is a weighted
average where the weights are the unconditional probabilities:

1 ’ p11 1 ’ p00
2 ’ p00 ’ p11 2 ’ p00 ’ p11

The pentanomial option value is therefore 1055.14, while the market value was 1047.5.


8.7.3 Value at risk

VaR is a very intuitive measure to evaluate market risk because it indicates the maximum
potential loss at a given level of con¬dence (a) for a portfolio of ¬nancial assets over a
speci¬ed time horizon (h).
In practice, the value of a portfolio is expressed as a function of K risk factors,
x„ = N wi,t Pi,„ (χ1,„ , . . . , χK,„ ). The factors in¬‚uencing the portfolio value are usu-
i=1
ally identi¬ed with some market variables such as interest rates, exchange rates or stock
indexes. If their distribution is known in a closed form, we need to estimate the distribu-
tion of the future value of the portfolio conditional on the available information and the
VaR is then the solution to:
VaR(h,a)
a= f (xt+h ) dx
’∞

Different parametric models can be used to forecast the portfolio return distribution. The
simple way to calculate VaR involves assuming that the risk factor returns follow a multi-
variate normal distribution conditional on the available information. If the portfolio return
is linearly dependent on them, its probability distribution is also normal and the VaR is
Stochastic Volatility Models 281

simply the quantile of this analytic distribution. If the linear assumption is inappropri-
ate, the portfolio return can be approximated as a quadratic function of the risk factor
returns.
An alternative way to handle the non-linearity is to use Monte Carlo simulation. The
idea is to simulate repeatedly the random processes governing the risk factors. Each
simulation gives us a possible value for the portfolio at the end of our time horizon. If
enough of these simulations are considered, it is possible to infer the VaR, as the relevant
quantile of the simulated distribution.
Since market risk factors usually have fatter tails than the normal distribution, it is also
possible to use historical simulation rather than a parametric approach. The idea behind
this technique is to use the historical distribution of returns to the assets in the portfolio
to simulate the portfolio™s VaR, on the hypothetical assumption that we held this portfolio
over the period of time covered by our historical data set. Thus, the historical simulation
involves collecting historic asset returns over some observation period and using the
weights of the current portfolio to simulate the hypothetical returns we would have had if
we had held our current portfolio over the observation period. It is then assumed that this
historical distribution of returns is also a good proxy for the portfolio return distribution
it will face over the next holding period and VaR is calculated as the relevant quantile of
this distribution.
The advantage of the parametric approach is that the factors variance“covariance matrix
can be updated using a general model of changing or stochastic volatility. The main dis-
advantage is that the factor returns are usually assumed to be conditionally normal, losing
the possibility to take into account non-linear correlations among them. Historical sim-
ulation has the advantage of re¬‚ecting the historical multivariate probability distribution
of the risk factor returns, avoiding ad hoc assumptions. However the method suffers a
serious drawback. Its main disadvantage is that it does not incorporate volatility updat-
ing. Moreover, extreme quantiles are dif¬cult to estimate, as extrapolation beyond past
observations is impossible. Finally, quantile estimates tend to be very volatile whenever
a large observation enters the sample and the database is not suf¬ciently large.
The advantage of the parametric approach to update the volatility suggests the simplest
utilisation of the SV models for the VaR computation. Having chosen the asset or portfolio
distribution (usually the normal one), it is possible to use the forecasted volatility to
ˆ
characterise the future return distribution. Thus, σT +1/T can be used to calculate the VaR
over the next period.
A different approach using the SV model is to devolatilise the observed returns series
and to revolatilise it with an appropriate forecasted value, obtained with a particular model
of changing volatility. This approach is considered in several recent works (Barone-Adesi
et al., 1998; Hull and White, 1998) and is a way of combining different methods and
partially overcoming the drawbacks of each.
To make the historical simulation consistent with empirical ¬ndings, the log-normal
SV model and the regime switching model may be considered to describe the volatility
behaviour. Past returns are standardised by the estimated volatility to obtain standardised
residuals. Statistical tests can con¬rm that these standardised residuals behave approxi-
mately as an iid series which exhibits heavy tails. Historical simulation can then be used.
Finally, to adjust them to the current market conditions, the randomly selected standardised
residuals are multiplied by the forecasted volatility obtained with the SV model.
282 Applied Quantitative Methods for Trading and Investment
Table 8.2 VaR at different con¬dence levels for the FTSE100 index return

Con¬dence level Log-normal SV model Regime switching model

0.1 2.5442 2.7503
0.05 3.9298 3.5874
0.01 5.3417 4.6502

For example, Table 8.2 shows the results obtained with the FTSE100 index return
by considering 1 000 000 historical simulations47 (see Appendix B for details). Clearly,
this approach allows a wide range of stochastic and changing volatility models, such as
ARCH“GARCH models, to be considered. Moreover, it must be pointed out that instead
of using historical simulation, an appropriate standard distribution can also be considered
to model the transformed returns and then several probability distributions can be assumed
for the unconditional returns (McNeil and Frey, 2000; Eberlein et al., 2001).
Another example of a parametric approach to VaR calculation considers the hypothesis
that a regime switching model governs the asset returns (Billio and Pelizzon, 2000):
µ + σ 0 µt st = 0
yt =
µ + σ 1 µt st = 1
where µt ∼ N (0,1) is independent of st .
To calculate the VaR it is necessary to determine the value of the conditional distribution
for which the cumulative density is a, i.e.
VaR(h,a)
a= P (st+h |It ) fst+h (yt+h |It ) dy
’∞
st+h =0,1

where fst+h (yt+h |It ) is the probability density of yt+h when the regime is st+h and condi-
tional on the available information set It (usually containing past returns), P (st+h |It ) is
the prediction probability obtained by the Hamilton ¬lter.
Given the parameters estimated in Section 8.7.1.1 for the switching regime model and
the prediction probabilities at time t + h (obtained by the product of the transition matrix,
for h ’ 1 steps, and the conditional probabilities P (st+1 |It ) given by the Hamilton ¬lter),
the VaR is the relevant quantile of the mixture distribution.
The model can be generalised to the case of N risky assets providing an explicit link
between the return on the asset and the return on the market index, thus by explicitly
modelling the correlation between different assets. The Multivariate Switching Regime
Beta Model (Billio and Pelizzon, 2000) is a sort of market model or better a single factor
model in the Arbitrage Pricing Theory framework where the return of a single stock i is
characterised by the regime switching of the market index and the regime switching of
the speci¬c risk of the asset. It can be written as:
±
ymt = µm (st ) + σm (st )µt µt ∼ IIN (0,1)

y1t = µ1 (s1t ) + β1 (st , s1t )ymt + σ1 (s1t )µ1t µ1t ∼ IIN (0,1)


y2t = µ2 (s2t ) + β2 (st , s2t )ymt + σ2 (s2t )µ2t µ2t ∼ IIN (0,1)
.
.
.


yNt = µN (sNt ) + βN (st , sNt )ymt + σN (sNt )µNt µNt ∼ IIN (0,1)
47
Each operation takes about 12 seconds.
Stochastic Volatility Models 283

where ymt is the market return, the market regime variable st and the single stock regime
variables sj t , j = 1, . . . , N are independent Markov chains, µt and µj t , j = 1, . . . , N , are
independently distributed.
Using this approach it is possible to take into account the correlation between different
assets. In fact, the variance“covariance matrix between the two assets i and j is:

βi2 (st , sit )σm (st ) + σi2 (sit )
2 2
βi (st , sit )βj (st , sj t )σm (st )
(st , sit , sj t ) =
βj (st , sj t )σm (st ) + σj2 (sj t )
2
2 2
βi (st , sit )βj (st , sj t )σm (st )

then the correlation between different assets depends on the extent to which each asset is
linked, through the factor loading β, to the market index.
To calculate VaR for a portfolio based on N assets it is suf¬cient to use the approach
presented above. In particular, considering two assets and assuming that the number of
regimes is two for all three Markov chains we have:

VaR(h,a)
a= P (st+h , si,t+h , sj,t+h |It ) fst+h ,si,t+h ,sj,t+h (y|It ) dy
’∞
st+h =0,1 si,t+h =0,1 sj,t+h =0,1


where fst+h ,si,t+h ,sj,t+h (y|It ) is the probability density of the portfolio return y when the
regimes are st+h , si,t+h , sj,t+h and conditional on the available information set It . This dis-
tribution has mean w µ(st+h , si,t+h , sj,t+h ) and variance w (st+h , si,t+h , sj,t+h )w where w
is the vector of the percentage of wealth invested in the two assets and µ(st+h , si,t+h , sj,t+h )
is the vector of risky asset mean returns, i.e.

µi (si,t+h ) + βi (st+h , si,t+h )µm (st+h )
µ(st+h , si,t+h , sj,t+h ) =
µj (sj,t+h ) + βj (st+h , sj,t+h )µm (st+h )

The drawback of this approach is that it requires the estimation of a number of parameters
that grows exponentially with the number of assets. In fact, the number of possible regimes
generated by this model is 2N+1 .
One possible solution is to consider the idiosyncratic risk distributed as IIN (0,σi2 )
(without a speci¬c Markov chain dependency) and to characterise the systematic risk
with more than one source of risk. This approach is in line with the Arbitrage Pricing
Theory model where the risky factors are characterised by switching regime processes.
Formally, we can write this model as:

Fj t = ±j (sj t ) + θj (sj t )µj t µj t ∼ IIN (0,1), j = 1, 2, . . . , K
yit = µi + K=1 βij (sj t )Fj t + σi µit µit ∼ IIN (0,1), i = 1, 2, . . . , N
j


<<

. 13
( 19)



>>