binding function by using the score vector ‚fa (yt |Y t’1 ; π)/‚π (score generator) to de¬ne

the matching conditions. If the auxiliary model fa (yt |Y t’1 ; π) is chosen ¬‚exibly with

a suitable non-parametric interpretation, then the estimator achieves the asymptotic ef¬-

ciency of maximum likelihood and has good power properties for detecting misspeci-

¬cation (Gallant and Long, 1997; Tauchen, 1997), hence the term Ef¬cient Method of

Moments (EMM). EMM delivers consistent estimates of the structural parameter vector

under weak conditions on the choice of the auxiliary model. However, extrapolating from

the generalised method of moments evidence, it is natural to conjecture that the quality of

inference may hinge on how well the auxiliary model approximates the salient features of

the observed data. This intuition is formalised by Gallant and Long (1997), who show that

a judicious selection of the auxiliary model, ensuring that the quasi-scores asymptotically

span the true score vector, will result in full asymptotic ef¬ciency.19

17

For all these methods, it is necessary to recycle the random numbers used in the calculation when θ changes,

in order to have good numerical and statistical properties of the estimators based on these simulations.

18

Another possible auxiliary model is an ARMA(p, q) on the logarithms of the squared data (see Monfardini

(1998)).

19

In fact, as the score generator approaches the true conditional density, the estimated covariance matrix

for the structural parameter approaches that of maximum likelihood. This result embodies one of the main

advantages of EMM, since it prescribes a systematic approach to the derivation of ef¬cient moment conditions

for estimation in a general parametric setting.

258 Applied Quantitative Methods for Trading and Investment

Andersen et al. (1999) perform an extensive Monte Carlo study of EMM estimation of a

stochastic volatility model. They examine the sensitivity to the choice of auxiliary model

using ARCH, GARCH and EGARCH models for the score as well as non-parametric

extensions. EMM ef¬ciency approaches that of maximum likelihood for larger sample

sizes, while inference is sensitive to the choice of auxiliary model in small samples, but

robust in larger samples.20

The indirect inference theory, however, crucially depends on the correct speci¬cation

assumption concerning the structural model. There is now an emerging literature (see,

for example, Dridi and Renault (2000) and Dridi (2000)) which focuses on procedures

more robust to the structural model speci¬cation. In particular, Dridi and Renault (2000)

propose an extension to the indirect inference methodology to semiparametric settings

and show how the semiparametric indirect inference works on basic examples using

SV models.

8.4.3.2.2 Importance sampling

A more direct way of performing inference is to compute the likelihood by integrating

out the latent ht process. As previously seen, the integral (8.8) has no closed form and

it has to be computed numerically. However, the likelihood function naturally appears as

the expectation of the function T f (yt |Y t’1 , H t ; θ ) with respect to the p.d.f. P de¬ned

t=1

by21 T f (ht |Y t’1 , H t’1 ; θ ), from which it is easy to recursively draw. Therefore, an

t=1

˜

unbiased simulator of the whole likelihood function T (θ ) is T f (yt |Y t’1 , n H t ; θ )

t=1

˜

where n H t are recursively drawn from the auxiliary p.d.f. P . The likelihood is then

approximated by the empirical mean:

N T

1 ˜

f (yt |Y t’1 , n H t ; θ )

N n=1 t=1

and this simulated likelihood can be numerically maximised. However, this basic simulator

may be very slow, in the sense that the simulator may have a very large variance and then

some accelerating technique is needed. One solution is to consider the general method

of importance sampling based on a sequence of conditional p.d.f.s q(ht |Y T , H t’1 1). Let

us denote this probability distribution by Q and the corresponding expectation by EQ .

We have:

T

= EP f (yt |Y t’1 , H t ; θ )

T (θ )

t=1

T

f (yt |Y t’1 , H t ; θ )f (ht |Y t’1 , H t’1 ; θ )

= EQ

q(ht |Y T , H t’1 ; θ )

t=1

20

Care must be taken, however, to avoid over-parameterisation of the auxiliary model, as convergence problems

may arise if the quasi-score is extended to the point where it begins to ¬t the purely idiosyncratic noise in

the data.

It is important to note that this p.d.f. is neither f (H T ; θ ), except when yt does not cause ht , nor f (H T |Y T ; θ ).

21

Stochastic Volatility Models 259

T (θ )

Therefore, an unbiased simulator of is:

˜ ˜ ˜

T

f (yt |Y t’1 , n H t ; θ )f (n ht |Y t’1 , n H t’1 ; θ )

EQ

˜ ˜

q(n ht |Y T , n H t’1 ; θ )

t=1

˜

where n H T is drawn in Q. The problem is then how to choose the importance func-

tion: the natural answer is by reducing the Monte Carlo variance. It is easy to calculate

the theoretical optimal choice f (H T |Y T ; θ ) = T f (ht |Y T , H t’1 ) (i.e. the smoothing

t=1

density of ht ), for which one simulation is suf¬cient, but it is clearly not computable.

Then it is possible to consider the smoothing density of an approximating model, or to

¬x a parametric family of importance functions, choosing the member that minimises

the Monte Carlo variance (which is eventually computed in an approximated way). For

the SV model (8.2), the ¬rst solution is proposed by Sandmann and Koopman (1998)

by using as approximating model the linearised version (8.6). In the aim of the second

solution, Danielsson and Richard (1993) propose a sequentially optimised importance

sampling, which Danielsson (1994) applies to the SV model.22 In both cases, the sim-

ulated maximum likelihood estimates of model parameters are obtained by numerical

optimisation of the logarithm of the simulated likelihood.23

8.4.3.2.3 Bayesian approach

In the Bayesian setting, there are also serious dif¬culties in estimating the SV model. In

general, the posterior density f (θ |Y T ) and the posterior expectation of θ cannot be com-

puted in a closed form. Again, this complex setting requires a simulation-based approach.

The data augmentation principle, which considers the latent variable ht as nuisance par-

ameters, and the utilisation of Gibbs sampling (Gelfand and Smith, 1990), by iterating

simulations from f (H T |Y T , θ ) (data augmentation step) and f (θ |Y T , H T ) (parameter

simulation step), allow simulation from the joint posterior distribution f (H T , θ |Y T ),

derivation of the distribution of interest as the marginal distribution of θ and approx-

imation of the posterior expectation by a sample average. When conditional distributions

cannot be directly simulated, the corresponding steps in the Gibbs algorithm are replaced

by Metropolis“Hastings steps.24 Moreover, the prior modelling on the parameters is usu-

ally quasi-non-informative.

One way of considering this approach is to regard it as an empirical Bayes procedure,

reporting the mean of the posterior distributions as an estimator of θ . This is the approach

followed by Jacquier et al. (1994) who show that empirical Bayes outperforms QML and

GMM in the SV case.

In Jacquier et al. (1994) the posterior distribution of the parameters was sampled by

MCMC methods using a one-move approach (i.e. the latent variables ht were sampled each

at time from f (ht |Y T , H ’t , ±, β, σ· ), where H ’t denotes all the elements of H T exclud-

2

ing ht ). Although this algorithm is conceptually simple, it is not particularly ef¬cient from

a simulation perspective, as is shown by Kim et al. (1998), who develop an alternative,

22

The details will not be dealt with here as they are quite involved, even for the simplest model.

23

As for non-ef¬cient methods, numerical and statistical accuracy is obtained by recycling the random numbers

used in the calculation for each parameter value.

24

Such hybrid algorithms are validated in Tierney (1994).

260 Applied Quantitative Methods for Trading and Investment

more ef¬cient, multi-move MCMC algorithm. The ef¬ciency gain in the Kim et al. (1998)

algorithm arises from the joint sampling of H T in one block conditioned on everything

else in the model. Finally, Chib et al. (2002) develop ef¬cient Markov Chain Monte Carlo

algorithms for estimating generalised models of SV de¬ned by heavy-tailed Student-t dis-

tributions, exogenous variables in the observation and volatility equations, and a jump

component in the observation equation (see Section 8.5.1).

8.4.3.2.4 An MCMC approach to maximum likelihood estimation

Although the Bayesian approach is straightforward to state and computationally attractive,

it requires the elicitation of a prior, which is often regarded by some econometricians

as being dif¬cult in dynamic models. Even if this is not an insurmountable problem,

alternatives are available which allow us to perform maximum likelihood estimation using

MCMC methods.

The ¬rst possibility is the Simulated Expectation Maximisation (SEM) algorithm pro-

posed by Shephard (1993). The EM algorithm exploits the following decomposition of

the log-likelihood function:

log f (Y T ; θ ) = log f (Y T , H T ; θ ) ’ log f (H T |Y T ; θ )

= E[log f (Y T , H T ; θ)|Y T ] ’ E[log f (H T |Y T ; θ )|Y T ]

and iterates:

θ i+1 = arg max Eθ i [log f (Y T , H T ; θ)|Y T ]

θ

This is an increasing algorithm such that the sequence θ i converges to the ML estimator.

The problem is that, although log f (Y T , H T ; θ ) has in general a closed form, the same is

not true for its conditional expectation. In the SEM algorithm this expectation is replaced

by an approximation based on simulations. Thus, the problem is now to be able to draw

in the conditional distribution of H T given Y T and θ . Shephard (1993), in the context

of a non-linear state space model, uses the Hastings“Metropolis algorithm to solve this

problem, and applies it to the SV model.

Another possible approach is the Simulated Likelihood Ratio (SLR) method proposed

by Billio et al. (1998). The general principle is:

f (Y T ; θ ) f (Y T , H T ; θ ) T

= Eθ Y (8.9)

f (Y T ; θ ) f (Y T , H T ; θ )

where θ is an arbitrary ¬xed value of the parameters. Obviously:

f (Y T ; θ)

arg max f (Y ; θ ) = arg max

T

f (Y T ; θ)

θ θ

˜

and with n H T , n = 1, 2, . . . , N , simulated paths in the conditional distribution

f (H T |Y T ; θ), the SLR method amounts to maximising:

Stochastic Volatility Models 261

˜

N

f (n H T , Y T ; θ )

1

˜

N f (n H T , Y T ; θ )

n=1

with respect to θ . The method can be implemented by simulating the conditional distribu-

tion25 f (H T |Y T ; θ ). As already noted, it is impossible to simulate directly this distribution,

thus a Hastings“Metropolis approach is suggested.

Contrary to the SEM approach, the SLR method allows for the computation of the

likelihood surface and then of likelihood ratio test statistics. It needs only one optimisation

run and not a sequence of optimisations; it is possible to store the simulated paths, and

then only one simulation run is required. Moreover, as the simulation is made for only

one value of the parameter, the objective function will be smooth with respect to θ , even

if simulations involve rejection methods.

Billio et al. (1998) apply the SLR method also to the SV model (8.2).

8.5 EXTENSIONS OF SV MODELS

The basic SV models can be generalised in a number of directions. Straightforward

generalisations might allow µt to have heavy-tailed Student-t distributions and exogenous

variables in the observation and volatility equations.

Moreover, the ARCH in mean model of Engle et al. (1987) can be extended to the

SV framework, by specifying yt = µ0 + µ1 exp(ht ) + µt exp(ht /2). This model allows yt

to be moderately serially correlated, but in the discrete SV model the Hamilton ¬lter no

longer works, because yt , ht are not jointly Markovian.

8.5.1 Extensions of continuous SV models

In the context of continuous SV models, Harvey et al. (1994) concentrated their attention

on models based on Student-t error; Mahieu and Schotman (1998) analysed the possibility

of using a mixture distribution. Jacquier et al. (1995) have computed the posterior density

of the parameters of a Student-t-based SV model. This particular type of model in fact

can be viewed as an Euler discretisation of a Student-t-based Levy process but with

additional stochastic volatility effects; further articles are available in (continuous-time)

mathematical options and risk assessment literature.26 By building on the work of Kim

et al. (1998), Chib et al. (2002) develop ef¬cient Markov Chain Monte Carlo algorithms

for estimating these models. They also consider a second type of model which contains

a jump component27 in the observation equation to allow for large, transient movements.

25

The resulting Monte Carlo approximation of (8.9) could be only locally good around θ , and so Geyer (1996)

suggests updating θ to the maximiser of the Monte Carlo likelihood and repeating the Monte Carlo procedure

using the new θ . By updating θ a few times, one should obtain better approximations of the relative likelihood

function near the true maximum likelihood estimate.

26

Leading references include Eberlein (2001) and Eberlein and Prause (2001). The extension to allow for

stochastic volatility effects is discussed in Eberlein and Prause (2001) and Eberlein et al. (2001).

27

Jump models are quite popular in continuous-time models of ¬nancial asset pricing. See, for example, Merton

(1976), Ball and Torous (1985), Bates (1996), Duf¬e et al. (2000).

262 Applied Quantitative Methods for Trading and Investment

Moreover, a natural framework for extension of continuous SV models might be based

on adapting the Gaussian state space so that:

yt = µt exp(zt ht /2)

ht = Tt ht’1 + ·t ·t ∼ N (0, Ht )

and then on allowing ht to follow a more complicated ARMA process. Another simple

example would be:

2

σ ·1 0

β

1 0

zt = ht = ht’1 + ·t , ·t ∼ N 0, 2

1 0 1 σ ·2

0

Now, the second component of ht is a random walk, allowing the permanent level of the

volatility to slowly change. This is analogous to the Engle and Lee (1992) decomposition

of shocks into permanent and transitory. A model along the same lines has been suggested

by Harvey and Shephard (1993), who allow (ignoring the cyclical AR(1) component):

0 0

1 1 1

zt = Tt = Ht = 2

σ ·2

0 0 1 0

This uses the Kitagawa and Gersch (1984) smooth trend model in the SV context, which

in turn is close to putting a cubic spline through the data. This may provide a good

summary of historical levels of volatility, but it could be poor as a vehicle for forecasting

as con¬dence intervals for forecasted volatilities ht+r may grow very quickly with r.

Another suggestion is to allow ht to be a fractional process, giving the long-memory

SV model. For ¬nancial time series, there is strong evidence that the effect of a shock to

volatility persists (i.e. is not absorbed) for a long number of periods (see e.g. Andersen

and Bollerslev (1997), Lobato and Savin (1998), Harvey (1998), Bollerslev and Jubinski

(1999), Bollerslev and Mikkelsen (1999), Bollerslev and Wright (2000) and Ray and Tsay

(2000)), thus the concept of long memory seems suitable and has been suggested by Breidt

et al. (1998). A covariance stationary time series yt has long memory if:

∞

|Cov(yt , yt’r )| = ∞

r=0

with Var(yt ) < ∞. Basically, it says that the autocovariances do decay as the lag increases

but very slowly, usually hyperbolically.

Currently there exist four approaches to estimate the long-memory SV model. The

quasi-maximum likelihood estimator of Breidt et al. (1998), the GMM approach of Wright

(1999), the widely used semiparametric, log-periodogram estimator of Geweke and Porter-

Hudak (1983) (see e.g. Andersen and Bollerslev (1997), Ray and Tsay (2000), Wright

(2000), Deo and Hurvich (2001) and the recent developments of Hurvich and Ray

(2001), Hurvich et al. (2001)) and the Bayesian estimator based on the Markov Chain

Monte Carlo sampler (Chan and Petris, 1999) and eventually the wavelet representation

of the log-squared returns (Jensen, 1999, 2000, 2001).

The quasi-MLE of the long-memory SV model is known to be strongly consistent,

but requires the order of the short-memory autoregressive and moving average parame-

ters to be correctly identi¬ed, as does the GMM estimator. The difference between the

Stochastic Volatility Models 263

quasi-MLE and GMM is that when the fractional order of integration is smaller than 1/4,

the asymptotic properties in addition to consistency are known for the GMM estimator.

Unlike the quasi-MLE of a short-memory stochastic volatility model whose asymptotic

properties are known (Harvey et al., 1994; Ruiz, 1994), these other asymptotic proper-

ties are not yet known for the quasi-MLE of the long-memory SV model. However, in

simulation experiments, Wright (1999) ¬nds that neither estimator™s ¬nite-sample prop-

erties dominate the other; the GMM estimator of the long-memory parameter generally

produces smaller standard errors but with a signi¬cant downward bias. From these simu-

lations Wright (1999) admonishes developing alternative estimators of the long-memory

SV model that are more ef¬cient and less biased. In fact, even if Deo and Hurvich (2001)

¬nd the asymptotic properties for the log-periodogram estimator of volatility to be similar

to those proved by Robinson (1995) for the same estimator in the mean, correct infer-

ence about the degree of long-memory relies on the number of Fourier frequencies in the

regression growing at a rate that is dependent on the value of the unknown long-memory

parameter. It thus seems that neither the quasi-MLE nor the log-periodogram estimator of

the long-memory volatility model lend themselves nicely to the construction of con¬dence

intervals or hypothesis testing of the long-memory parameter estimate.

Finally, it could be useful to allow the SV model to capture the non-symmetric response

to shocks. This feature can be modelled by allowing µt’1 and ·t to be correlated. If µt’1

and ·t are negatively correlated, and if µt’1 > 0, then yt’1 > 0 and ht is likely to fall.

2

Hence, a large effect of yt’1 on the estimated ht will be accentuated by a negative sign on

yt’1 , while its effect will be partially ameliorated by a positive sign. This correlation was

suggested by Hull and White (1987) and estimated using GMM by Melino and Turnbull

(1990) and Scott (1991). A simple quasi-maximum likelihood estimator has been proposed

by Harvey and Shephard (1996). Jacquier et al. (1995) have extended their single-move

MCMC sampler to estimate this effect.

8.5.2 Extensions of discrete SV models

In the discrete case, the basic model might be extended by considering different Markov

chains, which can allow the decomposition of shocks into permanent and transitory as in

the continuous case:

yt = µt exp(zt ht /2)

(8.10)

ht = µ + TSt

where St represents a vector of Markov chains. However, the vector of Markov chains

can easily be represented by a single Markov chain with a suf¬cient number of states and

then the model (8.10) formally reduces to the basic model (8.3).

Finally, the two SV models can be combined by allowing the continuous latent volatil-

ity to be governed by a ¬rst-order Markov chain. In that case, the estimation is very

dif¬cult. So et al. (1998) therefore propose Bayesian estimators which are constructed by

Gibbs sampling.

8.6 MULTIVARIATE MODELS

Most macroeconomics and ¬nance is about how variables interact, thus a multivariate

approach is very important. For multivariate stochastic volatility models this means that

264 Applied Quantitative Methods for Trading and Investment

it is essential to capture changing cross-covariance patterns. Multivariate modelling of

covariance is rather new and dif¬cult because it is af¬‚icted by extreme problems of lack

of parsimony. From a modelling point of view, the multivariate SV models are easier to

extend than the ARCH models, but the estimation problem remains.

8.6.1 Multivariate continuous SV models

Some multivariate continuous SV models are easy to state. Harvey et al. (1994) applied

quasi-likelihood Kalman ¬ltering techniques on:

yit = µit exp(hit /2) i = 1, . . . , M, µt = (µ1t , . . . , µMt ) ∼ NIID(0, µ) (8.11)

where µ is a correlation matrix and ht = (h1t , . . . , hMt ) a multivariate random walk,

although more complicated linear dynamics could be handled. The approach again relies

on linearising, this time with loss of information, by writing log yit = hit + log µit . The

2 2

vector of log µit is iid, all with means ’1.27, and a covariance matrix which is a known

2

function of µ . Consequently, µ and the parameters indexing the dynamics of ht can

be estimated.

It is worthwhile pointing out two aspects of this model. If rank constraints are imposed

on ht , common trends and cycles will be allowed into the process describing the volatility.

Furthermore, the model is similar to Bollerslev™s (1990) model which is characterised by

constant conditional correlation. Hence the model is better de¬ned as one of changing

variances rather than of changing correlation. Consequently, it fails to represent important

features of the data and so it is of limited interest.

Perhaps a more attractive multivariate SV model can be obtained by introducing factors.

The simplest one-factor model is:

yt = »ft + wt , wt ∼ NIID(0, w )

ft = µt exp(ht /2), ht = βht’1 + ·t , ·t ∼ NIID(0, σ· )

2

where yt is perturbed by wt and explained by the scaled univariate SV model ft . Typically

w will be assumed diagonal, perhaps driven by independent SV models.

The lack of an obvious linearising transformation for these models prevents us from

effectively using Kalman ¬ltering methods. MCMC methods do not suffer this drawback

and are explored in Jacquier et al. (1995) and Pitt and Shephard (1999b).

8.6.2 Multivariate discrete SV models

The multivariate extension of the discrete stochastic volatility model (8.3) is easy to

state. We can consider the multivariate framework (8.11) and allow each component of

ht = (h1t , . . . , hMt ) to follow a two-state Markov chain, i.e.

yit = µit exp(hit /2), i = 1, . . . , M, µt = (µ1t , . . . , µMt ) ∼ NIID(0, µ)

hit = ± + βsit

In order to apply the Hamilton ¬lter and to obtain the likelihood function, it is useful to

de¬ne a new Markov chain St with 2M states, which represents the M Markov chains

governing the dynamics of ht .

Stochastic Volatility Models 265

If the M Markov chains are independent, the transition probabilities of St are simply

obtained by multiplying the probabilities that drive the different Markov chains. Accord-

ingly, the transition probability matrix will be Q = P1 — P2 — · · · — PM , where — indi-

cates the Kronecker product and Pi the transition matrix of sit , i = 1, 2, . . . , M. In that

case, the number of states rises exponentially with the dimension of ht , but the number

of parameters describing the Markov chains grows linearly with M and is 2M.

A more general speci¬cation does not make any a priori assumptions about the relations

between the different Markov chains. The transition probabilities of the composite Markov

chain St are then given by:

qij = (St = j |St’1 = i), i, j = 1, 2, . . . , 2M

which requires 2M (2M ’ 1) parameters. To understand the dimension of the problem, with

M = 2 (and two states), the independent case requires four parameters, while the general

speci¬cation requires 12 parameters.

Clearly the general speci¬cation becomes quickly unfeasible but, in some applications,

the independent case is not useful to understand the causality between the volatility

of different assets. Billio (2002a) proposes considering several correlated cases with a

number of parameters comprised between 2M and 2M (2M ’ 1) by exploiting the concept

of Granger causality.

As for the continuous SV model, a more interesting multivariate extension can be

obtained by introducing a latent factor structure where the latent factors are characterised

by discrete stochastic volatility. Unfortunately, in that case the joint process of the observ-

able variable yt and of the latent Markov chains is no longer Markovian, and then the

Hamilton ¬lter no longer works. For the estimation it is thus necessary to use some

approximation or to use simulation-based methods (see Billio and Monfort (1998), Kim

and Nelson (1999)).

8.7 EMPIRICAL APPLICATIONS

To provide simple illustrations of the usefulness of SV models, the two basic models are

estimated and their output is used to develop standard option pricing and to calculate the

VaR of an asset or a portfolio.

8.7.1 The Volatility program

There do not exist statistical packages to easily and directly estimate28 SV models and

thus the necessary routines have been developed with Ox (version 3.20), a programming

28

Linear state space models can be estimated with the Kalman ¬lter in EViews, with the Gauss package

FANPAC or the Ox package SSFPack (see also STAMP). Thus the linearised version (8.6) could be estimated

with a quasi-maximum likelihood approach. For the switching regime models, see also MSVAR, an Ox package

developed by H.M. Krolzig and designed for the econometric modelling of univariate and multiple time series

subject to shifts in regime (http://www.economics.ox.ac.uk/research/hendry/krolzig/).

266 Applied Quantitative Methods for Trading and Investment

language created mainly by Jurgen A. Doornik.29 These routines can also be used within

the package GiveWin.

The ¬les required for running the Volatility program30 are “volatilitymain.ox”, the main

program ¬le, “volatility.oxo”, a compiled ¬le containing the de¬nition of the functions,

and the header ¬le “volatility.h”, containing the lists of global variables and functions.

In Ox or GiveWin it is suf¬cient to load the main program ¬le, to select the appropriate

options and then to run the program (for the details of the commands and options see the

enclosed readme.txt ¬le).

Depending on which commands are commented out (// in front of the command) the

program can:

• Estimate a basic continuous or discrete SV model on a user provided series;

• Simulate a basic continuous or discrete SV model;

• Estimate a basic continuous or discrete model on a user provided series and then

simulate an alternative path with the estimated parameters.

It shall be stressed that GiveWin is not needed to estimate the models but only to display

graphs. This program can easily be used with the freeware version of Ox in conjunction

with any text editor, however we recommend the use of OxEdit since it integrates with

Ox; both packages can be downloaded from Doornik™s website (see footnote 29). All the

graphic windows presented in this chapter are taken from GiveWin.

The ¬rst line of code in Figure 8.2, just before the “main” command, imports the

“volatility.oxo” ¬le, which contains the functions, recalls the Database class and other Ox

packages such as the graphic, the probabilistic and the maximisation ones.31

The program is then organised as follows:

• In a ¬rst step the time series of interest is requested in an Excel spreadsheet. The user

has to indicate the exact path of the ¬le to be loaded (other data formats can be used, see

the readme.txt ¬le for additional details). Data passed to the programs must be the price

levels, the necessary transformations are directly carried out by the estimation routines.

• In a second step the model is chosen, estimated and, if desired, simulated.

• Finally, the outputs of the model are printed and graphed to the screen and saved.

29

Ox is an object-oriented matrix programming language with a comprehensive mathematical and statistical

function library whose major features are speed, extensive library and well-designed syntax, leading to programs

which are easy to maintain. In particular, this program takes advantage of the concept of class: it is possible

to create new classes based on existing ones and to use their functions, therefore avoiding the need to rewrite

them for the new class. In our case, the program is built on the Database class, which is the class designed

for handling databases, samples, names of variables, etc. The Database class is used as a starting point for

the more speci¬c class of Stochastic Volatility Model: the functions, both to estimate and to simulate the

models and to store the results, are totally rewritten, while the functions that manage the time series are part

of the Database class. More information on Ox can be found at http://www.nuff.ox.ac.uk/users/doornik/. See

also Doornik (2001).

30

Updated versions of the program will be available for download at the address www.greta.it (under the

Working Papers section). The package is free of charge for academic and research purposes. For commercial

use, please contact the author (mgobbo@greta.it).

31

Of course, it is possible to modify the program by adding functions belonging to the loaded Database class

or to other different classes. In this case, the class containing the desired functions must be loaded by adding

a line of code (#import “. . .”) before the “main” command.

Stochastic Volatility Models 267

Figure 8.2 The main program “volatilitymain.ox” loaded with GiveWin with the full list

of commands

The available variables in the Excel ¬le “series.xls” are the daily stock indexes analysed

in Section 8.1.1, i.e. the FTSE100, the CAC40 and the MIB30 indexes. In the example

developed in this chapter attention will be focused on the modelling of the FTSE100 index.

In the ¬rst part of the program, these variables are loaded in Excel format. Thus the full

sample of FTSE100 is selected in order to start the analysis and perform the estimation.

The command “Estimate” is quite complex and requires inputs by the user: the type of

model, the choice of the initialisation of the parameter values and their values if the user

wants to de¬ne them (see Figure 8.3 and the readme.txt ¬le).

8.7.1.1 Estimation

The package allows the analysis of the two basic models, i.e. the log-normal SV model

(8.2), called ARSV in the program, which is estimated by quasi-maximum likelihood

with the Kalman ¬lter, and the two-regime switching model (8.3), called SRSV, which

is estimated by maximum likelihood with the Hamilton ¬lter.

The ¬rst model is:

yt = µ + µt exp(ht /2)

ht = ± + βht’1 + ·t

2

with µt and ·t independent Gaussian white noises. Their variances are 1 and σ· , respec-

tively. The volatility equation is characterised by the constant parameter ±, the autoregres-

2

sive parameter β and the variance σ· of the volatility noise. The mean is either imposed

equal to zero or estimated with the empirical mean of the series (see equation (8.12)).

268 Applied Quantitative Methods for Trading and Investment

Figure 8.3 The “Load”, “Select” and “Estimate” commands

Since the speci¬cation of the conditional volatility is an autoregressive process of order

one, the stationarity condition is |β| < 1. Moreover, the volatility σ· must be strictly posi-

tive. In the estimation procedure the following logistic and logarithm reparameterisations:

exp(b)

β=2 ’1 σ· = exp(s· )

1 + exp(b)

have been considered in order to satisfy the above constraints.

The second model is a particular speci¬cation of the regime switching model introduced

by Hamilton. Precisely the distribution of the returns is described by two regimes with

the same mean but different variances and by a constant transition matrix:

µ + σ 0 µt if st = 0 1 ’ p11

p00

yt = P=

1 ’ p00

µ + σ 1 µt if st = 1 p11

where32 st is a two-state Markov chain independent of µt , which is a Gaussian white

noise with unit variance. The parameters of this model are the mean µ, the low and

high standard deviation σ0 , σ1 and the transition probabilities p00 , p11 (also called regime

persistence probabilities). As for the log-normal SV model, the logarithm and the logistic

transformations ensure the positiveness of the volatilities and constrain the transition

probabilities to assume values in the (0,1) interval.

Before starting the estimation it is necessary to transform the raw time series, expressed

in level, into logarithmic returns33 and to set the starting values of the parameters in

According to the model (8.3), σ0 = exp(±/2) and σ1 = exp((± + β)/2).

32

33

In the Excel ¬le “series.xls” there are 899 observations of the daily stock indexes analysed in Section 8.1.1.

In the estimation we therefore consider 898 daily return observations.

Stochastic Volatility Models 269

the maximisation algorithm.34 Moreover, for the log-normal SV model the returns are

modi¬ed as follows:

yt— = log(yt ’ y t )2 + 1.27 (8.12)

where y t is the empirical mean. Thus, for the log-normal SV model the mean is not

estimated but is simply set equal to the empirical mean.

While these transformations are automatically done by the estimation procedure, the

setting of the starting parameter values requires a choice by the user from the follow-

ing options.

• Random initialisation: a range of possible values of the parameters is ¬xed, where

necessary, and a value is randomly extracted. This method is useful when the user

has no idea about the possible value of the parameters but wants to better investigate

the parametric space. The drawback of this option is that the optimisation algorithm

may be quite time-consuming, because it needs more iterations to converge and the

probability that it does not converge to the global maximum increases and then several

optimisation runs (with different random starting values) may be required.

• Data-driven initialisation: the starting values of the parameters are calculated consider-

ing the time series analysed. For example, the sample mean is used as an approximation

of the mean of the switching regime model and the empirical variance multiplied by

appropriate factors is used for the high and low variance. This alternative helps the

user to speed up the convergence even if he has no opinion on the possible values of

the parameters.

• User initialisation: the starting values of the parameters are directly inserted by the user.

In the example, the data-driven initialisation has been selected.

During the estimation it is possible to control each step of the algorithm through the

command MaxControl (see the readme.txt ¬le for more information). The estimation

output is then given by the estimated values of the parameters, their standard errors and

relative signi¬cance test statistics.35

Figure 8.4 shows the ¬nal output of the log-normal SV model for the FTSE100 index.

In this example the numerical optimisation ends after 76 iterations, which take 8.32

seconds,36 and the log-likelihood37 is ’1153.7. The volatility of the FTSE100 index is

very persistent, in fact the autoregressive coef¬cient of the volatility equation (β) is equal

to 0.956. In practice, for ¬nancial time series this coef¬cient is very often bigger than 0.9.

Figure 8.5 exempli¬es the graphic output, which consists of the estimated volatility

for the FTSE100 index along with the historical return series. The estimated volatil-

ˆ

ity is obtained by using the Kalman smoother ht/T = E(ht |Y —T ), which is however

not immediately useful. In fact, we are interested in E(σt |Y T ) = E(exp(ht /2)|Y T ), but

E(exp(ht /2)|Y T ) = exp(E(ht /2|Y T )). Thus, we consider a ¬rst-order Taylor expansion

34

Recall that data transformations are directly carried out by the programs that require input price levels.

35

The standard errors are calculated following Ruiz (1994) for the log-normal SV model and as the inverse of

the information matrix for the switching regime model. In both cases the z-statistics asymptotically follow an

N (0,1) distribution.

36

On a Pentium III 933 MHz.

37

For the log-normal SV model the log-likelihood is computed with the Kalman ¬lter for the transformed series

yt— , see equation (8.12).

270 Applied Quantitative Methods for Trading and Investment

Figure 8.4 Estimation output of the log-normal SV model for the FTSE100 index

0.050

historical returns

0.025

0.000

’0.025

’0.050

0 100 200 300 400 500 600 700 800 900

estimated volatility

0.020

0.015

0.010

0 100 200 300 400 500 600 700 800 900

Figure 8.5 Historical returns and estimated volatility for the FTSE100 index obtained with the

log-normal SV model

Stochastic Volatility Models 271

ˆ

of exp(ht /2) around ht/T , and compute the conditional mean and estimate the volatility

in the following way:

ˆ ˆ

ht ∼ exp ht/T ht/T

1

σt/T = E exp

ˆ |Y T +

= Qt/T

exp

2 2 8 2

This computation is performed directly by the program and Figure 8.5 presents the esti-

mated volatility.

Figure 8.6 shows the ¬nal output38 of the switching regime model for the FTSE100

index. In this case the numerical optimisation ends after 26 iterations, which take 7.65

seconds, and the log-likelihood39 is ’2680.18. For this model we can judge the persistence

Figure 8.6 Estimation output of the switching regime model for the FTSE100 index

38

It is important to underline that the z-statistics for the transition probabilities are not useful for testing

pii = 0, pii = 1, i = 0, 1. In fact, these tests are not standard since they imply testing for the presence of two

regimes (see Davies (1977, 1987) and Hansen (1992, 1996)).

39

In this case the log-likelihood is computed with the Hamilton ¬lter for the return series and thus it is not

directly comparable with the log-likelihood of the log-normal SV model.

272 Applied Quantitative Methods for Trading and Investment

of the volatility by the value taken by the transition (or persistence) probabilities p00 , p11 .

They are very high (0.99 and 0.96), con¬rming the high persistence of the volatility of

the FTSE100 index. Moreover, the levels of the high and low volatility are perfectly in

line with the values of the volatility estimated with the log-normal SV model.

In Figure 8.7 the graphic output of the switching regime model is presented. It consists

of the historical return series, the weighted or estimated volatility and the estimated

switches between regimes.40

To estimate the volatility we consider the output of the Kim smoother. Since σt =

exp(±/2)(1 ’ st ) + exp((± + β)/2)st = σ0 (1 ’ st ) + σ1 st , we can compute:

σt/T = E(σt |Y T ) = σ0 P (st = 0|Y T ) + σ1 P (st = 1|Y T )

ˆ (8.13)

where

P (st = 0|Y T ) = P (ht = ±|Y T ) and P (st = 1|Y T ) = P (ht = ± + β|Y T )

Finally, it is possible to save the estimated volatility. Since the visualisation of the

graphs is possible only with the commercial version of Ox, if this is not available the

program allows only the saving of the estimated volatility series. The “SeriesEst” com-

mand allows the saving of the following series in an Excel format:41 historical returns,

0.05 historical returns

0.00

’0.05

0 100 200 300 400 500 600 700 800 900

weighted volatility

0.020

0.015

0 100 200 300 400 500 600 700 800 900

1.0

regime shifts

0.5

0 100 200 300 400 500 600 700 800 900

Figure 8.7 Historical returns, weighted volatility and estimated switches between regimes for

the FTSE100 index obtained with the regime switching model

The regime is 0 if P (ht = ±|Y T ) ≥ 0.5 and 1 otherwise.

40

41

In the output ¬le, for the ARSV model, Var1 indicates the historical returns and Var2 the estimated volatilities.

For the SRSV, Var1 indicates the historical returns, Var2 the estimated volatilities, Var3 and Var4 the smoothed

and ¬ltered probabilities of the high volatility regime and Var5 the regime shifts.

Stochastic Volatility Models 273

Figure 8.8 The “Simulation”, “SeriesEst”, “SeriesSim” and “Graph” commands

estimated volatilities and for the switching regime model the smoothed and ¬ltered prob-

abilities of the high volatility regime and the regime shifts. Moreover, the graphs can

be directly saved in a postscript format with the “Graph” command. In both cases the

user should provide a path including the name and extension of the destination ¬le. The

“Graph” command includes also an additional control variable to choose whether or not

to plot the series (see Figure 8.8).

8.7.1.2 Simulation

The Volatility program also allows simulation of both the models. The “Simulation”

command gives the possibility to choose the type of model, the values of the parameters

and the length of the simulated series. If the user wants to simulate the model characterised

by the parameters just estimated, the last input of the “Simulation” command must be set

to 0, otherwise it has to be replaced by the column vector of the desired parameters.

The graphic output of the simulation is composed of the simulated series and their

volatilities. A ¬nal possibility is to plot both the estimation and simulation phases (see

Figures 8.9 and 8.10). In particular, for the switching regime model the program plots

the simulated volatility, which jumps between the low and high level, and the smoothed

(weighted) simulated volatility, which is computed in the same way as the estimated

volatility (see equation (8.13)).

274 Applied Quantitative Methods for Trading and Investment

0.050

historical returns estimated volatility

0.025 0.020

0.000

0.015

’0.025

0.010

’0.050

0 200 400 600 800 0 200 400 600 800

simulated returns simulated volatility

0.025

0.025

0.020

0.000

0.015

’0.025 0.010

0.005

’0.050

0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500

Figure 8.9 Estimation and simulation graphic output of the log-normal SV model

0.05 historical returns weighted volatility

0.020

0.00

0.015

’0.05

0 200 400 600 800 0 200 400 600 800

1.0 simulated returns

regime shifts

0.05

0.5 0.00

’0.05

0 500 1000 1500 2000 2500

0 200 400 600 800

1.0 simulated regime shifts

weighted simulated volatility

0.020

0.5

0.015

0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500

Figure 8.10 Estimation and simulation graphic output of the switching regime model

Stochastic Volatility Models 275

Finally, it is possible to save the simulated volatility. The “SeriesSim” command allows

the saving of the following series in an Excel format:42 simulated returns, simulated

volatilities and for the switching regime model the smoothed probabilities of the high

volatility regime and the regime shifts.

8.7.1.3 Forecasting

The ¬nal output given by the Volatility program is the forecasted volatility for the fol-

lowing period. The Kalman and Hamilton ¬lters also give the prediction density of ht+1 ,

then it is possible to forecast the next value of the volatility.

For the log-normal SV model, we consider a ¬rst-order Taylor expansion of exp(ht /2)

ˆ

around hT +1/T and by taking the conditional expectation we forecast the volatility in the

following way:

ˆ ˆ

hT +1 ∼ exp hT +1/T hT +1/T

1

σT +1/T = E exp

ˆ +

YT = QT +1/T

exp

2 2 8 2

With regard to the switching regime model, since σt = σ0 (1 ’ st ) + σ1 st , we can forecast

the volatility as follows:

σT +1/T = E(σT +1 |Y T ) = σ0 P (sT +1 = 0|Y T ) + σ1 P (sT +1 = 1|Y T )

ˆ

where

P (sT +1 = 0|Y T ) = P (hT +1 = ±|Y T ) P (sT +1 = 1|Y T ) = P (hT +1 = ± + β|Y T )

are the prediction probabilities43 obtained with the last iteration of the Hamilton ¬lter.

The forecasted volatility is evidenced in the output and it is saved as the last value of

the estimated volatility.44

Let us now consider some practical utilisations of the estimated volatilities.

8.7.2 Option pricing

As seen in Section 8.1.2, the option price in the Black and Scholes framework can be

expressed as a conditional expectation given the current price of the underlying asset:

CtBS = exp(’r„ )ESt+„ |St [max(St+„ ’ K, 0)]

42

For the ARSV model, Var1 indicates the simulated returns and Var2 the simulated volatilities. For the SRSV,

Var1 indicates the simulated returns, Var2 the simulated volatilities, Var3 the smoothed probabilities of the high

volatility regime and Var4 the regime shifts.

43

It is possible to obtain the prediction probabilities by multiplying the transition matrix P by the ¬ltered

probabilities (which are saved in the estimation output ¬le as Var4), i.e.

P (st+1 = 0|Y t ) P (st = 0|Y t )

=P

P (st+1 = 1|Y ) P (st = 1|Y t )

t

44

In the estimation output ¬le, in the last row the non-zero value for the estimated volatility is the forecasted

volatility, while the non-zero value for the ¬ltered probability is the prediction probability of the high volatility

regime (see footnote 43). All the other variables are set equal to zero.

276 Applied Quantitative Methods for Trading and Investment

where the dynamic of the asset is described by a geometric diffusion process and the

expectation is taken with respect to the risk-neutral probability measure.

Since the Black and Scholes formula can be expressed as a function only of the volatil-

ity, great effort has been made in modelling its behaviour. While Black and Scholes

assume that it is constant over the life of the option, a series of models proposed in the

late 1980s supposes that it varies through time in a deterministic or stochastic way, in an

attempt to capture the empirical features of the option prices. In fact, an analysis of the

volatility implied in the market option prices (the so-called implied volatility) highlights

that the volatility is neither constant through time nor independent of the strike price (the

so-called “smile” and “sneer” effect) (see Rubinstein (1985)).

A very simple approach consists of using the volatility estimated with stochastic volatil-

ity models as input to the Black and Scholes formula. In that case, it is suf¬cient to

ˆ

consider the forecasted volatility σt/t’1 as the volatility parameter in the formula to obtain

the option price:

Ct = CtBS (σt/t’1 )

ˆ

In the following we review some stochastic volatility models for the option pricing, which

consider the volatility as an exogenous stochastic process.

The path-breaking work on stochastic volatility models applied to option pricing is the

paper by Hull and White (1987). The authors assume that both the underlying security S

and the variance σ 2 follow a geometric diffusion process:

dS = µSdt + σ Sdz

dσ 2 = φσ 2 dt + ξ σ 2 dω

where the correlation ρ between the two Brownian motions dz, dω is a constant with mod-

ulus less than one. Hull and White take ρ ≡ 0. Scott (1987) considers the case in which

the volatility follows an Ornstein“Uhlenbeck process and also imposes the restriction

ρ ≡ 0. Finally, Heston (1993) proposes the familiar mean-reverting square root process

for the volatility:

dS = µSdt + σ Sdz

dσ 2 = γ (φ ’ σ 2 )dt + ξ σ dω

where φ is the long-run average variance and he takes the assumption ρ = 0.

Introducing stochastic volatility into the de¬nition of the stochastic differential equation

of the underlying asset creates several complications. A dynamic portfolio with only one

option and one underlying asset is not suf¬cient to create a riskless investment strategy.

The problem arises since the stochastic differential equation for the option contains two

sources of uncertainty. Unfortunately, it is impossible to eliminate volatility market risk

premium and correlation parameters from the partial differential equation using only one

option and one underlying asset. Moreover, these parameters are dif¬cult to estimate45

and extensive use of numerical techniques is required to solve the two-dimensional partial

differential equation.

45

An exception occurs when the volatility is a deterministic function of the asset price or time. In this case it

is possible to easily ¬nd a solution to the partial differential equation.

Stochastic Volatility Models 277

In the Hull and White (1987) formula, the option price is determined assuming that

the volatility market risk premium is zero and there is zero correlation between the two

Brownian motions describing the underlying asset and the volatility, i.e. the volatility

is uncorrelated with the asset price. With these assumptions and using a risk-neutral

valuation procedure, they show that the price of an option with stochastic volatility is the

Black and Scholes price integrated over the distribution of the mean volatility:

CtHW = CtBS (σ 2 )g(σ 2 |σt2 )dσ 2

where t+„

1

σ= 2

σ 2 (u)du

„ t

and g(σ |σt2 ) is the conditional probability density of the mean variance σ 2 over the

period „ .

In the more general case of non-zero correlation, the framework becomes more complex,

allowing only numerical solutions.

It can be observed that continuous time stochastic volatility provides an attractive

and intuitive explanation for observed volatility patterns and for observed biases in

implied volatility. Precisely, smiles, skews, upward and downward term structures

of implied volatility arise naturally from a stochastic volatility model. However the fact

that stochastic volatility models ¬t empirical patterns does not mean that those models are

correct and the biases in market prices may be the result of other factors, not considered,

such as liquidity problems.

A work related to that of Hull and White is Naik (1993). While in the Hull and White

speci¬cation the volatility follows a continuous diffusion process, Naik analyses the case

where the instantaneous variance of an asset is subject to random discontinuous shifts.

In particular, the volatility is described with a right-continuous Markov chain process: it

remains in the same state for a random amount of time and then shifts to another state

with transition probabilities determined by a matrix. In the case of a two-state volatility

process the transition matrix is simply:

1 ’ p11

p00

P=

1 ’ p00 p11

Assuming that the underlying process is continuous, the risk of a shift in volatility is

diversi¬able, and therefore not priced, and that the two processes are uncorrelated, the

option valuation equation can be expressed in a closed form as follows:

„

=

CtN (σ1 ) CtBS (σ 2 (x))g(x|σ1 ) dx

0

where σ1 indicates the high volatility level, σ 2 (x) = [σ1 x + σ0 („ ’ x)]/„ , 0 ¤ x ¤ „ and

2 2

g(x|σ1 ) denotes the unconditional density of the time spent by the volatility process in the

high volatility state, given the current high volatility state. In the same way it is possible

to determine the option price conditional on the current low volatility state (CtN (σ0 ) with

σ0 the low volatility level).

278 Applied Quantitative Methods for Trading and Investment

As in the Hull and White model, the option price is the expectation of the Black and

Scholes formula computed for the average future volatility, given the current state. Since

two regimes are considered, the ¬nal option price can be obtained by a weighted average

of the two conditional values CtN (σ1 ) and CtN (σ0 ).

This analysis can be extended by considering multiple states and correlation between

changes of the underlying and shifts in volatility, but unfortunately in these cases the

option price can be obtained only via numerical methods. In this kind of procedure a

discrete time Markov chain is used as an approximation of the volatility process.

We brie¬‚y present two examples of this approach, due to Billio and Pelizzon (1997)

and Bollen (1998) (see also Bollen et al. (2000)). Both these works are based on the

hypothesis that the returns of the underlying asset follow a switching regime model. The

distribution is characterised by a mixture of distributions with different variance, where

the weights depend on a hidden Markov chain process which represents possible different

volatility regimes of the market. To obtain the numerical solution, the basic idea is to

approximate the distribution through a multinomial approach, considering a binomial tree

for each of the two distributions characterised by different variance.

Following Cox et al. (1979), the future value of the underlying process can be

expressed as:

uSt p

St+ t =

dSt 1 ’ p

One possible speci¬cation of the parameters that guarantees asymptotic normality and

convergence to the desired mean and variance of the continuously compounded returns is

√

u = exp( σ 2 t + r 2 t 2 ), d = u’1 and p = (er t ’ d)/(u ’ d). In this case the process

is fully characterised by the parameter representing the variance σ 2 , since the step size

t and the risk-free interest rate r are given. Once the variance is estimated it is possible

to construct a tree to represent the possible future paths of the variable and hence the

distribution of returns (at maturity).

If a regime switching model is considered, the distribution of the returns is simply a

mixture of the distributions characterising each state. Therefore a discrete process can be

used to approximate the continuous time process, and hence the distribution of returns,

in each state. In this case, two binomial distributions are used as an approximation of the

mixture of the two distributions:

±

u1 St |st = 1 p11

u0 St |st = 0 p00

St+ t =

d0 St |st = 0 1 ’ p00

d1 St |st = 1 1 ’ p11

where st denotes the regime.

In this case it is necessary to invoke the risk-neutral hypothesis to determine

the values of the parameters that are u1 = exp( σ1 t + r 2 t 2 ), d1 = u’1 , p11 =

2

1

’ d1 )/(u1 ’ d1 ) for the high volatility regime and u0 = exp( σ0 t + r 2 t 2 ), d0 =

(er t 2

u’1 and p00 = (er t ’ d0 )/(u0 ’ d0 ) for the low volatility regime. This model is usually

0

called a quadrinomial tree (or lattice). The inner two branches correspond to the low

volatility regime while the outer ones correspond to the high volatility regime. Each set

of probabilities (pii , 1 ’ pii ) must be interpreted as the branch probability conditional on

the regime i, with i = 0, 1.

Stochastic Volatility Models 279

Although the quadrinomial tree represents both distributions accurately, its branches do

not recombine ef¬ciently, exhibiting an exponential growth of the computational time as

the number of steps increases.

Bollen (1998) proposes a method to overcome this problem and develops a pentanomial

tree. The de¬nition of the parameters is modi¬ed to yield both the possibility to recom-

bine the tree and the convergence of the discrete process to the mixture of distributions.

This is obtained by approximating each regime density by a trinomial distribution instead

of the binomial one. The modi¬ed tree has ¬ve evenly spaced branches because the

step sizes of the two regimes are in the ratio 1:2 (u1 = exp( σ1 t + r 2 t 2 ), u0 =

2

exp( 1 σ1 t + r 2 t 2 ) and the middle branch is shared by the two regimes:

2

2

±

u1 St |st =1 p1,u

u0 St |st =0

p0,u

= St pm

St+ t

d S |s

0t t =0

p0,d

d1 St |st =1 p1,d

where pst ,u , pst ,d are the probabilities to go up and down conditional on the regime st

and pm is the probability to remain at the same level price St .

Once the tree is generated, the option value is calculated operating backward from the

terminal values, i.e. the payoffs, to the valuation time. In the regime switching approaches,

for simplicity two conditional option values are calculated at each step, where the condi-

tioning information is the regime at the valuation time t:

Ct (st = 0) = [p00 Ct+1 (st+1 = 0) + (1 ’ p00 )Ct+1 (st+1 = 1)]e’r t

Ct (st = 1) = [(1 ’ p11 )Ct+1 (st+1 = 0) + p11 Ct+1 (st+1 = 1)]e’r t

At the valuation time the value of the option is obtained as a weighted average of the

two conditional option values, where the weights depend on the knowledge of the current

regime. If the regime is unknown, the unconditional probabilities:

1 ’ p11 1 ’ p00

2 ’ p00 ’ p11 2 ’ p00 ’ p11

are used.

Let us take an example of the pentanomial approach by considering the parameters

estimated in Section 8.7.1.1.

We deal with a European call option on the FTSE100 index quoted at LIFFE on

22 August 2000 (when the FTSE100 index quoted 6584.82), with maturity June 2001

and strike price 5900. The risk-free interest rate r is approximated with three-month

Libor46 and the time to maturity in terms of trading days is 213. In this example t = 1.

46

London Interbank Offered Rate.

280 Applied Quantitative Methods for Trading and Investment

Taking the estimated volatilities of Section 8.7.1.1, the parameters of the pentanomial

model can be computed as follows:

√

e ’ e σ1 +r

’ 2 2

r

p1,u = √ 2 2 √2

σ1 +r ’ σ1 +r 2

’e

e

1√ 2 2 1√ 2 2

’ σ1 +r ’ σ1 +r

e ’e ’ pm 1 ’ e

r 2 2

√ p0,u = 1√ 2 2 1√ 2 2

σ1 +r 2

2

u1 = e σ1 +r

’ e 2 σ1 +r

’

e2

1√ 2 2 « 2

e 2 σ1 +r

u0 =

¬ σ0 + r ·

2 2

1√ 2 2

pm = 1 ’ 4

’ σ1 +r

d0 = e 2

√ σ1 + r

2 2

d1 = e’ σ1 +r 2

2

p0,d = 1 ’ p0,u ’ pm

p1,d = 1 ’ p1,u

Once the tree is generated, the payoffs are calculated with the usual procedure (see

Appendix A for details). The conditional option values Ct (st = 0), Ct (st = 1) are obtained

using the estimated transition probabilities, while the ¬nal value of the option is a weighted

average where the weights are the unconditional probabilities:

1 ’ p11 1 ’ p00

2 ’ p00 ’ p11 2 ’ p00 ’ p11

The pentanomial option value is therefore 1055.14, while the market value was 1047.5.

8.7.3 Value at risk

VaR is a very intuitive measure to evaluate market risk because it indicates the maximum

potential loss at a given level of con¬dence (a) for a portfolio of ¬nancial assets over a

speci¬ed time horizon (h).

In practice, the value of a portfolio is expressed as a function of K risk factors,

x„ = N wi,t Pi,„ (χ1,„ , . . . , χK,„ ). The factors in¬‚uencing the portfolio value are usu-

i=1

ally identi¬ed with some market variables such as interest rates, exchange rates or stock

indexes. If their distribution is known in a closed form, we need to estimate the distribu-

tion of the future value of the portfolio conditional on the available information and the

VaR is then the solution to:

VaR(h,a)

a= f (xt+h ) dx

’∞

Different parametric models can be used to forecast the portfolio return distribution. The

simple way to calculate VaR involves assuming that the risk factor returns follow a multi-

variate normal distribution conditional on the available information. If the portfolio return

is linearly dependent on them, its probability distribution is also normal and the VaR is

Stochastic Volatility Models 281

simply the quantile of this analytic distribution. If the linear assumption is inappropri-

ate, the portfolio return can be approximated as a quadratic function of the risk factor

returns.

An alternative way to handle the non-linearity is to use Monte Carlo simulation. The

idea is to simulate repeatedly the random processes governing the risk factors. Each

simulation gives us a possible value for the portfolio at the end of our time horizon. If

enough of these simulations are considered, it is possible to infer the VaR, as the relevant

quantile of the simulated distribution.

Since market risk factors usually have fatter tails than the normal distribution, it is also

possible to use historical simulation rather than a parametric approach. The idea behind

this technique is to use the historical distribution of returns to the assets in the portfolio

to simulate the portfolio™s VaR, on the hypothetical assumption that we held this portfolio

over the period of time covered by our historical data set. Thus, the historical simulation

involves collecting historic asset returns over some observation period and using the

weights of the current portfolio to simulate the hypothetical returns we would have had if

we had held our current portfolio over the observation period. It is then assumed that this

historical distribution of returns is also a good proxy for the portfolio return distribution

it will face over the next holding period and VaR is calculated as the relevant quantile of

this distribution.

The advantage of the parametric approach is that the factors variance“covariance matrix

can be updated using a general model of changing or stochastic volatility. The main dis-

advantage is that the factor returns are usually assumed to be conditionally normal, losing

the possibility to take into account non-linear correlations among them. Historical sim-

ulation has the advantage of re¬‚ecting the historical multivariate probability distribution

of the risk factor returns, avoiding ad hoc assumptions. However the method suffers a

serious drawback. Its main disadvantage is that it does not incorporate volatility updat-

ing. Moreover, extreme quantiles are dif¬cult to estimate, as extrapolation beyond past

observations is impossible. Finally, quantile estimates tend to be very volatile whenever

a large observation enters the sample and the database is not suf¬ciently large.

The advantage of the parametric approach to update the volatility suggests the simplest

utilisation of the SV models for the VaR computation. Having chosen the asset or portfolio

distribution (usually the normal one), it is possible to use the forecasted volatility to

ˆ

characterise the future return distribution. Thus, σT +1/T can be used to calculate the VaR

over the next period.

A different approach using the SV model is to devolatilise the observed returns series

and to revolatilise it with an appropriate forecasted value, obtained with a particular model

of changing volatility. This approach is considered in several recent works (Barone-Adesi

et al., 1998; Hull and White, 1998) and is a way of combining different methods and

partially overcoming the drawbacks of each.

To make the historical simulation consistent with empirical ¬ndings, the log-normal

SV model and the regime switching model may be considered to describe the volatility

behaviour. Past returns are standardised by the estimated volatility to obtain standardised

residuals. Statistical tests can con¬rm that these standardised residuals behave approxi-

mately as an iid series which exhibits heavy tails. Historical simulation can then be used.

Finally, to adjust them to the current market conditions, the randomly selected standardised

residuals are multiplied by the forecasted volatility obtained with the SV model.

282 Applied Quantitative Methods for Trading and Investment

Table 8.2 VaR at different con¬dence levels for the FTSE100 index return

Con¬dence level Log-normal SV model Regime switching model

0.1 2.5442 2.7503

0.05 3.9298 3.5874

0.01 5.3417 4.6502

For example, Table 8.2 shows the results obtained with the FTSE100 index return

by considering 1 000 000 historical simulations47 (see Appendix B for details). Clearly,

this approach allows a wide range of stochastic and changing volatility models, such as

ARCH“GARCH models, to be considered. Moreover, it must be pointed out that instead

of using historical simulation, an appropriate standard distribution can also be considered

to model the transformed returns and then several probability distributions can be assumed

for the unconditional returns (McNeil and Frey, 2000; Eberlein et al., 2001).

Another example of a parametric approach to VaR calculation considers the hypothesis

that a regime switching model governs the asset returns (Billio and Pelizzon, 2000):

µ + σ 0 µt st = 0

yt =

µ + σ 1 µt st = 1

where µt ∼ N (0,1) is independent of st .

To calculate the VaR it is necessary to determine the value of the conditional distribution

for which the cumulative density is a, i.e.

VaR(h,a)

a= P (st+h |It ) fst+h (yt+h |It ) dy

’∞

st+h =0,1

where fst+h (yt+h |It ) is the probability density of yt+h when the regime is st+h and condi-

tional on the available information set It (usually containing past returns), P (st+h |It ) is

the prediction probability obtained by the Hamilton ¬lter.

Given the parameters estimated in Section 8.7.1.1 for the switching regime model and

the prediction probabilities at time t + h (obtained by the product of the transition matrix,

for h ’ 1 steps, and the conditional probabilities P (st+1 |It ) given by the Hamilton ¬lter),

the VaR is the relevant quantile of the mixture distribution.

The model can be generalised to the case of N risky assets providing an explicit link

between the return on the asset and the return on the market index, thus by explicitly

modelling the correlation between different assets. The Multivariate Switching Regime

Beta Model (Billio and Pelizzon, 2000) is a sort of market model or better a single factor

model in the Arbitrage Pricing Theory framework where the return of a single stock i is

characterised by the regime switching of the market index and the regime switching of

the speci¬c risk of the asset. It can be written as:

±

ymt = µm (st ) + σm (st )µt µt ∼ IIN (0,1)

y1t = µ1 (s1t ) + β1 (st , s1t )ymt + σ1 (s1t )µ1t µ1t ∼ IIN (0,1)

y2t = µ2 (s2t ) + β2 (st , s2t )ymt + σ2 (s2t )µ2t µ2t ∼ IIN (0,1)

.

.

.

yNt = µN (sNt ) + βN (st , sNt )ymt + σN (sNt )µNt µNt ∼ IIN (0,1)

47

Each operation takes about 12 seconds.

Stochastic Volatility Models 283

where ymt is the market return, the market regime variable st and the single stock regime

variables sj t , j = 1, . . . , N are independent Markov chains, µt and µj t , j = 1, . . . , N , are

independently distributed.

Using this approach it is possible to take into account the correlation between different

assets. In fact, the variance“covariance matrix between the two assets i and j is:

βi2 (st , sit )σm (st ) + σi2 (sit )

2 2

βi (st , sit )βj (st , sj t )σm (st )

(st , sit , sj t ) =

βj (st , sj t )σm (st ) + σj2 (sj t )

2

2 2

βi (st , sit )βj (st , sj t )σm (st )

then the correlation between different assets depends on the extent to which each asset is

linked, through the factor loading β, to the market index.

To calculate VaR for a portfolio based on N assets it is suf¬cient to use the approach

presented above. In particular, considering two assets and assuming that the number of

regimes is two for all three Markov chains we have:

VaR(h,a)

a= P (st+h , si,t+h , sj,t+h |It ) fst+h ,si,t+h ,sj,t+h (y|It ) dy

’∞

st+h =0,1 si,t+h =0,1 sj,t+h =0,1

where fst+h ,si,t+h ,sj,t+h (y|It ) is the probability density of the portfolio return y when the

regimes are st+h , si,t+h , sj,t+h and conditional on the available information set It . This dis-

tribution has mean w µ(st+h , si,t+h , sj,t+h ) and variance w (st+h , si,t+h , sj,t+h )w where w

is the vector of the percentage of wealth invested in the two assets and µ(st+h , si,t+h , sj,t+h )

is the vector of risky asset mean returns, i.e.

µi (si,t+h ) + βi (st+h , si,t+h )µm (st+h )

µ(st+h , si,t+h , sj,t+h ) =

µj (sj,t+h ) + βj (st+h , sj,t+h )µm (st+h )

The drawback of this approach is that it requires the estimation of a number of parameters

that grows exponentially with the number of assets. In fact, the number of possible regimes

generated by this model is 2N+1 .

One possible solution is to consider the idiosyncratic risk distributed as IIN (0,σi2 )

(without a speci¬c Markov chain dependency) and to characterise the systematic risk

with more than one source of risk. This approach is in line with the Arbitrage Pricing

Theory model where the risky factors are characterised by switching regime processes.

Formally, we can write this model as:

Fj t = ±j (sj t ) + θj (sj t )µj t µj t ∼ IIN (0,1), j = 1, 2, . . . , K

yit = µi + K=1 βij (sj t )Fj t + σi µit µit ∼ IIN (0,1), i = 1, 2, . . . , N

j