<<

. 10
( 19)



>>


and the transactions data consist of { ti , Ni , Di , Si } for the ith price change. The
PCD model is concerned with the joint analysis of ( ti , Ni , Di , Si ).

Remark: Focusing on transactions associated with a price change can reduce
the sample size dramatically. For example, consider the intraday data of IBM stock
from November 1, 1990 to January 31, 1991. There were 60,265 intraday trades, but
only 19,022 of them resulted in a price change. In addition, there is no diurnal pattern
in time durations between price changes.

To illustrate the relationship among the price movements of all transactions and
those of transactions associated with a price change, we consider the intraday trad-
ings of IBM stock on November 21, 1990. There were 726 transactions on that day
during the normal trading hours, but only 195 trades resulted in a price change. Fig-
ure 5.14 shows the time plot of the price series for both cases. As expected, the price
series are the same.
The PCD model decomposes the joint distribution of ( ti , Ni , Di , Si ) given Fi’1
as

f ( ti , Ni , Di , Si | Fi’1 )
= f (Si | Di , Ni , ti , Fi’1 ) f (Di | Ni , ti , Fi’1 ) f (Ni | ti , Fi’1 ) f ( ti | Fi’1 ).
(5.47)

This partition enables us to specify suitable econometric models for the conditional
distributions and, hence, to simplify the modeling task. There are many ways to
specify models for the conditional distributions. A proper speci¬cation might depend
on the asset under study. Here we employ the speci¬cations used by McCulloch and
Tsay (2000), who use generalized linear models for the discrete-valued variables and
a time series model for the continuous variable ln( ti ).
For the time duration between price changes, we use the model

ln( ti ) = β0 + β1 ln( ti’1 ) + β2 Si’1 + σ i , (5.48)

where σ is a positive number and { i } is a sequence of iid N (0, 1) random variables.
This is a multiple linear regression model with lagged variables. Other explanatory
variables can be added if necessary. The log transformation is used to ensure the
positiveness of time duration.
209
THE PCD MODEL


(a) All transactions


112.5 113.0 113.5 114.0


•••

••••••• •
•• ••
•• ••

••
•• ••
•••• • •••• • •••• •••
• •• • •
••••••
• ••• • ••••• ••••••• •• ••• •
•• • •
price




•• •••• ••••
• •••• • •••• ••••• ••
•••• • • •
• ••••

••• ••••
•• •• •••
•• •• •
• ••••• •
•• ••• •••
• • • •••••
• •• •• •• • •••
••
••• •••••• •••••
• •• • • •••• •••
••
•• •• •• • ••••••• •
•• •• • ••••

• ••••••• •
• •• •• • • •• • •••••• •• •
• •••
••• • ••
• ••••••••• ••••• •
• •• • •• •
35000 40000 45000 50000 55000
seconds

(b) Transactions with a price change


112.5 113.0 113.5 114.0




•••
•• • •

•• ••
••• ••
•••• • ••• • • • • •

••••••
•• • •• ••• • •• •• •
• •
price




• •••• •••
•• •• •• •• ••• ••
•• •••
• ••••
• •• •••
• • •
• •• •• • ••
• •••
• •• • •
•• •• • • •••• • •
• •• • • • • • •• • ••••
•• • • •• •
• • • •• • • • • • •

• • • •• • • •• •• •
35000 40000 45000 50000 55000
seconds

Figure 5.14. Time plots of the intraday transaction prices of IBM stock on November 21,
1990: (a) all transactions, and (b) transactions that resulted in a price change.



The conditional model for Ni is further partitioned into two parts because empir-
ical data suggest a concentration of Ni at 0. The ¬rst part of the model for Ni is the
logit model

p(Ni = 0 | ti , Fi’1 ) = logit[±0 + ±1 ln( ti )], (5.49)

where logit(x) = exp(x)/[1 + exp(x)], whereas the second part of the model is

exp[γ0 + γ1 ln( ti )]
Ni | (Ni > 0, ti , Fi’1 ) ∼ 1 + g(»i ), »i = , (5.50)
1 + exp[γ0 + γ1 ln( ti )]

where ∼ means “is distributed as,” and g(») denotes a geometric distribution with
parameter », which is in the interval (0, 1).
The model for direction Di is

Di | (Ni , ti , Fi’1 ) = sign(µi + σi ), (5.51)

where is a N (0, 1) random variable, and
210 HIGH-FREQUENCY DATA


µi = ω0 + ω1 Di’1 + ω2 ln( ti )
4
ln(σi ) = β Di’ j = β| Di’1 + Di’2 + Di’3 + Di’4 |.
j=1


In other words, Di is governed by the sign of a normal random variable with mean µi
and variance σi2 . A special characteristic of the prior model is the function for ln(σi ).
For intraday transactions, a key feature is the price reversal between consecutive
price changes. This feature is modeled by the dependence of Di on Di’1 in the
mean equation with a negative ω1 parameter. However, there exists occasional local
trend in the price movement. The previous variance equation allows for such a local
trend by increasing the uncertainty in the direction of price movement when the past
data showed evidence of a local trend. For a normal distribution with a ¬xed mean,
increasing its variance makes a random draw have the same chance to be positive
and negative. This in turn increases the chance for a sequence of all positive or all
negative draws. Such a sequence produces a local trend in price movement.
To allow for different dynamics between positive and negative price movements,
we use different models for the size of a price change. Speci¬cally, we have

Si | (Di = ’1, Ni , ti , Fi’1 ) ∼ p(»d,i ) + 1, with (5.52)
ln(»d,i ) = ·d,0 + ·d,1 Ni + ·d,2 ln( ti ) + ·d,3 Si’1
Si | (Di = 1, Ni , ti , Fi’1 ) ∼ p(»u,i ) + 1, with (5.53)
ln(»u,i ) = ·u,0 + ·u,1 Ni + ·u,2 ln( ti ) + ·u,3 Si’1 ,

where p(») denotes a Poisson distribution with parameter », and 1 is added to the
size because the minimum size is 1 tick when there is a price change.
The speci¬ed models in Eqs. (5.48)“(5.53) can be estimated jointly by either the
maximum likelihood method or the Markov Chain Monte Carlo methods. Based
on Eq. (5.47), the models consist of six conditional models that can be estimated
separately.

Example 5.5. Consider the intraday transactions of IBM stock on November
21, 1990. There are 194 price changes within the normal trading hours. Figure 5.15
shows the histograms of ln( ti ), Ni , Di , and Si . The data for Di are about equally
distributed between “upward” and “downward” movements. Only a few transactions
resulted in a price change of more than 1 tick; as a matter of fact, there were seven
changes with two ticks and one change with three ticks. Using Markov Chain Monte
Carlo (MCMC) methods (see Chapter 10), we obtained the following models for the
data. The reported estimates and their standard deviations are the posterior means
and standard deviations of MCMC draws with 9500 iterations. The model for the
time duration between price changes is

ln( ti ) = 4.023 + 0.032 ln( ti’1 ) ’ 0.025Si’1 + 1.403 i ,
211
THE PCD MODEL

0 10 20 30 40 50




100 150
50
0
0 2 4 6 1.0 1.5 2.0 2.5 3.0
log(duration) size, in ticks
20 40 60 80 100




120
0 20 40 60 80
0




-1.0 -0.5 0.0 0.5 1.0 0 5 10 15 20
direction number of trades

Figure 5.15. Histograms of intraday transactions data for IBM stock on November 21, 1990:
(a) log durations between price changes, (b) direction of price movement, (c) size of price
change measured in ticks, and (d) number of trades without a price change.



where standard deviations of the coef¬cients are 0.415, 0.073, 0.384, and 0.073,
respectively. The ¬tted model indicates that there was no dynamic dependence in the
time duration. For the Ni variable, we have

Pr (Ni > 0 | ti , Fi’1 ) = logit[’0.637 + 1.740 ln( ti )],

where standard deviations of the estimates are 0.238 and 0.248, respectively. Thus,
as expected, the number of trades with no price change in the time interval (ti’1 , ti )
depends positively on the length of the interval. The magnitude of Ni when it is
positive is

exp[0.178 ’ 0.910 ln( ti )]
Ni | (Ni > 0, ti , Fi’1 ) ∼ 1 + g(»i ), »i = ,
1 + exp[0.178 ’ 0.910 ln( ti )]

where standard deviations of the estimates are 0.246 and 0.138, respectively. The
negative and signi¬cant coef¬cient of ln( ti ) means that Ni is positively related to
the length of the duration ti because a large ln( ti ) implies a small »i , which
in turn implies higher probabilities for larger Ni ; see the geometric distribution in
Eq. (5.27).
212 HIGH-FREQUENCY DATA


The ¬tted model for Di is

µi = 0.049 ’ 0.840Di’1 ’ 0.004 ln( ti )
ln(σi ) = 0.244| Di’1 + Di’2 + Di’3 + Di’4 |,

where standard deviations of the parameters in the mean equation are 0.129, 0.132,
and 0.082, respectively, whereas that for the parameter in the variance equation is
0.182. The price reversal is clearly shown by the highly signi¬cant negative coef-
¬cient of Di’1 . The marginally signi¬cant parameter in the variance equation is
exactly as expected. Finally, the ¬tted models for the size of a price change are

ln(»d,i ) = 1.024 ’ 0.327Ni + 0.412 ln( ti ) ’ 4.474Si’1
ln(»u,i ) = ’3.683 ’ 1.542Ni + 0.419 ln( ti ) + 0.921Si’1 ,

where standard deviations of the parameters for the “down size” are 3.350, 0.319,
0.599, and 3.188, respectively, whereas those for the “up size” are 1.734, 0.976,
0.453, and 1.459. The interesting estimates of the prior two equations are the negative
estimates of the coef¬cient of Ni . A large Ni means there were more transactions in
the time interval (ti’1 , ti ) with no price change. This can be taken as evidence of no
new information available in the time interval (ti’1 , ti ). Consequently, the size for
the price change at ti should be small. A small »u,i or »d,i for a Poisson distribution
gives precisely that.
In summary, granted that a sample of 194 observations in a given day may not
contain suf¬cient information about the trading dynamic of IBM stock, but the ¬tted
models appear to provide some sensible results. McCulloch and Tsay (2000) extend
the PCD model to a hierarchical framework to handle all the data of the 63 trad-
ing days between November 1, 1990 and January 31, 1991. Many of the parameter
estimates become signi¬cant in this extended sample, which has more than 19,000
observations. For example, the overall estimate of the coef¬cient of ln( ti’1 ) in the
model for time duration ranges from 0.04 to 0.1, which is small, but signi¬cant.
Finally, using transactions data to test microstructure theory often requires a care-
ful speci¬cation of the variables used. It also requires a deep understanding of the
way by which the market operates and the data are collected. However, ideas of the
econometric models discussed in this chapter are useful and widely applicable in
analysis of high-frequency data.



APPENDIX A. REVIEW OF SOME PROBABILITY DISTRIBUTIONS

Exponential distribution
A random variable X has an exponential distribution with parameter β > 0 if its
probability density function (pdf) is given by
213
THE PCD MODEL
±
 1 ’x/β
e if x ≥ 0
f (x | β) = β

0 otherwise.

Denoting such a distribution by X ∼ exp(β), we have E(X ) = β and Var(X ) = β 2 .
The cumulative distribution function (CDF) of X is

if x < 0
0
F(x | β) =
1 ’ e’x/β if x ≥ 0.

When β = 1, X is said to have a standard exponential distribution.

Gamma function
For κ > 0, the gamma function (κ) is de¬ned by

x κ’1 e’x d x.
(κ) =
0

The most important properties of the gamma function are:

1. For any κ > 1, (κ) = (κ ’ 1) (κ ’ 1).
2. For any positive integer m, (m) = (m ’ 1)!.

3. ( 1 ) = π.
2

The integration
y
x κ’1 e’x d x,
(y | κ) = y>0
0

is an incomplete gamma function. Its values have been tabulated in the literature.
Computer programs are now available to evaluate the incomplete gamma function.

Gamma distribution
A random variable X has a Gamma distribution with parameter κ and β (κ > 0,
β > 0) if its pdf is given by
±
 1
 x κ’1 e’x/β if x ≥ 0
κ (κ)
β
f (x | κ, β) =

0 otherwise.

By changing variable y = x/β, one can easily obtain the moments of X :
∞ ∞
1
x κ+m’1 e’x/β d x
E(X m ) = x m f (x | κ, β)d x =
β κ (κ)
0 0
214 HIGH-FREQUENCY DATA


βm β m (κ + m)
κ+m’1 ’y
= dy = .
y e
(κ) (κ)
0

In particular, the mean and variance of X are E(X ) = κβ and Var(X ) = κβ 2 . When
β = 1, the distribution is called a standard Gamma distribution with parameter κ.
We use the notation G ∼ Gamma(κ) to denote that G follows a standard Gamma
distribution with parameter κ. The moments of G are

(κ + m)
E(G m ) = , m > 0. (5.54)
(κ)

Weibull distribution
A random variable X has a Weibull distribution with parameters ± and β (± > 0,
β > 0) if its pdf is given by
± ±’1 ’(x/β)±
if x ≥ 0
β± x e
f (x | ±, β) =
if x < 0,
0

where β and ± are the scale and shape parameters of the distribution. The mean and
variance of X are
2
1 2 1
E(X ) = β 1+ , Var(X ) = β 1+ ’ 1+
2
± ± ±

and the CDF of X is

if x < 0
0
F(x | ±, β) = ±
1 ’ e’(x/β) if x ≥ 0.

When ± = 1, the Weibull distribution reduces to an exponential distribution.
De¬ne Y = X/[β (1 + ± )]. We have E(Y ) = 1 and the pdf of Y is
1

± ± ±
 1 1
y ±’1 exp
± 1+ ’ 1+ if y ≥ 0
y
f (y | ±) = ± ±

0 otherwise,
(5.55)
where the scale parameter β disappears due to standardization. The CDF of the stan-
dardized Weibull distribution is
±
if y < 0
 0
±
F(y | ±) = 1
1 ’ exp ’ 1+ if y > 0,
y
±

and we have E(Y ) = 1 and Var(Y ) = (1 + ± )/[ (1 + ± )]2 ’ 1. For a duration
2 1

model with Weibull innovations, the prior pdf is used in the maximum likelihood
estimation.
215
THE PCD MODEL


Generalized Gamma distribution
A random variable X has a generalized Gamma distribution with parameter ±, β, κ
(± > 0, β > 0, and κ > 0) if its pdf is given by
±

 ±x κ±’1
 exp ’ if x ≥ 0
f (x | ±, β, κ) = β κ± (κ) β


0 otherwise,

where β is a scale parameter, and ± and κ are shape parameters. This distribution
can be written as
±
X
G= ,
β

where G is a standard Gamma random variable with parameter κ. The pdf of X can
be obtained from that of G by the technique of changing variables. Similarly, the
moments of X can be obtained from that of G in Eq. (5.54) by

(κ + m ) β m (κ + ±)
m
±
E(X ) = E[(βG ) ] = β E(G )=β = .
m 1/± m m m/± m
(κ) (κ)

When κ = 1, the generalized Gamma distribution reduces to that of a Weibull
distribution. Thus, the exponential and Weibull distributions are special cases of the
generalized Gamma distribution.
The expectation of a generalized Gamma distribution is E(X ) = β (κ +
± )/ (κ). In duration models, we need a distribution with unit expectation. There-
1

fore, de¬ning a random variable Y = »X/β, where » = (κ)/ (κ + ± ), we have1

E(Y ) = 1 and the pdf of Y is
± κ±’1
 ±y y±
 exp ’ if y > 0
f (y | ±, κ) = »κ± (κ) » (5.56)


0 otherwise,

where again the scale parameter β disappears and » = (κ)/ (κ + ± ).
1




APPENDIX B. HAZARD FUNCTION

A useful concept in modeling duration is the Hazard function implied by a distribu-
tion function. For a random variable X , the survival function is de¬ned as

S(x) ≡ P(X > x) = 1 ’ P(X ¤ x) = 1 ’ CDF(x), x > 0,

which gives the probability that a subject, which follows the distribution of X , sur-
vives at the time x. The hazard function (or intensity function) of X is then de¬ned
216 HIGH-FREQUENCY DATA


by

f (x)
h(x) = (5.57)
S(x)

where f (.) and S(.) are the pdf and survival function of X , respectively.

Example 5.6. For the Weibull distribution with parameters ± and β, the sur-
vival function and hazard function are:
±
± ±’1
x
S(x | ±, β) = exp ’ , h(x | ±, β) = , x > 0.
x
β±
β

In particular, when ± = 1, we have h(x | β) = 1/β. Therefore, for an exponential
distribution, the hazard function is constant. For a Weibull distribution, the hazard is
a monotone function. If ± > 1, then the hazard function is monotonously increas-
ing. If ± < 1, the hazard function is monotonously decreasing. For the generalized
Gamma distribution, the survival function and hence, the hazard function involve the
incomplete Gamma function. Yet the hazard function may exhibit various patterns,
including U shape or inverted U shape. Thus, the generalized Gamma distribution
provides a ¬‚exible approach to modeling the duration of stock transactions.
For the standardized Weibull distribution, the survival and hazard functions are
±
1
S(y | ±) = exp ’ 1+ ,
y
±
±
1
y ±’1 ,
h(y | ±) = ± 1+ y > 0.
±


APPENDIX C. SOME RATS PROGRAMS FOR DURATION MODELS

The data used are adjusted time durations of intraday transactions of IBM stock from
November 1 to November 9, 1990. The ¬le name is “ibm1to5.dat” and it has 3534
observations.


A. Program for Estimating a WACD(1, 1) Model
all 0 3534:1
open data ibm1to5.dat
data(org=obs) / x r1
set psi = 1.0
nonlin a0 a1 b1 al
frml gvar = a0+a1*x(t-1)+b1*psi(t-1)
frml gma = %LNGAMMA(1.0+1.0/al)
frml gln =al*gma(t)+log(al)-log(x(t)) $
+al*log(x(t)/(psi(t)=gvar(t)))-(exp(gma(t))*x(t)/psi(t))**al
217
THE PCD MODEL


smpl 2 3534
compute a0 = 0.2, a1 = 0.1, b1 = 0.1, al = 0.8
maximize(method=bhhh,recursive,iterations=150) gln
set fv = gvar(t)
set resid = x(t)/fv(t)
set residsq = resid(t)*resid(t)
cor(qstats,number=20,span=10) resid
cor(qstats,number=20,span=10) residsq



B. Program for Estimating a GACD(1, 1) Models
all 0 3534:1
open data ibm1to5.dat
data(org=obs) / x r1
set psi = 1.0
nonlin a0 a1 b1 al ka
frml cv = a0+a1*x(t-1)+b1*psi(t-1)
frml gma = %LNGAMMA(ka)
frml lam = exp(gma(t))/exp(%LNGAMMA(ka+(1.0/al)))
frml xlam = x(t)/(lam(t)*(psi(t)=cv(t)))
frml gln =-gma(t)+log(al/x(t))+ka*al*log(xlam(t))-(xlam(t))**al
smpl 2 3534
compute a0 = 0.238, a1 = 0.075, b1 = 0.857, al = 0.5, ka = 4.0
nlpar(criterion=value,cvcrit=0.00001)
maximize(method=bhhh,recursive,iterations=150) gln
set fv = cv(t)
set resid = x(t)/fv(t)
set residsq = resid(t)*resid(t)
cor(qstats,number=20,span=10) resid
cor(qstats,number=20,span=10) residsq



C. A program for estimating a Tar-WACD(1, 1) model. The threshold 3.79 is
prespeci¬ed.
all 0 3534:1
open data ibm1to5.dat
data(org=obs) / x rt
set psi = 1.0
nonlin a1 a2 al b0 b2 bl
frml u = ((x(t-1)-3.79)/abs(x(t-1)-3.79)+1.0)/2.0
frml cp1 = a1*x(t-1)+a2*psi(t-1)
frml gma1 = %LNGAMMA(1.0+1.0/al)
frml cp2 = b0+b2*psi(t-1)
frml gma2 = %LNGAMMA(1.0+1.0/bl)
frml cp = cp1(t)*(1-u(t))+cp2(t)*u(t)
frml gln1 =al*gma1(t)+log(al)-log(x(t)) $
+al*log(x(t)/(psi(t)=cp(t)))-(exp(gma1(t))*x(t)/psi(t))**al
frml gln2 =bl*gma2(t)+log(bl)-log(x(t)) $
+bl*log(x(t)/(psi(t)=cp(t)))-(exp(gma2(t))*x(t)/psi(t))**bl
frml gln = gln1(t)*(1-u(t))+gln2(t)*u(t)
smpl 2 3534
compute a1 = 0.2, a2 = 0.85, al = 0.9
218 HIGH-FREQUENCY DATA


compute b0 = 1.8, b2 = 0.5, bl = 0.8
maximize(method=bhhh,recursive,iterations=150) gln
set fv = cp(t)
set resid = x(t)/fv(t)
set residsq = resid(t)*resid(t)
cor(qstats,number=20,span=10) resid
cor(qstats,number=20,span=10) residsq




EXERCISES

1. Let rt be the log return of an asset at time t. Assume that {rt } is a Gaussian white
noise series with mean 0.05 and variance 1.5. Suppose that the probability of a
trade at each time point is 40% and is independent of rt . Denote the observed
return by rto . Is rto serially correlated? If yes, calculate the ¬rst three lags of auto-
correlations of rto .
2. Let Pt be the observed market price of an asset, which is related to the fundamen-

tal value of the asset Pt— via Eq. (5.9). Assume that Pt— = Pt— ’ Pt’1 forms
a Gaussian white noise series with mean zero and variance 1.0. Suppose that the
bid-ask spread is two ticks. What is the lag-1 autocorrelation of the price change
series Pt = Pt ’ Pt’1 when the tick size is $1/8? What is the lag-1 autocorre-
lation of the price change when the tick size is $1/16?
3. The ¬le “ibm-d2-dur.dat” contains the adjusted durations between trades of IBM
stock on November 2, 1990. The ¬le has three columns consisting of day, time of
trade measured in seconds from midnight, and adjusted durations.
(a) Build an EACD model for the adjusted duration and check the ¬tted model.
(b) Build a WACD model for the adjusted duration and check the ¬tted model.
(c) Build a GACD model for the adjusted duration and check the ¬tted model.
(d) Compare the prior three duration models.
4. The ¬le “mmm9912-dtp.dat” contains the transactions data of the stock of 3M
Company in December 1999. There are three columns: day of the month, time
of transaction in seconds from midnight, and transaction price. Transactions that
occurred after 4:00 pm Eastern time are excluded.
(a) Is there a diurnal pattern in 3M stock trading? You may construct a time series
n t , which denotes the number of trades in 5-minute time interval to answer
this question.
(b) Use the price series to con¬rm the existence of bid-ask bounce in intraday
trading of 3M stock.
(c) Tabulate the frequencies of price change in multiples of tick size $1/16. You
may combine changes with 5 ticks or more into a category and those with ’5
ticks or beyond into another category.
5. Consider again the transactions data of 3M stock in December 1999.
219
REFERENCES


(a) Use the data to construct an intraday 5-minute log return series. Use the sim-
ple average of all transaction prices within a 5-minute interval as the stock
price for the interval. Is the series serially correlated? You may use Ljung“
Box statistics to test the hypothesis with the ¬rst 10 lags of sample autocor-
relation function.
(b) There are seventy-seven 5-minute returns in a normal trading day. Some
researchers suggest that the sum of squares of the intraday 5-minute returns
can be used as a measure of daily volatility. Apply this approach and calculate
the daily volatility of the log return of 3M stock in December 1999. Discuss
the validity of such a procedure to estimate daily volatility.
6. The ¬le “mmm9912-adur.dat” contains an adjusted intraday trading duration of
3M stock in December 1999. There are thirty-nine 10-minute time intervals in
a trading day. Let di be the average of all log durations for the ith 10-minute
interval across all trading days in December 1999. De¬ne an adjusted duration as
t j / exp(di ), where j is in the ith 10-minute interval. Note that more sophisticated
methods can be used to adjust the diurnal pattern of trading duration. Here we
simply use a local average.
(a) Is there a diurnal pattern in the adjusted duration series? Why?
(b) Build a duration model for the adjusted series using exponential innovations.
Check the ¬tted model.
(c) Build a duration model for the adjusted series using Weibull innovations.
Check the ¬tted model.
(d) Build a duration model for the adjusted series using generalized Gamma
innovations. Check the ¬tted model.
(e) Compare and comment on the three duration models built before.


REFERENCES

Campbell, J. Y., Lo, A. W., and MacKinlay, A. C. (1997), The Econometrics of Financial
Markets, Princeton University Press: New Jersey.
Cho, D., Russell, J. R., Tiao, G. C., and Tsay, R. S. (2000), “The magnet effect of price limits:
Evidence from high frequency data on Taiwan stock exchange,” Working paper, Graduate
School of Business, University of Chicago.
Engle, R. F., and Russell, J. R. (1998), “Autoregressive conditional duration: A new model for
irregularly spaced transaction data,” Econometrica, 66, 1127“1162.
Ghysels, E. (2000), “Some econometric recipes for high-frequency data cooking,” Journal of
Business and Economic Statistics, 18, 154“163.
Hasbrouck, J. (1992), Using the TORQ database, Stern School of Business, New York Uni-
versity.
Hasbrouck, J. (1999), “The dynamics of discrete bid and ask quotes,” Journal of Finance, 54,
2109“2142.
Hauseman, J., Lo, A., and MacKinlay, C. (1992), “An ordered probit analysis of transaction
stock prices,” Journal of Financial Economics, 31, 319“379.
220 HIGH-FREQUENCY DATA


Lo, A., and MacKinlay, A. C. (1990), “An econometric analysis of nonsynchronous trading,”
Journal of Econometrics, 45, 181“212.
McCulloch, R. E., and Tsay, R. S. (2000), “Nonlinearity in high frequency data and hierarchi-
cal models,” Working paper, Graduate School of Business, University of Chicago.
Roll, R. (1984), “A simple implicit measure of the effective bid-ask spread in an ef¬cient
market,” Journal of Finance, 39, 1127“1140.
Rydberg, T. H., and Shephard, N. (1998), “Dynamics of trade-by-trade price movements:
decomposition and models,” Working paper, Nuf¬eld College, Oxford University.
Stoll, H., and Whaley, R. (1990), “Stock market structure and volatility,” Review of Financial
Studies, 3, 37“71.
Wood, R. A. (2000), “Market microstructure research databases: History and projections,”
Journal of Business & Economic Statistics, 18, 140“145.
Zhang, M. Y., Russell, J. R., and Tsay, R. S. (2001), “A nonlinear autoregressive conditional
duration model with applications to ¬nancial transaction data,” Journal of Econometrics
(to appear).
Zhang, M. Y., Russell, J. R., and Tsay, R. S. (2001b), “Determinants of bid and ask quotes
and implications for the cost of trading,” Working paper, Graduate School of Business,
University of Chicago.
6
Switching Regime Volatility:
An Empirical Evaluation

BRUNO B. ROCHE AND MICHAEL ROCKINGER


ABSTRACT
Markov switching models are one possible method to account for volatility clustering.
This chapter aims at describing, in a pedagogical fashion, how to estimate a univariate
switching model for daily foreign exchange returns which are assumed to be drawn
in a Markovian way from alternative Gaussian distributions with different means and
variances. An application shows that the US dollar/Deutsche Mark exchange rate can be
modelled as a mixture of normal distributions with changes in volatility, but not in mean,
where regimes with high and low volatility alternate. The usefulness of this methodology
is demonstrated in a real life application, i.e. through the performance comparison of
simple hedging strategies.

6.1 INTRODUCTION
Volatility clustering is a well known and well documented feature of ¬nancial markets
rates of return. The seminal approach proposed by Engle (1982), with the ARCH model,
followed several years later by Bollerslev (1986), with the GARCH models, led to a huge
literature on this subject in the last decade. This very successful approach assumes that
volatility changes over time in an autoregressive fashion. There are several excellent books
and surveys dealing with this subject. To quote a few, Bollerslev et al. (1992, 1993), Bera
and Higgins (1993), Engle (1995) and Gouri´ roux (1997) provide a large overview of the
e
theoretical developments, the generalisation of the models and the application to speci¬c
markets. ARCH models provide a parsimonious description for volatility clustering where
volatility is assumed to be a deterministic function of past observations.
However, ARCH models struggle to account for the stylised fact that volatility can
exhibit discrete, abrupt and somehow fairly persistent changes. In the late 1980s, Hamil-
ton (1989) proposed an alternative methodology, the Markovian switching model, which
encountered great success. Although initiated by Quandt (1958) and Goldfeld and Quandt
(1973, 1975) to provide a description of markets in disequilibrium, this approach has
not encountered a great interest until the works of Hamilton (1989) on business cycles
modelling, and of Engel and Hamilton (1990) on exchange rates. The main feature of
this approach is that it involves multiple structures and allows returns to be drawn from
distinct distributions.
Applied Quantitative Methods for Trading and Investment. Edited by C.L. Dunis, J. Laws and P. Na¨m
±
™ 2003 John Wiley & Sons, Ltd ISBN: 0-470-84885-5
194 Applied Quantitative Methods for Trading and Investment

The change of regime between the distributions is determined in a Markovian manner.
It is driven by an unobservable state variable that follows a ¬rst-order Markov chain which
can take values of {0, 1}. The value of that variable is dependent upon its past values.
The switching mechanism thus enables complex dynamic structures to be captured and
allows for frequent changes at random times. In that way, a structure may persist for a
period of time and then be replaced by another structure after a switch occurs.
This methodology is nowadays very popular in the ¬eld of nonlinear time series models
and it has experienced a wide number of applications in the analysis of ¬nancial time
series. Although the original Markov switching models focused on the modelling of the
¬rst moment with application to economic and ¬nancial time series, see e.g. Hamilton
(1988, 1989), Engel and Hamilton (1990), Lam (1990), Goodwin (1993), Engel (1994),
Kim and Nelson (1998), among others, a growing body of literature is developing with
regard to the application of this technique and its variant to volatility modelling. To
quote again a few among others, Hamilton and Lin (1996), Dueker (1997) and Ramchand
and Susmel (1998). Gray (1996) models switches in interest rates. Chesnay and Jondeau
(2001) model switches of multivariate dependency.
In this chapter we present the switching methodology in a pedagogical framework
and in a way that may be useful for the ¬nancial empiricist. In Section 6.2 the nota-
tions and the switching model are introduced. In Section 6.3 we develop the maximum
likelihood estimation methodology and show how the switching model can be estimated.
This methodology is applied in Section 6.4 to the US dollar (USD)/Deutsche Mark (DEM)
exchange rate for the period 1 March 1995 to 1 March 1999. In that section it is shown how
estimation results are to be interpreted, and how endogenously detected changes between
states can improve the performances of simple real life hedging strategies. Section 6.5
concludes and hints at further lines of research.

6.2 THE MODEL
We assume, in this chapter, that foreign exchange returns1 are a mixture of normal dis-
tributions. This means that returns are drawn from a normal distribution where the mean
and variance can take different values depending on the “state” a given return belongs to.
Since there is pervasive evidence that variance is persistent, yet little is known about its
mean, it is useful to consider the more restrictive mixture model where just the variance
can switch. This leads us to introduce the following model, based on Hamilton (1994):

Rt = µ + [σ1 St + σ0 (1 ’ St )]µt

where µt are independent and identically distributed normal innovations with mean 0 and
variance 1. St is a Markov chain with values 0 and 1, and with transition probabilities
p = [p00 , p01 , p10 , p11 ] such that:

Pr[St = 1|St’1 = 1] = p11 Pr[St = 0|St’1 = 1] = p01
Pr[St = 1|St’1 = 0] = p10 Pr[St = 0|St’1 = 0] = p00

where p11 + p01 = 1 and p10 + p00 = 1.

1
Returns are calculated as the difference in the natural logarithm of the exchange rate value St for two
consecutive observations: Rt = 100[ln(St ) ’ ln(St’1 )]. This corresponds to continuously compounded returns.
Switching Regime Volatility 195
Let
ρj = Pr[S1 = j ] ∀j

be the unconditional probability of being in a certain state at time 1 and let ρ = [ρ0 , ρ1 ] .
If » designates the vector of all remaining parameters » = [µ, σ1, σ0 ] then we can
de¬ne θ = [» , p , ρ ] the vector of all parameters.2
In the following we use the notation

R t = [Rt , Rt’1, . . . , R1 ]

to designate the vector of realisations of past returns.
It is also useful to introduce the density of Rt conditional on regime St : f (Rt |St ). For
the model considered, in the case where µt is normally distributed, this density can be
written as:3
2
Rt ’ µ
1 1 1
f (Rt |St ; θ ) = √ exp ’ (6.1)
2π σ1 St + σ0 (1 ’ St ) σ1 St + σ0 (1 ’ St )
2

This illustrates that for a given parameter vector θ and for a given state St , the den-
sity of returns can be written in a straightforward manner. Expression (6.1) shows that
the conditional density depends only on the current regime St and not on past ones. It
should also be noted that, due to the Markovian character of the regimes, the information
contained in R t’1 is summarised in St .

6.3 MAXIMUM LIKELIHOOD ESTIMATION
The likelihood is

L = f (R T ; θ ) = f (RT |R T ’1 ; θ )f (RT ’1 |R T ’2 ; θ ) · · · f (R2 |R 1 ; θ )f (R1 ; θ ) (6.2)

and we wish to obtain the maximum likelihood estimate

θ ∈ arg max(θ ) ln f (RT ; θ )

In order to apply a maximum likelihood procedure on (6.2), it is necessary to introduce
the states St so that expression (6.1) can be used. To see how this can be done suppose
that θ is given and consider a typical element of the likelihood which can be developed
by using obvious probabilistic rules:
f (R t ; θ )
f (Rt |R t’1 ; θ ) ≡
f (R t’1 ; θ )
1
f (R t , St ; θ )
St=0
f (Rt |R t’1 ; θ ) =
f (R t’1 ; θ )

2
Notice that there is a link between ρ and p, as we will see later on.
3
Notice that densities (associated with continuous random variables) are written as f (·) and probabilities
(associated with discrete random variables) as Pr[·, ·].
196 Applied Quantitative Methods for Trading and Investment
1
f (Rt |R t’1 , St ; θ )f (R t’1 , St ; θ )
St=0
f (Rt |R t’1 ; θ ) =
f (R t’1 ; θ )

Moreover:
1
f (Rt |R t’1 , St ; θ ) = f (Rt |St ; θ ) Pr[St |Rt’1 ; θ ] (6.3)
St =0

where the last equality follows from (i) the Markovian character of the problem whereby
the knowledge of St summarises the entire history R t’1 so that f (Rt |Rt’1 , St ; θ ) =
f (Rt |St ; θ ) and (ii) f (R t’1 , St ; θ )/f (R t’1 ; θ ) = Pr[St |Rt’1 ; θ ].
We also have:
1
Pr[St , St’1 , R t’1 ; θ ]
Pr[St |R t’1 ; θ ] =
f (R t’1 ; θ )
St’1=0

1
Pr[St |St’1 , R t’1 ; θ ] Pr[St’1 , R t’1 ; θ ]
Pr[St |R t’1 ; θ ] = (6.4)
f (R t’1 ; θ )
St’1=0

1
Pr[St |R t’1 ; θ ] = Pr[St |St’1 ; θ ] Pr[St’1 , R t’1 ; θ ]
St’1=0


The last equality follows from the fact that Pr[St |St’1 , R t’1 ; θ ] = Pr[St |St’1 ; θ ] by the
assumption that states evolve according to a ¬rst-order Markov process.
Using Bayes™ formula it follows that:

Pr[St’1 , R t’1 ; θ ]
Pr[St’1 |R t’1 ; θ ] =
f (R t’1 ; θ )
f (Rt’1 , St’1 , R t’2 ; θ )
Pr[St’1 |R t’1 ; θ ] = 1
f (R t’1 , St’1 ; θ ) (6.5)
St’1 =0

f (Rt’1 |St’1 ; θ ) Pr[St’1 |R t’2 ; θ ]
Pr[St’1 |R t’1 ; θ ] = 1
f (Rt’1 |St’1 ; θ ) Pr[St’1 |R t’2 ; θ ]
St’1 =0


Henceforth, at time t ’ 1, f (Rt’1 |St’1 ; θ ), which is de¬ned in equation (6.1) shows up
in natural fashion. If we assume that we know Pr[St’1 |R t’2 ; θ ] then it becomes possible
using equation (6.5) to compute Pr[St’1 |R t’1 ; θ ] and from equation (6.4) to derive the
conditional probability of St given R t’1 . Pr[St |R t’1 ; θ ] can therefore be computed, for
all t, in a recursive fashion.
The starting value for the probabilities Pr[S1 = j |R0 ; θ ] = Pr[S1 = j ; θ ] = ρj can be
either estimated directly as additional parameters in the maximum likelihood estimation,
Switching Regime Volatility 197

or approximated by the steady state probabilities which have to verify
1
Pr[S1 = i; θ] = Pr[S1 = j ; θ]pij
j =0

1 ’ p00 1 ’ p11
’ Pr[St = 1; θ ] = Pr[St = 0; θ ] =
and (6.6)
2 ’ p11 ’ p00 2 ’ p11 ’ p00

One realises, at this stage, that the likelihood for a given θ can be obtained by iterating
on equation (6.3) which involves the computation of (6.4). As a by-product, the compu-
tation of (6.4) involves (6.5) which are the ¬ltered probabilities of being in a given state
conditional on all currently available information Pr[St |Rt ; θ ]. Also forecasts of states can
be easily obtained by iterating on the transition probabilities.
Using standard numerical methods, this procedure allows for a fast computation of
the estimates.

6.4 AN APPLICATION TO FOREIGN EXCHANGE RATES
Before we develop in detail one application of the switching model to the foreign exchange
markets, we wish to start this section with a brief overview of the functioning of these
markets as they offer notable features.

6.4.1 Features of the foreign exchange interbank market
It is interesting to note that, in contrast to other exchange markets, the interbank foreign
exchange (also called forex) market has no geographical limitations, since currencies are
traded all over the world, and there is no trading-hours scheme, indeed currencies are
traded around the clock. It is, truly, a 24 hours, 7 days-a-week market.
Another notable feature is that, in contrast to other exchange markets too, forex traders
negotiate deals and agree transactions over the telephone with trading prices and vol-
umes not being known to third parties. The tick quotes are provided by market-makers
and conveyed to the data subscribers™ terminal. They are meant to be indicative, pro-
viding a general indication of where an exchange rate stands at a given time. Though
not necessarily representing the actual rate at which transactions really take place, these
indicative quotes are felt as being fairly accurate and matching the true prices experienced
in the market. Moreover, in order to avoid dealing with the bid“ask bounce, inherent to
most high-frequency data (see, for instance, chapter 3 in Campbell et al. (1997)), use was
made, for the estimation of the switching model, of the bid series only, generally regarded
as a more consistent set of observations.
In the following we will use, as an illustration, the USD/DEM exchange rate. The tick-
by-tick quotes have been supplied by Reuters via Olsen & Associates. We will use daily
quotes which are arbitrarily taken at each working day at 10pm GMT (corresponding
approximately to the closing of Northern American markets). We obviously could have
used a different time of the day and/or different frequencies.
It is interesting to note that, in this high-frequency dataset, there are signi¬cant intraday,
intraweek and intrayear seasonal patterns (see Figures 6.1 and 6.2), explained respectively
by the time zone effect (in the case of the USD/DEM rate, the European and the US time
198 Applied Quantitative Methods for Trading and Investment

900

800

700

600

500

400

300

200

100

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Hour (GMT)

Figure 6.1 Intraday pattern USD/DEM (average number of transactions per hour)

8000


7000


6000


5000


4000

3000


2000


1000


0
Sun Mon Tue Wed Thu Fri

Figure 6.2 Intraweek pattern USD/DEM (average number of transactions per day of the week)


zones are the most active ones), the low activity exhibited during weekends and some
universal public holidays (e.g. Christmas, New Year). Some other factors such as the
release of economic indicators by, amongst others, central banks may also induce sea-
sonality in foreign exchange markets. Seasonalities are also investigated by Guillaume
et al. (1997).
Switching Regime Volatility 199

Further descriptions of questions related to intraday data in forex markets can be found
in Baillie and Bollerslev (1989), Goodhart and Figliuoli (1991), M¨ ller et al. (1997) and
u
Schnidrig and W¨ rtz (1995), among others.
u

6.4.2 Descriptive statistics
In our empirical application of switching models, as previously said, we will use daily
observations of the USD/DEM exchange rate. We obtain these by sampling from the
tick-by-tick data, extracting those recorded at 10pm GMT, from October 1995 to October
1998, corresponding to 775 observations overall.
Table 6.1 displays the basic descriptive statistics for the USD/DEM foreign exchange
returns, for the tick-by-tick and daily data.
There is an enormous difference between the ¬rst two moments of the series, con¬rm-
ing the dramatic effect of time aggregation (see Ghysels et al. (1998)). As indicated in
Table 6.1, daily returns data is negatively skewed, yet in a non-signi¬cant way. This
suggests that there exist some strong negative values but not enough to be statisti-
cally meaningful.
The series exhibits leptokurtosis (which means that the distribution of the data has
thicker tails than a normal one) and heteroskedasticity (or volatility clustering) as shown
by the Ljung“Box test on the squared returns. This latter observation is a well known
feature of ¬nancial rates of return: large price changes in magnitude, irrespective of sign,
are likely to be followed by large price movements; small price changes in magnitude are
likely to be followed by small price movements. Finally, the assumption of normality of
the data can be rejected at any level of signi¬cance as indicated by the Jarque“Bera and
Kolmogorov“Smirnov tests.
It is well known from the literature on mixture of distributions that a mixture of normal
distributions can be leptokurtic. This suggests that daily USD/DEM exchange rate returns
are good candidates for being explained by switching among distributions.

Table 6.1 Descriptive statistics of the daily and tick-by-tick returns

No. of Mean Variance Skewness Kurtosis Acf(1) Acf(2)
observations

’0.09 ’46.3%
Tick-by-tick 5 586 417 2.8E-08 4.17E-08 13.34 0.4%
’0.23
Daily 775 2.0E-04 2.84E-05 4.17

Ljung“Box (20 lags) critical value at 5% = 31.4

Daily returns 25.33
43.82—
Squared daily returns

Normality tests of daily returns

58.68—
Jarque“Bera
0.0607—
Kolmogorov“Smirnov

Denotes parameter estimates statistically signi¬cant at the 1% level.
200 Applied Quantitative Methods for Trading and Investment

6.4.3 Model empirical results

We obtain the estimates for the Markov switching model in a recursive way via maximum
likelihood, under normality for the errors and supposing two volatility regimes.
We are using the Gauss software for estimating the model variables. The program is
made up of six parts which are fully described in the Appendix. While the ¬rst three
sections deal with data loading, preparation and the inclusion of the relevant libraries,
sections four and ¬ve compute the maximum likelihood estimation of the model™s par-
ameters. The software proposes several optimisation algorithms. We choose the algorithm
proposed by Berndt, Hall, Hall and Hausman (BHHH). The last section computes the
¬ltered probabilities as described in equation (6.5) and the smoothed probabilities (i.e. the
probabilities of being in a given state conditional on all currently available information
at t ’ 1: Pr[St |Rt’1 ; θ ]). Full details of the program are given in Appendix A.
Table 6.2 shows the estimates for the USD/DEM model. All coef¬cients are signi¬cant
at the 5% level. The probability of staying in the higher volatility regime (i.e. St = 0) is
0.8049, which means that, on average, it lasts for about ¬ve days (1/(1 ’ 0.8049) = 5.13;
see also Hamilton (1989)).


6.4.4 Model evaluation strategy

Model evaluation is carried out in two ways. Firstly, we test whether the model residuals
are normal and non-correlated; we also test if standardised returns follow a normal dis-
tribution. This approach provides a common ground for statistically assessing the model
performance. Our evaluation criterion consists in a thorough analysis of the residuals (i.e.
the analysis and testing of the normality assumptions). For the latter, to make things eas-
ily reproducible, we have used the two common Jarque“Bera and Kolmogorov“Smirnov
tests. Secondly, our evaluation also comprises checking the switching volatility model
through its performance in a close to real life hedging strategy.


6.4.5 Residuals analysis

We carry out a brief analysis of the residuals of the computed model. Strictly speaking,
the term “residuals” is used here for the series of standardised returns (i.e. the returns
series divided by the forecast volatilities). If the volatility captures well the ¬‚uctuations
of the market, and the model™s assumptions are valid, such residuals are expected to
be normal.

Table 6.2 Markov switching model: empirical results

t-Statistics
Value Std. error Pr(>t)

µ 0.0345 0.0171 2.023 0.0215
σ0 0.6351 0.0281 22.603 0.0000
σ1 0.2486 0.0313 7.934 0.0000
p00 0.8049 0.0620 12.976 0.0000
p11 0.6348 0.0927 6.848 0.0000
Switching Regime Volatility 201
Table 6.3 Model residuals “ basic statistics and normality tests

Daily returns Model residuals

Mean 0.0043 0.0485
Std. dev. 0.9944 1.0804
’0.2455 ’0.1344
Skewness
Exc. kurtosis 1.2764 1.2228
Sample size 775


Ljung“Box (20 lags) critical value at 5% = 31.4

Std. residuals 25.33 23.6
43.82—
Squared std. residuals 16.9


Normality tests

58.68— 49.18—
Jarque“Bera
0.061— 0.057—
Kolmogorov“Smirnov

Denotes parameter estimates statistically signi¬cant at the 1% level.


4



2
Switching residuals




0



’2



’4


’3 ’2 ’1 0 1 2 3
Quantiles of standard normal

Figure 6.3 Probability plot of the Markov switching model


Table 6.3 presents the basic summary statistics and normality tests for the standardised
log-returns and the standardised residuals/returns computed from the model. Figure 6.3
shows the normal score plot for the standardised returns from the model.
Here the normal score plot is used to assess whether the standardised residuals data
have a Gaussian distribution. If that is the case, then the plot will be approximately a
202 Applied Quantitative Methods for Trading and Investment

straight line. The extreme points have more variability than points towards the centre. A
plot that is bent down on the left and bent up on the right means that the data have longer
tails than the Gaussian.
The striking feature is that the model captures fairly well the heteroskedasticity of the
underlying time series (as shown by the Ljung“Box test on the squared residuals) and,
therefore, achieves homoskedasticity.
Having said that, the switching model residuals do not follow a normal distribution.
Both the Jarque“Bera and the Kolmogorov“Smirnov normality tests enable us to reject
the hypothesis that the residuals follow a normal distribution. Although this does not
invalidate the switching model, this highlights the fact that nonlinearities still exist in the
residuals that the switching model did not manage to capture.


6.4.6 Model evaluation with a simple hedging strategy

In this section we show how ¬ltered volatility estimates can be combined with technical
trend-following systems to improve the performance of these systems.
The negative relationship between the performance of trend-following trading systems
and the level of volatility in foreign exchange markets is a well known empirical ¬nding.
In other words, trending periods in the forex markets tend to occur in relatively quiet (i.e.
low volatility) periods.
We here compare hedging strategies using trend-following systems with similar systems
combined with Markov switching ¬ltered volatility.


6.4.6.1 Trend-following moving average models

As described by M¨ ller (1995), trend-following systems based on moving average models
u
are well known technical solutions, easy to use, and widely applied for actively hedging
foreign exchange rates.
The moving average (MA) is a useful tool to summarise the past behaviour of a
time series at any given point in time. In the following example, MAs are used in
the form of momenta, that is the difference of the current time series values and an
MA. MAs can be de¬ned with different weighting functions of their summation. The
choice of the weighting function has a key in¬‚uence on the success of the MA in its
application.
Among the MAs, the exponentially weighted moving average (EMA) plays an impor-
tant role. Its weighting function declines exponentially with the time distance of the
past observations from now. The sequential computation of EMAs along a time series
is simple as it relies upon a recursion formula. For time series with a strong random
element, however, the rapidly increasing shape of the exponential function leads to strong
weights of the very recent past and hence for short-term noise structures of the time
series. This is a reason why other MA weighting functions have been found worthy of
interest in empirical applications. The following subsection presents two families of MA
weighting functions. Both families can be developed with repeated applications of the
MA/EMA operator.
Switching Regime Volatility 203

6.4.6.2 Moving average de¬nitions
A moving average of the time series x is a weighted average of the series of elements of
the past up to now:
n
wn’j xj
j =’∞
MAx,w;n ≡ (6.7)
n
wn’j
j =’∞


where wk is a series of weights independent of n.
A fundamental property of a moving average is its range r (or centre of gravity of the
weighted function wk ):

wk k
k=0
r≡ (6.8)

wk
k=0

The range r of a discrete time series is in units of the time series index, but unlike this
integer index it can be any positive real number.
EMAs have the following declining weights:
k
r
wk ≡ (6.9)
r +1

where r is the centre of gravity. In the case of daily time series, r = (d ’ 1)/2 (d =
number of days in moving average).
An EMA can be computed by a recursion formula. If its value EMAx (r, tn’1 ) of the
previous series element xn’1 is known, one can easily compute the value at tn :
r
EMAx (r, tn ) = µEMAx (r, tn’1 ) + (1 ’ µ)xn µ= (6.10)
with
r +1

or expressed in number of days in the moving average:

d ’1
2
µ=1’ i.e. r = (6.11)
d +1 2

The recursion needs an initial value to start with. There is usually no information before
the ¬rst series element x1 , which is the natural choice for this initialisation:

EMAx (r, t1 ) = x1 (6.12)

The error made with this initialisation declines with the factor [r/(r + 1)]n’1 .
204 Applied Quantitative Methods for Trading and Investment

In many applications, the EMA will neither be used at t1 nor in the initial phase after
t1 which is called the built-up time. After the built-up time, when the EMA is used, one
expects to be (almost) free of initialisation errors.

6.4.6.3 Trading rules with EMA trading models
MA models summarise the past behaviour of a time series at any given point in time.
This information is used to identify trends in ¬nancial markets in order to subsequently
take positions according to the following rule:

If x(tn ) > EMAx (r, tn ) go long (or stay long)
If x(tn ) < EMAx (r, tn ) go short (or stay short)

with x(tn ) the spot exchange rate at time t. Commission costs of 0.00025 DEM are
charged for each trade. A trade is de¬ned as being long and going short or vice versa.
An EMA model is de¬ned by its type (i.e. MA or EMA) and, in the latter case, its
centre of gravity (or number of MA days). There are obviously an in¬nite number of
different combinations. Our analysis here is arbitrarily limited to one EMA type (i.e. 50
days or µ = 0.96).

6.4.6.4 Trading rules of EMA trading models with volatility ¬lters
It is a well known empirical ¬nding that trend-following systems tend to perform poorly
when markets become volatile. One possible explanation lies in the fact that, most of the
time, high volatility periods correlate with periods when prices change direction.
The previous rules are thus combined with the following new rules:

If volatility > volatility threshold T , then reverse position
(i.e. go long if current position is short and vice versa)
If volatility < volatility threshold T , keep position as indicated
by the underlying MA model

6.4.6.5 Trading results
Table 6.4 shows the numerical results. Strategies based on the EMA model without volatil-
ity ¬ltering are leading to approximately zero pro¬t (if commission costs are not included),
denoting the inability of the EMA model considered here to detect pro¬table trends. The
volatility ¬lter trading strategy based on ¬ltered probabilities on the other hand is improv-
ing considerably the performance of the model. It is interesting to note that, despite the
higher frequency of the trades (253 vs. 65), the performance after transaction costs is still
signi¬cantly higher. Lastly, in terms of the risk/reward ratio, the latter model achieves a
remarkable performance, although the average pro¬t per annum is average.
The ¬le “Hedging System Simulation.xls” on the CD-Rom contains the raw forex data
as well as the Markovian switching volatility and the moving average computations. The
¬le is split into three worksheets.
Switching Regime Volatility 205
Table 6.4 Summary results of trading strategies (including transaction costs)

Without volatility ¬lter With volatility ¬lter

’4.9%
Total pro¬t (%) 15.4%
’1.6%
Average pro¬t per annum 5.1%
Maximum drawdowna 19.4% 10.4%
Number of trades 65 253
’0.08
Risk/reward ratiob 0.49
a
Drawdowns are de¬ned as the difference between the maximum pro¬t potentially realised
to date and the pro¬t (or loss) realised to date. The maximum drawdown is therefore the
maximum value of the drawdowns observed in history. It is an estimate of the maximum
loss the strategy would have incurred.
b
De¬ned as the ratio maximum drawdown/average pro¬t per annum.


The ¬rst worksheet (“USDDEM EMA System”) contains the basic computation of the
EMA model and the trading model simulation. Columns A and B contain the date and
the mid forex rate respectively. Column C contains the EMA computation. The EMA
centre of gravity is input in cell C3. Column E contains the switching volatility ¬gures.
The computation of these ¬gures is given in the third worksheet (“Backup”, column N);
the formula is based on the ¬ltered probabilities, i.e. Pr[St = 0|Rt ]σ0 + Pr[St = 1|Rt ]σ1 .

<<

. 10
( 19)



>>