ńņš. 8 |

2 ā‚S

where we set the drift equal to r which is extracted from MD*BASE and

corresponds to the time to maturity used in the simulation and N ā—ā— is the

204 9 Trading on Deviations of Implied and Historical Densities

number of days to maturity. The ļ¬rst derivative of Ļ(. ) is approximated by:

Ļ(S(iā’1)/N ā—ā— ) ā’ Ļ(S(iā’1)/N ā—ā— ā’ āS)

ā‚Ļ

(S(iā’1)/N ā—ā— ) = ,

ā‚S āS

where āS is 1/2 of the width of the bingrid on which the diļ¬usion function is

estimated. Finally the estimated diļ¬usion function is linearly extrapolated at

both ends of the bingrid to accommodate potential outliers.

With these ingredients we start the simulation with index value S0 = 3328.41

(Monday, April 21, 1997) and time to maturity Ļ„ = 88/360 and r = 3.23. The

expiration date is Friday, July 18, 1997. From these simulated index values

we calculate annualized logā“returns which we take as input of the nonpara-

metric density estimation (see equation (9.5)). The XploRe quantlet denxest

accomplishes the estimation of the time series density by means of the Gaussian

kernel function:

1 1

ā exp ā’ u2 .

K(u) =

2

2Ļ

The bandwidth hM C is computed by the XploRe quantlet denrot which applies

Silvermanā™s rule of thumb.

First of all, we calculate the optimum bandwidth hM C given the vector of

10, 000 simulated index values. Then we search the bandwidth hā™ C which

M

implies a variance of g ā— to be closest to the variance of f ā— (but to be still

within 0.5 to 5 times hM C ). We stop the search if var(g ā— ) is within a range

of 5% of var(f ā— ). Following, we translate g ā— such that its mean matches the

futures price F. Finally, we transform this density over DAX index values ST

into a density g ā— ā™ over logā“returns uT . Since

ST x

P(ST < x) = P ln < ln = P(uT < u)

St St

where x = St eu , we have

P(ST ā [x, x + āx]) = P(uT ā [u, u + āu])

and

P(ST ā [x, x + āx]) ā g ā— (x)āx

P(uT ā [u, u + āu]) ā g ā— ā™(u)āu.

9.4 Comparison of Implied and Historical SPD 205

Therefore, we have as well (see HĀØrdle and Simar (2002) for density transfor-

a

mation techniques)

g ā— (St eu )ā(St eu )

g ā— ā™(u) g ā— (St eu )St eu .

ā ā

āu

To simplify notations, we will denote both densities g ā— . Figure 9.2 displays the

resulting time series density over logā“returns on Friday, April 18, 1997. Pro-

ceeding in the same way for all 30 periods beginning in April 1997 and ending

in September 1999, we obtain the time series of the 3 month ā˜forwardā™ skewness

and kurtosis values of g ā— shown in Figures 9.3 and 9.4. The ļ¬gures reveal that

the time series distribution is systematically slightly negatively skewed. Skew-

ness is very close to zero. As far as kurtosis is concerned we can extract from

Figure 9.4 that it is systematically smaller than but nevertheless very close to

3. Additionally, all time series density plots looked like the one shown in Figure

9.2.

9.4 Comparison of Implied and Historical SPD

At this point it is time to compare implied and historical SPDs. Since by

construction, expectation and variance are adjusted, we focus the comparison

on skewness and kurtosis. Starting with skewness, we can extract from Figure

9.3 that except for one period the IBT implied SPD is systematically more

negatively skewed than the time series SPD, a fact that is quite similar to what

Aitā“Sahalia, Wang and Yared (2000) already found for the S&P 500. The 3

month IBT implied SPD for Friday, September 17, 1999 is slightly positively

skewed. It may be due to the fact that in the months preceeding June 1999,

the month in which the 3 month implied SPD was estimated, the DAX index

stayed within a quite narrow horizontal range of index values after a substantial

downturn in the 3rd quarter of 1998 (see Figure 9.11) and agents therefore

possibly believed index prices lower than the average would be more realistic

to appear. However, this is the only case where skew(f ā— )>skew(g ā— ).

206 9 Trading on Deviations of Implied and Historical Densities

Skewness Comparison: TS=thin; IBT=thick

07/18/97 04/17/98 03/19/99 12/17/99

Time

0.0

-0.5

-1.0

-1.5

Skewness

Figure 9.3. Comparison of Skewness time series for 30 periods.

Kurtosis Comparison: TS=thin; IBT=thick

Kurtosis

6.0

4.0

2.0

0.0 Time

07/18/97 04/17/98 03/19/99 12/17/99

Figure 9.4. Comparison of Kurtosis time series for 30 periods.

9.5 Skewness Trades 207

The kurtosis time series reveals a similar pattern as the skewness time series.

The IBT SPD has except for one period systematically more kurtosis than the

time series SPD. Again this feature is in line with what Aitā“Sahalia, Wang

and Yared (2000) found for the S&P 500. The 3 month IBT implied SPD for

Friday, October 16, 1998 has a slightly smaller kurtosis than the time series

SPD. That is, investors assigned less probability mass to high and low index

prices. Note that the implied SPD was estimated in July 1998 after a period

of 8 months of booming asset prices (see Figure 9.11). It is comprehensible

in such an environment that high index prices seemed less realistic to appear.

Since the appearance of low index prices seemed to be unrealistic as well, agents

obviously expected the DAX move rather sideways.

9.5 Skewness Trades

In the previous section we learned that the implied and the time series SPDā™s

reveal diļ¬erences in skewness and kurtosis. In the following two sections, we

investigate how to proļ¬t from this knowledge. In general, we are interested in

what option to buy or to sell at the day at which both densities were estimated.

We consider exclusively European call or put options.

According to Aitā“Sahalia, Wang and Yared (2000), all strategies are designed

such that we do not change the resulting portfolio until maturity, i.e. we keep

all options until they expire. We use the following terms for moneyness which

we deļ¬ne as K/(St e(T ā’t)r ):

Moneyness(FOTM Put) < 0.90

ā¤

0.90 Moneyness(NOTM Put) < 0.95

ā¤

0.95 Moneyness(ATM Put) < 1.00

ā¤

1.00 Moneyness(ATM Call) < 1.05

ā¤

1.05 Moneyness(NOTM Call) < 1.10

ā¤

1.10 Moneyness(FOTM Call)

Table 9.1. Deļ¬nitions of moneyness regions.

where FOTM, NOTM, ATM stand for far outā“ofā“theā“money, near outā“ofā“theā“

money and atā“theā“money respectively.

A skewness trading strategy is supposed to exploit diļ¬erences in skewness of

two distributions by buying options in the range of strike prices where they

208 9 Trading on Deviations of Implied and Historical Densities

are underpriced and selling options in the range of strike prices where they

are overpriced. More speciļ¬cally, if the implied SPD f ā— is less skewed (for

example more negatively skewed) than the time series SPD g ā— , i.e. skew(f ā— ) <

skew(g ā— ), we sell the whole range of strikes of OTM puts and buy the whole

range of strikes of OTM calls (S1 trade). Conversely, if the implied SPD is

more skewed, i.e. skew(f ā— ) > skew(g ā— ), we initiate the S2 trade by buying the

whole range of strikes of OTM puts and selling the whole range of strikes of

OTM calls. In both cases we keep the options until expiration.

Skewness s is a measure of asymmetry of a probability distribution. While for a

distribution symmetric around its mean s = 0, for an asymmetric distribution

s > 0 indicates more weight to the left of the mean. Recalling from option

pricing theory the pricing equation for a European call option, Franke, HĀØrdle

a

and Hafner (2001):

ā

ā’r(T ā’t)

max(ST ā’ K, 0)f ā— (ST )dST , (9.6)

C(St , K, r, T ā’ t) =e

0

where f ā— is the implied SPD, we see that when the two SPDā™s are such that

skew(f ā— ) < skew(g ā— ), agents apparently assign a lower probability to high

outcomes of the underlying than would be justiļ¬ed by the time series density,

see Figure 7.13. Since for call options only the right ā˜tailā™ of the support

determines the theoretical price, the latter is smaller than the price implied by

equation (9.6) using the time series density. That is, we buy underpriced calls.

The same reasoning applies to European put options. Looking at the pricing

equation for such an option:

ā

ā’r(T ā’t)

max(K ā’ ST , 0)f ā— (ST )dST , (9.7)

P (St , K, r, T ā’ t) =e

0

we conclude that prices implied by this pricing equation using f ā— are higher

than the prices using the time series density. That is, we sell puts.

Since we hold all options until expiration and due to the fact that options for

all strikes are not always available in markets we are going to investigate the

payoļ¬ proļ¬le at expiration of this strategy for two compositions of the portfolio.

To get an idea about the exposure at maturity let us begin with a simpliļ¬ed

portfolio consisting of one short position in a put option with moneyness of

0.95 and one long position in a call option with moneyness of 1.05. To further

simplify, we assume that the future price F is equal to 100 EUR. Thus, the

portfolio has a payoļ¬ which is increasing in ST , the price of the underlying at

maturity. For ST < 95 EUR the payoļ¬ is negative and for ST > 105 EUR it is

positive.

9.5 Skewness Trades 209

However, in the application we encounter portfolios containing several

long/short calls/puts with increasing/decreasing strikes as indicated in Table

9.2.

Payoff of S1 Trade : OTM

50

0

-50

85 90 95 100 105 110 115

Underlying

Figure 9.5. S1 trade payoļ¬ at maturity of portfolio detailed in Table

9.2.

Figure 9.5 shows the payoļ¬ of a portfolio of 10 short puts with strikes ranging

from 86 EUR to 95 EUR and of 10 long calls striking at 105 EUR to 114 EUR,

the future price is still assumed to be 100 EUR. The payoļ¬ is still increasing

in ST but it is concave in the left tail and convex in the right tail. This is due

to the fact that our portfolio contains, for example, at ST = 106 EUR two call

options which are in the money instead of only one compared to the portfolio

considered above. These options generate a payoļ¬ which is twice as much. At

ST = 107 EUR the payoļ¬ is inļ¬‚uenced by three ITM calls procuring a payoļ¬

which is three times higher as in the situation before etc. In a similar way we

can explain the slower increase in the left tail. Just to sum up, we can state

that this trading rule has a favorable payoļ¬ proļ¬le in a bull market where the

underlying is increasing. But in bear markets it possibly generates negative

cash ļ¬‚ows. Buying (selling) two or more calls (puts) at the same strike would

change the payoļ¬ proļ¬le in a similar way leading to a faster increase (slower

decrease) with every call (put) bought (sold).

The S2 strategy payoļ¬ behaves in the opposite way. The same reasoning can

be applied to explain its payoļ¬ proļ¬le. In contradiction to the S1 trade the S2

trade is favorable in a falling market.

210 9 Trading on Deviations of Implied and Historical Densities

S1 OTMā“S1

Moneyness Moneyness

0.86 ā’ 0.95

short put 0.95

1.05 ā’ 1.14

long call 1.05

Table 9.2. Portfolios of skewness trades.

9.5.1 Performance

Given the skewness values for the implied SPD and the time series SPD we

now have a look on the performance of the skewness trades. Performance is

measured in net EUR cash ļ¬‚ows which is the sum of the cash ļ¬‚ows generated

at initiation in t = 0 and at expiration in t = T . We ignore any interest rate

between these two dates. Using EUREX settlement prices of 3 month DAX put

and calls we initiated the S1 strategy at the Monday immediately following the

3rd Friday of each month, beginning in April 1997 and ending in September

1999. January, February, March 1997 drop out due to the time series density

estimation for the 3rd Friday of April 1997. October, November and December

1999 drop out since we look 3 months forward. The cash ļ¬‚ow at initiation stems

from the inļ¬‚ow generated by the written options and the outļ¬‚ow generated by

the bought options and hypothetical 5% transaction costs on prices of bought

and sold options. Since all options are kept in the portfolio until maturity (time

to expiration is approximately 3 months, more precisely Ļ„ = TTM/360) the

cash ļ¬‚ow in t = T is composed of the sum of the inner values of the options in

the portfolio.

Figure 9.6 shows the EUR cash ļ¬‚ows at initiation, at expiration and the re-

sulting net cash ļ¬‚ow for each portfolio. The sum of all cash ļ¬‚ows, the total net

cash ļ¬‚ow, is strongly positive (9855.50 EUR). Note that the net cash ļ¬‚ow (blue

bar) is always positive except for the portfolios initiated in June 1998 and in

September 1998 where we incur heavy losses compared to the gains in the other

periods. In other words, this strategy would have procured 28 times moder-

ate gains and two times large negative cash ļ¬‚ows. As Figure 9.5 suggests this

strategy is exposed to a directional risk, a feature that appears in December

1997 and June 1998 where large payoļ¬s at expiration (positive and negative)

occur. Indeed, the period of November and December 1997 was a turning point

of the DAX and the beginning of an 8 month bull market, explaining the large

payoļ¬ in March 1998 of the portfolio initiated in December 1997. The same

9.5 Skewness Trades 211

Performance S1

CashFlow in EUR

5000

2500

0 Time

-2500

-5000

07/97 10/97 01/98 04/98 07/98 10/98 01/99 04/99 07/99 10/99

Figure 9.6. Performance of S1 trade with 5% transaction costs. The

ļ¬rst (red), second (magenta) and the third bar (blue) show for each

period the cash ļ¬‚ow in t = 0, in t = T and the net cash ļ¬‚ow respectively.

Cash ļ¬‚ows are measured in EUR. XFGSpdTradeSkew.xpl

arguing explains the large negative payoļ¬ of the portfolio set up in June 1998

expiring in September 1998 (refer to Figure 9.11). Another point to note is

that there is a zero cash ļ¬‚ow at expiration in 24 periods. Periods with a zero

cash ļ¬‚ow at initiation and at expiration are due to the fact that there was not

set up any portfolio (there was no OTM option in the database).

Since there is only one period (June 1999), when the implied SPD is more

skewed than the time series SPD a comparison of the S1 trade with knowledge

of the latter SPDā™s and without this knowledge is not useful. A comparison

of the skewness measures would have ļ¬ltered out exactly one positive net cash

ļ¬‚ow, more precisely the cash ļ¬‚ow generated by a portfolio set up in June

1999. But to what extend this may be signiļ¬cant is uncertain. For the same

reason the S2 trade has no great informational content. Applied to real data

it would have procured a negative total net cash ļ¬‚ow. Actually, only in June

1999 a portfolio would have been set up. While the S1 trade performance was

independent of the knowledge of the implied and the time series SPDā™s the

S2 trade performance changed signiļ¬cantly as it was applied in each period

212 9 Trading on Deviations of Implied and Historical Densities

(without knowing both SPDā™s). The cash ļ¬‚ow proļ¬le seemed to be the inverse

of Figure 9.6 indicating that should there be an options mispricing it would

probably be in the sense that the implied SPD is more negatively skewed than

the time series SPD.

9.6 Kurtosis Trades

A kurtosis trading strategy is supposed to exploit diļ¬erences in kurtosis of two

distributions by buying options in the range of strike prices where they are

underpriced and selling options in the range of strike prices where they are

overpriced. More speciļ¬cally, if the implied SPD f ā— has more kurtosis than

the time series SPD g ā— , i.e. kurt(f ā— ) > kurt(g ā— ), we sell the whole range of

strikes of FOTM puts, buy the whole range of strikes of NOTM puts, sell the

whole range of strikes of ATM puts and calls, buy the whole range of strikes

of NOTM calls and sell the whole range of strikes of FOTM calls (K1 trade).

Conversely, if the implied SPD has less kurtosis than the time series density g ā— ,

i.e. kurt(f ā— ) < kurt(g ā— ), we initiate the K2 trade by buying the whole range of

strikes of FOTM puts, selling the whole range of strikes of NOTM puts, buying

the whole range of strikes of ATM puts and calls, selling the whole range of

strikes of NOTM calls and buying the whole range of strikes of FOTM calls.

In both cases we keep the options until expiration.

Kurtosis Īŗ measures the fatness of the tails of a distribution. For a normal

distribution we have Īŗ = 3. A distribution with Īŗ > 3 is said to be leptokurtic

and has fatter tails than the normal distribution. In general, the bigger Īŗ is,

the fatter the tails are. Again we consider the option pricing formulae (9.6)

and (9.7) and reason as above using the probability mass to determine the

moneyness regions where we buy or sell options. Look at Figure 7.14 for a

situation in which the implied density has more kurtosis than the time series

density triggering a K1 trade.

To form an idea of the K1 strategyā™s exposure at maturity we start once again

with a simpliļ¬ed portfolio containing two short puts with moneyness 0.90 and

1.00, one long put with moneyness 1.00, two short calls with moneyness 1.00

and 1.10 and one long call with moneyness 1.05. Figure 9.7 reveals that this

portfolio inevitably leads to a negative payoļ¬ at maturity regardless the move-

ment of the underlying.

Should we be able to buy the whole range of strikes as the K1 trading rule

suggests, the portfolio is given in Table 9.3, FOTMā“NOTMā“ATMā“K1, we get

9.6 Kurtosis Trades 213

Payoff of K1 Trade

0

-5

-10

85 90 95 100 105 110 115

Underlying

Figure 9.7. Kurtosis trade 1 payoļ¬ at maturity of portfolio detailed in

Table 9.3.

a payoļ¬ proļ¬le (Figure 9.8) which is quite similar to the one from Figure 9.7.

In fact, the payoļ¬ function looks like the ā˜smoothā™ version of Figure 9.7.

Payoff of K1 Trade : FOTM-NOTM-ATM

0

-10

-20

-30

-40

85 90 95 100 105 110 115

Underlying

Figure 9.8. K1 trade payoļ¬ at maturity of portfolio detailed in Table

9.3.

Changing the number of long puts and calls in the NOTM regions can produce

a positive payoļ¬. Setting up the portfolio given in Table 9.3, NOTMā“K1,

results in a payoļ¬ function shown in Figure 9.9. It is quite intuitive that the

more long positions the portfolio contains the more positive the payoļ¬ will be.

Conversely, if we added to that portfolio FOTM short puts and calls the payoļ¬

would decrease in the FOTM regions.

As a conclusion we can state that the payoļ¬ function can have quite diļ¬erent

shapes depending heavily on the speciļ¬c options in the portfolio. If it is possible

to implement the K1 trading rule as proposed the payoļ¬ is negative. But it may

214 9 Trading on Deviations of Implied and Historical Densities

Payoff of K1 Trade : NOTM

20

15

10

5

0

-5

85 90 95 100 105 110 115

Underlying

Figure 9.9. K1 trade payoļ¬ at maturity of portfolio detailed in Table

9.3.

happen that the payoļ¬ function is positive in case that more NOTM options

(long positions) are available than FOTM or ATM (short positions) options.

K1 FOTMā“NOTMā“ATMā“K1 NOTMā“K1

Moneyness Moneyness Moneyness

0.86 ā’ 0.90

short put 0.90 0.90

0.91 ā’ 0.95 0.91 ā’ 0.95

long put 0.95

0.96 ā’ 1.00

short put 1.00 1.00

1.00 ā’ 1.04

short call 1.00 1.00

1.05 ā’ 1.09 1.05 ā’ 1.09

long call 1.05

1.10 ā’ 1.14

short call 1.10 1.10

Table 9.3. Portfolios of kurtosis trades.

9.6.1 Performance

To investigate the performance of the kurtosis trades, K1 and K2, we proceed in

the same way as for the skewness trade. The total net EUR cash ļ¬‚ow of the K1

trade, applied when kurt(f ā— ) > kurt(g ā— ), is strongly positive (10, 915.77 EUR).

As the payoļ¬ proļ¬les from ļ¬gures 9.7 and 9.8 already suggested, all portfolios

generate negative cash ļ¬‚ows at expiration (see magenta bar in Figure 9.10). In

contrast to that, the cash ļ¬‚ow at initiation in t = 0 is always positive. Given

the positive total net cash ļ¬‚ow, we can state that the K1 trade earns its proļ¬t in

t = 0. Looking at the DAX evolution shown in Figure 9.11, we understand why

9.6 Kurtosis Trades 215

Performance K1

CashFlow in EUR

5000

2500

0 Time

-2500

-5000

07/97 10/97 01/98 04/98 07/98 10/98 01/99 04/99 07/99 10/99

Figure 9.10. Performance of K1 trade with 5% transaction costs. The

ļ¬rst (red), second (magenta) and the third bar (blue) show for each

period the cash ļ¬‚ow in t = 0, in t = T and the net cash ļ¬‚ow respectively.

Cash ļ¬‚ows are measured in EUR. XFGSpdTradeKurt.xpl

the payoļ¬ of the portfolios set up in the months of April 1997, May 1997 and in

the months from November 1997 to June 1998 is relatively more negative than

for the portfolios of June 1997 to October 1997 and November 1998 to June

1999. The reason is that the DAX is moving up or down for the former months

and stays within an almost horizontal range of quotes for the latter months

(see the payoļ¬ proļ¬le depicted in Figure 9.8). In July 1998 no portfolio was

set up since kurt(f ā— ) < kurt(g ā— ).

What would have happened if we had implemented the K1 trade without know-

ing both SPDā™s? Again, the answer to this question can only be indicated due

to the rare occurences of periods in which kurt(f ā— ) < kurt(g ā— ). Contrarily to

the S1 trade, the density comparison would have ļ¬ltered out a strongly nega-

tive net cash ļ¬‚ow that would have been generated by a portfolio set up in July

1998. But the signiļ¬cance of this feature is again uncertain.

About the K2 trade can only be said that without a SPD comparison it would

have procured heavy losses. The K2 trade applied as proposed can not be

216 9 Trading on Deviations of Implied and Historical Densities

evaluated completely since there was only one period in which kurt(f ā— ) <

kurt(g ā— ).

DAX 1997-1999

DAX

7000

6500

6000

5500

5000

4500

4000

3500

3000 Time

1/974/97 7/97 10/97 1/98 4/98 7/98 10/98 1/99 4/99 7/99 10/99

Figure 9.11. Evolution of DAX from January 1997 to December 1999

9.7 A Word of Caution

Interpreting the implied SPD as the SPD used by investors to price options, the

historical density as the ā˜realā™ underlyingsā™ SPD and assuming that no agent but

one know the underlyingsā™ SPD one should expect this agent to make higher

proļ¬ts than all others due to its superior knowledge. That is why, exploiting

deviations of implied and historical density appears to be very promising at a

ļ¬rst glance. Of course, if all market agents knew the underlyingsā™ SPD, both

f ā— would be equal to g ā— . In view of the high net cash ļ¬‚ows generated by both

skewness and kurtosis trades of type 1, it seems that not all agents are aware

of discrepancies in the third and fourth moment of both densities. However,

the strategies seem to be exposed to a substantial directional risk. Even if the

dataset contained bearish and bullish market phases, both trades have to be

tested on more extensive data. Considering the current political and economic

9.7 A Word of Caution 217

developments, it is not clear how these trades will perform being exposed to

ā˜peso risksā™. Given that proļ¬ts stem from highly positive cash ļ¬‚ows at portfolio

initiation, i.e. proļ¬ts result from possibly mispriced options, who knows how

the pricing behavior of agents changes, how do agents assign probabilities to

future values of the underlying?

We measured performance in net EUR cash ļ¬‚ows. This approach does not

take risk into account as, for example the Sharpe ratio which is a measure of

the risk adjusted return of an investment. But to compute a return an initial

investment has to be done. However, in the simulation above, some portfolios

generated positive payoļ¬s both at initiation and at maturity. It is a challenge

for future research to ļ¬nd a way how to adjust for risk in such situations.

The SPD comparison yielded the same result for each period but one. The

implied SPD f ā— was in all but one period more negatively skewed than the time

series SPD g ā— . While g ā— was in all periods platykurtic, f ā— was in all but one

period leptokurtic. In this period the kurtosis of g ā— was slightly greater than

that of f ā— . Therefore, there was no alternating use of type 1 and type 2 trades.

But in more turbulent market environments such an approach might prove

useful. The procedure could be extended and ļ¬ne tuned by applying a density

distance measure as in Aitā“Sahalia, Wang and Yared (2000) to give a signal

when to set up a portfolio either of type 1 of type 2. Furthermore, it is tempting

to modify the time series density estimation method such that the monte carlo

paths be simulated drawing random numbers not from a normal distribution

but from the distribution of the residuals resulting from the nonparametric

estimation of ĻF Z (ā¢), HĀØrdle and Yatchew (2001).

a

Bibliography

Aitā“Sahalia, Y., Wang, Y. and Yared, F. (2001). Do Option Markets correctly

Price the Probabilities of Movement of the Underlying Asset?, Journal of

Econometrics 102: 67ā“110.

Barle, S. and Cakici, N., (1998). How to Grow a Smiling Tree, The Journal of

Financial Engineering 7: 127ā“146.

Black, F. and Scholes, M., (1998). The Pricing of Options and Corporate

Liabilities, Journal of Political Economy 81: 637ā“659.

218 9 Trading on Deviations of Implied and Historical Densities

Blaskowitz, O. (2001). Trading on Deviations of Implied and Historical Density,

Diploma Thesis, Humboldtā“UniversitĀØt zu Berlin.

a

Breeden, D. and Litzenberger, R., (1978). Prices of State Contingent Claims

Implicit in Option Prices, Journal of Business, 9, 4: 621ā“651.

Cox, J., Ross, S. and Rubinstein, M. (1979). Option Pricing: A simpliļ¬ed

Approach, Journal of Financial Economics 7: 229ā“263.

Derman, E. and Kani, I. (1994). The Volatility Smile and Its Implied Tree,

http://www.gs.com/qs/

Dupire, B. (1994). Pricing with a Smile, Risk 7: 18ā“20.

Florensā“Zmirou, D. (1993). On Estimating the Diļ¬usion Coeļ¬cient from Dis-

crete Observations, Journal of Applied Probability 30: 790ā“804.

Franke, J., HĀØrdle, W. and Hafner, C. (2001). EinfĀØhrung in die Statistik der

a u

FinanzmĀØrkte, Springer Verlag, Heidelberg.

a

HĀØrdle, W. and Simar, L. (2002). Applied Multivariate Statistical Analysis,

a

Springer Verlag, Heidelberg.

HĀØrdle, W. and Tsybakov, A., (1995). Local Polynomial Estimators of the

a

Volatility Function in Nonparametric Autoregression, Sonderforschungs-

bereich 373 Discussion Paper, Humboldtā“UniversitĀØt zu Berlin.

a

HĀØrdle, W. and Yatchew, A. (2001). Dynamic Nonparametric State Price

a

Density Estimation using Constrained Least Squares and the Bootstrap,

Sonderforschungsbereich 373 Discussion Paper, Humboldtā“UniversitĀØt zu

a

Berlin.

HĀØrdle, W. and Zheng, J. (2001). How Precise Are Price Distributions Predicted

a

by Implied Binomial Trees?, Sonderforschungsbereich 373 Discussion Pa-

per, Humboldtā“UniversitĀØt zu Berlin.

a

Jackwerth, J.C. (1999). Option Implied Risk Neutral Distributions and Im-

plied Binomial Trees: A Literatur Review, The Journal of Derivatives

Winter: 66ā“82.

Kloeden, P., Platen, E. and Schurz, H. (1994). Numerical Solution of SDE

Through Computer Experiments, Springer Verlag, Heidelberg.

Rubinstein, M. (1994). Implied Binomial Trees, Journal of Finance 49: 771ā“

818.

Part IV

Econometrics

10 Multivariate Volatility Models

Matthias R. Fengler and Helmut Herwartz

Multivariate volatility models are widely used in Finance to capture both

volatility clustering and contemporaneous correlation of asset return vectors.

Here we focus on multivariate GARCH models. In this common model class

it is assumed that the covariance of the error distribution follows a time de-

pendent process conditional on information which is generated by the history

of the process. To provide a particular example, we consider a system of ex-

change rates of two currencies measured against the US Dollar (USD), namely

the Deutsche Mark (DEM) and the British Pound Sterling (GBP). For this

process we compare the dynamic properties of the bivariate model with uni-

variate GARCH speciļ¬cations where cross sectional dependencies are ignored.

Moreover, we illustrate the scope of the bivariate model by ex-ante forecasts of

bivariate exchange rate densities.

10.1 Introduction

Volatility clustering, i.e. positive correlation of price variations observed on

speculative markets, motivated the introduction of autoregressive conditionally

heteroskedastic (ARCH) processes by Engle (1982) and its popular generaliza-

tions by Bollerslev (1986) (Generalized ARCH, GARCH) and Nelson (1991)

(exponential GARCH, EGARCH). Being univariate in nature, however, such

models neglect a further stylized fact of empirical price variations, namely con-

temporaneous cross correlation e.g. over a set of assets, stock market indices,

or exchange rates.

Cross section relationships are often implied by economic theory. Interest rate

parities, for instance, provide a close relation between domestic and foreign

bond rates. Assuming absence of arbitrage, the so-called triangular equation

formalizes the equality of an exchange rate between two currencies on the one

222 10 Multivariate Volatility Models

hand and an implied rate constructed via exchange rates measured towards a

third currency. Furthermore, stock prices of ļ¬rms acting on the same market

often show similar patterns in the sequel of news that are important for the

entire market (Hafner and Herwartz, 1998). Similarly, analyzing global volatil-

ity transmission Engle, Ito and Lin (1990) and Hamao, Masulis and Ng (1990)

found evidence in favor of volatility spillovers between the worldā™s major trad-

ing areas occurring in the sequel of ļ¬‚oor trading hours. From this point of view,

when modeling time varying volatilities, a multivariate model appears to be a

natural framework to take cross sectional information into account. Moreover,

the covariance between ļ¬nancial assets is of essential importance in ļ¬nance.

Eļ¬ectively, many problems in ļ¬nancial practice like portfolio optimization,

hedging strategies, or Value-at-Risk evaluation require multivariate volatility

measures (Bollerslev et al., 1988; Cecchetti, Cumby and Figlewski, 1988).

10.1.1 Model speciļ¬cations

Let Īµt = (Īµ1t , Īµ2t , . . . , ĪµN t ) denote an N -dimensional error process, which is

either directly observed or estimated from a multivariate regression model. The

process Īµt follows a multivariate GARCH process if it has the representation

1/2

Īµt = Ī£t Ī¾t , (10.1)

where Ī£t is measurable with respect to information generated up to time t ā’ 1,

denoted by the ļ¬ltration Ftā’1 . By assumption the N components of Ī¾t follow a

multivariate Gaussian distribution with mean zero and covariance matrix equal

to the identity matrix.

The conditional covariance matrix, Ī£t = E[Īµt Īµt |Ftā’1 ], has typical elements

Ļij with Ļii , i = 1, . . . , N, denoting conditional variances and oļ¬-diagonal ele-

ments Ļij , i, j = 1, . . . , N, i = j, denoting conditional covariances. To make the

speciļ¬cation in (10.1) feasible a parametric description relating Ī£t to Ftā’1 is

necessary. In a multivariate setting, however, dependencies of the second order

moments in Ī£t on Ftā’1 become easily computationally intractable for practical

purposes.

Let vech(A) denote the half-vectorization operator stacking the elements of a

1

quadratic (N Ć— N )-matrix A from the main diagonal downwards in a 2 N (N +

1) dimensional column vector. Within the so-called vec-representation of the

10.1 Introduction 223

GARCH(p, q) model Ī£t is speciļ¬ed as follows:

q p

Ė Ė

vech(Ī£t ) = c + Ai vech(Īµtā’i Īµtā’i ) + Gi vech(Ī£tā’i ). (10.2)

i=1 i=1

Ė Ė

In (10.2) the matrices Ai and Gi each contain {N (N + 1)/2}2 elements. Deter-

ministic covariance components are collected in c, a column vector of dimension

N (N + 1)/2. We consider in the following the case p = q = 1 since in applied

work the GARCH(1,1) model has turned out to be particularly useful to de-

scribe a wide variety of ļ¬nancial market data (Bollerslev, Engle and Nelson,

1994).

On the one hand the vecā“model in (10.2) allows for a very general dynamic

structure of the multivariate volatility process. On the other hand this speciļ¬-

cation suļ¬ers from high dimensionality of the relevant parameter space, which

makes it almost intractable for empirical work. In addition, it might be cumber-

some in applied work to restrict the admissible parameter space such that the

implied matrices Ī£t , t = 1, . . . , T , are positive deļ¬nite. These issues motivated

a considerable variety of competing multivariate GARCH speciļ¬cations.

Prominent proposals reducing the dimensionality of (10.2) are the constant

correlation model (Bollerslev, 1990) and the diagonal model (Bollerslev et al.,

1988). Specifying diagonal elements of Ī£t both of these approaches assume the

absence of cross equation dynamics, i.e. the only dynamics are

Ļii,t = cii + ai Īµ2

i,tā’1 + gi Ļii,tā’1 , i = 1, . . . , N. (10.3)

To determine oļ¬-diagonal elements of Ī£t Bollerslev (1990) proposes a constant

contemporaneous correlation,

ā

Ļij,t = Ļij Ļii Ļjj , i, j = 1, . . . , N, (10.4)

whereas Bollerslev et al. (1988) introduce an ARMA-type dynamic structure

as in (10.3) for Ļij,t as well, i.e.

Ļij,t = cij + aij Īµi,tā’1 Īµj,tā’1 + gij Ļij,tā’1 , i, j = 1, . . . , N. (10.5)

For the bivariate case (N = 2) with p = q = 1 the constant correlation model

contains only 7 parameters compared to 21 parameters encountered in the full

model (10.2). The diagonal model is speciļ¬ed with 9 parameters. The price

that both models pay for parsimonity is in ruling out cross equation dynamics as

allowed in the general vec-model. Positive deļ¬niteness of Ī£t is easily guaranteed

224 10 Multivariate Volatility Models

for the constant correlation model (|Ļij | < 1), whereas the diagonal model

requires more complicated restrictions to provide positive deļ¬nite covariance

matrices.

The so-called BEKK-model (named after Baba, Engle, Kraft and Kroner, 1990)

provides a richer dynamic structure compared to both restricted processes men-

tioned before. Deļ¬ning N Ć— N matrices Aik and Gik and an upper triangular

matrix C0 the BEKKā“model reads in a general version as follows:

q p

K K

Ī£t = C0 C0 + Aik Īµtā’i Īµtā’i Aik + Gik Ī£tā’i Gik . (10.6)

k=1 i=1 k=1 i=1

If K = q = p = 1 and N = 2, the model in (10.6) contains 11 parameters and

implies the following dynamic model for typical elements of Ī£t :

= c11 + a2 Īµ2 22

Ļ11,t 11 1,tā’1 + 2a11 a21 Īµ1,tā’1 Īµ2,tā’1 + a21 Īµ2,tā’1

2 2

+ g11 Ļ11,tā’1 + 2g11 g21 Ļ21,tā’1 + g21 Ļ22,tā’1 ,

c21 + a11 a22 Īµ2 2

Ļ21,t = 1,tā’1 + (a21 a12 + a11 a22 )Īµ1,tā’1 Īµ2,tā’1 + a21 a22 Īµ2,tā’1

+ g11 g22 Ļ11,tā’1 + (g21 g12 + g11 g22 )Ļ12,tā’1 + g21 g22 Ļ22,tā’1 ,

c22 + a2 Īµ2 22

Ļ22,t = 12 1,tā’1 + 2a12 a22 Īµ1,tā’1 Īµ2,tā’1 + a22 Īµ2,tā’1

2 2

+ g12 Ļ11,tā’1 + 2g12 g22 Ļ21,tā’1 + g22 Ļ22,tā’1 .

Compared to the diagonal model the BEKKā“speciļ¬cation economizes on the

number of parameters by restricting the vecā“model within and across equa-

tions. Since Aik and Gik are not required to be diagonal, the BEKK-model

is convenient to allow for cross dynamics of conditional covariances. The pa-

rameter K governs to which extent the general representation in (10.2) can be

approximated by a BEKK-type model. In the following we assume K = 1.

Note that in the bivariate case with K = p = q = 1 the BEKK-model contains

11 parameters. If K = 1 the matrices A11 and ā’A11 , imply the same condi-

tional covariances. Thus, for uniqueness of the BEKK-representation a11 > 0

and g11 > 0 is assumed. Note that the right hand side of (10.6) involves only

quadratic terms and, hence, given convenient initial conditions, Ī£t is positive

deļ¬nite under the weak (suļ¬cient) condition that at least one of the matrices

C0 or Gik has full rank (Engle and Kroner, 1995).

10.1.2 Estimation of the BEKK-model

As in the univariate case the parameters of a multivariate GARCH model are

estimated by maximum likelihood (ML) optimizing numerically the Gaussian

10.2 An empirical illustration 225

log-likelihood function.

With f denoting the multivariate normal density, the contribution of a single

observation, lt , to the log-likelihood of a sample is given as:

ln{f (Īµt |Ftā’1 )}

lt =

N 1 1

= ā’ ln(2Ļ) ā’ ln(|Ī£t |) ā’ Īµt Ī£ā’1 Īµt .

t

2 2 2

T

Maximizing the log-likelihood, l = t=1 lt , requires nonlinear maximization

methods. Involving only ļ¬rst order derivatives the algorithm introduced by

Berndt, Hall, Hall, and Hausman (1974) is easily implemented and particularly

useful for the estimation of multivariate GARCH processes.

If the actual error distribution diļ¬ers from the multivariate normal, maximizing

the Gaussian log-likelihood has become popular as Quasi ML (QML) estima-

tion. In the multivariate framework, results for the asymptotic properties of

the (Q)ML-estimator have been derived recently. Jeantheau (1998) proves the

QML-estimator to be consistent under the main assumption that the consid-

ered multivariate process is strictly stationary and ergodic. Further assuming

ļ¬niteness of moments of Īµt up to order eight, Comte and Lieberman (2000)

derive asymptotic normality of the QML-estimator. The asymptotic distribu-

tion of the rescaled QML-estimator is analogous to the univariate case and

discussed in Bollerslev and Wooldridge (1992).

10.2 An empirical illustration

10.2.1 Data description

We analyze daily quotes of two European currencies measured against the USD,

namely the DEM and the GBP. The sample period is December 31, 1979 to

April 1, 1994, covering T = 3720 observations. Note that a subperiod of our

sample has already been investigated by Bollerslev and Engle (1993) discussing

common features of volatility processes.

The data is provided in fx. The ļ¬rst column contains DEM/USD and

the second GBP/USD. In XploRe a preliminary statistical analysis is easily

done by the summarize command. Before inspecting the summary statis-

tics, we load the data, Rt , and take log diļ¬erences, Īµt = ln(Rt ) ā’ ln(Rtā’1 ).

XFGmvol01.xpl produces the following table:

226 10 Multivariate Volatility Models

[2,] " Minimum Maximum Mean Median Std.Error"

[3,] "-----------------------------------------------------------"

[4,] "DEM/USD -0.040125 0.031874 -4.7184e-06 0 0.0070936"

[5,] "GBP/USD -0.046682 0.038665 0.00011003 0 0.0069721"

XFGmvol01.xpl

Evidently, the empirical means of both processes are very close to zero (-4.72e-

06 and 1.10e-04, respectively). Also minimum, maximum and standard errors

are of similar size. First diļ¬erences of the respective log exchange rates are

shown in Figure 10.1. As is apparent from Figure 10.1, variations of exchange

rate returns exhibit an autoregressive pattern: Large returns in foreign ex-

change markets are followed by large returns of either sign. This is most obvious

in periods of excessive returns. Note that these volatility clusters tend to coin-

cide in both series. It is precisely this observation that justiļ¬es a multivariate

GARCH speciļ¬cation.

10.2.2 Estimating bivariate GARCH

{coeff, likest} = bigarch(theta,et)

estimates a bivariate GARCH model

The quantlet bigarch provides a fast algorithm to estimate the BEKK repre-

sentation of a bivariate GARCH(1,1) model. QML-estimation is implemented

by means of the BHHH-algorithm which minimizes the negative Gaussian log-

likelihood function. The algorithm employs analytical ļ¬rst order derivatives of

the log-likelihood function (LĀØtkepohl, 1996) with respect to the 11-dimensional

u

vector of parameters containing the elements of C0 , A11 and G11 as given in

(10.6).

10.2 An empirical illustration 227

DEM/USD

0.04

0.02

Returns

0

-0.02

-0.04

1980 1982 1984 1986 1988 1990 1992 1994

Time

GBP/USD

0.04

0.02

Returns

0

-0.02

-0.04

1980 1982 1984 1986 1988 1990 1992 1994

Time

Figure 10.1. Foreign exchange rate data: returns.

XFGmvol01.xpl

228 10 Multivariate Volatility Models

The standard call is

{coeff, likest}=bigarch(theta, et),

where as input parameters we have initial values theta for the iteration algo-

rithm and the data set, e.g. ļ¬nancial returns, stored in et. The estimation

output is the vector coeff containing the stacked elements of the parameter

matrices C0 , A11 and G11 in (10.6) after numerical optimization of the Gaussian

log-likelihood function. Being an iterative procedure the algorithm requires to

determine suitable initial parameters theta. For the diagonal elements of the

matrices A11 and G11 values around 0.3 and 0.9 appear reasonable, since in uni-

variate GARCH(1,1) models parameter estimates for a1 and g1 in (10.3) often

take values around 0.32 = 0.09 and 0.81 = 0.92 . There is no clear guidance how

to determine initial values for oļ¬ diagonal elements of A11 or G11 . Therefore

it might be reasonable to try alternative initializations of these parameters.

Given an initialization of A11 and G11 the starting values for the elements in

C0 are immediately determined by the algorithm assuming the unconditional

covariance of Īµt to exist, Engle and Kroner (1995).

Given our example under investigation the bivariate GARCH estimation yields

as output:

Contents of coeff

[ 1,] 0.0011516

[ 2,] 0.00031009

[ 3,] 0.00075685

[ 4,] 0.28185

[ 5,] -0.057194

[ 6,] -0.050449

[ 7,] 0.29344

[ 8,] 0.93878

[ 9,] 0.025117

[10,] 0.027503

[11,] 0.9391

Contents of likest

[1,] -28599

XFGmvol02.xpl

10.2 An empirical illustration 229

The last number is the obtained minimum of the negative log-likelihood func-

tion. The vector coeff given ļ¬rst contains as ļ¬rst three elements the parame-

ters of the upper triangular matrix C0 , the following four belong to the ARCH

(A11 ) and the last four to the GARCH parameters (G11 ), i.e. for our model

Ī£t = C0 C0 + A11 Īµtā’1 Īµtā’1 A11 + G11 Ī£tā’1 G11 (10.7)

stated again for convenience, we ļ¬nd the matrices C0 , A, G to be:

1.15 .31

C0 = 10ā’3 ,

0 .76

.282 ā’.050 .939 .028

A11 = , G11 = . (10.8)

ā’.057 .293 .025 .939

10.2.3 Estimating the (co)variance processes

The (co)variance is obtained by sequentially calculating the diļ¬erence equation

(10.7) where we use the estimator for the unconditional covariance matrix as

initial value (Ī£0 = E T E ). Here, the T Ć— 2 vector E contains log-diļ¬erences

of our foreign exchange rate data. Estimating the covariance process is also

accomplished in the quantlet XFGmvol02.xpl and additionally provided in

sigmaprocess.

We display the estimated variance and covariance processes in Figure 10.2. The

upper and the lower panel of Figure 10.2 show the variances of the DEM/USD

and GBP/USD returns respectively, whereas in the middle panel we see the co-

variance process. Except for a very short period in the beginning of our sample

the covariance is positive and of non-negligible size throughout. This is evi-

dence for cross sectional dependencies in currency markets which we mentioned

earlier to motivate multivariate GARCH models.

Instead of estimating the realized path of variances as shown above,

we could also use the estimated parameters to simulate volatility paths

( XFGmvol03.xpl).

230 10 Multivariate Volatility Models

DEM/USD

15

Sigma11

10

5

1980 1982 1984 1986 1988 1990 1992 1994

Time

Covariance

5 10 15

Sigma12

0

1980 1982 1984 1986 1988 1990 1992 1994

Time

GBP/USD

20 30

Sigma22

10

0

1980 1982 1984 1986 1988 1990 1992 1994

Time

Ė

Figure 10.2. Estimated variance and covariance processes, 105 Ī£t .

XFGmvol02.xpl

10.2 An empirical illustration 231

DEM/USD - Simulation

10 15 20

Sigma11

5

0 500 1000 1500 2000 2500 3000

Time

Covariance

10 15

Sigma12

5

0

0 500 1000 1500 2000 2500 3000

Time

GBP/USD - Simulation

10 20 30 40

Sigma22

0

0 500 1000

1500 2000 2500 3000

Time

Figure 10.3. Simulated variance and covariance processes, both bivari-

Ė

ate (blue) and univariate case (green), 105 Ī£t .

XFGmvol03.xpl

232 10 Multivariate Volatility Models

For this at each point in time an observation Īµt is drawn from a multivariate

normal distribution with variance Ī£t . Given these observations, Ī£t is updated

according to (10.7). Then, a new residual is drawn with covariance Ī£t+1 . We

apply this procedure for T = 3000. The results, displayed in the upper three

panels of Figure 10.3, show a similar pattern as the original process given in

Figure 10.2. For the lower two panels we generate two variance processes from

the same residuals Ī¾t . In this case, however, we set oļ¬-diagonal parameters in

A11 and G11 to zero to illustrate how the unrestricted BEKK model incorpo-

rates cross equation dynamics. As can be seen, both approaches are convenient

to capture volatility clustering. Depending on the particular state of the sys-

tem, spillover eļ¬ects operating through conditional covariances, however, have

a considerable impact on the magnitude of conditional volatility.

10.3 Forecasting exchange rate densities

The preceding section illustrated how the GARCH model may be employed

eļ¬ectively to describe empirical price variations of foreign exchange rates. For

practical purposes, as for instance scenario analysis, VaR estimation (Chap-

ter 1), option pricing (Chapter 16), one is often interested in the future joint

density of a set of asset prices. Continuing the comparison of the univariate

and bivariate approach to model volatility dynamics of exchange rates it is

thus natural to investigate the properties of these speciļ¬cations in terms of

forecasting performance.

We implement an iterative forecasting scheme along the following lines: Given

the estimated univariate and bivariate volatility models and the corresponding

information sets Ftā’1 , t = 1, . . . , T ā’ 5 (Figure 10.2), we employ the identi-

ļ¬ed data generating processes to simulate one-week-ahead forecasts of both

exchange rates. To get a reliable estimate of the future density we set the

number of simulations to 50000 for each initial scenario. This procedure yields

two bivariate samples of future exchange rates, one simulated under bivariate,

the other one simulated under univariate GARCH assumptions.

A review on the current state of evaluating competing density forecasts is of-

fered by Tay and Wallis (1990). Adopting a Bayesian perspective the common

approach is to compare the expected loss of actions evaluated under alterna-

tive density forecasts. In our pure time series framework, however, a particular

action is hardly available for forecast density comparisons. Alternatively one

could concentrate on statistics directly derived from the simulated densities,

10.3 Forecasting exchange rate densities 233

Time window J Success ratio SRJ

1980 1981 0.744

1982 1983 0.757

1984 1985 0.793

1986 1987 0.788

1988 1989 0.806

1990 1991 0.807

1992 1994/4 0.856

Table 10.1. Time varying frequencies of the bivariate GARCH model

outperforming the univariate one in terms of one-week-ahead forecasts

(success ratio)

such as ļ¬rst and second order moments or even quantiles. Due to the mul-

tivariate nature of the time series under consideration it is a nontrivial issue

to rank alternative density forecasts in terms of these statistics. Therefore,

we regard a particular volatility model to be superior to another if it provides

a higher simulated density estimate of the actual bivariate future exchange

rate. This is accomplished by evaluating both densities at the actually realized

exchange rate obtained from a bivariate kernel estimation. Since the latter

comparison might suļ¬er from diļ¬erent unconditional variances under univari-

ate and multivariate volatility, the two simulated densities were rescaled to

have identical variance. Performing the latter forecasting exercises iteratively

over 3714 time points we can test if the bivariate volatility model outperforms

the univariate one.

To formalize the latter ideas we deļ¬ne a success ratio SRJ as

1 Ė Ė

SRJ = 1{fbiv (Rt+5 ) > funi (Rt+5 )}, (10.9)

|J|

tāJ

where J denotes a time window containing |J| observations and 1 an indica-

Ė Ė

tor function. fbiv (Rt+5 ) and funi (Rt+5 ) are the estimated densities of future

exchange rates, which are simulated by the bivariate and univariate GARCH

processes, respectively, and which are evaluated at the actual exchange rate

levels Rt+5 . The simulations are performed in XFGmvol04.xpl.

Our results show that the bivariate model indeed outperforms the univariate

one when both likelihoods are compared under the actual realizations of the

exchange rate process. In 81.6% of all cases across the sample period, SRJ =

0.816, J = {t : t = 1, ..., T ā’5}, the bivariate model provides a better forecast.

234 10 Multivariate Volatility Models

Covariance and success ratio

1.5

1

0.5

0

1980 1982 1984 1986 1988 1990 1992 1994

Time

Figure 10.4. Estimated covariance process from the bivariate GARCH

model (104 Ļ12 , blue) and success ratio over overlapping time intervals

Ė

with window length 80 days (red).

This is highly signiļ¬cant. In Table 10.1 we show that the overall superiority of

the bivariate volatility approach is conļ¬rmed when considering subsamples of

two-years length. A-priori one may expect the bivariate model to outperform

the univariate one the larger (in absolute value) the covariance between both

return processes is. To verify this argument we display in Figure 10.4 the

empirical covariance estimates from Figure 10.2 jointly with the success ratio

evaluated over overlapping time intervals of length |J| = 80.

As is apparent from Figure 10.4 there is a close co-movement between the

success ratio and the general trend of the covariance process, which conļ¬rms

our expectations: the forecasting power of the bivariate GARCH model is

10.3 Forecasting exchange rate densities 235

particularly strong in periods where the DEM/USD and GBP/USD exchange

rate returns exhibit a high covariance. For completeness it is worthwhile to

mention that similar results are obtained if the window width is varied over

reasonable choices of |J| ranging from 40 to 150.

With respect to ļ¬nancial practice and research we take our results as strong

support for a multivariate approach towards asset price modeling. Whenever

contemporaneous correlation across markets matters, the system approach of-

fers essential advantages. To name a few areas of interest multivariate volatil-

ity models are supposed to yield useful insights for risk management, scenario

analysis and option pricing.

Bibliography

Baba, Y., Engle, R.F., Kraft, D.F., and Kroner, K.F. (1990). Multivariate Si-

multaneous Generalized ARCH, mimeo, Department of Economics, Uni-

versity of California, San Diego.

Berndt, E.K., Hall B.H., Hall, R.E., and Hausman, J.A. (1974). Estimation

and Inference in Nonlinear Structural Models, Annals of Economic and

Social Measurement 3/4: 653ā“665.

Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroscedastic-

ity, Journal of Econometrics 31: 307-327.

Bollerslev, T. (1990). Modeling the Coherence in Short-Run Nominal Exchange

Rates: A Multivariate Generalized ARCH Approach, Review of Economics

and Statistics 72: 498ā“505.

Bollerslev, T. and Engle, R.F. (1993). Common Persistence in Conditional

Variances, Econometrica 61: 167ā“186.

Bollerslev, T., Engle, R.F. and Nelson, D.B. (1994). GARCH Models, in: En-

gle, R.F., and McFadden, D.L. (eds.) Handbook of Econometrics, Vol. 4,

Elsevier, Amsterdam, 2961ā“3038.

Bollerslev, T., Engle, R.F. and Wooldridge, J.M. (1988). A Capital Asset Pric-

ing Model with Time-Varying Covariances, Journal of Political Economy

96: 116ā“131.

236 10 Multivariate Volatility Models

Bollerslev, T. and Wooldridge, J.M. (1992). Quasiā“Maximum Likelihood Esti-

mation and Inference in Dynamic Models with Timeā“Varying Covariances,

Econometric Reviews, 11: 143ā“172.

Cecchetti, S.G., Cumby, R.E. and Figlewski, S. (1988). Estimation of the Op-

timal Futures Hedge, Review of Economics and Statistics 70: 623-630.

Comte, F. and Lieberman, O. (2000). Asymptotic Theory for Multivariate

GARCH Processes, Manuscript, Universities Paris 6 and Paris 7.

Engle, R.F. (1982). Autoregressive Conditional Heteroscedasticity with Esti-

mates of the Variance of UK Inļ¬‚ation. Econometrica 50: 987-1008.

Engle, R.F., Ito, T. and Lin, W.L. (1990). Meteor Showers or Heat Waves?

Heteroskedastic Intra-Daily Volatility in the Foreign Exchange Market,

Econometrica 58: 525ā“542.

Engle, R.F. and Kroner, K.F. (1995). Multivariate Simultaneous Generalized

ARCH, Econometric Theory 11: 122ā“150.

Hafner, C.M. and Herwartz, H. (1998). Structural Analysis of Portfolio Risk

using Beta Impulse Response Functions, Statistica Neerlandica 52: 336-

355.

Hamao, Y., Masulis, R.W. and Ng, V.K. (1990). Correlations in Price Changes

and Volatility across International Stock Markets, Review of Financial

Studies 3: 281ā“307.

Jeantheau, T. (1998). Strong Consistency of Estimators for Multivariate ARCH

Models, Econometric Theory 14: 70-86.

LĀØtkepohl, H. (1996). Handbook of Matrices, Wiley, Chichester.

u

Nelson, D.B. (1991). Conditional Heteroskedasticity in Asset Returns: A New

Approach, Econometrica 59: 347ā“370.

Tay, A. and Wallis, K. (2000). Density forecasting: A Survey, Journal of Fore-

casting 19: 235ā“254.

11 Statistical Process Control

Sven Knoth

Statistical Process Control (SPC) is the misleading title of the area of statistics

which is concerned with the statistical monitoring of sequentially observed data.

Together with the theory of sampling plans, capability analysis and similar

topics it forms the ļ¬eld of Statistical Quality Control. SPC started in the

1930s with the pioneering work of Shewhart (1931). Then, SPC became very

popular with the introduction of new quality policies in the industries of Japan

and of the USA. Nowadays, SPC methods are considered not only in industrial

statistics. In ļ¬nance, medicine, environmental statistics, and in other ļ¬elds of

applications practitioners and statisticians use and investigate SPC methods.

A SPC scheme ā“ in industry mostly called control chart ā“ is a sequential scheme

for detecting the so called change point in the sequence of observed data. Here,

we consider the most simple case. All observations X1 , X2 , . . . are independent,

normally distributed with known variance Ļ 2 . Up to an unknown time point

m ā’ 1 the expectation of the Xi is equal to Āµ0 , starting with the change point

m the expectation is switched to Āµ1 = Āµ0 . While both expectation values

are known, the change point m is unknown. Now, based on the sequentially

observed data the SPC scheme has to detect whether a change occurred.

SPC schemes can be described by a stopping time L ā“ known as run length ā“

which is adapted to the sequence of sigma algebras Fn = F(X1 , X2 , . . . , Xn ).

The performance or power of these schemes is usually measured by the Average

Run Length (ARL), the expectation of L. The ARL denotes the average num-

ber of observations until the SPC scheme signals. We distinguish false alarms

ā“ the scheme signals before m, i. e. before the change actually took place ā“ and

right ones. A suitable scheme provides large ARLs for m = ā and small ARLs

for m = 1. In case of 1 < m < ā one has to consider further performance

measures. In the case of the oldest schemes ā“ the Shewhart charts ā“ the typical

inference characteristics like the error probabilities were ļ¬rstly used.

238 11 Statistical Process Control

The chapter is organized as follows. In Section 11.1 the charts in consider-

ation are introduced and their graphical representation is demonstrated. In

the Section 11.2 the most popular chart characteristics are described. First,

the characteristics as the ARL and the Average Delay (AD) are deļ¬ned. These

performance measures are used for the setup of the applied SPC scheme. Then,

the three subsections of Section 11.2 are concerned with the usage of the SPC

routines for determination of the ARL, the AD, and the probability mass func-

tion (PMF) of the run length. In Section 11.3 some results of two papers are

reproduced with the corresponding XploRe quantlets.

11.1 Control Charts

Recall that the data X1 , X2 , . . . follow the change point model

Xt ā¼ N (Āµ0 , Ļ 2 ) , t = 1, 2, . . . , m ā’ 1

. (11.1)

Xt ā¼ N (Āµ1 = Āµ0 , Ļ 2 ) , t = m, m + 1, . . .

The observations are independent and the time point m is unknown. The

control chart (the SPC scheme) corresponds to a stopping time L. Here we

consider three diļ¬erent schemes ā“ the Shewhart chart, EWMA and CUSUM

schemes. There are one- and two-sided versions. The related stopping times in

the one-sided upper versions are:

1. The Shewhart chart introduced by Shewhart (1931)

Xt ā’ Āµ0

LShewhart = inf t ā I : Zt =

N > c1 (11.2)

Ļ

with the design parameter c1 called critical value.

2. The EWMA scheme (exponentially weighted moving average) initially

presented by Roberts (1959)

LEWMA = inf t ā I : Zt

EWMA

Ī»/(2 ā’ Ī») ,

N > c2 (11.3)

EWMA

Z0 = z0 = 0 ,

Xt ā’ Āµ0

EWMA EWMA

= (1 ā’ Ī») Ztā’1

Zt +Ī» , t = 1, 2, . . . (11.4)

Ļ

with the smoothing value Ī» and the critical value c2 . The smaller Ī» the

faster EWMA detects small Āµ1 ā’ Āµ0 > 0.

11.1 Control Charts 239

3. The CUSUM scheme (cumulative sum) introduced by Page (1954)

LCUSUM = inf t ā I : Zt

CUSUM

N > c3 , (11.5)

CUSUM

Z0 = z0 = 0 ,

Xt ā’ Āµ0

CUSUM CUSUM

ā’k

Zt = max 0, Ztā’1 + , t = 1, 2, . . . (11.6)

Ļ

with the reference value k and the critical value c3 (known as decision

interval). For fastest detection of Āµ1 ā’ Āµ0 CUSUM has to be set up with

k = (Āµ1 + Āµ0 )/(2 Ļ).

The above notation uses normalized data. Thus, it is not important whether

Xt is a single observation or a sample statistic as the empirical mean.

Remark, that for using one-sided lower schemes one has to apply the upper

schemes to the data multiplied with -1. A slight modiļ¬cation of one-sided

Shewhart and EWMA charts leads to their two-sided versions. One has to

replace in the comparison of chart statistic and threshold the original statistic

EWMA

Zt and Zt by their absolute value. The two-sided versions of these schemes

ńņš. 8 |