. 7
( 19)


Table 3.7 Granger causality tests

In¬‚ation does not Granger cause z2t
F -Statistic
Variable Signi¬cance

πt 5.8
z2t does not Granger cause in¬‚ation
F -Statistic
Variable Signi¬cance

z2t 6.12
Means rejection at 5% level.
Modelling the Term Structure of Interest Rates 125

0.25 0.3
0.00 0.0
2 4 6 8 10 12 14 16 18 20
2 4 6 8 10 12 14 16 18 20

0.25 0.56
0.05 0.32
’0.15 0.08
2 4 6 8 10 12 14 16 18 20
2 4 6 8 10 12 14 16 18 20

Figure 3.17 Impulse-response functions and two standard error bands

Overall, considering the results for the factor loadings and the relationship between
the latent factors and the economic variables, the short-term interest rates are mostly
driven by the real interest rates. In parallel, the in¬‚ation expectations exert the most
important in¬‚uence on the long-term rates, as is usually assumed. Similar results con-
cerning the information content of the German term structure, namely in the longer
terms, regarding future changes in in¬‚ation rate were obtained in several previous papers,
namely Schich (1996), Gerlach (1995) and Mishkin (1991), using different samples and
testing procedures.53

The Mishkin (1991) and Jorion and Mishkin (1991) results on Germany are contradictory as, according
to Mishkin (1991), the short end of the term structure does not contain information on future in¬‚ation for all
OECD countries studied, except for France, the UK and Germany. Conversely, Jorion and Mishkin (1991)
conclude that the predictive power of the shorter rates about future in¬‚ation is low in the USA, Germany and
Fama (1990) and Mishkin (1990a,b) present identical conclusions concerning the information content of
US term structure regarding future in¬‚ation and state that the US dollar short rates have information content
regarding future real interest rates and the longer rates contain information on in¬‚ation expectations. Mishkin
(1990b) also concludes that for several countries the information on in¬‚ation expectations is weaker than for
the United States. Mehra (1997) presents evidence of a cointegration relation between the nominal yield on a
10-year Treasury bond and the actual US in¬‚ation rate. Koedijk and Kool (1995), Mishkin (1991), and Jorion
and Mishkin (1991) supply some evidence on the information content of the term structure concerning in¬‚ation
rate in several countries.
126 Applied Quantitative Methods for Trading and Investment

Fleming and Remolona (1998) also found that macroeconomic announcements of the
CPI and the PPI affect mostly the long end of the term structure of interest rates, using
high frequency data. Nevertheless, it is also important to have in mind that, according
to Schich (1999), the information content of the term structure on future in¬‚ation is
time-varying and depends on the country considered.
The relationship between the factors and the referred variables is consistent with the
time-series properties of the factors and those variables. Actually, as observed in Clarida
et al. (1998), the I (1) hypothesis is rejected in Dickey“Fuller tests for the German in¬‚a-
tion rate and short-term interest rate, while, as previously stated, the factors are stationary.

The identi¬cation of the factors that determine the time-series and cross-section behaviour
of the term structure of interest rates is one of the most challenging research topics in
¬nance. In this chapter, it was shown that a two-factor constant volatility model describes
quite well the dynamics and shape of the German yield curve between 1986 and 1998.
The data supports the expectations theory with constant term premiums and thus the
term premium structure can be calculated and short-term interest rate expectations derived
from the adjusted forward rate curve. The estimates obtained for the term premium curve
are not inconsistent with the ¬gures usually conjectured. Nevertheless, poorer results are
obtained if the second factor is directly linked to the in¬‚ation rate, given that restrictions
on the behaviour of that factor are imposed, generating less plausible shapes and ¬gures
for the term premium curve.
We identi¬ed within the sample two periods of poorer model performance, both related
to world-wide gyrations in bond markets (Spring 1994 and 1998), which were char-
acterised by sharp changes in long-term interest rates while short-term rates remained
As to the evolution of bond yields in Germany during 1998, it seems that it is more
the (low) level of in¬‚ation expectations as compared to the level of real interest rates that
underlies the dynamics of the yield curve during that year. However, there still remains
substantial volatility in long bond yields to be explained. This could be related to the
spillover effects of international bond market developments on the German bond market
in the aftermath of the Russian and Asian crises.
It was also shown that one of those factors seems to be related to the ex-ante real
interest rate, while a second factor is linked to in¬‚ation expectations. This conclusion is
much in accordance with the empirical literature on the subject and is a relevant result
for modelling the yield curve using information on macroeconomic variables.
Therefore, modelling the yield curve behaviour, namely for VaR purposes, seems to be
reasonably approached by simulations of (ex-post) real interest rates and lagged in¬‚ation
rate. In addition, the results obtained suggest that a central bank has a decisive role
concerning the bond market moves, given that it in¬‚uences both the short and the long
ends of the yield curve, respectively by in¬‚uencing the real interest rate and the in¬‚ation
expectations. Accordingly, the second factor may also be used as an indicator of monetary
policy credibility.

A¨t-Sahalia, Y. (1996), “Testing Continuous-Time Models of the Spot Interest Rate”, Review of
Financial Studies, 9, 427“470.
Modelling the Term Structure of Interest Rates 127
Babbs, S. H. and K. B. Nowman (1998), “An Application of Generalized Vasicek Term Structure
Models to the UK Gilt-edged Market: a Kalman Filtering analysis”, Applied Financial Economics,
8, 637“644.
Backus, D., S. Foresi, A. Mozumdar and L. Wu (1997), “Predictable Changes in Yields and For-
ward Rates”, mimeo.
Backus, D., S. Foresi and C. Telmer (1998), “Discrete-Time Models of Bond Pricing”, NBER
working paper no. 6736.
Balduzzi, P., S. R. Das, S. Foresi and R. Sundaram (1996), “A Simple Approach to Three Factor
Af¬ne Term Structure Models”, Journal of Fixed Income, 6 December, 43“53.
Bliss, R. (1997), “Movements in the Term Structure of Interest Rates”, Federal Reserve Bank of
Atlanta, Economic Review, Fourth Quarter.
Bolder, D. J. (2001), “Af¬ne Term-Structure Models: Theory and Implementation”, Bank of Canada,
working paper 2001-15.
Breeden, D. T. (1979), “An Intertemporal Asset Pricing Model with Stochastic Consumption and
Investment Opportunities”, Journal of Financial Economics, 7, 265“296.
Buhler, W., M. Uhrig-Homburg, U. Walter and T. Weber (1999), “An Empirical Comparison of
Forward-Rate and Spot-Rate Models for Valuing Interest-Rate Options”, The Journal of Finance,
LIV, 1, February.
Campbell, J. Y. (1995), “Some Lessons from the Yield Curve”, Journal of Economic Perspectives,
9, 3, 129“152.
Campbell, J. Y., A. W. Lo and A. C. MacKinlay (1997), The Econometrics of Financial Markets,
Princeton University Press, Princeton, NJ.
Cassola, N. and J. B. Lu´s (1996), “The Term Structure of Interest Rates: a Comparison of Alter-
native Estimation Methods with an Application to Portugal”, Banco de Portugal, working paper
no. 17/96, October 1996.
Cassola, N. and J. B. Lu´s (2003), “A Two-Factor Model of the German Term Structure of Interest
Rates”, Applied Financial Economics, forthcoming.
Chen, R. and L. Scott (1993a), “Maximum Likelihood Estimations for a Multi-Factor Equilibrium
Model of the Term Structure of Interest Rates”, Journal of Fixed Income, 3, 14“31.
Chen, R. and L. Scott (1993b), “Multi-Factor Cox“Ingersoll“Ross Models of the Term Structure:
Estimates and Test from a Kalman Filter”, working paper, University of Georgia.
Clarida, R. and M. Gertler (1997), “How the Bundesbank Conducts Monetary Policy”, in Romer,
C. D. and D. H. Romer (eds), Reducing In¬‚ation: Motivation and Strategy, NBER Studies in
Business Cycles, Vol. 30.
Clarida, R., J. Gali and M. Gertler (1998), “Monetary Policy Rules in Practice “ Some International
Evidence”, European Economic Review, 42, 1033“1067.
Cox, J., J. Ingersoll and S. Ross (1985a), “A Theory of the Term Structure of Interest Rates”,
Econometrica, 53, 385“407.
Cox, J., J. Ingersoll and S. Ross (1985b), “An Intertemporal General Equilibrium Model of Asset
Prices”, Econometrica, 53, 363“384.
Deacon, M. and A. Derry (1994), “Deriving Estimates of In¬‚ation Expectations from the Prices of
UK Government Bonds”, Bank of England, working paper 23.
De Jong, F. (1997), “Time-Series and Cross-section Information in Af¬ne Term Structure Models”,
Center for Economic Research.
Doan, T. A. (1995), RATS 4.0 User™s Manual, Estima, Evanston, IL.
Duf¬e, D. and R. Kan (1996), “A Yield Factor Model of Interest Rates”, Mathematical Finance,
6, 379“406.
Fama, E. F. (1975), “Short Term Interest Rates as Predictors of In¬‚ation”, American Economic
Review, 65, 269“282.
Fama, E. F. (1990), “Term Structure Forecasts of Interest Rates, In¬‚ation and Real Returns”, Jour-
nal of Monetary Economics, 25, 59“76.
Fleming, M. J. and E. M. Remolona (1998), “The Term Structure of Announcement Effects”,
Fung, B. S. C., S. Mitnick and E. Remolona (1999), “Uncovering In¬‚ation Expectations and Risk
Premiums from Internationally Integrated Financial Markets”, Bank of Canada, working paper
128 Applied Quantitative Methods for Trading and Investment
Gerlach, S. (1995), “The Information Content of the Term Structure: Evidence for Germany”, BIS
working paper no. 29, September.
Geyer, A. L. J. and S. Pichler (1996), “A State-Space Approach to Estimate and Test Multifactor
Cox“Ingersoll“Ross Models of the Term Structure”, mimeo.
Gong, F. F. and E. M. Remolona (1997a), “A Three-factor Econometric Model of the US Term
Structure”, FRBNY Staff Reports, 19, January.
Gong, F. F. and E. M. Remolona (1997b), “In¬‚ation Risk in the U.S. Yield Curve: The Usefulness
of Indexed Bonds”, Federal Reserve Bank, New York, June.
Gong, F. F. and E. M. Remolona (1997c), “Two Factors Along the Yield Curve”, The Manchester
School Supplement, pp. 1“31.
Hamilton, J. D. (1994), Time Series Analysis, Princeton University Press, Princeton, NJ.
Jorion, P. and F. Mishkin (1991), “A Multicountry Comparison of Term-structure Forecasts at Long
Horizons”, Journal of Financial Economics, 29, 59“80.
Koedijk, K. G. and C. J. M. Kool (1995), “Future In¬‚ation and the Information in International
Term Structures”, Empirical Economics, 20, 217“242.
Litterman, R. and J. Scheinkman (1991), “Common Factors Affecting Bond Returns”, Journal of
Fixed Income, 1, June, 49“53.
Lu´s, J. B. (2001), Essays on Extracting Information from Financial Asset Prices, PhD thesis, Uni-
versity of York.
Mehra, Y. P. (1997), “The Bond Rate and Actual Future In¬‚ation”, Federal Reserve Bank of Rich-
mond, working paper 97-3, March.
Mishkin, F. (1990a), “What Does the Term Structure Tell Us About Future In¬‚ation”, Journal of
Monetary Economics, 25, 77“95.
Mishkin, F. (1990b), “The Information in the Longer Maturity Term Structure About Future In¬‚a-
tion”, Quarterly Journal of Economics, 55, 815“828.
Mishkin, F. (1991), “A Multi-country Study of the Information in the Shorter Maturity Term Struc-
ture About Future In¬‚ation”, Journal of International Money and Finance, 10, 2“22.
Nelson, C. R. and A. F. Siegel (1987), “Parsimonious Modelling of Yield Curves”, Journal of
Business, 60, 4.
Remolona, E., M. R. Wickens and F. F. Gong (1998), “What was the Market™s View of U.K.
Monetary Policy? Estimating In¬‚ation Risk and Expected In¬‚ation with Indexed Bonds”, FRBNY
Staff Reports, 57, December.
Ross, S. A. (1976), “The Arbitrage Theory of Capital Asset Pricing”, Journal of Economic Theory,
13, 341“360.
Schich, S. T. (1996), “Alternative Speci¬cations of the German Term Structure and its Information
Content Regarding In¬‚ation”, Deutsche Bundesbank, D.P. 8/96.
Schich, S. T. (1999), “What the Yield Curves Say About In¬‚ation: Does It Change Over Time?”,
OECD Economic Department Working Papers, No. 227.
Svensson, L. E. O. (1994), “Estimating and Interpreting Forward Interest Rates: Sweden 1992“4”,
CEPR Discussion Paper Series, No. 1051.
Vasicek, O. (1977), “An Equilibrium Characterisation of the Term Structure”, Journal of Financial
Economics, 5, 177“188.
Zin, S. (1997), “Discussion of Evans and Marshall”, Carnegie-Rochester Conference on Public
Policy, November.
Forecasting and Trading Currency Volatility:
An Application of Recurrent Neural
Regression and Model Combination—


In this chapter, we examine the use of nonparametric Neural Network Regression (NNR)
and Recurrent Neural Network (RNN) regression models for forecasting and trading cur-
rency volatility, with an application to the GBP/USD and USD/JPY exchange rates. Both
the results of the NNR and RNN models are benchmarked against the simpler GARCH
alternative and implied volatility. Two simple model combinations are also analysed.
The intuitively appealing idea of developing a nonlinear nonparametric approach to
forecast FX volatility, identify mispriced options and subsequently develop a trading
strategy based upon this process is implemented for the ¬rst time on a comprehensive
basis. Using daily data from December 1993 through April 1999, we develop alternative
FX volatility forecasting models. These models are then tested out-of-sample over the
period April 1999“May 2000, not only in terms of forecasting accuracy, but also in terms
of trading ef¬ciency. In order to do so, we apply a realistic volatility trading strategy using
FX option straddles once mispriced options have been identi¬ed.
Allowing for transaction costs, most trading strategies retained produce positive returns.
RNN models appear as the best single modelling approach yet, somewhat surprisingly,
a model combination which has the best overall performance in terms of forecasting
accuracy fails to improve the RNN-based volatility trading results.
Another conclusion from our results is that, for the period and currencies considered,
the currency option market was inef¬cient and/or the pricing formulae applied by market
participants were inadequate.

Exchange rate volatility has been a constant feature of the International Monetary System
ever since the breakdown of the Bretton Woods system of ¬xed parities in 1971“73. Not
surprisingly, in the wake of the growing use of derivatives in other ¬nancial markets, and

This chapter previously appeared under the same title in the Journal of Forecasting, 21, 317“354 (2002).
™ John Wiley & Sons, Ltd. Reproduced with permission.

Applied Quantitative Methods for Trading and Investment. Edited by C.L. Dunis, J. Laws and P. Na¨m
™ 2003 John Wiley & Sons, Ltd ISBN: 0-470-84885-5
130 Applied Quantitative Methods for Trading and Investment

following the extension of the seminal work of Black“Scholes (1973) to foreign exchange
by Garman“Kohlhagen (1983), currency options have become an ever more popular way
to hedge foreign exchange exposures and/or speculate in the currency markets.
In the context of this wide use of currency options by market participants, having the
best volatility prediction has become ever more crucial. True, the only unknown variable
in the Garman“Kohlhagen pricing formula is precisely the future foreign exchange rate
volatility during the life of the option. With an “accurate” volatility estimate and knowing
the other variables (strike level, current level of the exchange rate, interest rates on both
currencies and maturity of the option), it is possible to derive the theoretical arbitrage-free
price of the option. Just because there will never be such thing as a unanimous agreement
on the future volatility estimate, market participants with a better view/forecast of the
evolution of volatility will have an edge over their competitors.
In a rational market, the equilibrium price of an option will be affected by changes
in volatility. The higher the volatility perceived by market participants, the higher the
option™s price. Higher volatility implies a greater possible dispersion of the foreign
exchange rate at expiry: all other things being equal, the option holder has logically an
asset with a greater chance of a more pro¬table exercise. In practice, those investors/market
participants who can reliably predict volatility should be able to control better the ¬nan-
cial risks associated with their option positions and, at the same time, pro¬t from their
superior forecasting ability.
There is a wealth of articles on predicting volatility in the foreign exchange market:
for instance, Baillie and Bollerslev (1990) used ARIMA and GARCH models to describe
the volatility on hourly data, West and Cho (1995) analysed the predictive ability of
GARCH, AR and nonparametric models on weekly data, Jorion (1995) examined the
predictive power of implied standard deviation as a volatility forecasting tool with daily
data, Dunis et al. (2001b) measured, using daily data, both the 1-month and 3-month
forecasting ability of 13 different volatility models including AR, GARCH, stochastic
variance and model combinations with and without the adding of implied volatility as an
extra explanatory variable.
Nevertheless, with the exception of Engle et al. (1993), Dunis and Gavridis (1997)
and, more recently, Laws and Gidman (2000), these papers evaluate the out-of-sample
forecasting performance of their models using traditional statistical accuracy criteria, such
as root mean squared error, mean absolute error, mean absolute percentage error, Theil-
U statistic and correct directional change prediction. Investors and market participants
however have trading performance as their ultimate goal and will select a forecasting
model based on ¬nancial criteria rather than on some statistical criterion such as root
mean squared error minimisation. Yet, as mentioned above, seldom has recently published
research applied any ¬nancial utility criterion in assessing the out-of-sample performance
of volatility models.
Over the past few years, Neural Network Regression (NNR) has been widely advocated
as a new alternative modelling technology to more traditional econometric and statistical
approaches, claiming increasing success in the ¬elds of economic and ¬nancial forecast-
ing. This has resulted in many publications comparing neural networks and traditional
forecasting approaches. In the case of foreign exchange markets, it is worth pointing out
that most of the published research has focused on exchange rate forecasting rather than
on currency volatility forecasts. However, ¬nancial criteria, such as Sharpe ratio, prof-
itability, return on equity, maximum drawdown, etc., have been widely used to measure
Forecasting and Trading Currency Volatility 131

and quantify the out-of-sample forecasting performance. Dunis (1996) investigated the
application of NNR to intraday foreign exchange forecasting and his results were evalu-
ated by means of a trading strategy. Kuan and Liu (1995) proposed two-step Recurrent
Neural Network (RNN) models to forecast exchange rates and their results were evalu-
ated using traditional statistical accuracy criteria. Tenti (1996) applied RNNs to predict the
USD/DEM exchange rate, devising a trading strategy to assess his results, while Franses
and Van Homelen (1998) use NNR models to predict four daily exchange rate returns
relative to the Dutch guilder using directional accuracy to assess out-of-sample forecast-
ing accuracy. Overall, it seems however that neural network research applied to exchange
rates has been so far seldom devoted to FX volatility forecasting.
Accordingly, the rationale for this chapter is to investigate the predictive power of
alternative nonparametric forecasting models of foreign exchange volatility, both from
a statistical and an economic point of view. We examine the use of NNR and RNN
regression models for forecasting and trading currency volatility, with an application to
the GBP/USD and USD/JPY exchange rates. The results of the NNR and RNN models are
benchmarked against the simpler GARCH (1,1) alternative, implied volatility and model
combinations: in terms of model combination, a simple average combination and the
Granger“Ramanathan (1984) optimal weighting regression-based approach are employed
and their results investigated.
Using daily data from December 1993 through April 1999, we develop alternative FX
volatility forecasting models. These models are then tested out-of-sample over the period
April 1999“May 2000, not only in terms of forecasting accuracy, but also in terms of
trading ef¬ciency. In order to do so, we apply a realistic volatility trading strategy using
FX option straddles once mispriced options have been identi¬ed.
Allowing for transaction costs, most trading strategies retained produce positive returns.
RNN models appear as the best single modelling approach but model combinations,
despite their superior performance in terms of forecasting accuracy, fail to produce
superior trading strategies.
Another conclusion from our results is that, for the period and currencies considered,
the currency option market was inef¬cient and/or the pricing formulae applied by market
participants were inadequate.
Overall, we depart from existing work in several respects.
Firstly, we develop alternative nonparametric FX volatility models, applying in par-
ticular an RNN architecture with a loop back from the output layer implying an error
feedback mechanism, i.e. we apply a nonlinear error-correction modelling approach to
FX volatility.
Secondly, we apply our nonparametric models to FX volatility, something that has not
been done so far. A recent development in the literature has been the application of
nonparametric time series modelling approaches to volatility forecasts. Gaussian kernel
regression is an example, as in West and Cho (1995). Neural networks have also been
found useful in modelling the properties of nonlinear time series. As mentioned above, if
there are quite a few articles on applications of NNR models to foreign exchange, stock
and commodity markets,1 there are rather few concerning ¬nancial markets volatility

For NNR applications to commodity forecasting, see, for instance, Ntungo and Boyd (1998) and Trippi and
Turban (1993). For applications to the stock market, see, amongst others, Deboeck (1994) and Leung et al.
132 Applied Quantitative Methods for Trading and Investment

forecasting in general.2 It seems therefore that, as an alternative technique to more tra-
ditional statistical forecasting methods, NNR models need further investigation to check
whether or not they can add value in the ¬eld of foreign exchange volatility forecasting.
Finally, unlike previous work, we do not limit ourselves to forecasting accuracy but
extend the analysis to the all-important trading ef¬ciency, taking advantage of the fact
that there exists a large and liquid FX implied volatility market that enables us to apply
sophisticated volatility trading strategies.
The chapter is organised as follows. Section 4.2 describes our exchange rate and volatil-
ity data. Section 4.3 brie¬‚y presents the GARCH (1,1) model and gives the corresponding
21-day volatility forecasts. Section 4.4 provides a detailed overview and explains the pro-
cedures and methods used in applying the NNR and RNN modelling procedure to our
¬nancial time series, and it presents the 21-day volatility forecasts obtained with these
methods. Section 4.5 brie¬‚y describes the model combinations retained and assesses the
21-day out-of-sample forecasts using traditional statistical accuracy criteria. Section 4.6
introduces the volatility trading strategy using FX option straddles that we follow once
mispriced options have been identi¬ed through the use of our most successful volatility
forecasting models. We present detailed trading results allowing for transaction costs and
discuss their implications, particularly in terms of a quali¬ed assessment of the ef¬ciency
of the currency options market. Finally, Section 4.7 provides some concluding comments
and suggestions for further work.

The motivation for this research implies that the success or failure to develop pro¬table
volatility trading strategies clearly depends on the possibility to generate accurate volatility
forecasts and thus to implement adequate volatility modelling procedures.
Numerous studies have documented the fact that logarithmic returns of exchange rate
time series exhibit “volatility clustering” properties, that is periods of large volatility
tend to cluster together followed by periods of relatively lower volatility (see, amongst
others, Baillie and Bollerslev (1990), Kroner et al. (1995) and Jorion (1997)). Volatility
forecasting crucially depends on identifying the typical characteristics of volatility within
the restricted sample period selected and then projecting them over the forecasting period.
We present in turn the two databanks we have used for this study and the modi¬cations
to the original series we have made where appropriate.

4.2.1 The exchange rate series databank and historical volatility

The return series we use for the GBP/USD and USD/JPY exchange rates were extracted
from a historical exchange rate database provided by Datastream. Logarithmic returns,
de¬ned as log(St /St’1 ), are calculated for each exchange rate on a daily frequency basis.
We multiply these returns by 100, so that we end up with percentage changes in the
exchange rates considered, i.e. st = 100 log(St /St’1 ).

Even though there are no NNR applications yet to foreign exchange volatility forecasting, some researchers
have used NNR models to measure the stock market volatility (see, for instance, Donaldson and Kamstra (1997)
and Bartlmae and Rauscher (2000)).
Forecasting and Trading Currency Volatility 133

Our exchange rate databank spans from 31 December 1993 to 9 May 2000, giving
us 1610 observations per exchange rate.3 This databank was divided into two separate
sets with the ¬rst 1329 observations from 31 December 1993 to 9 April 1999 de¬ned as
our in-sample testing period and the remaining 280 observations from 12 April 1999 to
9 May 2000 being used for out-of-sample forecasting and validation.
In line with the ¬ndings of many earlier studies on exchange rate changes (see, amongst
others, Engle and Bollerslev (1986), Baillie and Bollerslev (1989), Hsieh (1989), West
and Cho (1995)), the descriptive statistics of our currency returns (not reported here in
order to conserve space) clearly show that they are nonnormally distributed and heavily
fat-tailed. They also show that mean returns are not statistically different from zero.
Further standard tests of autocorrelation, nonstationarity and heteroskedasticity show that
logarithmic returns are all stationary and heteroskedastic. Whereas there is no evidence
of autocorrelation for the GBP/USD return series, some autocorrelation is detected at the
10% signi¬cance level for USD/JPY returns.
The fact that our currency returns have zero unconditional mean enables us to use
squared returns as a measure of their variance and absolute returns as a measure of their
standard deviation or volatility.4 The standard tests of autocorrelation, nonstationarity and
heteroskedasticity (again not reported here in order to conserve space) show that squared
and absolute currency returns series for the in-sample period are all nonnormally dis-
tributed, stationary, autocorrelated and heteroskedastic (except USD/JPY squared returns
which were found to be homoskedastic).
Still, as we are interested in analysing alternative volatility forecasting models and
whether they can add value in terms of forecasting realised currency volatility, we must
adjust our statistical computation of volatility to take into account the fact that, even if
it is only the matter of a constant, in currency options markets, volatility is quoted in
annualised terms. As we wish to focus on 1-month volatility forecasts and related trading
strategies, taking, as is usual practice, a 252-trading day year (and consequently a 21-
trading day month), we compute the 1-month volatility as the moving annualised standard
deviation of our logarithmic returns and end up with the following historical volatility
measures for the 1-month horizon:

σt = ( 252 — |st |)
21 t’20

where |st | is the absolute currency return.5 The value σt is the realised 1-month exchange
rate volatility that we are interested in forecasting as accurately as possible, in order to
see if it is possible to ¬nd any mispriced option that we could possibly take advantage of.
The descriptive statistics of both historical volatility series (again not reported here
in order to conserve space) show that they are nonnormally distributed and fat-tailed.
Further statistical tests of autocorrelation, heteroskedasticity and nonstationarity show
that they exhibit strong autocorrelation but that they are stationary in levels. Whereas

Actually, we used exchange rate data from 01/11/1993 to 09/05/2000, the data during the period 01/11/1993
to 31/12/1993 being used for the “pre-calculation” of the 21-day realised historical volatility.
Although the unconditional mean is zero, it is of course possible that the conditional mean may vary over
The use of absolute returns (rather than their squared value) is justi¬ed by the fact that with zero unconditional
mean, averaging absolute returns gives a measure of standard deviation.
134 Applied Quantitative Methods for Trading and Investment

GBP/USD historical volatility is heteroskedastic, USD/JPY realised volatility was found
to be homoskedastic.
Having presented our exchange rate series databank and explained how we compute our
historical volatilities from these original series (so that they are in a format comparable
to that which prevails in the currency options market), we now turn our attention to the
implied volatility databank that we have used.

4.2.2 The implied volatility series databank
Volatility has now become an observable and traded quantity in ¬nancial markets, and par-
ticularly so in the currency markets. So far, most studies dealing with implied volatilities
have used volatilities backed out from historical premium data on traded options rather
than over-the-counter (OTC) volatility data (see, amongst others, Latane and Rendle-
man (1976), Chiras and Manaster (1978), Lamoureux and Lastrapes (1993), Kroner et al.
(1995) and Xu and Taylor (1996)).
As underlined by Dunis et al. (2000), the problem in using exchange data is that call
and put prices are only available for given strike levels and ¬xed maturity dates. The
corresponding implied volatility series must therefore be backed out using a speci¬c
option pricing model. This procedure generates two sorts of potential biases: material
errors or mismatches can affect the variables that are needed for the solving of the
pricing model, e.g. the forward points or the spot rate, and, more importantly, the very
speci¬cation of the pricing model that is chosen can have a crucial impact on the ¬nal
“backed out” implied volatility series.
This is the reason why, in this chapter, we use data directly observable on the market-
place. This original approach seems further warranted by current market practice whereby
brokers and market makers in currency options deal in fact in volatility terms and not in
option premium terms any more.6 The volatility time series we use for the two exchange
rates selected, GBP/USD and USD/JPY, were extracted from a market quoted implied
volatilities database provided by Chemical Bank for data until end-1996, and updated from
Reuters “Ric” codes subsequently. These at-the-money forward, market-quoted volatilities
are in fact obtained from brokers by Reuters on a daily basis, at the close of business
in London.
These implied volatility series are nonnormally distributed and fat-tailed. Further sta-
tistical tests of autocorrelation and heteroskedasticity (again not reported here in order
to conserve space) show that they exhibit strong autocorrelation and heteroskedasticity.
Unit root tests show that, at the 1-month horizon, both GBP/USD and USD/JPY implied
volatilities are stationary at the 5% signi¬cance level.
Certainly, as noted by Dunis et al. (2001b) and con¬rmed in Tables A4.1 and A4.3 in
Appendix A for the GBP/USD and USD/JPY, an interesting feature is that the mean level
of implied volatilities stands well above average historical volatility levels.7 This tendency
of the currency options market to overestimate actual volatility is further documented

The market data that we use are at-the-money forward volatilities, as the use of either in-the-money or out-of-
the-money volatilities would introduce a signi¬cant bias in our analysis due to the so-called “smile effect”, i.e.
the fact that volatility is “priced” higher for strike levels which are not at-the-money. It should be made clear
that these implied volatilities are not simply backed out of an option pricing model but are instead directly
quoted from brokers. Due to arbitrage they cannot diverge too far from the theoretical level.
As noted by Dunis et al. (2001b), a possible explanation for implied volatility being higher than its historical
counterpart may be due to the fact that market makers are generally options sellers (whereas end users are
Forecasting and Trading Currency Volatility 135

by Figures A4.1 and A4.2 which show 1-month actual and implied volatilities for the
GBP/USD and USD/JPY exchange rates. These two charts also clearly show that, for
each exchange rate concerned, actual and implied volatilities are moving rather closely
together, which is further con¬rmed by Tables A4.2 and A4.4 for both GBP/USD and
USD/JPY volatilities.

4.3.1 The choice of the benchmark model
As the GARCH model originally devised by Bollerslev (1986) and Taylor (1986) is well
documented in the literature, we just present it very brie¬‚y, as it has now become widely
used, in various forms, by both academics and practitioners to model conditional variance.
We therefore do not intend to review its many different variants as this would be outside
the scope of this chapter. Besides, there is a wide consensus, certainly among market
practitioners, but among many researchers as well that, when variants of the standard
GARCH (1,1) model do provide an improvement, it is only marginal most of the time.
Consequently, for this chapter, we choose to estimate a GARCH (1,1) model for both
the GBP/USD and USD/JPY exchange rates as it embodies a compact representation
and serves well our purpose of ¬nding an adequate benchmark for the more complex
NNR models.
In its simple GARCH (1,1) form, the GARCH model basically states that the conditional
variance of asset returns in any given period depends upon a constant, the previous
period™s squared random component of the return and the previous period™s variance.
In other words, if we denote by σt2 the conditional variance of the return at time t and
µt’1 the squared random component of the return in the previous period, for a standard
GARCH (1,1) process, we have:

σt2 = ω + ±µt’1 + βσt’1
2 2

Equation (4.1) yields immediately the 1-step ahead volatility forecast and, using recursive
substitution, Engle and Bollerslev (1986) and Baillie and Bollerslev (1992) give the n-step
ahead forecast for a GARCH (1,1) process:

σt+n = ω[1 + (± + β) + · · · + (± + β)n’2 ] + ω + ±µt2 + βσt2

This is the formula that we use to compute our GARCH (1,1) n-step ahead out-of-
sample forecast.

4.3.2 The GARCH (1,1) volatility forecasts
If many researchers have noted that no alternative GARCH speci¬cation could consistently
outperform the standard GARCH (1,1) model, some such as Bollerslev (1987), Baillie

more often option buyers): there is probably a tendency among option writers to include a “risk premium”
when pricing volatility. Kroner et al. (1995) suggest another two reasons: (i) the fact that if interest rates are
stochastic, then the implied volatility will capture both asset price volatility and interest rate volatility, thus
skewing implied volatility upwards, and (ii) the fact that if volatility is stochastic but the option pricing formula
is constant, then this additional source of volatility will be picked up by the implied volatility.
136 Applied Quantitative Methods for Trading and Investment

and Bollerslev (1989) and Hsieh (1989), amongst others, point out that the Student-t
distribution ¬ts the daily exchange rate logarithmic returns better than conditional nor-
mality, as the former is characterised by fatter tails. We thus generate GARCH (1,1) 1-step
ahead forecasts with the Student-t distribution assumption.8 We give our results for the
GBP/USD exchange rate:

log(St /St’1 ) = µt
µt |•t’1 ∼ N(0, σt2 )
σt2 = 0.0021625 + 0.032119µt’1 + 0.95864σt’1
2 2

(0.0015222) (0.010135) (0.013969) (4.3)

where the ¬gures in parentheses are asymptotic standard errors. The t-values for ± and
2 2
β are highly signi¬cant and show strong evidence that σt2 varies with µt’1 and σt’1 . The
coef¬cients also have the expected sign. Additionally, the conventional Wald statistic for
testing the joint hypothesis that ± = β = 0 clearly rejects the null, suggesting a signi¬cant
GARCH effect.
The parameters in equation (4.3) were used to estimate the 21-day ahead volatility
forecast for the USD/GBP exchange rate: using the 1-step ahead GARCH (1,1) coef-
¬cients, the conditional 21-day volatility forecast was generated each day according to
equation (4.2) above. The same procedure was followed for the USD/JPY exchange rate
volatility (see Appendix B4.3).
Figure 4.1 displays the GARCH (1,1) 21-day volatility forecasts for the USD/GBP
exchange rate both in- and out-of-sample (the last 280 observations, from 12/04/1999 to

GARCH Vol. Forecast Realised Vol.
18 18
16 16
14 14
12 12
10 10
8 8
6 6
4 4
2 2
0 0




















Figure 4.1 GBP/USD GARCH (1,1) volatility forecast (%)

Actually, we modelled conditional volatility with both the normal and the t-distribution. The results are only
slightly different. However, both the Akaike and the Schwarz Bayesian criteria tend to favour the t-distribution.
We therefore selected the results from the t-distribution for further tests (see Appendix B for the USD/GBP
detailed results).
Forecasting and Trading Currency Volatility 137

09/05/2000). It is clear that, overall, the GARCH model ¬ts the realised volatility rather
well during the in-sample period. However, during the out-of-sample period, the GARCH
forecasts are quite disappointing. The USD/JPY out-of-sample GARCH (1,1) forecasts
suffer from a similar inertia (see Figure C4.1 in Appendix C).
In summary, if the GARCH (1,1) model can account for some statistical properties of
daily exchange rate returns such as leptokurtosis and conditional heteroskedasticity, its
ability to accurately predict volatility, despite its wide use among market professionals,
is more debatable. In any case, as mentioned above, we only intend to use our GARCH
(1,1) volatility forecasts as a benchmark for the nonlinear nonparametric neural network
models we intend to apply and test whether NNR/RNN models can produce a substantial
improvement in the out-of-sample performance of our volatility forecasts.

4.4.1 NNR modelling

Over the past few years, it has been argued that new technologies and quantitative sys-
tems based on the fact that most ¬nancial time series contain nonlinearities have made
traditional forecasting methods only second best. NNR models, in particular, have been
applied with increasing success to economic and ¬nancial forecasting and would constitute
the state of the art in forecasting methods (see, for instance, Zhang et al. (1998)).
It is clearly beyond the scope of this chapter to give a complete overview of arti¬cial
neural networks, their biological foundation and their many architectures and poten-
tial applications (for more details, see, amongst others, Simpson (1990) and Hassoun
For our purpose, let it suf¬ce to say that NNR models are a tool for determining
the relative importance of an input (or a combination of inputs) for predicting a given
outcome. They are a class of models made up of layers of elementary processing units,
called neurons or nodes, which elaborate information by means of a nonlinear transfer
function. Most of the computing takes place in these processing units.
The input signals come from an input vector A = (x [1] , x [2] , . . . , x [n] ) where x [i] is
the activity level of the ith input. A series of weight vectors Wj = (w1j , w2j , . . . , wnj )
is associated with the input vector so that the weight wij represents the strength of the
connection between the input x [i] and the processing unit bj . Each node may additionally
have also a bias input θj modulated with the weight w0j associated with the inputs. The
total input of the node bj is formally the dot product between the input vector A and the
weight vector Wj , minus the weighted input bias. It is then passed through a nonlinear
transfer function to produce the output value of the processing unit bj :

bj = f x [i] wij ’ w0j θj = f (Xj ) (4.4)

In this chapter, we use exclusively the multilayer perceptron, a multilayer feedforward network trained by
error backpropagation.
138 Applied Quantitative Methods for Trading and Investment

In this chapter, we have used the sigmoid function as activation function:10

f (Xj ) = (4.5)
1 + e’Xj
Figure 4.2 allows one to visualise a single output NNR model with one hidden layer and
two hidden nodes, i.e. a model similar to those we developed for the GBP/USD and the
USD/JPY volatility forecasts. The NNR model inputs at time t are xt[i] (i = 1, 2, . . . , 5).
[j ]
The hidden nodes outputs at time t are ht (j = 1, 2) and the NNR model output at time
t is yt , whereas the actual output is yt .
At the beginning, the modelling process is initialised with random values for the
weights. The output value of the processing unit bj is then passed on to the single
output node of the output layer. The NNR error, i.e. the difference between the NNR
forecast and the actual value, is analysed through the root mean squared error. The latter
is systematically minimised by adjusting the weights according to the level of its deriva-
tive with respect to these weights. The adjustment obviously takes place in the direction
that reduces the error.
As can be expected, NNR models with two hidden layers are more complex. In general,
they are better suited for discontinuous functions; they tend to have better generalisation
capabilities but are also much harder to train. In summary, NNR model results depend
crucially on the choice of the number of hidden layers, the number of nodes and the type
of nonlinear transfer function retained.
In fact, the use of NNR models further enlarges the forecaster™s toolbox of available
techniques by adding models where no speci¬c functional form is a priori assumed.11
Following Cybenko (1989) and Hornik et al. (1989), it can be demonstrated that speci¬c
NNR models, if their hidden layer is suf¬ciently large, can approximate any continuous

x t[1]

x t[2] Σ «

Σ «
x t[3]
Σ «
ht[2] yt
x t[4]

x t[5]

Figure 4.2 Single output NNR model

Other alternatives include the hyperbolic tangent, the bilogistic sigmoid, etc. A linear activation function is
also a possibility, in which case the NNR model will be linear. Note that our choice of a sigmoid implies
variations in the interval ]0, +1[. Input data are thus normalised in the same range in order to present the
learning algorithm with compatible values and avoid saturation problems.
Strictly speaking, the use of an NNR model implies assuming a functional form, namely that of the trans-
fer function.
Forecasting and Trading Currency Volatility 139

function.12 Furthermore, it can be shown that NNR models are equivalent to nonlinear
nonparametric models, i.e. models where no decisive assumption about the generating
process must be made in advance (see Cheng and Titterington (1994)).
Kouam et al. (1992) have shown that most forecasting models (ARMA models, bilin-
ear models, autoregressive models with thresholds, nonparametric models with kernel
regression, etc.) are embedded in NNR models. They show that each modelling procedure
can in fact be written in the form of a network of neurons.
Theoretically, the advantage of NNR models over other forecasting methods can there-
fore be summarised as follows: as, in practice, the “best” model for a given problem
cannot be determined, it is best to resort to a modelling strategy which is a generalisation
of a large number of models, rather than to impose a priori a given model speci¬cation.
This has triggered an ever-increasing interest for applications to ¬nancial markets (see,
for instance, Trippi and Turban (1993), Deboeck (1994), Rehkugler and Zimmermann
(1994), Refenes (1995) and Dunis (1996)).
Comparing NNR models with traditional econometric methods for foreign exchange rate
forecasting has been the topic of several recent papers: Kuan and Liu (1995), Swanson
and White (1995) and Gen¸ ay (1996) show that NNR models can describe in-sample
data rather well and that they also generate “good” out-of-sample forecasts. Forecasting
accuracy is usually de¬ned in terms of small mean squared prediction error or in terms of
directional accuracy of the forecasts. However, as mentioned already, there are still very
few studies concerned with ¬nancial assets volatility forecasting.

4.4.2 RNN modelling
RNN models were introduced by Elman (1990). Their only difference from “regular”
NNR models is that they include a loop back from one layer, either the output or the
intermediate layer, to the input layer. Depending on whether the loop back comes from
the intermediate or the output layer, either the preceding values of the hidden nodes or the
output error will be used as inputs in the next period. This feature, which seems welcome
in the case of a forecasting exercise, comes at a cost: RNN models will require more
connections than their NNR counterparts, thus accentuating a certain lack of transparency
which is sometimes used to criticise these modelling approaches.
Using our previous notation and assuming the output layer is the one looped back, the
RNN model output at time t depends on the inputs at time t and on the output at time
t ’ 1:13
yt = F (xt , yt’1 )
˜ ˜ (4.6)

There is no theoretical answer as to whether one should preferably loop back the intermedi-
ate or the output layer. This is mostly an empirical question. Nevertheless, as looping back
the output layer implies an error feedback mechanism, such RNN models can successfully
be used for nonlinear error-correction modelling, as advocated by Burgess and Refenes

This very feature also explains why it is so dif¬cult to use NNR models, as one may in fact end up ¬tting
the noise in the data rather than the underlying statistical process.
With a loop back from the intermediate layer, the RNN output at time t depends on the inputs at time t
and on the intermediate nodes at time t ’ 1. Besides, the intermediate nodes at time t depend on the inputs
at time t and on the hidden layer at time t ’ 1. Using our notation, we have therefore: yt = F (xt , ht’1 ) and
ht = G(xt , ht’1 ).
140 Applied Quantitative Methods for Trading and Investment

x t[1]

Σ «
x t[2]

Σ «

x t[3] Σ «
ht[2] yt

r t[1] = yt’1 ’ yt’1

Figure 4.3 Single output RNN model

(1996). This is why we choose this particular architecture as an alternative modelling
strategy for the GBP/USD and the USD/JPY volatility forecasts. Our choice seems fur-
ther warranted by claims from Kuan and Liu (1995) and Tenti (1996) that RNN models
are superior to NNR models when modelling exchange rates.
Figure 4.3 allows one to visualise a single output RNN model with one hidden layer
and two hidden nodes, again a model similar to those developed for the GBP/USD and
the USD/JPY volatility forecasts.

4.4.3 The NNR/RNN volatility forecasts Input selection, data scaling and preprocessing
In the absence of an indisputable theory of exchange rate volatility, we assume that a
speci¬c exchange rate volatility can be explained by that rate™s recent evolution, volatil-
ity spillovers from other ¬nancial markets, and macroeconomic and monetary policy
In the circumstances, it seems reasonable to include, as potential inputs, exchange rate
volatilities (including that which is to be modelled), the evolution of important stock and
commodity prices, and, as a measure of macroeconomic and monetary policy expectations,
the evolution of the yield curve.14
As explained above (see footnote 10), all variables were normalised according to our
choice of the sigmoid activation function. They had been previously transformed in log-
arithmic returns.15
Starting from a traditional linear correlation analysis, variable selection was achieved
via a forward stepwise neural regression procedure: starting with both lagged historical
and implied volatility levels, other potential input variables were progressively added,
keeping the network architecture constant. If adding a new variable improved the level of
explained variance over the previous “best” model, the pool of explanatory variables was
updated. If there was a failure to improve over the previous “best” model after several

On the use of the yield curve as a predictor of future output growth and in¬‚ation, see, amongst others, Fama
(1990) and Ivanova et al. (2000).
Despite some contrary opinions, e.g. Balkin (1999), stationarity remains important if NNR/RNN models are
to be assessed on the basis of the level of explained variance.
Forecasting and Trading Currency Volatility 141

attempts, variables in that model were alternated to check whether no better solution could
be achieved. The model chosen ¬nally was then kept for further tests and improvements.
Finally, conforming with standard heuristics, we partitioned our total data set into three
subsets, using roughly 2/3 of the data for training the model, 1/6 for testing and the
remaining 1/6 for validation. This partition in training, test and validation sets is made
in order to control the error and reduce the risk of over¬tting. Both the training and
the following test period are used in the model tuning process: the training set is used to
develop the model; the test set measures how well the model interpolates over the training
set and makes it possible to check during the adjustment whether the model remains valid
for the future. As the ¬ne-tuned system is not independent from the test set, the use of
a third validation set which was not involved in the model™s tuning is necessary. The
validation set is thus used to estimate the actual performance of the model in a deployed
In our case, the 1329 observations from 31/12/1993 to 09/04/1999 were considered
as the in-sample period for the estimation of our GARCH (1,1) benchmark model. We
therefore retain the ¬rst 1049 observations from 31/12/1993 to 13/03/1998 for the training
set and the remainder of the in-sample period is used as test set. The last 280 observations
from 12/04/1999 to 09/05/2000 constitute the validation set and serve as the out-of-sample
forecasting period. This is consistent with the GARCH (1,1) model estimation. Volatility forecasting results

We used two similar sets of input variables for the GBP/USD and USD/JPY volatilities,
with the same output variable, i.e. the realised 21-day volatility. Input variables
included the lagged actual 21-day realised volatility (Realised21t’21 ), the lagged
implied 21-day volatility (IVOL21t’21 ), lagged absolute logarithmic returns of the
exchange rate (|r|t’i , i = 21, . . . , 41) and lagged logarithmic returns of the gold price
(DLGOLDt’i , i = 21, . . . , 41) or of the oil price (DLOILt’i , i = 21, . . . , 41), depending
on the currency volatility being modelled.
In terms of the ¬nal model selection, Tables D4.1a and D4.1b in Appendix D give the
performance of the best NNR and RNN models over the validation (out-of-sample) data
set for the USD/GBP volatility. For the same input space and architecture (i.e. with only
one hidden layer), RNN models marginally outperform their NNR counterparts in terms
of directional accuracy. This is important as trading pro¬tability crucially depends on
getting the direction of changes right. Tables D4.1a and D4.1b also compare models with
only one hidden layer and models with two hidden layers while keeping the input and
output variables unchanged: despite the fact that the best NNR model is a two-hidden
layer model with respectively ten and ¬ve hidden nodes in each of its hidden layers, on
average, NNR/RNN models with a single hidden layer perform marginally better while
at the same time requiring less processing time.
The results of the NNR and RNN models for the USD/JPY volatility over the val-
idation period are given in Tables D4.2a and D4.2b in Appendix D. They are in line
with those for the GBP/USD volatility, with RNN models outperforming their NNR
counterparts and, in that case, the addition of a second hidden layer rather deteriorating
Finally, we selected our two best NNR and RNN models for each volatility, NNR
(44-10-5-1) and RNN (44-1-1) for the GBP/USD and NNR (44-1-1) and RNN (44-5-1)
142 Applied Quantitative Methods for Trading and Investment

for the USD/JPY, to compare their out-of-sample forecasting performance with that of
our GARCH (1,1) benchmark model. This evaluation is conducted on both statistical and
¬nancial criteria in the following sections. Yet, one can easily see from Figures E4.1
and E4.2 in Appendix E that, for both the GBP/USD and the USD/JPY volatilities, these
out-of-sample forecasts do not suffer from the same degree of inertia as was the case for
the GARCH (1,1) forecasts.

4.5.1 Model combination

As noted by Dunis et al. (2001a), today most researchers would agree that individual
forecasting models are misspeci¬ed in some dimensions and that the identity of the “best”
model changes over time. In this situation, it is likely that a combination of forecasts
will perform better over time than forecasts generated by any individual model that is
kept constant.
Accordingly, we build two rather simple model combinations to add to our three existing
volatility forecasts, the GARCH (1,1), NNR and RNN forecasts.16
The simplest forecast combination method is the simple average of existing forecasts.
As noted by Dunis et al. (2001b), it is often a hard benchmark to beat as other methods,
such as regression-based methods, decision trees, etc., can suffer from a deterioration of
their out-of-sample performance.
We call COM1 the simple average of our GARCH (1,1), NNR and RNN volatility
forecasts with the actual implied volatility (IVOL21). As we know, implied volatility is
itself a popular method to measure market expectations of future volatility.
Another method of combining forecasts suggested by Granger and Ramanathan (1984)
is to regress the in-sample historical 21-day volatility on the set of forecasts to obtain
appropriate weights, and then apply these weights to the out-of-sample forecasts: it is
denoted GR. We follow Granger and Ramanathan™s advice to add a constant term and not
to constrain the weights to add to unity. We do not include both ANN and RNN forecasts in
the regression as they can be highly collinear: for the USD/JPY, the correlation coef¬cient
between both volatility forecasts is 0.984.
We tried several alternative speci¬cations for the Granger“Ramanathan approach. The
parameters were estimated by ordinary least squares over the in-sample data set. Our best
model for the GBP/USD volatility is presented below with t-statistics in parentheses, and
the R-squared and standard error of the regression:

Actualt,21 = ’5.7442 + 0.7382RNN44t,21 + 0.6750GARCH (1, 1)t,21 + 0.3226IVOLt,21
(’6.550) (7.712) (8.592) (6.777)
R 2 = 0.2805 S.E. of regression = 1.7129 (4.7a)

More sophisticated combinations are possible, even based on NNR models as in Donaldson and Kamstra
(1996), but this is beyond the scope of this chapter.
Forecasting and Trading Currency Volatility 143

For the USD/JPY volatility forecast combination, our best model was obtained using the
NNR forecast rather than the RNN one:

Actualt,21 = ’9.4293 + 1.5913NNR44t,21 + 0.06164GARCH (1, 1)t,21 + 0.1701IVOLt,21
(’7.091) (7.561) (0.975) (2.029)
R 2 = 0.4128 S.E. of regression = 4.0239 (4.7b)

As can be seen, the RNN/NNR-based forecast gets the highest weight in both cases,
suggesting that the GR forecast relies more heavily on the RNN/NNR model forecasts
than on the others. Figures F4.1 and F4.2 in Appendix F show that the GR and COM1
forecast combinations, as the NNR and RNN forecasts, do not suffer from the same
inertia as the GARCH (1,1) out-of-sample forecasts do. The Excel ¬le “CombGR JPY”
on the accompanying CD-Rom documents the computation of the two USD/JPY volatility
forecast combinations and that of their forecasting accuracy.
We now have ¬ve volatility forecasts on top of the implied volatility “market fore-
cast” and proceed to test their out-of-sample forecasting accuracy through traditional
statistical criteria.

4.5.2 Out-of-sample forecasting accuracy

As is standard in the economic literature, we compute the Root Mean Squared Error
(RMSE), the Mean Absolute Error (MAE) and Theil U-statistic (Theil-U). These measures
have already been presented in detail by, amongst others, Makridakis et al. (1983),
Pindyck and Rubinfeld (1998) and Theil (1966), respectively. We also compute a “correct
directional change” (CDC) measure which is described below.
Calling σ the actual volatility and σ the forecast volatility at time „ , with a forecast
period going from t + 1 to t + n, the forecast error statistics are respectively:

RMSE = (σ„ ’ σ„ )2
„ =t+1

MAE = (1/n) | σ„ ’ σ„ |
„ =t+1
® 
t+n t+n t+n
° (1/n) σ„2 »
Theil-U = (σ„ ’ σ„ )2
ˆ σ„2 +
(1/n) (1/n)
„ =t+1 „ =t+1 „ =t+1

CDC = (100/n) D„
„ =t+1

where D„ = 1 if (σ„ ’ σ„ ’1 )(σ„ ’ σ„ ’1 ) > 0 else D„ = 0

The RMSE and the MAE statistics are scale-dependent measures but give us a basis
to compare our volatility forecasts with the realised volatility. The Theil-U statistic is
144 Applied Quantitative Methods for Trading and Investment

independent of the scale of the variables and is constructed in such a way that it necessarily
lies between zero and one, with zero indicating a perfect ¬t.
For all these three error statistics retained the lower the output, the better the forecasting
accuracy of the model concerned. However, rather than on securing the lowest statistical
forecast error, the pro¬tability of a trading system critically depends on taking the right
position and therefore getting the direction of changes right. RMSE, MAE and Theil-U
are all important error measures, yet they may not constitute the best criterion from a
pro¬tability point of view. The CDC statistic is used to check whether the direction given
by the forecast is the same as the actual change which has subsequently occurred and,
for this measure, the higher the output the better the forecasting accuracy of the model
concerned. Tables 4.1 and 4.2 compare, for the GBP/USD and the USD/JPY volatility
respectively, our ¬ve volatility models and implied volatility in terms of the four accuracy
measures retained.
These results are most interesting. Except for the GARCH (1,1) model (for all criteria
for the USD/JPY volatility and in terms of directional change only for the GBP/USD
volatility), they show that our ¬ve volatility forecasting models offer much more precise
indications about future volatility than implied volatilities. This means that our volatility
forecasts may be used to identify mispriced options, and a pro¬table trading rule can
possibly be established based on the difference between the prevailing implied volatility
and the volatility forecast.
The two NNR/RNN models and the two combination models predict correctly direc-
tional change at least over 57% of the time for the USD/JPY volatility. Furthermore,
for both volatilities, these models outperform the GARCH (1,1) benchmark model on all

Table 4.1 GBP/USD volatility models forecasting accuracy


IVOL21 1.98 1.63 0.13 49.64
GARCH (1,1) 1.70 1.48 0.12 48.57
NNR (44-10-5-1) 1.69 1.42 0.12 50.00
RNN (44-1-1) 1.50 1.27 0.11 52.86
COM1 1.65 1.41 0.11 65.23
GR 1.67 1.37 0.12 67.74

USD/JPY volatility models forecasting accuracy17
Table 4.2


IVOL21 3.04 2.40 0.12 53.21
GARCH (1,1) 4.46 4.14 0.17 52.50
NNR (44-1-1) 2.41 1.88 0.10 59.64
RNN (44-5-1) 2.43 1.85 0.10 59.29
COM1 2.72 2.29 0.11 56.79
GR 2.70 2.13 0.11 57.86

The computation of the COM1 and GR forecasting accuracy measures is documented in the Excel ¬le
“CombGR’ JPY” on the accompanying CD-Rom.
Forecasting and Trading Currency Volatility 145

evaluation criteria. As a group, NNR/RNN models show superior out-of-sample forecast-
ing performance on any statistical evaluation criterion, except directional change for the
GBP/USD volatility for which they are outperformed by model combinations. Within this
latter group, the GR model performance is overall the best in terms of statistical fore-
casting accuracy. The GR model combination provides the best forecast of directional
change, achieving a remarkable directional forecasting accuracy of around 67% for the
GBP/USD volatility.
Still, as noted by Dunis (1996), a good forecast may be a necessary but it is certainly
not a suf¬cient condition for generating positive trading returns. Prediction accuracy is not
the ultimate goal in itself and should not be used as the main guiding selection criterion
for system traders. In the following section, we therefore use our volatility forecasting
models to identify mispriced foreign exchange options and endeavour to develop pro¬table
currency volatility trading models.

4.6.1 Volatility trading strategies

Kroner et al. (1995) point out that, since expectations of future volatility play such a
critical role in the determination of option prices, better forecasts of volatility should
lead to a more accurate pricing and should therefore help an option trader to identify
over- or underpriced options. Therefore a pro¬table trading strategy can be established
based on the difference between the prevailing market implied volatility and the volatility
forecast. Accordingly, Dunis and Gavridis (1997) advocate to superimpose a volatility
trading strategy on the volatility forecast.
As mentioned previously, there is a narrow relationship between volatility and the
option price. An option embedding a high volatility gives the holder a greater chance of
a more pro¬table exercise. When trading volatility, using at-the-money forward (ATMF)
straddles, i.e. combining an ATFM call with an ATFM put with opposite deltas, results in
taking no forward risk. Furthermore, as noted, amongst others, by Hull (1997), both the
ATMF call and put have the same vega and gamma sensitivity. There is no directional bias.
If a large rise in volatility is predicted, the trader will buy both call and put. Although
this will entail paying two premia, the trader will pro¬t from a subsequent movement
in volatility: if the foreign exchange market moves far enough either up or down, one
of the options will end deeply in-the-money and, when it is sold back to the writing
counterparty, the pro¬t will more than cover the cost of both premia. The other option will
expire worthless. Conversely, if both the call and put expire out-of-the-money following
a period of stability in the foreign exchange market, only the premia will be lost.
If a large drop in volatility is predicted, the trader will sell the straddle and receive the
two option premia. This is a high-risk strategy if his market view is wrong as he might
theoretically suffer unlimited loss, but, if he is right and both options expire worthless,
he will have cashed in both premia.

4.6.2 The currency volatility trading models

The trading strategy adopted is based on the currency volatility trading model proposed
by Dunis and Gavridis (1997). A long volatility position is initiated by buying the 1-month
146 Applied Quantitative Methods for Trading and Investment

ATMF foreign exchange straddle if the 1-month volatility forecast is above the prevailing
1-month implied volatility level by more than a certain threshold used as a con¬rmation
¬lter or reliability indicator. Conversely, a short ATMF straddle position is initiated if the
1-month volatility forecast is below the prevailing implied volatility level by more than
the given threshold.
To this effect, the ¬rst stage of the currency volatility trading strategy is, based on
the threshold level as in Dunis (1996), to band the volatility predictions into ¬ve classes,
namely, “large up move”, “small up move”, “no change”, “large down move” and “small
down move” (Figure 4.4). The change threshold de¬ning the boundary between small and
large movements was determined as a con¬rmation ¬lter. Different strategies with ¬lters
ranging from 0.5 to 2.0 were analysed and are reported with our results.
The second stage is to decide the trading entry and exit rules. With our ¬lter rule,
a position is only initiated when the 1-month volatility forecast is above or below the
prevailing 1-month implied volatility level by more than the threshold. That is:
• If Dt > c, then buy the ATMF straddle
• If Dt < ’c, then sell the ATFM straddle
where Dt denotes the difference between the 1-month volatility forecast and the prevailing
1-month implied volatility, and c represents the threshold (or ¬lter).
In terms of exit rules, our main test is to assume that the straddle is held until expiry
and that no new positions can be initiated until the existing straddle has expired. As, due
to the drop in time value during the life of an option, this is clearly not an optimal trading
strategy, we also consider the case of American options which can be exercised at any
time until expiry, and thus evaluate this second strategy assuming that positions are only
held for ¬ve trading days (as opposed to one month).18
As in Dunis and Gavridis (1997), pro¬tability is determined by comparing the level
of implied volatility at the inception of the position with the prevailing 1-month realised
historical volatility at maturity.
It is further weighted by the amount of the position taken, itself a function of the
difference between the 1-month volatility forecast and the prevailing 1-month implied
volatility level on the day when the position is initiated: intuitively, it makes sense to
assume that, if we have a “good” model, the larger |Dt |, the more con¬dent we should
be about taking the suggested position and the higher the expected pro¬t. Calling G this
gearing of position, we thus have:19
G = |Dt |/|c| (4.8)

C = change threshold
’C C

Large down Small down Small up Large up

No change

Figure 4.4 Volatility forecasts classi¬cation

For the “weekly” trading strategy, we also considered closing out European options before expiry by taking
the opposite position, unwinding positions at the prevailing implied volatility market rate after ¬ve trading
days: this strategy was generally not pro¬table.
Laws and Gidman (2000) adopt a similar strategy with a slightly different de¬nition of the gearing.
Forecasting and Trading Currency Volatility 147

Pro¬tability is therefore de¬ned as a volatility net pro¬t (i.e. it is calculated in volatility
points or “vols” as they are called by options traders20 ). Losses are also de¬ned as
a volatility loss, which implies two further assumptions: when short the straddle, no
stop-loss strategy is actually implemented and the losing trade is closed out at the then
prevailing volatility level (it is thus reasonable to assume that we overestimate potential
losses in a real world environment with proper risk management controls); when long
the straddle, we approximate true losses by the difference between the level of implied
volatility at inception with the prevailing volatility level when closing out the losing trade,
whereas realised losses would only amount to the premium paid at the inception of the
position (here again, we seem to overestimate potential losses). It is further assumed that
volatility pro¬ts generated during one period are not reinvested during the next. Finally,
in line with Dunis and Gavridis (1997), transaction costs of 25 bp per trade are included
in our pro¬t and loss computations.

4.6.3 Trading simulation results
The currency volatility trading strategy was applied from 31 December 1993 to 9 May
2000. Tables 4.3 and 4.4 document our results for the GBP/USD and USD/JPY monthly
trading strategies both for the in-sample period from 31 December 1993 to 9 April 1999
and the out-of-sample period from 12 April 1999 to 9 May 2000. The evaluation discussed
below is focused on out-of-sample performance.
For our trading simulations, four different thresholds ranging from 0.5 to 2.0 and two
different holding periods, i.e. monthly and weekly, have been retained. A higher threshold
level implies requiring a higher degree of reliability in the signals and obviously reduces
the overall number of trades.
The pro¬tability criteria include the cumulative pro¬t and loss with and without gearing,
the total number of trades and the percentage of pro¬table trades. We also show the average
gearing of the positions for each strategy.
Firstly, we compare the performance of the NNR/RNN models with the benchmark
GARCH (1,1) model. For the GBP/USD monthly volatility trading strategy in Table 4.3,
the GARCH (1,1) model generally produces higher cumulative pro¬ts not only in-sample
but also out-of-sample. NNR/RNN models seldom produce a higher percentage of prof-
itable trades in-sample or out-of-sample, although the geared cumulative return of the
strategy based on the RNN (44-1-1) model is close to that produced with the bench-
mark model. With NNR/RNN models predicting more accurately directional change than
the GARCH model, one would have intuitively expected them to show a better trading
performance for the monthly volatility trading strategies.
This expected result is in fact achieved by the USD/JPY monthly volatility trading
strategy, as shown in Table 4.4: NNR/RNN models clearly produce a higher percentage
of pro¬table trades both in- and out-of-sample, with the best out-of-sample performance
being that based on the RNN (44-5-1) model. On the contrary, the GARCH (1,1) model-
based strategies produce very poor trading results, often recording an overall negative
cumulative pro¬t and loss ¬gure.

In market jargon, “vol” refers to both implied volatility and the measurement of volatility in percent per
annum (see, amongst others, Malz (1996)). Monetary returns could only be estimated by comparing the actual
pro¬t/loss of a straddle once closed out or expired against the premium paid/received at inception, an almost
impossible task with OTC options.
148 Applied Quantitative Methods for Trading and Investment
Table 4.3 GBP/USD monthly volatility trading strategy

1 Threshold = 0.5 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR
P/L without gearing 58.39% 54.47% 81.75% 54.25% 81.40% 5.77% 5.22% 12.90% 10.35% 11.98%
P/L with gearing 240.44% 355.73% 378.18% 182.38% 210.45% 16.60% 35.30% 42.91% 16.41% 19.44%
Total trades 59 59 61 58 61 12 12 11 10 10
Profitable trades 67.80% 70.00% 77.05% 70.69% 83.61% 50.00% 58.33% 72.73% 70.00% 80.00%
Average gearing 2.83 4.05 3.87 2.35 2.13 1.55 2.17 2.39 1.46 1.44

2 Threshold = 1.0 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR
P/L without gearing 61.48% 57.52% 82.61% 64.50% 60.24% 7.09% 12.74% 12.08% 8.33% 9.65%
P/L with gearing 134.25% 190.80% 116.99% 88.26% 8.65% 20.23% 20.79% 10.53% 11.03%
Total trades 51 51 45 47 7 8 9 5 6
Profitable trades 72.55% 69.09% 80.00% 78.72% 85.71% 87.50% 66.67% 80.00% 83.33%
Average gearing 1.68 2.21 2.08 1.53 1.32 1.18 1.61 1.48 1.19 1.13

3 Threshold = 1.5 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR
P/L without gearing 53.85% 61.13% 66.24% 62.26% 39.22% 9.26% 8.67% 8.93% 10.36% 8.69%
P/L with gearing 74.24% 114.88% 113.04% 80.65% 49.52% 11.16% 11.75% 11.32% 11.00% 9.92%
Total trades 40 40 52 31 24 4 6 6 3 2
Profitable trades 80.00% 71.43% 80.77% 83.87% 83.33% 100.00% 83.33% 83.33% 100% 100.00%
Average gearing 1.28 1.62 1.43 1.24 1.19 1.12 1.31 1.22 1.05 1.14

4 Threshold = 2.0 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR NNR(44-10-5-1) RNN(44-1-1) GARCH(1,1) COM1 GR
P/L without gearing 69.03% 63.04% 60.57% 48.35% 20.97% 4.39% 7.80% 10.54% 4.39% -
P/L with gearing 103.33% 98.22% 85.19% 63.05% 24.25% 5.37% 8.71% 11.29% 4.70% -
Total trades 24 33 31 16 8 1 4 4 1 0
Profitable trades 82.76% 78.57% 79.49% 94.12% 88.89% 100.00% 100.00% 100.00% 100% -
Average gearing 1.35 1.40 1.24 1.25 1.16 1.22 1.09 1.06 1.06 -
Note: Cumulative P/L figures are expressed in volatility points.

Secondly, we evaluate the performance of model combinations. It is quite disappointing
as, for both monthly volatility trading strategies, model combinations produce on average
much lower cumulative returns than alternative strategies based on NNR/RNN models for
the USD/JPY volatility and on either the GARCH (1,1) or the RNN (44-1-1) model for
the GBP/USD volatility. As a general rule, the GR combination model fails to clearly
outperform the simple average model combination COM1 during the out-of-sample period,
something already noted by Dunis et al. (2001b).
Overall, with the monthly holding period, RNN model-based strategies show the
strongest out-of-sample trading performance: in terms of geared cumulative pro¬t, they
come ¬rst in four out of the eight monthly strategies analysed, and second best in the
remaining four cases. The strategy with the highest return yields a 106.17% cumulative
pro¬t over the out-of-sample period and is achieved for the USD/JPY volatility with the
RNN (44-5-1) model and a ¬lter equal to 0.5.
The results of the weekly trading strategy are presented in Tables G4.1 and G4.2 in
Appendix G. They basically con¬rm the superior performance achieved through the use
of RNN model-based strategies and the comparatively weak results obtained through the
use of model combination.
Forecasting and Trading Currency Volatility 149
Table 4.4 USD/JPY monthly volatility trading strategy

1 Threshold = 0.5 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR
’26.42% ’8.57%
P/L without gearing 31.35% 16.42% 19.15% 19.73% 16.21% 20.71% 16.19% 3.21%
P/L with gearing 151.79% 144.52% 6.93% 152.36% 82.92% 63.61% 106.17% 75.78% 11.50%
Total trades 62 62 60 60 61 13 13 12 12 13
Profitable trades 54.84% 51.61% 38.33% 53.33% 60.66% 76.92% 84.62% 50.00% 66.67% 61.54%
Average gearing 3.89 3.98 3.77 2.41 2.36 2.86 3.91 5.73 2.63 1.86

2 Threshold = 1.0 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR
’9.71% ’1.94%
P/L without gearing 25.83% 44.59% 40.60% 64.35% 20.70% 21.32% 13.53% 26.96%
P/L with gearing 67.66% 105.01% 76.72% 122.80% 52.81% 45.81% 42.14% 21.41% 46.53%
Total trades 58 58 51 52 12 12 12 11 12
Profitable trades 62.07% 56.90% 58.82% 69.23% 83.33% 83.33% 41.67% 63.64% 83.33%
Average gearing 2.01 2.16 1.97 1.58 1.55 1.97 1.94 3.03 1.41 1.66

3 Threshold = 1.5 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR
P/L without gearing 47.07% 23.62% 40.93% 46.63% 84.17% 19.65% 25.54% 1.87% 5.09% 23.80%
P/L with gearing 92.67% 75.69% 86.60% 75.70% 109.49% 40.49% 73.19% 31.31% 10.65% 32.91%
Total trades 51 51 46 37 42 10 10 12 6 10
Profitable trades 64.71% 55.77% 52.17% 59.46% 73.81% 80.00% 80.00% 33.33% 50% 90.00%
Average gearing 1.71 1.72 1.60 1.33 1.35 1.54 1.94 2.21 1.37 1.41

4 Threshold = 2.0 Trading days = 21
In-sample Out-of-sample
Observation (1-1329) (1330-1610)
Period (31/12/1993-09/04/1999) (12/04/1999-09/05/2000)
Models NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR NNR(44-1-1) RNN(44-5-1) GARCH(1,1) COM1 GR
P/L without gearing 52.52% 28.98% 39.69% 27.89% 75.99% 34.35% 33.70% 0.11% 7.74% 3.09%
’5.11% 12.35%
P/L with gearing 202.23% 72.21% 49.77% 35.77% 94.30% 60.33% 64.76% 5.48%
Total trades 37 37 41 21 26 10 10 12 3 7
Profitable trades 59.46% 61.54% 56.10% 61.90% 80.77% 90.00% 90.00% 33% 100% 71%
Average gearing 1.67 1.59 1.39 1.25 1.26 1.54 1.68 1.54 1.54 1.23
Note: Cumulative P/L figures are expressed in volatility points.


. 7
( 19)