<<

. 3
( 19)



>>

sion and generalisation of more established methods. In the case of statistical arbitrage,
cointegration can be thought of as a principled extension of the relative value strategies,
such as “pairs trading”, which are in common use by market practitioners. In the case
of hedging, the use of a cointegration approach can be viewed as extending factor-model
hedging to include situations where the underlying risk factors are not measurable directly,
but are instead manifested implicitly through their effect on asset prices.
The structure of the rest of the chapter is as follows. In Section 2.2 we provide a
more detailed description of the econometric basis of our approach and illustrate the
way in which cointegration models are constructed and how variance ratio tests can be
used as a means of identifying potentially predictable components in asset price dynam-
ics. In Section 2.3 we explain how cointegration can be used to perform implicit factor
hedging. In Section 2.4 we explain how cointegration can be used to construct sophisti-
cated relative-value models as a potential basis for statistical arbitrage trading strategies.
In Section 2.5 we present a controlled simulation in which we show how cointegration
methods can be used to “reverse engineer” certain aspects of the underlying dynamics
of a set of time series. In Section 2.6 we describe the application of cointegration tech-
niques to a particular set of asset prices, namely the daily closing prices of the 50 equities
which constituted the STOXX 50 index as of 4 July 2002; a detailed description of the
methodology is provided along with a discussion of the accompanying spreadsheet which
contains the analysis itself. Finally, Section 2.7 contains a brief discussion of further
practical issues together with a concluding summary of the chapter.

2.2 TIME SERIES MODELLING AND COINTEGRATION
In this section we review alternative methods for representing and modelling time series.
Whilst often overlooked, the choice of problem representation can play a decisive role in
determining the success or failure of any subsequent modelling or forecasting procedure.
In particular, the representation will determine the extent to which the statistical properties
of the data are stable over time, or “stationary”.
Stable statistical properties are important because most types of model are more suited
to tasks of interpolation (queries within the range of past data) rather than extrapolation
(queries outside the range of known data). Where the statistical properties of a system
are “nonstationary”, i.e. changing over time, future queries may lie in regions outside the
known data range, resulting in a degradation in the performance of any associated model.
The most common solution to the problems posed by nonstationarity is to attempt to
identify a representation of the data which minimises these effects. Figure 2.1 illustrates
Cointegration to Hedge and Trade International Equities 43
Value (yt )




Value (yt )
Time (t ) Time (t )
Value (yt )




Value (yt )
Time (t )
Time (t )

Figure 2.1 Time series with different characteristics, particularly with regard to stationarity: (top
left) stationary time series; (top right) trend-stationary time series; (bottom left) integrated time
series; (bottom right) cointegrated time series


different classes of time series from the viewpoint of the transformations that are required
to achieve stationarity.
A naturally stationary series, such as that shown in the top-left chart, is one which has a
stable range of values over time. Such a series can be directly included in a model, either
as a dependent or independent variable, without creating any undue risk of extrapolation.
The top-right chart shows an example of a “trend-stationary” variable; it is stationary
around a known trend which is a deterministic function of time. A stationary representation
of such a variable can be obtained by “de-trending” the variable relative to the underlying
trend. Some economic time series fall into this category.
Series such as that in the bottom-left chart are known as “difference stationary” because
the period-to-period differences in the series are stationary although the series itself is not.
Turning this around, such series can also be viewed as “integrated series”, which represent
the integration (sum) of a stationary time series. Arti¬cial random-walk series and most
asset prices fall into this category, i.e. prices are nonstationary but price differences,
returns, are stationary.
The two series in the bottom-right chart represent a so-called cointegrated set of vari-
ables. Whilst the individual series are nonstationary we can construct a combined series
(in this case the difference between the two) which is stationary. As we shall demonstrate
below, some sets of asset prices exhibit cointegration to a greater or lesser degree, leading
to interesting and valuable opportunities for both trading and hedging the assets within
the set. Another way of looking at cointegration is that we are “de-trending” the series
against each other, rather than against time.
The class into which a time series or set of time series fall, whether stationary, inte-
grated, or cointegrated, has important implications both for the modelling approach which
should be adopted and the nature of any potentially predictable components that the time
series may contain. Details of a wide range of statistical tests, for identifying both the type
of time series (stationary, nonstationary, cointegrated) and the presence of any potentially
predictable component in the time series dynamics, are provided in Burgess (1999). In
this chapter we will concentrate on two main tests: regression-based tests for the presence
of cointegration, and variance ratio tests for the presence of potential predictability.
44 Applied Quantitative Methods for Trading and Investment

The most popular method of testing for cointegration is that introduced by Granger
(1983) and is based upon the concept of a “cointegrating regression”. In this approach a
particular time series (the “target series”) y0,t is regressed upon the remainder of the set
of time series (the “cointegrating series”) y1,t , . . . , yn,t :
y0,t = ± + β1 y1,t + β2 y2,t + · · · + βn yn,t + dt (2.1)
If the series are cointegrated then statistical tests will indicate that dt is stationary and the
parameter vector ± = (1, ’±, ’β1 , ’β2 , . . . , ’βn ) is referred to as the cointegrating vec-
tor. Two standard tests recommended by Engle and Granger (1987) are the Dickey“Fuller
(DF) and the Cointegrating Regression Durbin“Watson (CRDW). The Dickey“Fuller test
is described later in this chapter, as part of the controlled simulation in Section 2.5. An
extensive review of approaches to constructing and testing for cointegrating relationships
is contained in Burgess (1999).
Variance ratio tests are a powerful way of testing for potential predictability in time
series dynamics. They are derived from a property of unpredictable series where the
variance of the differences in the series grows linearly with the length of the period over
which they are measured. A simple intuition for this property is presented in Figure 2.2.
In the limiting case where all steps are in the same direction the variance of the series will
grow as a function of time squared, at the other extreme of pure reversion the variance of the
series will be independent of time (and close to zero). A random diffusion will be a weighted
combination of both behaviours and will exhibit variance which grows linearly with time.
This effect has been used as the basis of statistical tests for deviations from random-walk
behaviour by a number of authors starting with Lo and MacKinlay (1988) and Cochrane
(1988). The motivation for testing for deviations from random-walk behaviour is that
they suggest the presence of a potentially predictable component in the dynamics of a
time series. The „ -period variance ratio is simply the normalised ratio of the variance of
„ -period differences to the variance of single-period differences:

„ y)2
yt ’

(
t
VR(„ ) = (2.2)
2
( yt ’ y)

t


Variance
1-period 2-period

Perfect r
Price
(r + r)2 = 4r 2
r2
trend:
r
Time

Perfect
Price (r ’ r)2 = 0
r2
reversion:
’r
r
Time
2r 2
r2
50/50 mix of trend + reversion
Random:


Figure 2.2 The relationship between variance and time for a simple diffusion process
Cointegration to Hedge and Trade International Equities 45

5
Trending
Random walk 4




Variance ratio
Mean reverting
Trending
3
Value




Random walk
2 Mean reverting
1
0
1 2 3 4 5 6 7 8 9 10
Time Period

Figure 2.3 Example time series with different characteristics (left) and their variance ratio
functions (right)

By viewing the variance ratio statistics for different periods collectively, we form the
variance ratio function (VRF) of the time series (Burgess, 1999). A positive gradient to
the VRF indicates positive autocorrelation in the time series dynamics and hence trending
behaviour; conversely a negative gradient to the VRF indicates negative autocorrela-
tion and mean-reverting or cyclical behaviour. Figure 2.3 shows examples of time series
with different characteristics, together with their associated VRFs. Further examples are
contained in Burgess (1999).
For the random walk series, the variance grows linearly with the period „ and hence
the VRF remains close to one. For a trending series the variance grows at a greater than
linear rate and so the VRF rises as the period over which the differences are calculated
increases. Finally, for the mean-reverting series the converse is true: the variance grows
sublinearly and hence the VRF falls below one.

2.3 IMPLICIT HEDGING OF UNKNOWN COMMON
RISK FACTORS
The relevance of cointegration to hedging is based upon the recognition that much of the
“risk” or stochastic component in asset returns is caused by variations in factors which have
a common effect on many assets. This viewpoint forms the basis of traditional asset pricing
models such as the CAPM (Capital Asset Pricing Model) of Sharpe (1964) and the APT
(Arbitrage Pricing Theory) of Ross (1976). Essentially these pricing models take the form:
yi,t = ±i + βi,Mkt Mktt + βi,1 f1,t + · · · + βi,n fn,t + µi,t (2.3)
This general formulation relates changes in asset prices yt to sources of systematic risk
(changes in the market, Mktt , and in other economic “risk factors”, fj,t ) together with
an idiosyncratic asset-speci¬c component µi,t .
The presence of market-wide risk factors creates the possibility of hedging or reducing
risk through the construction of appropriate combinations of assets. Consider a portfolio
consisting of a long (bought) position in an asset y1 and a short (sold) position in an asset
y2 . If the asset price dynamics in each case follow a data-generating process of the form
shown in equation (2.3), then the combined returns y1,t ’ y2,t are given by:
y1,t ’ y2,t = (±1 ’ ±2 )
+ (β1,Mkt ’ β2,Mkt ) Mktt + (β1,1 ’ β2,1 ) f1,t + · · · + (β1,n ’ β2,n ) fn,t
+ (µ1,t ’ µ2,t )
(2.4)
46 Applied Quantitative Methods for Trading and Investment

100 %
90 %
80 %
Asset Y1
70 %
Asset Y2
60 %
Synthetic Y1 ’ Y2
50 %
40 %
30 %
20 %
10 %
0%
basset, Mkt basset, 1 basset, 2 Idiosyncratic 1 Idiosyncratic 2 Total Variance Idiosyncratic
variance from variance
market
factors

Figure 2.4 Attribution of price variance across risk factors: whilst the individual assets Y1 and
Y2 are primarily in¬‚uenced by changes in market-wide risk factors, the price changes of the
“synthetic asset” Y1 ’ Y2 are largely immunised from such effects


If the factor exposures are similar, i.e. β1,j ≈ β2,j , then the proportion of variance which
is caused by market-wide factors will be correspondingly reduced. This effect is illustrated
in Figure 2.4.
A common approach to hedging is to assume that we can explicitly identify at least
reasonable approximations to the underlying risk factors fj,t and factor sensitivities βi,j
and then to create portfolios in which the combined exposure to the different risk factors
lies within a desired tolerance. However, in cases where this may not be the optimal
approach, cointegration provides an alternative method of implicitly hedging the common
underlying sources of risk.
More speci¬cally, given an asset universe UA and a particular “target asset”, T ∈ UA , a
cointegrating regression can be used to create a “synthetic asset” SA(T ) which is a linear
combination of assets which exhibits the maximum possible long-term correlation with the
target asset T . The coef¬cients of the linear combination are estimated by regressing the
historical price of T on the historical prices of a set of “constituent” assets C ‚ UA ’ T :
« 2
T t ’ βi Ci,t 
SA(T )t = {βi } = arg min
βi Ci,t (2.5)
s.t.
Ci ∈C Ci ∈C
t=1,...,n


As the aim of the regression is to minimise the squared differences, this is a standard
ordinary least squares (OLS) regression, and the optimal “cointegrating vector” β =
(β1 , . . . , βnc )T of constituent weights can be calculated directly by:

βOLS = (CT C)’1 Ct (2.6)

where C is the nc (= |C|) — n matrix of historical prices of the constituents and t =
(T1 , . . . , Tn )T is the vector of historical prices of the target asset.
The standard properties of the OLS procedure used in regression ensure both that the
synthetic asset will be an unbiased estimator for the target asset, i.e. E[Tt ] = SA(T )t , and
also that the deviation between the two price series will be minimal in a mean-squared-error
Cointegration to Hedge and Trade International Equities 47

sense. The synthetic asset can be considered an optimal statistical hedge for the target
series, given a particular set of constituent assets C.
From an economic perspective the set of constituent assets C act as proxies for the
unobserved common risk factors. In maximising the correlation between the target asset
and the synthetic asset the construction procedure cannot (by de¬nition) account for
the “asset-speci¬c” components of price dynamics, but must instead indirectly optimise
the sensitivities to common sources of economic risk. The synthetic asset represents a
combination which as closely as possible matches the underlying factor exposures of
the target asset without requiring either the risk factors or the exposures to be identi¬ed
explicitly. In Section 2.5, this procedure is illustrated in detail by a controlled experiment
in which the cointegration approach is applied to simulated data with known properties.


2.4 RELATIVE VALUE AND STATISTICAL ARBITRAGE
In the previous section we saw that appropriately constructed combinations of prices can
be largely immunised against market-wide sources of risk. Such combinations of assets
are potentially amenable to statistical arbitrage because they represent opportunities to
exploit predictable components in asset-speci¬c price dynamics in a manner which is
(statistically) independent of changes in the level of the market as a whole, or other market-
wide sources of risk. Furthermore, as the asset-speci¬c component of the dynamics is not
directly observable by market participants it is plausible that regularities in the dynamics
may exist from this perspective which have not yet been “arbitraged away” by market
participants.
To motivate the use of statistical arbitrage strategies, we brie¬‚y relate the opportunities
they offer to those of more traditional “riskless” arbitrage strategies. The basic concept
of riskless arbitrage is that where the future cash-¬‚ows of an asset can be replicated by
a combination of other assets, the price of forming the replicating portfolio should be
approximately the same as the price of the original asset. Thus the no-arbitrage condition
can be represented in a general form as:

|payoff(Xt ’ SA(Xt ))| < Transaction cost (2.7)

where Xt is an arbitrary asset (or combination of assets), SA(Xt ) is a “synthetic asset”
which is constructed to replicate the payoff of Xt and “transaction cost” represents the net
costs involved in constructing (buying) the synthetic asset and selling the “underlying”
Xt (or vice versa). This general relationship forms the basis of the “no-arbitrage” pricing
approach used in the pricing of ¬nancial “derivatives” such as options, forwards and
futures.1 From this perspective, the price difference Xt ’ SA(Xt ) can be thought of as
the mispricing between the two (sets of) assets.
A speci¬c example of riskless arbitrage is index arbitrage in the UK equities market.
Index arbitrage (see for example Hull (1993)) occurs between the equities constituting a
particular market index, and the associated futures contract on the index itself. Typically
the futures contract Ft will be de¬ned so as to pay a value equal to the level of the index

1
See Hull (1993) for a good introduction to derivative securities and no-arbitrage relationships.
48 Applied Quantitative Methods for Trading and Investment

at some future “expiration date” T . Denoting the current (spot) stock prices as Sti , the
no-arbitrage relationship, specialising the general case in equation (2.7), is given by:


wi Sti e(r’qi )(T ’t) < cost
Ft ’ (2.8)
i

where wi is the weight of stock i in determining the market index, r is the risk-free
interest rate, and qi is the dividend rate for stock i. In the context of equation (2.7) the
weighted combination of constituent equities can be considered as the synthetic asset
which replicates the index futures contract.
When the “basis” Ft ’ i wi Sti e(r’qi )(T ’t) exceeds the transaction costs of a particular
trader, the arbitrageur can “lock in” a riskless pro¬t by selling the (overpriced) futures
contract Ft and buying the (underpriced) combination of constituent equities. When the
magnitude of the mispricing between the spot and future grows, there are frequently large
corrections in the basis which are caused by index arbitrage activity, as illustrated in
Figure 2.5 for the UK FTSE 100 index.
Many complex arbitrage relationships exist and “riskless” arbitrage is an important
subject in its own right. However such strategies are inherently self-limiting “ as compe-
tition amongst arbitrageurs grows, the magnitude and duration of mispricings decreases.
Furthermore, in practice, even arbitrage which is technically “riskless” will still involve a
certain level of risk due to uncertain future dividend rates qi , trading risks, and so on. From
this perspective the true attraction of index arbitrage strategies lies less in the theoretical
price relationship than in a favourable property of the mispricing dynamics “ namely a
tendency for the basis risk to “mean revert” or ¬‚uctuate around a stable level.

100 5280

5260
80
Basis (future ’ spot)




5240
60
Index level




5220
40
5200
20
5180

0 5160

’20 5140
10:40:56 AM
11:02:19 AM
11:26:10 AM
11:47:47 AM
12:04:25 PM
12:27:14 PM
12:51:25 PM
1:12:45 PM
1:35:49 PM
1:54:09 PM
2:20:11 PM
2:41:10 PM
2:54:43 PM
3:07:25 PM
3:22:55 PM
3:42:35 PM
3:59:47 PM




Figure 2.5 Illustration of index arbitrage opportunities in the UK equity market; the data con-
sists of 3200 prices for the FTSE 100 index (in bold) and the derivative futures contract expiring
Sept. 98; the lower curve shows the so-called “basis”, the deviation from the theoretical fair price
relationship between the two series; the data sample covers the period from 10.40am to 4pm on
15 September 1998; some of the abrupt price shifts will be due to arbitrage activity
Cointegration to Hedge and Trade International Equities 49

Building upon this insight, the premise of “statistical arbitrage” is that regularities in
combinations of asset prices can be exploited as the basis of pro¬table trading strategies,
irrespective of the presence or absence of a theoretical fair price relationship between the
set of assets involved.
Whilst clearly subject to a higher degree of risk than “true” arbitrage strategies, statisti-
cal arbitrage opportunities offer the hope of being both more persistent and more prevalent
in the markets. More persistent because risk-free arbitrage opportunities are rapidly elim-
inated by market activity. More prevalent because in principle they may occur between
any set of assets rather than solely in cases where a suitable “risk-free” hedging strategy
can be implemented.
A simple form of statistical arbitrage is “pairs trading”, which is in common use by a
number of market participants, such as hedge funds, proprietary trading desks and other
“risk arbitrageurs”. Pairs trading is based on a relative value analysis of two asset prices.
The two assets might be selected either on the basis of intuition, economic fundamentals,
long-term correlations or simply past experience. A promising candidate for a pairs strat-
egy might look like the example in Figure 2.6, between HSBC and Standard Chartered.
The pairs in Figure 2.6 show a clear similarity to the riskless arbitrage opportunities
shown in Figure 2.5. In both cases the two prices “move together” in the long term,
with temporary deviations from the long-term correlation which exhibit a strong mean-
reversion pattern. Note however that in the “statistical arbitrage” case the magnitude of
the deviations is greater (around ±10% as opposed to <0.5%) and so is the time period
over which the price corrections occur (days or weeks as opposed to seconds or minutes).
Opportunities for pairs trading in this simple form, however, are dependent upon the
existence of similar pairs of assets and thus are naturally limited. By constructing synthetic
“pairs” in the form of appropriate combinations of two or more assets, cointegration
techniques provide a sophisticated and powerful method to generalise the relative value
approach and create a wider range of potential trading opportunities. Once a cointegrating

700
200

600
STAN HSBC
150

500
100
Deviation




400
Prices




50
300
0
200

’50 100

’100 0
20/08/98


25/08/98


27/08/98


02/09/98


07/09/98


09/09/98


15/09/98


17/09/98


22/09/98


24/09/98


29/09/98




Figure 2.6 Illustration of potential statistical arbitrage opportunities in the UK equity market;
the chart shows equity prices for Standard Chartered and HSBC, sampled on an hourly basis from
20 August to 30 September 1998. Note the mean-reverting nature of the deviation
50 Applied Quantitative Methods for Trading and Investment

regression has been performed to estimate the “fair price” relationship between a set of
assets, tools such as variance ratio analysis can be used to detect deterministic components
in the mispricing dynamics that could be used as the basis of a “statarb” strategy.
In this and the previous section we have provided a motivation for the use of co-
integration-based techniques for both hedging and trading. In the following section we
supplement this qualitative motivation with some quantitative results obtained from apply-
ing the techniques in a controlled simulation with known time series dynamics.

2.5 ILLUSTRATION OF COINTEGRATION
IN A CONTROLLED SIMULATION
Now that we have described the rationale for applying cointegration-based techniques
in trading, the next sections provide examples of how these techniques can be used in
practice. In Section 2.6 we will explore the application of cointegration techniques to real
asset prices. But before we do that, this section highlights the way in which the techniques
work by means of an arti¬cial example in which the underlying dynamics of the time series
are controlled. Consider the example of a set of three assets, each following a two-factor
version of the data-generating process shown in equation (2.3). In this controlled example
we specify the factor exposures of three assets X, Y and Z as shown in Table 2.1, i.e. price
changes within the set of three assets X, Y and Z are driven by a total of ¬ve factors, two
common risk factors f1 and f2 and three asset-speci¬c components µ1 , µ2 , µ3 . Furthermore
let us specify that f1 and f2 follow random-walk processes whilst the dynamics of the
asset-speci¬c factors contain a mean-reverting component. As discussed in Section 2.4,
these dynamics might also be plausibly the case in reality because predictable effects in
market-wide factors would be easily observed and thus “arbitraged away”, whilst small
predictable components in asset-speci¬c dynamics might be less obvious and hence also
more persistent.
Based on the assumptions described above, let us specify the full dynamics of the
resulting time series by the following equations:

fi,t = ·i,t i = 1, 2 ·i,t ∼ N (0,1)
µj,t = ’0.1µj,t + ej,t j = 1, 2, 3 ej,t ∼ N (0,0.25)
Xt = f1,t + f2,t + µ1,t (2.9)
Yt = f1,t + 0.5 f2,t + µ2,t
Zt = 0.5 f1,t + f2,t + µ3,t

Table 2.1 Price sensitivity of three assets X, Y
and Z to changes in common risk factors f1 and
f2 and asset-speci¬c effects µ1 , µ2 , µ3

f1 f2 µ1 µ2 µ3
Asset

X 1 1 1 0 0
Y 1 0.5 0 1 0
Z 0.5 1 0 0 1
Cointegration to Hedge and Trade International Equities 51

20

15
X
10
Y
Z
5

0

’5
1 101 201 301 401 501

Figure 2.7 Realisation of three simulated asset price series which are driven by two underlying
common factors in addition to asset-speci¬c components

i.e. the unobserved “factor” dynamics of f1 and f2 are driven by the pure noise terms ·i,t ;
the also-unobserved asset-speci¬c dynamics µj,t are a combination of noise terms ej,t with
“error correction” mean-reversion terms with parameter ’0.1; the observed asset dynamics
Xt , Yt and Zt are determined by their different exposures to the ¬ve underlying factors.
The precise “shapes” of the time series will depend on the sampled innovations ·i,t
and ei,t . A particular realisation of the asset prices generated by the system is shown in
Figure 2.7 and this is used as the basis of the analysis below. Note that the common
factor exposures create a broad similarity between the observed price movements of the
three assets.
As described in Section 2.3, we estimate the underlying fair price relationship from the
observed data by performing a cointegrating regression. In this case, we arbitrarily select
X as the “target series” and regress on the other two “cointegrating series” Y and Z. The
resulting relationship estimated by the regression is given by:

Xt = 0.632Yt + 0.703Zt + mt (2.10)

Due to sampling error, the estimated relationship differs slightly from the true underlying
relationship Xt = 2/3Yt + 2/3Zt + m— , which would precisely cancel the factor exposures
t

and leave a pure combination (mt ) of the asset-speci¬c terms. However, it is clear that the
cointegrating regression has been able to construct a combination which largely neutralises
the common risk factors, and that it has done this without any explicit knowledge of (or
even estimation of) the factor exposures shown in Table 2.1. It is because they bypass
the need to estimate explicit factor exposures that we refer to cointegration techniques as
performing “implicit” hedging of market-wide risk factors.
In this example, the asset-speci¬c dynamics have been constructed so as to be mean
reverting, so the error term of the regression can be considered as a statistical “mispricing”
which represents the temporary deviation from the estimated “fair price” relationship
between the three assets. Unlike the nonstationary asset prices X, Y and Z, the estimated
mispricing mt , which is illustrated in Figure 2.8, can clearly be seen to be mean reverting.
The mean-reverting nature of the mispricing time series, compared to the close to
random-walk behaviour of the original time series X, Y and Z, is highlighted by the
variance ratio pro¬les shown in Figure 2.9. Whilst the variance ratio for all three original
assets remains close to unity in each case, the variance ratio of the mispricing falls sub-
stantially below one as the period over which the differences are calculated increases. This
indicates that the volatility which is present in the short-term dynamics is not re¬‚ected in
52 Applied Quantitative Methods for Trading and Investment

1.5
1
0.5
0
Mis
’0.5
’1
’1.5
’2
1 101 201 301 401 501

Figure 2.8 The estimated “mispricing” time series, mt = Xt ’ (0.632Yt + 0.703Zt )


1.4
1.2
Variance ratio




1 VR(X)
VR(Y)
0.8
VR(Z)
0.6
VR(Mis)
0.4
0.2
0
1 11 21 31 41 51 61 71 81 91
Period

Figure 2.9 Variance ratio pro¬les for the time series X, Y and Z and mt = Xt ’ (0.632Yt
+ 0.703Zt )


the long-term volatility, thus providing evidence for a substantial mean-reverting compo-
nent in the mispricing dynamics.
Let us now evaluate the effectiveness of the cointegration procedure at “reverse en-
gineering” the underlying factor dynamics. In attempting to replicate the “target” time
series X the cointegrating regression procedure creates the “synthetic asset” 0.632Y +
0.703Z which has similar exposures to the common factors f1 and f2 . Thus in the mis-
pricing time series Xt ’ (0.632Yt + 0.703Zt ) the net exposure to the common factors is
close to zero, allowing the mean-reverting asset-speci¬c effects µ1 , µ2 , µ3 to dominate the
mispricing dynamics. This “statistical hedging” of the common risk factors is quanti¬ed
in Table 2.2, which reports the proportion of the variance of each observed time series
which is associated with each of the underlying factors.
This demonstrates that the use of cointegrating regression can immunise against com-
mon underlying factors which are not observed directly but instead proxied by the
observed asset prices. Whilst the variance of changes in the original time series X, Y
and Z is primarily (70“90%) associated with the common risk factors f1 and f2 , the
effect of these factors on the mispricing m is minimal (0.2%). Conversely, the relative
effect of the asset-speci¬c factors is greatly magni¬ed, growing from 10“30% in the
original time series to 99.8% in the relative mispricing m.
By magnifying the component of the dynamics which is associated with asset-speci¬c
effects, we would expect to magnify the predictable component which (by construction)
is present in the asset-speci¬c effects but not in the common factors. This effect can be
Cointegration to Hedge and Trade International Equities 53
Table 2.2 Sensitivity of price changes of the original
time series X, Y and Z and the “mispricing” time series
Xt ’ (0.632Yt + 0.703Zt ). The table entries show the pro-
portion of the variance of each time series which is asso-
ciated with changes in common risk factors f1 and f2 and
asset-speci¬c effects µ1 , µ2 , µ3

m
X Y Z

f1 46.1% 64.2% 16.8% 0.2%
f2 41.5% 10.1% 67.8% 0.0%
µ1 11.5% 0.2% 0.0% 54.5%
µ2 0.5% 25.1% 0.1% 23.7%
µ3 0.4% 0.4% 15.3% 21.6%

Total 100.0% 100.0% 100.0% 100.0%


quanti¬ed by considering the Dickey“Fuller statistics obtained from simple ECMs of the
time series dynamics:

ˆˆ ˆ
DF(st ) = β/σβ from regression st = ± ’ βst + ·twhere ·t is a noise term
(2.11)
i.e. we regress changes in the time series ( st ) against the level of the series (st ) and test
for a statistically signi¬cant error-correcting coef¬cient β. The details of the estimated
ECMs for our experiment are presented in Table 2.3.
The DF statistic approximately follows a t-distribution so, roughly speaking, DF values
greater than two indicate signi¬cant evidence for a mean-reverting/error-correction effect.
For the underlying (but unobserved) factors, the low DF statistics for f1 and f2 con¬rm
the lack of predictable components in these common factors, whilst the high DF values for
µ1 , µ2 and µ3 (4.908, 5.644 and 4.454 respectively) con¬rm the highly signi¬cant degree
of mean reversion in the asset-speci¬c effects.
In the observed series, X, Y and Z, the mean-reverting effect is “watered down” by
the unpredictable factor effects, with the result that the corresponding DF statistics are
small (actually slightly negative) and present no evidence of a predictable component.

Table 2.3 Details of simple error-correction models estimated to quantify the mean-reverting
component in both the unobserved factors and the observed time series. Values in bold correspond
ˆ
to cases where the estimated mean-reversion coef¬cient β is signi¬cant at the 0.1% level. The
ˆ
rows in the table are: estimated reversion parameter β; standard error of estimate; associated DF
statistic (approximately equivalent to the t-statistic in a standard regression); proportion of variance
explained by model (R 2 )

f1 f2 µ1 µ2 µ3 m
Factor/asset X Y Z

ˆ 0.014 ’0.001 0.079 ’0.001 ’0.000 ’0.001
Estimated β 0.096 0.124 0.077
Std. error σβ 0.008 0.001 0.020 0.022 0.018 0.002 0.003 0.001 0.018
ˆ
1.631 ’0.580 4.454 ’0.452 ’0.004 ’0.591
DF(st ) 4.908 5.644 4.398
R2 0.5% 0% 4.8% 6.2% 4.0% 0% 0% 0% 3.9%
54 Applied Quantitative Methods for Trading and Investment

This picture changes dramatically when we look at the constructed “mispricing” m which
has a high DF statistic of 4.398 “ almost as high as for the true asset-speci¬c effects.
The actual magnitude (as opposed to statistical signi¬cance) of the detected mean-
reversion effect is given by the R 2 values in the table. The results con¬rm that the
predictable component of the dynamics is almost as strongly present in the mispricing
time series as in the underlying, but unobserved, asset-speci¬c dynamics themselves. The
magnitude of the deterministic component in the mispricing is 3.9%, which is comparable
to the 4.8%, 6.2% and 4.0% in the true asset-speci¬c dynamics, and a negligible amount
in the case of the original time series X, Y and Z.
These results from our controlled experiment serve to illustrate the power of the co-
integration approach to remove market-wide risk factors and highlight the asset-speci¬c
components of price dynamics, which in this case were constructed to contain a mean-
reverting effect. However the qualitative reasoning presented in Sections 2.3 and 2.4,
together with quantitative evidence from other sources, suggests that similar results may be
obtained for real asset prices. In the following section we apply essentially the same tech-
niques as those used in this controlled experiment to analyse price relationships between
real assets, namely the equities which constitute the European-wide STOXX 50 index.


2.6 APPLICATION TO INTERNATIONAL EQUITIES
In this section we describe an application of the cointegration tools and techniques
described above to data from those international equities which comprised the STOXX
50 index as of 4 July 2002. We describe this analysis with reference to the accompanying
Excel workbook named “equity coint.xls” on the CD-Rom.
The set of equities which constitute our universe are listed in the ¬rst sheet of the
workbook (named “Constituents”). The full set of equities included in the analysis are
listed in Table 2.4.
The second sheet in the workbook is named “Prices” and contains the raw data for
the analysis. This consists of daily closing prices which have been adjusted to remove
the effects of stock splits, dividends and other corporate actions. The time frame for the
analysis is from 14 September 1998 to 3 July 2002, which is the longest period over
which continuous data is available across the whole set of stocks. This comprises almost
4 years of data, giving 993 daily observations.
Note that this data does not provide a true “snapshot” of the European equity mar-
kets due to the complication that the closing times differ across the different national
exchanges. For a practical trading system this would induce serious distortions to our
models, but for our purposes here the close prices serve adequately to illustrate the use
of the tools we have described above.
The third sheet (“Pairs”) contains a simple relative value analysis of a pair of assets
at a time. The sheet also serves to illustrate the data itself and the use of variance ratio
functions to identify the underlying time series dynamics. A screen shot of this worksheet
is shown in Figure 2.10.
Cells D37 and D38 are used to select two equities whose prices we wish to compare.
The equities are selected by entering numbers from 1 to 50 corresponding to the reference
numbers shown in Table 2.4. The example given shows the case of selecting the British
oil-stock BP (BP.L, number 6 in the set) and the French oil-stock Total-Fina (TOTF.PA,
number 27 in the set). The lower chart plots BP against Total and also shows the synthetic
Cointegration to Hedge and Trade International Equities 55
Table 2.4 The list of companies included in the analysis. The 50 stocks correspond to the con-
stituents of the pan-European STOXX 50 index as of 4 July 2002

Ref. Name Symbol Ref. Name Symbol

1 British Telecom BT.L 26 Zurich Financial ZURZn.VX
2 Glaxo Smithkline GSK.L 27 Total-Fina TOTF.PA
3 Alcatel CGEP.PA 28 Suez LYOE.PA
4 UBS UBSZn.VX 29 Oreal OREP.PA
5 Daimler Chrysler DCXGn.DE 30 Telecom Italia TIT.MI
6 BP BP.L 31 ENI ENI.MI
7 Astro-Zeneca AZN.L 32 Eon EONG.DE
8 Nokia NOK1V.HE 33 Siemens SIEGn.DE
9 Novartis NOVZn.VX 34 Deutsche Bank DBKGn.DE
10 Ericsson ERICb.ST 35 Generali GASI.MI
11 Philips PHG.AS 36 Deutsche Telecom DTEGn.DE
12 ING ING.AS 37 BBVA BBVA.MC
13 ABN Amro AAH.AS 38 Allianz ALVG.DE
14 Aegon AEGN.AS 39 Bayer BAYG.DE
15 Unilever UNc.AS 40 Barclays BARC.L
16 Royal Dutch RD.AS 41 HSBC HSBA.L
17 Swiss Re RUKZn.VX 42 Diageo DGE.L
18 Roche ROCZg.VX 43 Lloyds Bank LLOY.L
19 Vivendi EAUG.PA 44 Prudential PRU.L
20 BSCH SAN.MC 45 Royal Bank of Scotland RBOS.L
21 Nestle NESZn.VX 46 Shell SHEL.L
22 Carrefour CARR.PA 47 Vodafone VOD.L
23 BNP-Paribas BNPP.PA 48 Telefonica TEF.MC
24 Aviva AV.L 49 Munich Re MUVGn.DE
25 AXA AXAF.PA 50 Credit Swiss CSGZn.VX


asset which represents the relative return on the two stocks. All three series have been
normalised to represent log price changes since the beginning of the analysis period. A
close-up of the chart is shown in Figure 2.11.
In this case we see that there appears to be a semi-stable equilibrium which exists
between the two asset prices. For long periods of time the relative price tends to ¬‚uctuate
around an equilibrium or “fair price” level, however signi¬cant shifts in the relationship
also occur, such as the 30% shift in the relative value which occurred during the early part
of 2000. The apparent existence of a relationship between the two price series, together
with the instability in this relationship, serve to respectively illustrate the opportunities
and the risks which arise from a relative value approach to trading.
The top half of the sheet contains a variance ratio analysis of the price dynamics of
the selected equities and the synthetic asset corresponding to their relative prices. The
cells in the range C5:E34 contain array formulae to calculate the n-period variances
for each of the three time series, with n varying between 1 and 30. To the right of
these variances, cells H5:J34 contain the variance ratios, with each n-period variance
normalised by n times the one-period variance. These three functions are plotted in the
chart to the right of the numbers, with the example for BP and Total-Fina shown in
Figure 2.12.
56 Applied Quantitative Methods for Trading and Investment




Figure 2.10 The “Pairs” worksheet containing a pairwise relative value analysis, the selected
stocks are BP (number 6) and Total-Fina (number 27)


0.70
0.60
0.50
0.40
0.30 BP.L
0.20 TOTF.PA
0.10 BP.L/TOTF.PA
0.00
’0.10
’0.20
’0.30
9/14/98 3/14/99 9/14/99 3/14/00 9/14/00 3/14/01 9/14/01 3/14/02

Figure 2.11 Relative prices for BP, Total, and the synthetic asset which is the ratio of the two


In this case we see that the variance ratio functions for the two securities show declining
pro¬les, indicating the presence of reverting components in their time series dynamics. The
mean-reversion tendency is signi¬cantly more prominent in the synthetic asset (BP/Total)
than for either of the individual assets, providing further evidence to support the presence
of a potentially predictable component in the relative price dynamics.
Given such evidence of mean-reverting dynamics we could move on to implement
a statistical arbitrage strategy based on the types of trading rules described by Burgess
Cointegration to Hedge and Trade International Equities 57

1.2


1.0


0.8
Variance ratio




VR(BP.L)
0.6 VR(TOTF.PA)
VR(BP.L/TOTF.PA)

0.4


0.2


0.0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
Window length (days)

Figure 2.12 Variance ratio functions for BP, Total, and the synthetic asset which is the ratio of
the two

(1999) and Towers (2000). Note however that in this particular case at least some of
this effect will be due to the non-synchronous sampling of the close price in the French
and UK markets. Because of this non-synchronicity a more sophisticated analysis (and
probably additional data) would be needed to evaluate the true magnitude of the mean-
reverting effect in the relative price of these equities and its viability as the basis for a
pro¬table trading strategy.
Whilst pairs analysis works well for some equities, it is highly sensitive to the properties
of each asset price and works better for some stocks than for others. Essentially it requires
that for a given equity, there is one (and only one) equity which has similar exposures to
each and every underlying factor. For a given equity there may be zero, one or more than
one closely matching pairs and only in the case of a single matching pair is the simple
approach likely to be close to optimal. These complications mean that pairs analysis is
essentially opportunistic in nature rather than representing a general strategy which can
be applied across a broad asset universe.
Cointegration modelling is essentially an extension of pairs analysis which is designed
to overcome these limitations. Rather than requiring the existence of a single perfect
match we instead create an optimally matching “synthetic asset” in the form of a weighted
combination of one or more assets. The remaining sheets in the workbook demonstrate
the workings and results of this more sophisticated form of relative value modelling.
Firstly, the sheet “CointAnalysis” illustrates the construction of a synthetic asset to
match a chosen “target” asset. A screen shot of this worksheet is shown in Figure 2.13. The
top-left of the worksheet contains various control parameters and diagnostic information.
The chart in the top-centre of the worksheet presents a variance ratio analysis of the
statistical mispricing. The bottom-left chart is a visualisation of the synthetic asset and
the chart in the bottom-right shows the evolution of the various price series over time.
58 Applied Quantitative Methods for Trading and Investment




Figure 2.13 The “CointAnalysis” worksheet showing the construction of a synthetic asset to
match asset number 6: British Petroleum (BP.L)


CONTROLS Manual Select X 6 using: 6

RidgeFac 0.01

total 993

Insample 700
outsample 293


Figure 2.14 The controls for the cointegration analysis


The controls for the cointegration analysis are contained in the top-left of the worksheet.
As elsewhere in the workbook, the convention is that user-speci¬ed controls are contained
in cells with a black border and yellow background. In this case there are four such cells,
as shown in Figure 2.14.
Firstly, the target series is speci¬ed in cell F2, using the reference numbers listed in
Table 2.4. In this case we remain with the same example as before: asset number 6, British
Petroleum, or BP.L for short. Note that in order to allow the generation of automatic tables,
the actual control cell is H2, and cell F2 acts as a kind of manual override.
Cointegration to Hedge and Trade International Equities 59

The second control cell is F4, labelled “RidgeFac”. This represents an important
modi¬cation of the basic methodology, which is necessary to avoid the problems caused
by regressing on large numbers of variables. Rather than using a standard regression,
this more practical methodology uses a “ridge regression” in which the resulting param-
eters are in some sense “smoothed” or “regularised” and this cell controls the amount of
smoothing (Hoerl and Kennard, 1970a,b).
The ¬nal control parameters consist of the number of observations which should be used
to construct the model (the “in-sample” set) and the subsequent number of observations
which should be used to evaluate the model performance (the “out-of-sample” set). Cell
F6 indicates the number of observations available in total, which for this analysis is 993.
Cell F8 is used to specify the number of “in-sample” observations. In this case we use
700 observations, representing approximately two-thirds of the available data. By default,
all of the remaining observations are used to perform the “out-sample” evaluation. This
number can be overridden using cell G9, but in this case is left as the default, giving 293
observations for the out-of-sample results analysis.
The data for the regression is collected on the “CointModel” worksheet. The target
asset is stored in column H of this worksheet; a constant column of ones is placed in
column J; and the 49 cointegrating assets are remapped to the adjacent columns K through
to BG. It is useful to have the 50 independent variables in contiguous columns in order to
simplify the matrix algebra used to compute the solution to the cointegrating regression.
The calculations for the cointegrating regression are performed on the “Workings” sheet.
The worksheet performs a “ridge regression” (Hoerl and Kennard, 1970a,b) in which the
solution is given by β = (CT C + »σ I)’1 Ct. The target vector t and data matrix C (the
49 other asset price series supplemented by a column of ones) are referenced from the
“CointModel” worksheet. The regularisation parameter lambda (») is referenced from cell
F4 of the “CointAnalysis” worksheet. The covariance matrix CT C is calculated in cells
G4:BD53. The vector Ct is calculated in BI4:BI53. The enhanced covariance matrix,
CT C + »σ I, is constructed in cells BM4:DJ53 by re-scaling the diagonal elements of
CT C. The inverse of this enhanced matrix is calculated in cells G56:BD105. Finally the
beta parameters are calculated in cells BI56:BI105 by multiplying this inverse by the
vector Ct.
With the regularisation parameter set to » = 0, the solution reduces to the standard
OLS regression: β = (CT C)’1 Ct. Lambda acts as a scaling coef¬cient for the diagonal
component of the covariance matrix CT C, proportionally downweighting the off-diagonal
covariance terms and reducing the apparent correlation between the different series. As
we will see below this has an important effect in stabilising the regression and enabling
us to use 50 regressor variables, more than would normally be practically feasible.
The resulting beta vector is copied across to cells J25:BG25 of the “CointModel” work-
sheet and used to construct the synthetic asset. This is calculated as the beta-weighted
average of the 50 constituent assets (including constant term) and is stored in cells
G40:G1032. Note that once the betas have been estimated from the ¬rst 700 observa-
tions (in this case), the same weights can be applied to subsequent data to calculate the
values of the synthetic asset during the out-of-sample period. For purposes of visualising
the composition of the synthetic asset, we take the beta vector and multiply through by
the scale of the individual time series. The resulting “effective weights” are illustrated in
the lower left-hand chart on the “CointAnalysis” worksheet which is also reproduced in
Figure 2.15.
60 Applied Quantitative Methods for Trading and Investment

20%

SHEL.L

15% RD.AS


TOTF.PA
CARR.PA
10%
BT.L
ENI.MI
BNPP.PA CSGZn.VX
constant
LLOY.L
AAH.AS AV.L
DCXGn.DE
5% BARC.L
NESZn.VX EONG.DE
AZN.L PHG.AS HSBA.L PRU.L VOD.L
UNc.AS SAN.MC
GSK.L NOK1V.HE OREP.PA DBKGn.DE DGE.L
UBSZn.VX AXAF.PA BAYG.DE TEF.MC
NOVZn.VX
0%
RUKZn.VX
SIEGn.DE BBVA.MC RBOS.L
CGEP.PA ERICb.ST ROCZg.VX LYOE.PA DTEGn.DE
ING.AS AEGN.AS EAUG.PA ZURZn.VX TIT.MI
GASI.MI
’5% ALVG.DE

MUVGn.DE


’10%
BP.L

Figure 2.15 Effective weights for the synthetic asset for British Petroleum (BP.L) » = 0.01

In the case of BP, the synthetic asset weights are dominated by other oil stocks, par-
ticularly Royal Dutch/Shell (RD.AS and SHEL.L) and Total-Fina (TOTF.PA), however
most of the other stocks also have non-zero, though small, weightings indicating that the
best historical ¬t to BP price movements is obtained by taking into account a wide range
of other stocks.
Given the target asset and the constructed synthetic asset we can calculate the difference
in price which is equivalent to the residual of the regression. The evolution of this time
series represents the performance of a hedged portfolio with a long position in the target
asset and an offsetting short position in the synthetic asset. If the synthetic asset is a good
hedge for the target, this residual price should have low volatility and remain close to
zero. In order to evaluate the effectiveness of the cointegration procedure we compare this
price residual to that obtained by a simpler procedure, namely hedging with an equally
weighted “market” portfolio. These time series are visualised in the bottom right-hand
chart of the “CointAnalysis” worksheet, which is reproduced in Figure 2.16.
The vertical line divides the time axis into the in-sample and out-of-sample periods.
During the in-sample period we expect the synthetic asset to closely match the target
(BP.L) simply by construction; similarly the corresponding “residual” is stable around
the zero level. Note that the synthetic asset is an average across a number of stocks and
in this case, as would be typical, has a smoother price trajectory than the target asset
itself but on the whole does tend to track the longer term price movements observed in
the target series. The synthetic market price obtained as an unweighted average across
the set of stocks appears to be less successful in following the price of the target asset
and this is also observed in the higher volatility of the corresponding (market) residual.
During the out-of-sample period, the synthetic asset price will only track the target asset
to the extent to which it has a similar exposure to the underlying risk factors which drive
Cointegration to Hedge and Trade International Equities 61

800

700

600

500
BP.L
400 synthetic
resid
300
s-mkt
m-resid
200

100

0

’100

’200

’300
9/14/1998
11/14/1998
1/14/1999
3/14/1999
5/14/1999
7/14/1999
9/14/1999
11/14/1999
1/14/2000
3/14/2000
5/14/2000
7/14/2000
9/14/2000
11/14/2000
1/14/2001
3/14/2001
5/14/2001
7/14/2001
9/14/2001
11/14/2001
1/14/2002
3/14/2002
5/14/2002
Figure 2.16 Hedged and unhedged time series for the cointegration model for BP


asset prices. In this case the model for BP appears quite successful, the residual remains
in a similar price range as during the in-sample period and seems also to be relatively
stable around the zero level.
The “CointAnalysis” worksheet also displays some basic measures which quantify
some properties of the synthetic asset and the out-of-sample performance. The values
corresponding to this example are shown in Figure 2.17.
The ¬rst two values characterise the makeup of the synthetic asset. The “sum” ¬gure
corresponds to the normalised sum of the asset weights, typically we would expect this
to be close to 100%. The “sumabs” ¬gure corresponds to the normalised sum of the
absolute asset weights; if there are some negative weights, these will typically be offset


RESULTS sum 100%
sumabs 166%


RawVar 1294.93
ResVar 423.15
ResMkt 1262.60

Reduction 67%
MktRed 2%
Improve 65%


Figure 2.17 Characteristics of the cointegration model for BP.L
62 Applied Quantitative Methods for Trading and Investment

by positive weights over and above 100% and the sum of the absolute weights will
re¬‚ect this. The ¬gure indicates that the sum of the absolute weights is 166% in this
case, re¬‚ecting negative weights totalling about 33% and offset by approximately 133%
of positive weights, to give a total of 166%. Cross-checking against the visualisation of
the weights in Figure 2.15, these numbers seem to be reasonable.
The sum of the absolute weights can be quite an important issue, as it provides an
estimate of the quantity of assets we need to buy and sell in order to use the synthetic
asset as a hedge. This measure also highlights the importance of using regularisation. For
instance, with regularisation set to zero, the sum of the absolute weights for the BP.L
synthetic asset becomes 404%, indicating that each unit of BP needs to be hedged against
a long“short combination of equities totalling four times the value invested in BP!
The remaining ¬gures serve to quantify the effectiveness of the synthetic asset at hedg-
ing the volatility in the target asset. These measures are calculated during the out-of-sample
period in order to produce unbiased results. The “RawVar” ¬gure corresponds to the
volatility of the asset, measured in terms of price variance, the “ResVar” is the residual
variance when hedged by the synthetic asset, and “ResMkt” is the residual variance when
hedged against an equally-weighted “market” portfolio. The ¬nal three ¬gures represent
the proportional effectiveness of the hedging procedure. Thus in this case, the “Reduc-
tion” of 67% indicates that the synthetic asset hedge removes 67% of the out-of-sample
volatility. The “market” portfolio is not a good hedge in this case, only removing 2% of
the volatility in BP, and thus the cointegration approach improves on the market hedge by
65% of the original volatility. This particular example serves to highlight the potentially
large improvement which can be obtained by replacing market-based hedging with the
cointegration approach, but it is only fair to note that in other cases the market hedge
performs equally well or even better than the cointegration approach. A fairer comparison,
across the whole set of 50 stocks, will be presented towards the end of this section.
The ¬nal part of the “CointAnalysis” worksheet presents a variance ratio analysis of
the time series dynamics of the hedged portfolio, with the result being shown in the chart
at the top of the worksheet. The calculations underlying this chart are contained in the
top-left corner of the “CointModel” worksheet. The results for the BP.L model are shown
in Figure 2.18.


1.4 900
1.3 800
Variance ratio (LH axis)
1.2 Variance (RH axis) 700
1.1 600
1
500
0.9
400
0.8
300
0.7
200
0.6
100
0.5
0
0.4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Length of time period

Figure 2.18 Variance ratio analysis of the cointegration model for BP.L
Cointegration to Hedge and Trade International Equities 63

The variance ratio chart clearly indicates the decaying pattern which corresponds to
mean-reverting dynamics. The 30-period variance ratio is just below 0.5, indicating that
the variance computed over 30-day intervals is less than half as high as would be expected,
given the observed 1-day variance. This suggests that, from a relative value perspective,
over 50% of the short-term volatility in BP is essentially spurious price ¬‚uctuation which
has a strong tendency to cancel itself out over a longer time-scale. This pattern can also
be observed, though less clearly, from the concave shape of the variance curve itself.
As these ¬gures are out-of-sample results they would suggest the possibility of ¬nding
a suitable statistical arbitrage strategy to exploit this mean-reverting component in the
relative price dynamics of BP against the synthetic asset portfolio. In this case, however,
the same caveat as before applies in that the non-synchronous nature of our close-price
data may overstate the size of the true reversion effect.
Before moving on to consider the performance across our broader universe, let us ¬rst
consider the importance of the regularisation parameter. Remember that the results we
have been describing above correspond to a model constructed with » = 0.01. Let us now
compare this model to the model we obtain by leaving all other parameters the same but
replacing the “RidgeFac” value in cell F4 with 0.1. The composition of the new synthetic
asset is shown in Figure 2.19.
With this higher degree of regularisation, the weights become more uniform. Very few
are now negative and the highest weight (for Shell, SHEL.L) is reduced from approx-
imately 18% to only 6%. The “sum” of the weights falls to 98% and the “sumabs” to
103%. In this case, however, the new synthetic asset is a less effective hedge for price
movements in BP. The residual variance of the hedged portfolio rises to 702.56 (from
the 423 shown in Figure 2.17), and the reduction in variance due to the hedge is now
only 46% (from 67% previously). Thus, in this particular case, increasing the degree of

7%

SHEL.L
RD.AS
6%
CARR.PA

5%
TOTF.PA
BARC.L
BT.L
4%
CSGZn.VX
LLOY.L
ENI.MI
AAH.AS RBOS.L
BNPP.PA HSBA.L
EONG.DE
3% constant DCXGn.DE NESZn.VX
AV.L
SAN.MC
VOD.L
DGE.L
UNc.AS
DBKGn.DE BAYG.DE
PRU.L
UBSZn.VX
2% OREP.PA
AZN.L ROCZg.VX
PHG.AS AXAF.PA
GSK.L BBVA.MC TEF.MC
RUKZn.VX
NOVZn.VX
1% DTEGn.DE
INGAS
. LYOE.PA
SIEGn.DE
AEGN.AS ZURZn.VX
EAUG.PA
CGEP.PA GASI.MI
TIT.MI
NOK1V.HE
0%
ERICb.ST
’1% ALVG.DE

MUVGn.DE
’2%
BP.L

Effective weights for the synthetic asset for British Petroleum (BP.L) » = 0.1
Figure 2.19
64 Applied Quantitative Methods for Trading and Investment

regularisation has decreased the effectiveness of the synthetic asset as a hedging portfolio.
This should not be surprising, as in the limit we would expect a heavily regularised syn-
thetic asset to closely match the equally-weighted portfolio, which we know is not a very
good hedge in this case.
It is easy to con¬rm that moving to the opposite extreme also leads to a performance
degradation: with the ridge factor set to zero, not only does the sum of absolute weights
rise to the unattractive 404% mentioned above, but the residual variance of 611.01 (53%
reduction) is also worse than the 423 (67%) for the intermediate case of » = 0.01. These
results indicate a pattern which is typical of much statistical modelling: a certain degree
of regularisation tends to be bene¬cial, but beyond a certain point the smoothing becomes
excessive and begins to degrade the model performance.
Whilst the case of this one model, for BP, is both interesting and illustrative, it is
important to know whether these results are merely a lucky “one off ” or whether they
represent an approach which can be applied more generally. For this reason the ¬nal
worksheet in the analysis, called “Results Summary”, contains a table of results generated
by taking each of the 50 assets in turn as the “target” asset. A particular sample of these
results is shown in Table 2.5.

Table 2.5 Performance of cointegration model across the universe of equities

Stock VarRed MktRed RelImp HdgeFac In-sample Out-sample SumWts AbsSumWts

BP.L 67% 2% 65% 0.01 700 293 100% 166%
BT.L 47% 31% 16% 0.01 700 293 103% 373%
GSK.L 56% 38% 18% 0.01 700 293 100% 188%
CGEP.PA 82% 58% 24% 0.01 700 293 114% 506%
’3%
UBSZn.VX 46% 49% 0.01 700 293 100% 143%
’1% 46% ’47%
DCXGn.DE 0.01 700 293 101% 289%
BP.L 67% 2% 65% 0.01 700 293 100% 166%
AZN.L 54% 37% 17% 0.01 700 293 98% 172%
NOK1V.HE 62% 36% 27% 0.01 700 293 101% 377%
’37% ’50%
NOVZn.VX 13% 0.01 700 293 101% 170%
ERICb.ST 67% 42% 25% 0.01 700 293 93% 396%
PHG.AS 31% 31% 0% 0.01 700 293 96% 193%
60% ’21%
ING.AS 39% 0.01 700 293 104% 165%
’1%
AAH.AS 39% 40% 0.01 700 293 100% 141%
AEGN.AS 86% 64% 22% 0.01 700 293 101% 230%
’68%
UNc.AS 21% 88% 0.01 700 293 98% 179%
RD.AS 68% 67% 1% 0.01 700 293 102% 160%
RUKZn.VX 69% 62% 8% 0.01 700 293 98% 154%
’60% 46% ’105%
ROCZg.VX 0.01 700 293 99% 158%
EAUG.PA 49% 41% 8% 0.01 700 293 105% 231%
SAN.MC 86% 79% 7% 0.01 700 293 101% 117%
’43%
NESZn.VX 32% 75% 0.01 700 293 98% 148%
50% ’11%
CARR.PA 39% 0.01 700 293 100% 277%
’42%
BNPP.PA 3% 45% 0.01 700 293 93% 151%
40% ’15%
AV.L 26% 0.01 700 293 103% 163%
AXAF.PA 60% 56% 4% 0.01 700 293 103% 154%
ZURZn.VX 75% 64% 11% 0.01 700 293 107% 289%
Cointegration to Hedge and Trade International Equities 65
Table 2.5 (continued )

Stock VarRed MktRed RelImp HdgeFac In-sample Out-sample SumWts AbsSumWts

’6%
TOTF.PA 57% 64% 0.01 700 293 98% 161%
LYOE.PA 53% 29% 24% 0.01 700 293 98% 152%
’140% ’161%
OREP.PA 21% 0.01 700 293 97% 130%
’6%
TIT.MI 76% 83% 0.01 700 293 100% 224%
44% ’48%
ENI.MI 91% 0.01 700 293 99% 152%
’28% ’104%
EONG.DE 75% 0.01 700 293 98% 157%
SIEGn.DE 63% 37% 26% 0.01 700 293 99% 211%
DBKGn.DE 70% 64% 6% 0.01 700 293 101% 196%
’19%
GASI.MI 37% 56% 0.01 700 293 103% 169%
’13%
DTEGn.DE 46% 60% 0.01 700 293 98% 379%
’5%
BBVA.MC 81% 86% 0.01 700 293 102% 154%
ALVG.DE 88% 73% 15% 0.01 700 293 100% 174%
’6%
BAYG.DE 63% 69% 0.01 700 293 106% 185%
71% ’26%
BARC.L 97% 0.01 700 293 100% 190%
HSBA.L 71% 44% 27% 0.01 700 293 99% 211%
23% ’48%
DGE.L 71% 0.01 700 293 97% 155%
’61% ’97%
LLOY.L 36% 0.01 700 293 99% 196%
PRU.L 67% 62% 5% 0.01 700 293 98% 202%
66% ’48%
RBOS.L 115% 0.01 700 293 96% 219%
’20%
SHEL.L 33% 53% 0.01 700 293 102% 157%
VOD.L 33% 23% 10% 0.01 700 293 104% 207%
TEF.MC 77% 57% 20% 0.01 700 293 101% 237%
MUVGn.DE 70% 61% 9% 0.01 700 293 99% 228%
’22%
CSGZn.VX 47% 69% 0.01 700 293 102% 147%
Mean 42% 24% 18% 0.01 700 293 100% 206%
Median 53% 43% 12% 0.01 700 293 100% 176%



The metrics in the table are precisely the same as those which are presented for indi-
vidual models on the “CointAnalysis” worksheet (and in fact are directly derived from
those values). Whilst the performance varies substantially from one equity to another,
the ¬gures for both mean and median performance con¬rm the general applicability of
the approach. During what has been a very turbulent time for the equity markets, the
synthetic hedge portfolios manage to reduce the out-of-sample volatility by a factor of
42% (mean) or 53% (median) “ note that the mean performance is more heavily affected

<<

. 3
( 19)



>>