<<

. 16
( 39)



>>

(e.g., whether A has more e¬ect on G or H); and one statistic that tells you how reliable this
two measures of
covariation, not just one.
relationship is (e.g., whether knowing A gives you much con¬dence to predict G or H). Your
¬rst task is to draw lines into each of the eight graphs in Figure 13.1 that best ¬ts the points (do
it!)”and then to stare at the points and your lines. How steep are your lines (the relationships)
and how reliably do the points cluster around your line? Before reading on, your second task
is to write into the eight graphs in Figure 13.1 what your intuition tells you the relation and the
reliability are.
Figure 13.1 seems to show a range of di¬erent patterns:
Here is what I see.


• C and D are strongly related to A. Actually, C is just 2 · (A + 3); D is just ’C/2, so these
two relationships are also perfectly reliable.

• E and F have no relationship to A. Though random, there is no pattern connecting A
and E. F is 5 regardless of what A is.

• G and H both tend to be higher when A is higher, but in di¬erent ways. There is more
reliability in the relation between A and G, but even a large increase in A predicts a G
that is only a little higher. In contrast, there is less reliability in how A and H covary, but
even a small increase in A predicts a much higher H.

• Figures I and J repeat the G/H pattern, but for negative relations.
¬le=statistics-g.tex: RP
313
Section 13·4. Bivariate Statistics: Covariation Measures.




Figure 13.1. C Through J as functions of A
15




15
D
C

5




5
0




0
’10




’10
’10 0 10 20 30 ’10 0 10 20 30

A A
15




15
F
E

5




5
0




0
’10




’10




’10 0 10 20 30 ’10 0 10 20 30

A A
15




15
H
G

5




5
0




0
’10




’10




’10 0 10 20 30 ’10 0 10 20 30

A
A
15




15
5




5
J
I

0




0
’10




’10




’10 0 10 20 30 ’10 0 10 20 30

A A

The ¬ve observations are marked by circles. Can you draw a well ¬tting line? Which series have relations with A?
What sign? Which series have reliable relations with A?
¬le=statistics-g.tex: LP
314 Chapter 13. Statistics.



Table 13.4. Illustrative Rates of Return Time Series on Nine Assets

Portfolio (or Asset or Security)

A C D E F G H I J
Observation
’5 ’4 ’10 ’10
Year 1 +1.0 +5 2 +5 +12
’4.5 ’9 ’9
Year 2 +6 +18 +5 4 +3 +14
’3.0 ’10
Year 3 +3 +12 +2 +5 3 +10 +4
’1 ’1.0 ’8
Year 4 +4 +12 +5 2 +5 +9
’5.0 ’10
Year 5 +7 +20 0 +5 4 +12 +3
’2.5 ’1 ’1
Mean +2 +10 +5 +3 +4 +3
Var 25 100 6.25 81 0 1 121 1 144
Sdv 5 10 2.5 9 0 1 11 1 12

Which rate of return series (among portfolios C through J) had low and which had high covariation with the rate of
return series of Portfolio A?




I cheated in not using my eyeballs to draw lines, but in using the technique of “ordinary least
Side Note:
squares” line ¬tting in Figure 13.3, instead. The lines make it even clearer that when A is high, C, G, and H tend
to be high, too; but when A is high, D, I, and J tend to be low. And neither E nor F seem to covary with A.
(You will get to compute the slope of this line”the “beta””later.)


Of course, visual relationships are in the eye of the beholder. You need something more objec-
How covariance really
works: quadrants tive that both of us can agree on. Here is how to compute a precise measure. The ¬rst step is
above/below means.
to determine the means of each X and each Y variable and mark these into the ¬gure”which
is done for you in Figure 13.2. The two means divide your data into four quadrants. Now, intu-
itively, points in the northeast or southwest quadrants (in white) suggest a positive covariation;
points that are in the northwest or southeast quadrants (in red) suggest a negative covariation.
In other words, the idea of all of the covariation measures is that two series, call them X and
Y , are positively correlated

• when X tends to be above its mean, Y also tends to be above its mean (upper right quad-
rant); and
• when X tends to be below its mean, Y also tends to be below its mean (lower left quadrant).

And X and Y are negatively correlated

• when X tends to be above its mean, Y tends to be below its mean (lower right quadrant);
and
• when X tends to be below its mean, Y tends to be above its mean (upper left quadrant).
¬le=statistics-g.tex: RP
315
Section 13·4. Bivariate Statistics: Covariation Measures.



Figure 13.2. H as Function of A


15
Northwest Quadrant: Northeast Quadrant: Product is Positive
Y Deviation is Positive, X Deviation is Negative
Product is Positive*Negative = Negative




A is below A mean; <==> X Deviation is Negative
10




A is above A mean; <==> X Deviation is Positive
5
H

0




H is above H mean; <==> Y Deviation is Positive
H is below H mean; <==> Y Deviation is Negative


pulls towards negative slope
’5




Southwest Quadrant: Product is Positive
’10




Southeast Quadrant: Product is Negative



’10 ’5 0 5 10

A

Points in the red quadrants pull the overall covariance statistics into a negative direction. Points in the white
quadrants pull the overall covariance statistics into a positive direction.




Covariance
How can you make a positive number for every point that is either above both the X and Y The main idea is to think
about where points lie,
means, or both below the X and Y means, and a negative number for every point that is above
within the quadrants in
one mean and below the other? Easy! First, you measure each data point in terms of its distance the ¬gure. Multiply the
from its mean, so you subtract the mean from each data point, as in Table 13.5. Points in the deviations from the
mean to get the right
northeast quadrant are above both means, so both demeaned values are positive. Points in the
sign for each data point
southwest quadrant are below both means, so both demeaned values are negative. Points in the relative to each
quadrant.
other two quadrants have one positive and one negative demeaned number. Now, you know
that the product of either two positive or two negative numbers is positive, and the product
of one positive and one negative number is negative. So if you multiply your deviations from
the mean, the product has a positive sign if it is in the upper-right and lower-left quadrants
(the deviations from the mean are either both positive or both negative), and a negative sign if
it is in the upper-left and lower-right quadrants (only one deviation from the mean is positive,
the other is negative). A point that has a positive product pulls towards positive covariation,
whereas a negative product pulls towards negative covariation.
You want to know the average pull. This is the covariance”the average of these products The covariance is the
average of the
of two variables™ deviations from their own means. Try it for A and C (again, following the
multiplied deviations
standard method of dividing by N ’ 1 rather than N): from their means.

(’7)·(’14) + (+4)·(+8) + (+1)·(+2) + (’3)·(’6) + (+5)·(+10)
Cov(A, C) = = 50
4
(13.8)
¯ ¯
(a1 ’ a)·(b1 ’ b) + ... + (a5 ’ a)·(b5 ’ b)
¯ ¯
Cov(A, C) = .
N ’1
¬le=statistics-g.tex: LP
316 Chapter 13. Statistics.



Figure 13.3. B Through I as functions of A, Lines Added




15




15
Perfect Covariation




D
C

5




5
0




0
’10




’10
’10 0 10 20 30 ’10 0 10 20 30

A A
15




15
Zero Covariation




F
E

5




5
0




0
’10




’10
’10 0 10 20 30 ’10 0 10 20 30

A A


Reliable Association, Shallow Slope Unreliable Association, Steep Slope
Positive Covariation



15




15
H
G

5




5
0




0
’10




’10




pulls the other way


’10 0 10 20 30 ’10 0 10 20 30

A
A



pulls the other way
Negative Covariation



15




15
5




5
J
I

0




0
’10




’10




’10 0 10 20 30 ’10 0 10 20 30

A A



The ¬ve observations are marked by circles. The areas north, south, east, and west of the X and Y means are now
marked. A cross with arm lengths equal to one standard deviation is also placed on each ¬gure.

Which points push the relationship to be positive, which points push the relationship to be negative?
¬le=statistics-g.tex: RP
317
Section 13·4. Bivariate Statistics: Covariation Measures.



Table 13.5. Illustrative Rates of Return Time Series on Nine Assets, De-Meaned

Rates of Returns on Portfolios (or Assets or Securities)

A C D E F G H I J
Observation
’7 ’14 ’9 ’1 ’9
Year 1 +3.5 0 +1 +9
’2.0 ’8 ’8 ’1
Year 2 +4 +8 0 +1 +11
’0.5 ’13
Year 3 +1 +2 +3 0 0 +11 0
’3 ’6 ’1 ’7
Year 4 +1.5 +13 0 +1 +6
’2.5 ’1 ’13
Year 5 +5 +10 +1 0 +1 +13
E (˜)
r 0 0 0.0 0 0 0 0 0 0
V (˜)
ar r 25 100 6.25 81 0 1 121 1 144
Sdv (˜)
r 5 10 2.5 9 0 1 11 1 12

It will be easier to work with the series from Table 13.4 if you ¬rst subtract the mean from each series.




Note how in this A vs. C relationship, each term in the sum is positive, and therefore pulls the
average (the covariance) towards being a positive number. You can see this in the ¬gure, because
each and every point lies in the two “positivist” quadrants. Repeat this for the remaining series:

(’7)·(+3.5) + (+4)·(’2) + (+1)·(’0.5) + (’3)·(+1.5) + (+5)·(’2.5)
Cov(A, D) = = ’12.5 ;
4
(’7)·(’9) + (+4)·(’8) + (+1)·(+3) + (’3)·(+13) + (+5)·(+1)
Cov(A, E) = = 0 ;
4
(’7)·(0) + (+4)·(0) + (+1)·(0) + (’3)·(0) + (+5)·(0)
Cov(A, F) = = 0 ;
4
(’7)·(’1) + (+4)·(+1) + (+1)·(0) + (’3)·(’1) + (+5)·(+1)
Cov(A, G) = = 4.75 ;
4
(13.9)
(’7)·(’9) + (+4)·(’8) + (+1)·(+11) + (’3)·(’7) + (+5)·(+13)
Cov(A, H) = = 32 ;
4
(’7)·(+1) + (+4)·(’1) + (+1)·(0) + (’3)·(+1) + (+5)·(’1)
Cov(A, I) = = ’4.75 ;
4
(’7)·(+9) + (+4)·(+11) + (+1)·(’13) + (’3)·(+6) + (+5)·(’13)
Cov(A, J) = = ’28.75 ;
4
(x1 ’ x)·(y1 ’ y) + ... + (x5 ’ x)·(y5 ’ y)
¯ ¯ ¯ ¯
Cov(X, Y ) = .
N ’1

Having computed the covariances, look at Figure 13.3. A and D, A and I, and A and J covary
negatively on average; A and C, A and G, and A and H covary positively; and A and E, and
A and F have zero covariation. Now, take a look at the A vs. H ¬gure (also in Figure 13.2)
again: there is one lonely point in the lower-right quadrant, marked with an arrow. It tries to
pull the line into a negative direction. In the A vs. H covariance computation, this point is the
(+4)·(’1) term, which is the only negative component in the overall sum. If this one point had
not been in the data, the association between A and H would have been more strongly positive
than 4.75.
The covariance tells you the sign (whether a relationship is positive or negative), but its magni- Covariance gives the
right sign, but not much
tude is di¬cult to interpret”just as you could not really interpret the magnitude of the variance.
more. It is often
Indeed, the covariance not only shares the weird squared-units problem with the variance, but abbreviated as a sigma
the covariance of a variable with itself is the variance! This can be seen from the de¬nitions: with two subscripts.
Both multiply each historical outcome deviation by itself, and then divide by the same number,

(x1 ’ x)·(x1 ’ x) + ... + (xN ’ x)·(xN ’ x)
¯ ¯ ¯ ¯
Cov(X, X) =
N ’1 (13.10)
(x1 ’ x)2 + ... + (xN ’ x)2
¯ ¯
= = Var(X) .
N ’1
¬le=statistics-g.tex: LP
318 Chapter 13. Statistics.

And, just like the variance is needed to compute the standard deviation, the covariance is
needed to compute the next two covariation measures (correlation and beta). The covariance
statistic is so important and used so often that the Greek letter sigma (σ ) with two subscripts
has become the standard abbreviation:

Covariance between X and Y : Cov(X, Y ) σX,Y
(13.11)
2
Var(X) σX,X =
Variance of X: σX .

These are sigmas with two subscripts. If you use only one subscript, you mean the standard
deviation:
(13.12)
Standard Deviation of X: Sdv(X) σX .

This is easy to remember if you think of two subscripts as the equivalent of multiplication
(squaring), and of one subscript as the equivalent of square-rooting.


Correlation
To better interpret the covariance, you need to somehow normalize it. A ¬rst normalization
Correlation is closely
related to, but easier to of the covariance gives the correlation. It divides the covariance by the standard deviations of
interpret than
both variables. Applying this formula, you can compute
Covariance.

Cov(A, C) +50
Correlation(A, C) = = = +1.00 ;
Sdv (A) · Sdv (C) 10 · 5
Cov(A, D) ’12.5
Correlation(A, D) = = = ’1.00 ;
Sdv (A) · Sdv (D) 5 · 2.5
Cov(A, E) 0
Correlation(A, E) = = = ±0.00 ;
Sdv (A) · Sdv (E) 5·9
Cov(A, F) 0
Correlation(A, F) = = = not de¬ned ;
Sdv (A) · Sdv (F) 5·0
Cov(A, G) 4.75 (13.13)
Correlation(A, G) = = = +0.95 ;
Sdv (A) · Sdv (G) 5·1
Cov(A, H) 32
Correlation(A, H) = = = +0.58 ;
Sdv (A) · Sdv (H) 5 · 11
Cov(A, I) ’4.75
Correlation(A, I) = = = ’0.95 ;
Sdv (A) · Sdv (I) 5·1
Cov(A, J) ’28.75
Correlation(A, J) = = = ’0.48 ;
Sdv (A) · Sdv (J) 5 · 12
Cov(X, Y )
Correlation(X, Y ) = .
Sdv (X) · Sdv (Y )

The correlation measures the reliability of the relationship between two variables. A higher ab-
solute correlation means more reliability, regardless of the strength of the relationship (slope).
The nice thing about the correlation is that it is always between ’100% and +100%. Two vari-
ables that have a correlation of 100% always perfectly move in the same direction, two variables
that have a correlation of “100% always perfectly move in the opposite direction, and two vari-
ables that are independent have a correlation of 0%. This makes the correlation very easy to in-
terpret. The correlation is unit-less, regardless of the units of the original variables themselves,
and is often abbreviated with the Greek letter rho (ρ). The perfect correlations between A
and C or D tell you that all points lie on straight lines. (Verify this visually in Figure 13.3!) The
correlations between A and G (95%) and the correlations between A and I (’95%) are almost as
strong: the points almost lie on a line. The correlation between A and H, and the correlation
between A and J are weaker: the points do not seem to lie on a straight line, and knowing A
does not permit you to perfectly predict H or J.
¬le=statistics-g.tex: RP
319
Section 13·4. Bivariate Statistics: Covariation Measures.

If two variable are always acting identically, they have a correlation of 1. Therefore, you can
Side Note:
determine the maximum covariance between two variables:

˜˜
Cov(X, Y )
˜˜ ˜ ˜
1= ⇐ Cov(X, Y ) = Sdv(X) · Sdv(Y ) .
’ (13.14)
˜ ˜
Sdv(X) · Sdv(Y )
It is mathematically impossible for the absolute value of the covariance to exceed the product of the two standard
deviations.



Beta
The correlation cannot tell you that A has more pronounced in¬‚uence on C than on D: although Beta is yet another
covariation measure,
both correlations are perfect, if A is higher by 1, your prediction of C is higher by 2; but if A
and has an interesting
is higher by 1, your prediction of D is lower by only ’0.5. You need a measure for the slope of graphical interpretation.
the best-¬tting line that you would draw through the points. Your second normalization of the
covariance does this: it gives you this slope, the beta. It divides the covariance by the variance
of the X variable (here, A), i.e., instead of one dividing by the standard deviation of Y (as in the
correlation), it divides a second time by a standard deviation of X:

Cov(A, C) +50
βC,A = = = +2.00 ;
Sdv (A) · Sdv (A) 5·5
Cov(A, D) ’12.5
βD,A = = = ’0.50 ;
Sdv (A) · Sdv (A) 5·5
Cov(A, E) 0
βE,A = = = ±0.00 ;
Sdv (A) · Sdv (A) 5·5
Cov(A, F) 0
βF,A = = = ±0.00 ;
Sdv (A) · Sdv (A) 5·5
Cov(A, G) 4.75 (13.15)
βG,A = = = +0.19 ;
Sdv (A) · Sdv (A) 5·5
Cov(A, H) 32
βH,A = = = +1.28 ;
Sdv (A) · Sdv (A) 5·5
Cov(A, I) ’4.75
= = = ’0.19 ;
βI,A
Sdv (A) · Sdv (A) 5·5
Cov(A, J) ’28.75
= = = ’1.15 ;
βJ,A
Sdv (A) · Sdv (A) 5·5
Cov(X, Y ) Cov(X, Y )
= =
βY ,X .
Sdv (X) · Sdv (X) V (X)
ar

The ¬rst subscript on beta denotes the variable on the Y axis, while the second subscript on
beta denotes the variable on the X axis. It is the latter that provides the variance (the divisor).
Beta got its name from the fact that the most common way to write down the formula for a line
is y = ± + β · x, and the best-¬tting line slope is exactly what beta is. Unlike correlations, betas
are not limited to any range. Beta values of +1 and ’1 denote the diagonal lines, beta values of
0 and ∞ denote the horizontal and vertical line. Inspection of Figure 13.3 shows that the slope
of A vs. C is 2 to 1, while the slope of A vs. D is shallower 1 to -2. This is exactly what beta
tells us: βC,A is 2.0, while βD,A is ’0.5. Unlike the correlation, beta cannot tell you whether
your line ¬ts perfectly or imperfectly. But, unlike the correlation, beta can tell you how much
you should change your prediction of Y if the X values change. And unlike correlation and
covariance, the order of the two variables matters in computing beta. For example, βG,A = 0.19
is not βA,G :
Cov(A, G) 5.00
= = =5. (13.16)
βA, G
Sdv(G) · Sdv(G) 1·1
¬le=statistics-g.tex: LP
320 Chapter 13. Statistics.

Digging Deeper: In a statistical package, beta can be obtained either by computing covariances and variances
and then dividing the two; or by running a Linear Regression, in which the dependent variable is the Y variable
and the independent variable is the X variable.


(13.17)
Y = ±+β·X +« .

Both methods yield the same answer.


And, as with all other statistical measures, please keep in mind that you are usually computing
When it comes to stock
returns, you really want a historical beta (line slope), although you usually are really interested in the future beta (line
to know the future slope,
slope)!
although you usually
only have the historical
slope.
Summary of Covariation Measures
Table 13.6 summarizes the three covariation measures.


Table 13.6. Comparison of Covariation Measures

Order
Units Magnitude of Variables computed as Measures
σX,Y
Covariance squared practically meaningless irrelevant No Intuition
between ’1 and +1 σX,Y / (σX · σY )
Correlation unit-less irrelevant Reliability
σX,Y / σX,X
beta (Y,X) unit-less meaningful (slope) important Slope
σX,Y / σY ,Y
beta (X,Y) unit-less meaningful (slope) important Slope

2
All covariation measures share the same sign. If one is positive (negative), so are all others. Recall that σX,X = σX ,
which must be positive.



Figure 13.4 summarizes everything that you have learned about the covariation of your series.
It plots the data points, the quadrants, the best ¬tting lines, and a text description of the three
measures of covariation.


13·4.C. Computing Covariation Statistics For The Annual Returns Data

Now back to work! It is time to compute the covariance, correlation, and beta for your three
Applying the covariance
formula to the historical investment choices, S&P500, IBM, and Sony. Return to the deviations from the means in Ta-
data.
ble 13.2. As you know, to compute the covariances, you add the products of the demeaned
observations and divide by (T ’ 1)”tedious, but not di¬cult work:

(+0.162) · (’0.366) + ... + (’0.335) · (’0.511)
Cov(˜S&P500 , rIBM ) = = 0.0330 ;
r ˜
11

(+0.162) · (’0.345) + ... + (’0.335) · (’0.323)
Cov(˜S&P500 , rSony ) = = 0.0477 ;
r ˜
11

(’0.366) · (’0.345) + ... + (’0.511) · (’0.323)
Cov(˜IBM , rSony ) = = 0.0218 ;
r ˜
11

(˜i,s=1 ’ E (˜i )) · (˜j,s=1 ’ E (˜j )) + ... + (˜i,s=T ’ E (˜i ) · (˜j,s=T ’ E (˜j ))
r r r r r r r r
Cov(˜i , rj ) =
r˜ .
T ’1
(13.18)
All three covariance measures are positive. You know from the discussion on Page 317 that,
aside from their signs, the covariances are almost impossible to interpret. Therefore, now
¬le=statistics-g.tex: RP
321
Section 13·4. Bivariate Statistics: Covariation Measures.




Figure 13.4. C Through J as functions of A with lines and text



covarA, D = ’12.5
15




15
Perfect Correlation




correlA, D = ’1
betaA, D = ’0.5
covarA, C = 50




D
C

5




5
correlA, C = 1
0




0
betaA, C = 2
’10




’10
’10 0 10 20 30 ’10 0 10 20 30

A A


covarA, F = 0
covarA, E = 0
15




15
correlA, F = NA
correlA, E = 0
Zero Covariation




betaA, F = 0
betaA, E = 0 F
E

5




5
0




0
’10




’10




’10 0 10 20 30 ’10 0 10 20 30

A A


Reliable Association, Shallow Slope Unreliable Association, Steep Slope

covarA, G = 4.75
15




15
Positive Covariation




correlA, G = 0.95
betaA, G = 0.19
covarA, H = 32
H
G

5




5




correlA, H = 0.58
0




0




betaA, H = 1.28
’10




’10




pulls the other way


’10 0 10 20 30 ’10 0 10 20 30

A
A


pulls the other way
Negative Covariation




covarA, I = ’4.75 covarA, J = ’28.75
15




15




correlA, I = ’0.95 correlA, J = ’0.48
betaA, I = ’0.19 betaA, J = ’1.15
5




5
J
I

0




0
’10




’10




’10 0 10 20 30 ’10 0 10 20 30

A A

The ¬ve observations are marked by circles. The areas north, south, east, and west of the X and Y means are now
marked. A cross with arm lengths equal to one standard deviation is also placed on each ¬gure.
¬le=statistics-g.tex: LP
322 Chapter 13. Statistics.

compute the correlations, your measure of how well the best-¬tting line ¬ts the data. The
correlations are the covariances divided by the two standard deviations:

3.30%
Correlation(˜S&P500 , rIBM ) = = 44.7% ;
r ˜
19.0% · 38.8%

4.77%
Correlation(˜S&P500 , rSony ) = = 27.8% ;
r ˜
19.0% · 90.3%
(13.19)
2.18%
Correlation(˜IBM , rSony ) = = 6.2%
r ˜ ;
38.8% · 90.3%

Cov(˜i , rj )

Correlation(˜i , rj ) =
r˜ .
Sdv(˜i ) · Sdv(˜j )
r r

So, the S&P500 has correlated much more with IBM, than the S&P500 has correlated with Sony
(or IBM with Sony). This makes intuitive sense. Both S&P500 and IBM are U.S. investments,
while Sony is a stock that is trading in an entirely di¬erent market.
Finally, you might want to compute the beta of rSony with respect to the rS&P500 (i.e., divide the
˜ ˜
Applying the beta
formula to the historical covariance of rSony with rS&P500 by the variance of rS&P500 ), and the beta of rIBM with respect to
˜ ˜ ˜ ˜
data.
the rS&P500 . Although you should really write βrIBM ,˜S&P500 , no harm is done if you omit the r for
˜ ˜
r
˜
convenience, and just elevate the subscripts when there is no risk of confusion. Thus, you can
just write βIBM,S&P500 , instead.
3.30%
βIBM,S&P500 = = 0.91 ;
(19.0%)2
4.77% (13.20)
βSony,S&P500 = = 1.31 .
(19.0%)2
Cov(˜i , rj )

=
βi,j .
Sdv(˜i ) · Sdv(˜j )
r r

Beta is the slope of the best-¬tting line when the rate of return on S&P500 is on the X-axis and
the rate of return on IBM (or Sony) is on the Y -axis. Note that although Sony was correlated
less with the S&P500 than IBM was correlated with the S&P500, it is Sony that has the steeper
slope. Correlation and beta do measure di¬erent things. The next chapters will elaborate more
on the importance of beta in ¬nance.



You have now computed all the statistics that this book will use: means, variances, standard de-
The Marquis De Sade
would not have been viations, covariances, correlations, and betas. Only modestly painful, I hope. The next chapter
happy. Ok, neither
will use no new statistics, but it will show how they work in the context of portfolios.
would Mary Poppins.

Solve Now!
Q 13.3 What is the correlation of a random variable with itself?


Q 13.4 What is the slope (beta) of a random variable with itself?


Q 13.5 Return to the historical rates of return on the DAX from Question 13.2 (Page 311). Com-
pute the covariances, correlations and betas for the DAX with respect to each of the other three
investment securities.


Q 13.6 Very advanced Question: Compute the annual rates of return on a portfolio of 1/3 IBM
and 2/3 Sony. Then compute the beta of this portfolio with respect to the S&P500. How does this
compare to the beta of IBM with respect to the S&P500, and the beta of Sony with respect to the
S&P500?
¬le=statistics-g.tex: RP
323
Section 13·5. Summary.

13·5. Summary

The chapter covered the following major points:

• Finance often uses statistics based on historical rates of return as standings to predict
statistics for future rates of return. This is a leap of faith”often, but not always correct.

• Tildes denote random variables”a distribution of possible future outcomes. In practice,
the distribution is often given by historical data.

• The expected rate of return is a measure of the reward. It is often forecast from the
historical mean.

• The standard deviation”and its intermediate input, the variance”are measures of the
risk. The standard deviation is (practically) the square-root of the average squared devia-
tion from the mean.

• Covariation measures how two variables move together. Causality induces covariation,
but not vice-versa. So, two variables can covary, even if neither variable would be the
cause of the other.

• Like variance, the covariance is di¬cult to interpret. Thus, covariance is often only an
intermediate number on the way to more intuitive statistics.

• The correlation is the covariance divided by the standard deviation of both variables. It
measures how reliable a relationship between two variables is. The order of variables
does not matter. The correlation is always between ’1 and +1.

• The beta is the covariance divided the standard deviation of the variable on the X axis
squared (which is the variance). It measures how steep a relationship between two vari-
ables is. The order of variables matters: βA,B is di¬erent from βB,A .
¬le=statistics-g.tex: LP
324 Chapter 13. Statistics.

Advanced Appendix: More Statistical Theory
13·6.

13·6.A. Historical and Future Statistics

The theory usually assumes that although you do not know the outcome of a random variable,
Physical processes often
have known properties. you do know the statistics (such as mean and standard deviation) for the outcomes. That is,
Stock returns do not.
you can estimate or judge a random variable™s unknown mean, standard deviation, covariance
(beta), etc. Alas, while this is easy for the throw of a coin or a die, where you know the physical
properties of what determines the random outcome, this is not so easy for stock returns. For
example, what is the standard deviation of next month™s rate of return on PepsiCo?
You just do not have a better alternative than to assume that PepsiCo™s returns are manifesta-
Use historical statistics
as estimators of future tions of the same statistical process over time. So, if you want to know the standard deviation of
statistics.
PepsiCo™s next month™s rate of return, you typically must assume that each historical monthly
rate of return”at least over the last couple of years”was a draw from the same distribution.
Therefore, you can use the historical rates of return, assuming each one was an equally likely
outcome, to estimate the future standard deviation. Analogously, the mechanics of the compu-
tation for obtaining the estimated future standard deviation are exactly the same as those you
used to obtain an actual historical standard deviation.
But, using historical statistics and then arguing that they are representative of the future is
This works well for
standard deviations and a bit of a leap. Empirical observation has taught us that doing so works well for standard
covariation statistics,
deviations and covariation measures: that is, the historical statistics obtained from monthly
but not for means.
rates of returns over the last 3 to 5 years appear to be fairly decent predictors of future betas
and standard deviations. Unfortunately, the historical mean rates of return are fairly unreliable
predictors of the future rates of returns.

Q 13.7 When predicting the outcome of a die, why do you not use historical statistics on die
throws as predictors of future die throws?


Q 13.8 Are historical ¬nancial securities™ mean rates of return good predictors of future mean
rates of return?


Q 13.9 Are historical ¬nancial securities™ standard deviations and correlations of rates of return
good predictors of their future equivalents?



13·6.B. Improving Future Estimates From Historical Estimates

The principal remaining problem in the reliability of historical estimates of covariances for
Extreme outcomes have
more of two prediction is what statisticians call “regression to the mean.” That is, the most extreme his-
components: higher
torical estimates are likely caused not only by the underlying true estimates, but even more
expected outcome and a
so by chance. For example, if all securities had a true standard deviation of 30% per annum,
higher error term
(which will not repeat).
over a particular year some might show a standard deviation of 40%, while others might show
a standard deviation of 20%. Those with the high 40% historical standard deviations are most
likely to have lower than their historical standard deviations (dropping back to 30%). Those
with the low 20% historical standard deviations are most likely to have higher than their histor-
ical standard deviations (increasing back to 30%). This can also manifest itself in market beta
estimates. Predicting future betas by running a regression with historical rate of return data
is too naïve. The reason is that a stock that happened to have a really high return on one day
will show too high a beta if the overall stock market happened to have gone up this day and
too low a beta if the overall stock market happened to have gone down this day. Such extreme
observations tend not to repeat under the same market conditions in the future.
¬le=statistics-g.tex: RP
325
Section 13·6. Advanced Appendix: More Statistical Theory.

Statisticians handle such problems with a technique called “shrinkage.” The historical estimates Shrinkage just reduces
the estimate, hoping to
are reduced (“shrunk”) towards a more common mean. Naturally, the exact amount by which
adjust for extremes™
historical estimates should be shrunk and what number they should be shrunk towards is a errors.
very complex technical problem”and doing it well can make millions of dollars. This book
is de¬nitely not able to cover this subject appropriately. Still, reading this book, you might
wonder if there is something both quick-and-dirty and reasonable that you can do to obtain
better estimates of future mean returns, better estimates of future standard deviations, and
better estimates of future betas.
The answer is yes. Here is a two minute non-formal heuristic estimation job: To predict a Advice: take the average
of the market historical
portfolio statistic, average the historical statistic on your particular portfolio with the historical
statistic and your
statistic on the overall stock market. There are better and more sophisticated methods, but individual stock
this averaging is likely to predict better than the historical statistic of the particular portfolio historical statistic. It
probably predicts better.
by itself. (With more time and statistical expertise, you could use other information, such
as beta, the industry historical rate of return, or the average P/E ratio of the portfolio, to
produce even better guestimates of future portfolio behavior.) For example, the market beta
for the overall market is “1.0,” so my prescription is to average the estimated beta and 1.0.
Commercial vendors of market beta estimates do something similar, too. Bloomberg computes
the weighted average of the estimated market beta and the number one, with weights of 0.67 and
0.33, respectively, Value Line reverses these two weights. Ibbotson Associates, however, does
something more sophisticated, shrinking beta not towards one, but towards a “peer group”
market beta.
Let us apply some shrinking to the statistics in Table 14.7 on Page 353. If you were asked to An example of shrinking.
guestimate an expected annual rate of return for Wal-Mart over the next year, you would not
quote Wal-Marts historical 31.5% as your estimate of Wal-Mart™s future rate of return. Instead,
you could quote an average of 31.5% and 6.3% (the historical rate of return on the market from
1997 to 2002), or about 20% per annum. (This assumes that you are not permitted to use more
sophisticated models, such as the CAPM.) You would also guestimate Wal-Mart™s risk to be the
average of 31.1% and 18.7%, or about 25% per year. Finally, you would guestimate Wal-Mart™s
market beta to be about 0.95. The speci¬c market index to which you shrink matters little (the
Dow-Jones 30 or the S&P500)”but it does matter that you do shrink somehow! An even better
target to shrink towards would be the industry average statistics. (Some researchers go as far
as to estimate only industry betas, and forego computing the individual ¬rm beta altogether!
This is shrinking to a very large degree.) However, good shrinking targets are beyond the scope
of this book. Would you like to bet that the historical statistics are better guestimates than the
shrunk statistics? (If so, feel free to invest your money into Wal-Mart, and deceive yourself that
you will likely earn a mean return 31.5%! Good luck!)
Here is a summary of some recommendations. Based on regressions using ¬ve years of his- What works reasonably
well, what does not.
torical monthly data, to predict one-year-ahead statistics, you can use reasonable shrinkage
methods for large stocks (e.g., members of the S&P500) as follows:

Mean Nothing works too well (i.e.,predicting the future from the past).
Market-Model Alpha Nothing works too well.
Market-Model Beta Average the historical beta with the number “1.” For example, if the regression
coe¬cient (covariance/variance) is 4, use a beta of 2.5.
Standard Deviation Average the historical standard deviation of the stock and the historical standard
deviation of the S&P500. Then increase by 30%, because, historically, for unknown
reasons, volatility has been increasing.


Recall that the market model is the linear regression in which the x variable is the rate of return
on the S&P500, and the y variable is the rate of return on the stock in which you are interested.
¬le=statistics-g.tex: LP
326 Chapter 13. Statistics.

13·6.C. Other Measures of Spread

There are measures of risk other than the variance and standard deviation, but they are obscure
enough to deserve your ignorance (at least until an advanced investments course). One such
measure is the mean absolute deviation (MAD) from the mean. For the example of a rate of
return of either +25% or ’25%,

(13.21)
MAD = 1/2 · |(’25%)| + 1/2 · |(+25%)| = 1/2 · 25% + 1/2 · 25% = 25% .

In this case, the outcome happens to be the same as the standard deviation, but this is not
generally the case. The MAD gives less weight than the standard deviation to observations far
from the mean. For example, if you had three returns, ’50%, ’50% and +100%, the mean would
be 0%, the standard deviation 70.7%, and the MAD 66.7%.
Another measure of risk is the semivariance (SV ), which relies only on observations below
zero or below the mean. That is, all positive returns (or deviations from the mean) are simply
ignored. For the example of +25% or ’25%,

SV = 1/2 · (’25%)2 + 1/2 · (0) = 1/2 · 0.0625 = 0.03125 . (13.22)

The idea is that investors fear only realizations that are negative (or below the mean).
Finally, note that the correlation has another nice interpretation: the correlation squared is the
R 2 in a bivariate OLS regression with a constant.


13·6.D. Translating Mean and Variance Statistics Into Probabilities

Although you now know enough to compute a measure of risk, you have not bothered to explore
Translating standard
deviations into more how likely outcomes are. For example, if a portfolio™s expected rate of return is 12.6% per year,
intuitive risk
and its standard deviation is 22% per year, what is the probability that you will lose money
assessments
(earn below 0%)? What is the probability that you will earn 15% or more? 20% or more?
It turns out that if the underlying distribution looks like a bell curve”and many common
Stock returns often
assume a normal portfolio return distributions have this shape”there is an easy procedure to translate mean
(bell-shaped)
and standard deviation into the probability that the return will end being less than x. In fact,
distribution.
this probability correspondence is the only advantage that bell shaped distributions provide!
Everything else works regardless of the actual shape of the distribution.
For concreteness sake, assume you want to determine the probability that the rate of return on
An example: Z-score and
probability. this portfolio is less than +5%:

Step 1 Subtract the mean from 5%. In the example, with the expected rate of return of 12.6%,
the result is 5% ’ 12.6% = ’7.6%.

Step 2 Divide this number by the standard deviation. In ths example, this is ’7.6% divided by
22%, which comes to ’0.35. This number is called the Score or Z-score.

Step 3 Look up the probability for this Score in the Cumulative Normal Distribution Table in
Table B.1 (Page 796). For the score of ’0.35, this probability is about 0.36.

In sum, you have determined that if returns are drawn from a distribution with a mean of 12.6%
and a standard deviation of 22%, then the probability of observing a single rate of return of
+5% or less is about 36%. It also follows that the probability that a return is greater than +5%
must be 100% ’ 36% = 64%.
Side Note: In the real world, this works well enough”but not perfectly. So, do not get fooled by theoretical
pseudo-accuracy. Anything between 30% and 40% is a reasonable prediction here.
¬le=statistics-g.tex: RP
327
Section 13·6. Advanced Appendix: More Statistical Theory.

Now recall portfolio P in Table 14.1. P had a mean of 12.6% and a standard deviation of 22%. How well does the
Z-score ¬t in the
You have just computed that about one third of the 12 annual portfolio returns should be below
history?
+5%. 1991, 1993, 1994, 1995, 1996, 1997, 1998, and 1999 performed better than +5%; 1992,
2000, 2001, and 2002 performed worse. So, as predicted by applying the normal distribution
table, about one third of the annual returns were 5% or less.
A common question is “what is the probability that the return will be negative?” Use the same How likely is it that you
will lose money?
technique,

Step 1 Subtracting the mean of 0% yields 0.0% ’ 12.6% = ’12.6%.
Step 2 Dividing ’12.6% by the standard deviation of 22% yields the score of ’0.57.
Step 3 For this score of ’0.57, the probability is about 28%.

In words, the probability that the rate of return will be negative is around 25% to 30%. And,
therefore, the probability that the return will be positive is around 70% to 75%. The table shows
that 4 out of the 12 annual rates of return are negative. This is most likely sampling error:
with only 12 annual rates of return, it was impossible for the distribution of data to accurately
follow a bell shape.

Many portfolio returns have what is called “fat tails.” This means that the probability of
Digging Deeper:
extreme outcomes”especially extreme negative outcomes”is often higher than suggested by the normal distri-
bution table. For example, if the mean return were 30% (e.g. for a multi-year return) and the standard deviation
were 10%, the score for a value of 0 is ’3. The table therefore suggests that the probability of drawing a negative
return should be 0.135%, or about once in a thousand periods. Long experience with ¬nancial data suggests that
this is often much too overcon¬dent for the real world. In some contexts, the true probability of even the most
negative possible outcome (’100%) may be as high as 1%, even if the Z-score suggests 0.0001%!




13·6.E. Correlation and Causation

A warning: covariation is related to, but not the same as Causation. If one variable “causes” Correlation does not
imply Causation.
another, then the two variables will be correlated. But the opposite does not hold. For example,
snow and depression are positively correlated, but neither causes the other. Instead, there is
another variable (winter) that has an in¬‚uence on both snow and depression.
Solve Now!
Q 13.10 If the mean is 20 and the standard deviation is 15, what is the probability that the value
will turn out to be less than 0?

<<

. 16
( 39)



>>