<< стр. 4(всего 4)СОДЕРЖАНИЕ
a Sa = nв€’1 a X HX a
= nв€’1 (a X H )(HX a) since H H = H,
p
= nв€’1 y y = nв€’1 2
yj в‰Ґ 0
j=1

for y = HX a. It is well known from the one-dimensional case that nв€’1 n (xi в€’ x)2
i=1
в€’1
as an estimate of the variance exhibits a bias of the order n (Breiman, 1973). In the
n
multidimensional case, Su = nв€’1 S is an unbiased estimate of the true covariance. (This will
be shown in Example 4.15.)
The sample correlation coeп¬ѓcient between the i-th and j-th variables is rXi Xj , see (3.8). If
D = diag(sXi Xi ), then the correlation matrix is
R = Dв€’1/2 SDв€’1/2 , (3.21)
where Dв€’1/2 is a diagonal matrix with elements (sXi Xi )в€’1/2 on its main diagonal.

EXAMPLE 3.8 The empirical covariances are calculated for the pullover data set.
The vector of the means of the four variables in the dataset is x = (172.7, 104.6, 104.0, 93.8) .
пЈ« пЈ¶
1037.2 в€’80.2 1430.7 271.4
пЈ¬ в€’80.2 219.8 92.1 в€’91.6 пЈ·
The sample covariance matrix is S = пЈ­ пЈ·.
пЈ¬
1430.7 92.1 2624 210.3 пЈё
271.4 в€’91.6 210.3 177.4
The unbiased estimate of the variance (n =10) is equal to
пЈ« пЈ¶
1152.5 в€’88.9 1589.7 301.6
10 пЈ¬ в€’88.9 244.3 102.3 в€’101.8 пЈ·
Su = S = пЈ­ пЈ·.
пЈ¬
1589.7 102.3 2915.6 233.7 пЈё
9
301.6 в€’101.8 233.7 197.1
94 3 Moving to Higher Dimensions

пЈ« пЈ¶
в€’0.17
1 0.87 0.63
пЈ¬ в€’0.17 0.12 в€’0.46 пЈ·
1
The sample correlation matrix is R = пЈ¬ пЈ·.
пЈ­ 0.87 0.12 1 0.31 пЈё
0.63 в€’0.46 0.31 1

Linear Transformation

In many practical applications we need to study linear transformations of the original data.
This motivates the question of how to calculate summary statistics after such linear trans-
formations.
Let A be a (q Г— p) matrix and consider the transformed data matrix
Y = X A = (y1 , . . . , yn ) . (3.22)

The row yi = (yi1 , . . . , yiq ) в€€ Rq can be viewed as the i-th observation of a q-dimensional
random variable Y = AX. In fact we have yi = xi A . We immediately obtain the mean
and the empirical covariance of the variables (columns) forming the data matrix Y:
1 1
Y 1n = AX 1n = Ax
y= (3.23)
n n
1 1
SY = Y HY = AX HX A = ASX A . (3.24)
n n
Note that if the linear transformation is nonhomogeneous, i.e.,
yi = Axi + b where b(q Г— 1),
only (3.23) changes: y = Ax + b. The formula (3.23) and (3.24) are useful in the particular
case of q = 1, i.e., y = X a в‡” yi = a xi ; i = 1, . . . , n:
y = ax
Sy = a SX a.

EXAMPLE 3.9 Suppose that X is the pullover data set. The manager wants to compute
his mean expenses for advertisement (X3 ) and sales assistant (X4 ).
Suppose that the sales assistant charges an hourly wage of 10 EUR. Then the shop manager
calculates the expenses Y as Y = X3 + 10X4 . Formula (3.22) says that this is equivalent to
deп¬Ѓning the matrix A(4 Г— 1) as:
A = (0, 0, 1, 10).
Using formulas (3.23) and (3.24), it is now computationally very easy to obtain the sample
mean y and the sample variance Sy of the overall expenses:
пЈ« пЈ¶
172.7
пЈ¬ 104.6 пЈ·
y = Ax = (0, 0, 1, 10) пЈ¬
пЈ­ 104.0 пЈё = 1042.0
пЈ·

93.8
3.4 Linear Model for Two Variables 95

пЈ« пЈ¶пЈ« пЈ¶
1152.5 в€’88.9 1589.7 301.6 0
пЈ¬ в€’88.9 244.3 102.3 в€’101.8 пЈ· пЈ¬ 0 пЈ·
SY = ASX A = (0, 0, 1, 10) пЈ­
пЈ¬ пЈ·пЈ¬ пЈ·
1589.7 102.3 2915.6 233.7 пЈё пЈ­ 1 пЈё
301.6 в€’101.8 233.7 197.1 10
= 2915.6 + 4674 + 19710 = 27299.6.

Mahalanobis Transformation

A special case of this linear transformation is

zi = S в€’1/2 (xi в€’ x), i = 1, . . . , n. (3.25)

Note that for the transformed data matrix Z = (z1 , . . . , zn ) ,

SZ = nв€’1 Z HZ = Ip . (3.26)

So the Mahalanobis transformation eliminates the correlation between the variables and
standardizes the variance of each variable. If we apply (3.24) using A = S в€’1/2 , we obtain
the identity covariance matrix as indicated in (3.26).

Summary
в†’ The center of gravity of a data matrix is given by its mean vector x =
nв€’1 X 1n .
в†’ The dispersion of the observations in a data matrix is given by the empir-
ical covariance matrix S = nв€’1 X HX .
в†’ The empirical correlation matrix is given by R = Dв€’1/2 SDв€’1/2 .
в†’ A linear transformation Y = X A of a data matrix X has mean Ax and
empirical covariance ASX A .
в†’ The Mahalanobis transformation is a linear transformation zi = S в€’1/2 (xi в€’
x) which gives a standardized, uncorrelated data matrix Z.

3.4 Linear Model for Two Variables

We have looked many times now at downward- and upward-sloping scatterplots. What does
the eye deп¬Ѓne here as slope? Suppose that we can construct a line corresponding to the
96 3 Moving to Higher Dimensions

general direction of the cloud. The sign of the slope of this line would correspond to the
upward and downward directions. Call the variable on the vertical axis Y and the one on
the horizontal axis X. A slope line is a linear relationship between X and Y :

yi = О± + ОІxi + Оµi , i = 1, . . . , n. (3.27)

Here, О± is the intercept and ОІ is the slope of the line. The errors (or deviations from the
line) are denoted as Оµi and are assumed to have zero mean and п¬Ѓnite variance Пѓ 2 . The task
of п¬Ѓnding (О±, ОІ) in (3.27) is referred to as a linear adjustment.
In Section 3.6 we shall derive estimators for О± and ОІ more formally, as well as accurately
describe what a вЂњgoodвЂќ estimator is. For now, one may try to п¬Ѓnd a вЂњgoodвЂќ estimator (О±, ОІ)
via graphical techniques. A very common numerical and statistical technique is to use those
О± and ОІ that minimize:
n
(yi в€’ О± в€’ ОІxi )2 .
(О±, ОІ) = arg min (3.28)
(О±,ОІ)
i=1

The solutions to this task are the estimators:
sXY
ОІ= (3.29)
sXX
О± = y в€’ ОІx. (3.30)

The variance of ОІ is:
Пѓ2
V ar(ОІ) = . (3.31)
n В· sXX
The standard error (SE) of the estimator is the square root of (3.31),
Пѓ
SE(ОІ) = {V ar(ОІ)}1/2 = . (3.32)
(n В· sXX )1/2
We can use this formula to test the hypothesis that ОІ=0. In an application the variance
Пѓ 2 has to be estimated by an estimator Пѓ 2 that will be given below. Under a normality
assumption of the errors, the t-test for the hypothesis ОІ = 0 works as follows.
One computes the statistic

ОІ
t= (3.33)
SE(ОІ)
and rejects the hypothesis at a 5% signiп¬Ѓcance level if | t |в‰Ґ t0.975;nв€’2 , where the 97.5%
quantile of the StudentвЂ™s tnв€’2 distribution is clearly the 95% critical value for the two-sided
test. For n в‰Ґ 30, this can be replaced by 1.96, the 97.5% quantile of the normal distribution.
An estimator Пѓ 2 of Пѓ 2 will be given in the following.
3.4 Linear Model for Two Variables 97

pullovers data
200
sales (X2)
150
100

80 90 100 110 120
price (X2)

Figure 3.5. Regression of sales (X1 ) on price (X2 ) of pullovers.
MVAregpull.xpl

EXAMPLE 3.10 Let us apply the linear regression model (3.27) to the вЂњclassic blueвЂќ pullovers.
The sales manager believes that there is a strong dependence on the number of sales as a
function of price. He computes the regression line as shown in Figure 3.5.

How good is this п¬Ѓt? This can be judged via goodness-of-п¬Ѓt measures. Deп¬Ѓne

yi = О± + ОІxi , (3.34)

as the predicted value of y as a function of x. With y the textile shop manager in the above
example can predict sales as a function of prices x. The variation in the response variable
is:
n
(yi в€’ y)2 .
nsY Y = (3.35)
i=1
98 3 Moving to Higher Dimensions

The variation explained by the linear regression (3.27) with the predicted values (3.34) is:
n
(yi в€’ y)2 . (3.36)
i=1

The residual sum of squares, the minimum in (3.28), is given by:
n
(yi в€’ yi )2 .
i=1

An unbiased estimator Пѓ 2 of Пѓ 2 is given by RSS/(n в€’ 2).
The following relation holds between (3.35)вЂ“(3.37):
n n n
2 2
(yi в€’ yi )2 ,
(yi в€’ y) (yi в€’ y) +
= (3.38)
i=1 i=1 i=1
total variation = explained variation + unexplained variation.

The coeп¬ѓcient of determination is r2 :
n
(yi в€’ y)2
explained variation
i=1
r2 = В·
= (3.39)
n
total variation
y)2
(yi в€’
i=1

The coeп¬ѓcient of determination increases with the proportion of explained variation by the
linear relation (3.27). In the extreme cases where r2 = 1, all of the variation is explained by
the linear regression (3.27). The other extreme, r2 = 0, is where the empirical covariance is
sXY = 0. The coeп¬ѓcient of determination can be rewritten as
n
(yi в€’ yi )2
i=1
r2 = 1 в€’ . (3.40)
n
(yi в€’ y)2
i=1

From (3.39), it can be seen that in the linear regression (3.27), r2 = rXY is the square of
2

the correlation between X and Y .

EXAMPLE 3.11 For the above pullover example, we estimate

ОІ = в€’0.364.
О± = 210.774 and

The coeп¬ѓcient of determination is
r2 = 0.028.
The textile shop manager concludes that sales are not inп¬‚uenced very much by the price (in
a linear way).

 << стр. 4(всего 4)СОДЕРЖАНИЕ