<< стр. 9(всего 14)СОДЕРЖАНИЕ >>
are more popular than the one-sided ones. For two-sided CUSUM schemes we
consider a combination of two one-sided schemes, Lucas (1976) or Lucas and
Crosier (1982), and a scheme based on Crosier (1986). Note, that in some
recent papers the same concept of combination of two one-sided schemes is
used for EWMA charts.
Recall, that Shewhart charts are a special case of EWMA schemes (О» = 1).
Therefore, we distinguish 5 SPC schemes вЂ“ ewma1, ewma2, cusum1, cusum2
(two one-sided schemes), and cusumC (CrosierвЂ™s scheme). For the two-sided
EWMA charts the following quantlets are provided in the XploRe quantlib
spc.

SPC quantlets for two-sided EWMA scheme
вЂ“ produces chart п¬Ѓgure
spcewma2
spcewma2arl вЂ“ returns ARL
вЂ“ returns critical value c2
spcewma2c
spcewma2pmf вЂ“ returns probability mass and distribution function
of the run-length for single time points
spcewma2pmfm вЂ“ the same up to a given time point
240 11 Statistical Process Control

By replacing ewma2 by one of the remaining four scheme titles the related
characteristics can be computed.
The quantlets spcewma1,...,spccusumC generate the chart п¬Ѓgure. Here, we ap-
ply the 5 charts to artiп¬Ѓcial data. 100 pseudo random values from a normal
distribution are generated. The п¬Ѓrst 80 values have expectation 0, the next 20
values have expectation 1, i. e. model (11.1) with Вµ0 = 0, Вµ1 = 1, and m = 81.
We start with the two-sided EWMA scheme and set О» = 0.1, i. e. the chart is

Two-sided EWMA chart
lambda = 0.10, in-control ARL = 300 94
1
0.5
Z_t

0
-0.5

0 50 100
t

Figure 11.1. Two-sided EWMA chart XFGewma2fig.xpl

very sensitive to small changes. The critical value c2 (see (11.3)) is computed
to provide an in-control ARL of 300 (see Section 11.2). Thus, the scheme leads
in average after 300 observations to a false alarm.
EWMA
In Figure 11.1 the graph of Zt is plotted against time t = 1, 2, . . . , 100.
Further, the design parameter О», the in-control ARL, and the time of alarm (if
there is one) are printed. One can see, that the above EWMA scheme detects
the change point m = 81 at time point 94, i. e. the delay is equal to 14. The
related average values, i. e. ARL and Average Delay (AD), for Вµ1 = 1 are 9.33
and 9.13, respectively. Thus, the scheme needs here about 5 observations more
than average.
11.1 Control Charts 241

In the same way the remaining four SPC schemes can be plotted. Remark,
that in case of ewma1 one further parameter has to be set. In order to obtain
EWMA
a suitable п¬Ѓgure and an appropriate scheme the EWMA statistic Zt (see
(11.4)) is reп¬‚ected at a pre-speciп¬Ѓed border zreflect в‰¤ 0 (= Вµ0 ), i. e.
EWMA EWMA
} , t = 1, 2, . . .
Zt = max{zreflect, Zt

for an upper EWMA scheme. Otherwise, the statistic is unbounded, which
leads to schemes with poor worst case performance. Further, the methods

One-sided EWMA chart
1

lambda = 0.10, in-control ARL = 300, zreflect = -4.0 93
0.5
Z_t

0
-0.5

0 50 100
t

Figure 11.2. One-sided EWMA chart XFGewma1fig.xpl

used in Section 11.2 for computing the chart characteristics use bounded con-
tinuation regions of the chart. If zreflect is small enough, then the ARL and
the AD (which are not worst case criterions) of the reп¬‚ected scheme are the
same as of the unbounded scheme. Applying the quantlet XFGewma1fig.xpl
with zreflect= в€’4 leads to Figure 11.2. Thereby, zreflect has the same
normalization factor О»/(2 в€’ О») like the critical value c2 (see 2.). The corre-
sponding normalized border is printed as dotted line (see Figure 11.2). The
chart signals one observation earlier than the two-sided version in Figure 11.1.
The related ARL and AD values for Вµ1 = 1 are now 7.88 and 7.87, respectively.
In Figure 11.3 the three diп¬Ђerent CUSUM charts with k = 0.5 are presented.
They signal at the time points 87, 88, and 88 for cusum1, cusum2, and cusumC,
242 11 Statistical Process Control

respectively. For the considered dataset the CUSUM charts are faster be-

One-sided CUSUM chart
k = 0.50, in-control ARL = 300 87

10
Z_t

5
0

0 50 100
t
XFGcusum1fig.xpl

Two-sided CUSUM chart
k = 0.50, in-control ARL = 300 88
10
5
Z_t

0
-5

0 50 100
t
XFGcusum2fig.xpl

CrosierвЂ™s two-sided CUSUM chart
k = 0.50, in-control ARL = 300 88
10
5
Z_t

0

0 50 100
t
XFGcusumCfig.xpl

Figure 11.3. CUSUM charts: one-sided, two-sided, CrosierвЂ™s two-sided
11.2 Chart characteristics 243

cause of their better worst case performance. The observations right before
the change point at m = 81 are smaller than average. Therefore, the EWMA
charts need more time to react to the increased average. The related average
values of the run length, i. e. ARL and AD, are 8.17 and 7.52, 9.52 and 8.82,
9.03 and 8.79 for cusum1, cusum2, and cusumC, respectively.

11.2 Chart characteristics
Consider the change point model (11.1). For п¬Ѓxed m denote Pm (В·) and Em (В·)
the corresponding probability measure and expectation, respectively. Hereby,
m = в€ћ stands for the case of no change, i. e. the so called in-control case. Then
the Average Run Length (ARL) (expectation of the run length L) is deп¬Ѓned
as
Eв€ћ (L) , Вµ = Вµ0
LВµ = . (11.7)
E1 (L) , Вµ = Вµ0
Thus, the ARL denotes the average number of observations until signal for a
sequence with constant expectation. Вµ = Вµ0 or m = в€ћ stands for no change,
Вµ = Вµ0 and m = 1 mark, that just at the п¬Ѓrst time point (or earlier) a change
takes place from Вµ0 to Вµ. Therefore, the ARL evaluates only the special sce-
nario of m = 1 of the SPC scheme. Other measures, which take into account
that usually 1 < m < в€ћ, were introduced by Lorden (1971) and Pollak and
Siegmund (1975), Pollak and Siegmund (1975). Here, we use a performance
measure which was п¬Ѓrstly proposed by Roberts (1959). The so called (condi-
tional) Average Delay (AD, also known as steady-state ARL) is deп¬Ѓned as
(m)
DВµ lim DВµ ,
= (11.8)
mв†’в€ћ
(m)
DВµ = Em L в€’ m + 1|L в‰Ґ m ,
where Вµ is the value of Вµ1 in (11.1), i. e. the expectation after the change.
While LВµ measures the delay for the case m = 1, DВµ determines the delay for
a SPC scheme which ran a long time without signal. Usually, the convergence
(m)
in (11.8) is very fast. For quite small m the diп¬Ђerence between DВµ and DВµ is
very small already. LВµ and DВµ are average values for the random variable L.
Unfortunately, L is characterized by a large standard deviation. Therefore, one
might be interested in the whole distribution of L. Again, we restrict on the
special cases m = 1 and m = в€ћ. We consider the probability mass function
PВµ (L = n) (PMF) and the cumulative distribution function PВµ (L в‰¤ n) (CDF).
Based on the CDF, one is able to compute quantiles of the run length L.
244 11 Statistical Process Control

For normally distributed random variables it is not possible to derive exact
solutions for the above characteristics. There are a couple of approximation
techniques. Besides very rough approximations based on the Wald approxi-
mation known from sequential analysis, Wiener process approximations and
similar methods, three main methods can be distinguished:

1. Markov chain approach due to Brook and Evans (1972): Replacement of
the continuous statistic Zt by a discrete one
2. Quadrature of integral equations which are derived for the ARL, Vance
(1986) and Crowder (1986) and for some eigenfunctions which lead to the
3. Waldmann (1986) approach: Iterative computation of P (L = n) by us-
ing quadrature and exploiting of monotone bounds for the considered
characteristics

Here we use the п¬Ѓrst approach, which has the advantage, that all considered
characteristics can be presented in a straightforward way. Next, the Markov
chain approach is brieп¬‚y described. Roughly speaking, the continuous statistic
Zt is approximated by a discrete Markov chain Mt . The transition Ztв€’1 =
x в†’ Zt = y is approximated by the transition Mtв€’1 = i w в†’ Mt = j w with
x в€€ [i w в€’ w/2, i w + w/2] and y в€€ [j w в€’ w/2, j w + w/2]. That is, given an
integer r the continuation region of the scheme [в€’c, c], [zreflect, c], or [0, c]
is separated into 2 r + 1 or r + 1 intervals of the kind [i w в€’ w/2, i w + w/2]
(one exception is [0, w/2] as the п¬Ѓrst subinterval of [0, c]). Then, the transition
kernel f of Zt is approximated by the discrete kernel of Mt , i. e.
f (x, y) в‰€ P (i w в†’ [j w в€’ w/2, j w + w/2])/w
for all x в€€ [i w в€’ w/2, i w + w/2] and y в€€ [j w в€’ w/2, j w + w/2]. Eventually,
we obtain a Markov chain {Mt } with 2 r + 1 or r + 1 transient states and one
absorbing state. The last one corresponds to the alarm (signal) of the scheme.
Denote by Q = (qij ) the matrix of transition probabilities of the Markov chain
{Mt } on the transient states, 1 a vector of ones, and L = (Li ) the ARL vector.
Li stands for the ARL of a SPC scheme which starts in point i w (corresponds
to z0 ). In the case of a one-sided CUSUM scheme with z0 = 0 [0, w/2]
the value L0 approximates the original ARL. By using L we generalize the
original schemes to schemes with possibly diп¬Ђerent starting values z0 . Now,
the following linear equation system is valid, Brook and Evans (1972):
(I в€’ Q) L = 1 , (11.9)
11.2 Chart characteristics 245

where I denotes the identity matrix. By solving this equation system we get
the ARL vector L and an approximation of the ARL of the considered SPC
scheme. Remark that the larger r the better is the approximation. In the days
of Brook and Evans (1972) the maximal matrix dimension r+1 (they considered
cusum1) was 15 because of the restrictions of the available computing facilities.
Nowadays, one can use dimensions larger than some hundreds. By looking
at diп¬Ђerent r one can п¬Ѓnd a suitable value. The quantlet XFGrarl.xpl
demonstrates this eп¬Ђect for the Brook and Evans (1972) example. 9 diп¬Ђerent
values of r from 5 to 500 are used to approximate the in-control ARL of a
one-sided CUSUM chart with k = 0.5 and c3 = 3 (variance Пѓ 2 = 1). We get

r 5 10 20 30 40 50 100 200 500
L0 113.47 116.63 117.36 117.49 117.54 117.56 117.59 117.59 117.60

XFGrarl.xpl

The true value is 117.59570 (obtainable via a very large r or by using the
quadrature methods with a suitable large number of abscissas). The computa-
tion of the average delay (AD) requires more extensive calculations. For details
see, e. g., Knoth (1998) on CUSUM for Erlang distributed data. Here we apply
the Markov chain approach again, Crosier (1986). Given one of the considered
schemes and normally distributed data, the matrix Q is primitive, i. e. there
exists a power of Q which is positive. Then Q has one single eigenvalue which
is larger in magnitude than the remaining eigenvalues. Denote this eigenvalue
by . The corresponding left eigenvector П€ is strictly positive, i. e.

П€Q = П€ , П€ > 0. (11.10)

It can be shown, Knoth (1998), that the conditional density f (В·|L в‰Ґ m) of
both the continuous statistic Zt and the Markov chain Mt tends for m в†’
в€ћ to the normalized left eigenfunction and eigenvector, respectively, which
correspond to the dominant eigenvalue . Therefore, the approximation of
D = lim Em (L в€’ m + 1|L в‰Ґ m) can be constructed by
mв†’в€ћ

D = (П€ T L)/(П€ T 1) .

Note, that the left eigenvector П€ is computed for the in-control mean Вµ0 , while
the ARL vector L is computed for a speciп¬Ѓc out-of-control mean or Вµ0 again.
If we replace in the above quantlet ( XFGrarl.xpl) the phrase arl by ad, then
246 11 Statistical Process Control

we obtain the following output which demonstrates the eп¬Ђect of the parameter
r again.

r 5 10 20 30 40 50 100 200 500
D0 110.87 114.00 114.72 114.85 114.90 114.92 114.94 114.95 114.95

Fortunately, for smaller values of r than in the ARL case we get good accuracy
already. Note, that in case of cusum2 the value r has to be smaller (less than 30)
than for the other charts, since it is based on the computation of the dominant
eigenvalue of a very large matrix. The approximation in case of combination of
two one-sided schemes needs a twodimensional approximating Markov chain.
For the ARL only exists a more suitable approach. As, e. g., Lucas and Crosier
(1982) shown it is possible to use the following relation between the ARLs of the
one- and the two-sided schemes. Here, the two-sided scheme is a combination
of two symmetric one-sided schemes which both start at z0 = 0. Therefore,
we get a very simple formula for the ARL L of the two-sided scheme and the
ARLs Lupper and Llower of the upper and lower one-sided CUSUM scheme

Lupper В· Llower
L= . (11.11)
Lupper + Llower

Eventually, we consider the distribution function of the run length L itself.
By using the Markov chain approach and denoting with pn the approximated
i
probability of (L > n) for a SPC scheme started in i w, such that pn = (pn ),
i
we obtain
pn = pnв€’1 Q = p0 Qn . (11.12)
The vector p0 is initialized with p0 = 1 for the starting point z0 в€€ [i w в€’
i
0
w/2, i w + w/2] and pj = 0 otherwise. For large n we can replace the above
equation by
pn в‰€ gi n . (11.13)
i

The constant gi is deп¬Ѓned as

gi = П†i /(П†T П€) ,

where П† denotes the right eigenvector of Q, i. e. Q П† = П†. Based on (11.12)
and (11.13) the probability mass and the cumulative distribution function of
11.2 Chart characteristics 247

the run length L can be approximated. (11.12) is used up to a certain n. If the
diп¬Ђerence between (11.12) and (11.13) is smaller than 10в€’9 , then exclusively
(11.13) is exploited. Remark, that the same is valid as for the AD. For the
two-sided CUSUM scheme (cusum2) the parameter r has to be small (в‰¤ 30).

11.2.1 Average Run Length and Critical Values

The spc quantlib provides the quantlets spcewma1arl,...,spccusumCarl for
computing the ARL of the corresponding SPC scheme. All routines need the
actual value of Вµ as a scalar or as a vector of several Вµ, two scheme param-
eters, and the integer r (see the beginning of the section). The XploRe ex-
ample XFGarl.xpl demonstrates all ...arl routines for k = 0.5, О» = 0.1,
zreflect= в€’4, r = 50, c = 3, in-control and out-of-control means Вµ0 = 0 and
Вµ1 = 1, respectively. The next table summarizes the ARL results

chart ewma1 ewma2 cusum1 cusum2 cusumC
L0 1694.0 838.30 117.56 58.780 76.748
L1 11.386 11.386 6.4044 6.4036 6.4716

XFGarl.xpl
Remember that the ARL of the two-sided CUSUM (cusum2) scheme is based on
the one-sided one, i. e. 58.78 = 117.56/2 and 6.4036 = (6.4044В·49716)/(6.4044+
49716) with 49716 = Lв€’1 .
For the setup of the SPC scheme it is usual to give the design parameter О» and
k for EWMA and CUSUM, respectively, and a value Оѕ for the in-control ARL.
Then, the critical value c (c2 or c3 ) is the solution of the equation LВµ0 (c) =
Оѕ. Here, the regula falsi is used with an accuracy of |LВµ0 (c) в€’ Оѕ| < 0.001.
The quantlet XFGc.xpl demonstrates the computation of the critical values
for SPC schemes with in-control ARLs of Оѕ = 300, reference value k = 0.5
(CUSUM), smoothing parameter О» = 0.1 (EWMA), zreflect= в€’4, and the
Markov chain parameter r = 50.

chart ewma1 ewma2 cusum1 cusum2 cusumC
c 2.3081 2.6203 3.8929 4.5695 4.288

XFGc.xpl
248 11 Statistical Process Control

The parameter r = 50 guarantees fast computation and suitable accuracy.
Depending on the power of the computer one can try values of r up to 1000 or
larger (see XFGrarl.xpl in the beginning of the section).

11.2.2 Average Delay

The usage of the routines for computing the Average Delay (AD) is similar
to the ARL routines. Replace only the code arl by ad. Be aware that the
computing time is larger than in case of the ARL, because of the computation
of the dominant eigenvalue. It would be better to choose smaller r, especially in
the case of the two-sided CUSUM. Unfortunately, there is no relation between
the one- and two-sided schemes as for the ARL in (11.11). Therefore, the library
computes the AD for the two-sided CUSUM based on a twodimensional Markov
chain with dimension (r + 1)2 Г— (r + 1)2 . Thus with values of r larger than 30,
the computing time becomes quite large. Here the results follow for the above

chart ewma1 ewma2 cusum1 cusum2 cusumC
D0 1685.8 829.83 114.92 56.047 74.495
D1 11.204 11.168 5.8533 5.8346 6.2858

11.2.3 Probability Mass and Cumulative Distribution
Function

The computation of the probability mass function (PMF) and of the cumulative
distribution function (CDF) is implemented in two diп¬Ђerent types of routines.
The п¬Ѓrst one with the syntax spcchartpmf returns the values of the PMF
P (L = n) and CDF P (L в‰¤ n) at given single points of n, where chart has
to be replaced by ewma1, ..., cusumC. The second one written as spcchartpmfm
computes the whole vectors of the PMF and of the CDF up to a given point
n, i. e. P (L = 1), P (L = 2), . . . , P (L = n) and the similar one of the CDF.
Note, that the same is valid as for the Average Delay (AD). In case of the
two-sided CUSUM scheme the computations are based on a twodimensional
11.2 Chart characteristics 249

Markov chain. A value of parameter r less than 30 would be computing time
friendly.
With the quantlet XFGpmf1.xpl the 5 diп¬Ђerent schemes (r = 50, for cusum2
r = 25) are compared according their in-control PMF and CDF (Вµ = Вµ0 = 0) at
the positions n in {1, 10, 20, 30, 50, 100, 200, 300}. Remark, that the in-control
ARL of all schemes is chosen as 300.

chart ewma1 ewma2 cusum1 cusum2 cusumC
6 В· 10в€’8 2 В· 10в€’9 6 В· 10в€’6 4 В· 10в€’7 2 В· 10в€’6
P (L = 1)
P (L = 10) 0.00318 0.00272 0.00321 0.00307 0.00320
P (L = 20) 0.00332 0.00324 0.00321 0.00325 0.00322
P (L = 30) 0.00315 0.00316 0.00310 0.00314 0.00311
P (L = 50) 0.00292 0.00296 0.00290 0.00294 0.00290
P (L = 100) 0.00246 0.00249 0.00245 0.00248 0.00245
P (L = 200) 0.00175 0.00177 0.00175 0.00176 0.00175
P (L = 300) 0.00125 0.00126 0.00124 0.00125 0.00125
6 В· 10в€’8 2 В· 10в€’9 6 В· 10в€’6 4 В· 10в€’7 2 В· 10в€’6
P (L = 1)
P (L в‰¤ 10) 0.01663 0.01233 0.02012 0.01675 0.01958
P (L в‰¤ 20) 0.05005 0.04372 0.05254 0.04916 0.05202
P (L в‰¤ 30) 0.08228 0.07576 0.08407 0.08109 0.08358
P (L в‰¤ 50) 0.14269 0.13683 0.14402 0.14179 0.14360
P (L в‰¤ 100) 0.27642 0.27242 0.27728 0.27658 0.27700
P (L в‰¤ 200) 0.48452 0.48306 0.48480 0.48597 0.48470
P (L в‰¤ 300) 0.63277 0.63272 0.63272 0.63476 0.63273

XFGpmf1.xpl

A more appropriate, graphical representation provides the quantlet
XFGpmf2.xpl. Figure 11.4 shows the corresponding graphs.
250 11 Statistical Process Control

run length cdf - in control

15
P(L<=n)*E-2
10
5
0

0 10 20 30 40 50
n

run length cdf - out of control
1
P(L<=n)
0.5

ewma2
cusumC
0

0 10 20 30 40 50
n
Figure 11.4. CDF for two-sided EWMA and CrosierвЂ™s CUSUM for
Вµ = 0 (in control) and Вµ = 1 (out of control)
XFGpmf2.xpl
11.3 Comparison with existing methods 251

11.3 Comparison with existing methods

11.3.1 Two-sided EWMA and Lucas/Saccucci

Here, we compare the ARL and AD computations of Lucas and Saccucci (1990)
with XploRe results. In their paper they use as in-control ARL Оѕ = 500. Then
for, e. g., О» = 0.5 and О» = 0.1 the critical values are 3.071 and 2.814, respec-
tively. By using XploRe the related values are 3.0712 and 2.8144, respectively.
It is known, that the smaller О» the worse the accuracy of the Markov chain
approach. Therefore, r is set greater for О» = 0.1 (r = 200) than for О» = 0.5
(r = 50). Table 11.1 shows some results of Lucas and Saccucci (1990) on
ARLs and ADs. Their results are based on the Markov chain approach as
well. However, they used some smaller matrix dimension and п¬Ѓtted a regres-
sion model on r (see Subsection 11.3.2). The corresponding XploRe results

Вµ 0 0.25 0.5 0.75 1 1.5 2 3 4 5
О» = 0.5
LВµ 500 255 88.8 35.9 17.5 6.53 3.63 1.93 1.34 1.07
DВµ 499 254 88.4 35.7 17.3 6.44 3.58 1.91 1.36 1.10
О» = 0.1
LВµ 500 106 31.3 15.9 10.3 6.09 4.36 2.87 2.19 1.94
DВµ 492 104 30.6 15.5 10.1 5.99 4.31 2.85 2.20 1.83

Table 11.1. ARL and AD values from Table 3 of Lucas and Saccucci
(1990)

by using the quantlet XFGlucsac.xpl coincide with the values of Lucas and
Saccucci (1990).

XFGlucsac.xpl

11.3.2 Two-sided CUSUM and Crosier

Crosier (1986) derived a new two-sided CUSUM scheme and compared it with
the established combination of two one-sided schemes. Recall Table 3 of Crosier
(1986), where the ARLs of the new and the old scheme were presented. The
reference value k is equal to 0.5. First, we compare the critical values. By
252 11 Statistical Process Control

Вµ 0 0.25 0.5 0.75 1 1.5 2 2.5 3 4 5
old scheme, h = 4
LВµ 168 74.2 26.6 13.3 8.38 4.74 3.34 2.62 2.19 1.71 1.31
new scheme, h = 3.73
LВµ 168 70.7 25.1 12.5 7.92 4.49 3.17 2.49 2.09 1.60 1.22
old scheme, h = 5
LВµ 465 139 38.0 17.0 10.4 5.75 4.01 3.11 2.57 2.01 1.69
new scheme, h = 4.713
LВµ 465 132 35.9 16.2 9.87 5.47 3.82 2.97 2.46 1.94 1.59

Table 11.2. ARLs from Table 3 of Crosier (1986)

using XploRe ( XFGcrosc.xpl) with r = 100 one gets c = 4.0021 (4), 3.7304
(3.73), 4.9997 (5), 4.7133 (4.713), respectively вЂ“ the original values of Crosier
are written in parentheses. By comparing the results of Table 11.2 with the
results obtainable by the quantlet XFGcrosarl.xpl (r = 100) it turns out,
that again the ARL values coincide with one exception only, namely L1.5 = 4.75
for the old scheme with h = 4.

XFGcrosarl.xpl

Further, we want to compare the results for the Average Delay (AD), which is
called Steady-State ARL in Crosier (1986). In Table 5 of Crosier we п¬Ѓnd the
related results. A slight modiп¬Ѓcation of the above quantlet XFGcrosarl.xpl

Вµ 0 0.25 0.5 0.75 1 1.5 2 2.5 3 4 5
old scheme, h = 4
LВµ 163 71.6 25.2 12.3 7.68 4.31 3.03 2.38 2.00 1.55 1.22
new scheme, h = 3.73
LВµ 164 69.0 24.3 12.1 7.69 4.39 3.12 2.46 2.07 1.60 1.29
old scheme, h = 5
LВµ 459 136 36.4 16.0 9.62 5.28 3.68 2.86 2.38 1.86 1.53
new scheme, h = 4.713
LВµ 460 130 35.1 15.8 9.62 5.36 3.77 2.95 2.45 1.91 1.57

11.4 Real data example вЂ“ monitoring CAPM 253

allows to compute the ADs. Remember, that the computation of the AD for
the two-sided CUSUM scheme is based on a twodimensional Markov chain.
Therefore the parameter r is set to 25 for the scheme called old scheme by
Crosier. The results are summarized in Table 11.4.

Вµ 0 0.25 0.5 0.75 1 1.5 2 2.5 3 4 5
old scheme, h = 4
LВµ 163 71.6 25.2 12.4 7.72 4.33 3.05 2.39 2.01 1.55 1.22
new scheme, h = 3.73
LВµ 165 69.1 24.4 12.2 7.70 4.40 3.12 2.47 2.07 1.60 1.29
old scheme, h = 5
LВµ 455 136 36.4 16.0 9.65 5.30 3.69 2.87 2.38 1.86 1.54
new scheme, h = 4.713
LВµ 460 130 35.1 15.8 9.63 5.37 3.77 2.95 2.45 1.91 1.57

values to Table 11.3 are printed as italics XFGcrosad.xpl

While the ARL values in the paper and computed by XploRe coincide, those
for the AD diп¬Ђer slightly. The most prominent deviation (459 vs. 455) one
observes for the old scheme with h = 5. One further in-control ARL diп¬Ђerence
one notices for the new scheme with h = 3.73. All other diп¬Ђerences are small.
There are diп¬Ђerent sources for the deviations:
T T
1. Crosier computed D(32) = (p32 L)/(p32 1) and not the actual limit D
(see 11.8, 11.10, and 11.12).
2. Crosier used ARL(r) = ARLв€ћ + B/r2 + C/r4 and п¬Ѓtted this model
for r = 8, 9, 10, 12, 15. Then, ARLв€ћ is used as п¬Ѓnal approximation. In
order to get the above D(32) one needs the whole vector L, such that this
approach might be more sensitive to approximation errors than in the
single ARL case.

11.4 Real data example вЂ“ monitoring CAPM
There are diп¬Ђerent ways of applying SPC to п¬Ѓnancial data. Here, we use a
twosided EWMA chart for monitoring the Deutsche Bank (DBK) share. More
254 11 Statistical Process Control

precisely, a capital asset pricing model (CAPM) is п¬Ѓtted for DBK and the DAX
which is used as proxy of the eп¬ѓcient market portfolio. That is, denoting with
rDAX,t and rDBK,t the log returns of the DAX and the DBK, respectively, one
assumes that the following regression model is valid:

rDBK,t = О± + ОІ rDAX,t + Оµt (11.14)

Usually, the parameters of the model are estimated by the ordinary least
squares method. The parameter ОІ is a very popular measure in applied п¬Ѓ-
nance, Elton and Gruber (1991). In order to construct a real portfolio, the
ОІ coeп¬ѓcient is frequently taken into account. Research has therefore concen-
trated on the appropriate estimation of constant and time changing ОІ. In the
context of SPC it is therefore useful to construct monitoring rules which signal
changes in ОІ. Contrary to standard SPC application in industry there is no
obvious state of the process which one can call вЂќin-controlвЂќ, i. e. there is no
target process. Therefore, pre-run time series of both quotes (DBK, DAX)
are exploited for building the in-control state. The daily quotes and log re-
turns, respectively, from january, 6th, 1995 to march, 18th, 1997 (about 450
observations) are used for п¬Ѓtting (11.14):

ANOVA SS df MSS F-test P-value
_________________________________________________________________________
Regression 0.025 1 0.025 445.686 0.0000
Residuals 0.025 448 0.000
Total Variation 0.050 449 0.000

Multiple R = 0.70619
R^2 = 0.49871
Standard Error = 0.00746

PARAMETERS Beta SE StandB t-test P-value
________________________________________________________________________
b[ 0,]= -0.0003 0.0004 -0.0000 -0.789 0.4307
b[ 1,]= 0.8838 0.0419 0.7062 21.111 0.0000

With b[1,] = ОІ = 0.8838 a typical value has been obtained. The R2 =
0.49871 is not very large. However, the simple linear regression is considered
in the sequel. The (empirical) residuals of the above model are correlated (see
Figure 11.5). The SPC application should therefore be performed with the
(standardized) residuals of an AR(1) п¬Ѓt to the regression residuals. For an
application of the XploRe quantlet armacls (quantlib times) the regression
residuals were standardized. By using the conditional least squares method an
11.4 Real data example вЂ“ monitoring CAPM 255

Sample partial autocorrelation function (pacf)
20
15
10
pacf*E-2
5
0
-5
-10

5 10 15 20
lag
Figure 11.5. Partial autocorrelation function of CAPM regression resid-
uals

estimate of Л† = 0.20103 for the AR(1) model

Оµt = Оµtв€’1 + О·t (11.15)

has been obtained. Eventually, by plugging in the estimates of О±, ОІ and ,
and standardizing with the sample standard deviation of the pre-run residuals
series (see (11.15)) one gets a series of uncorrelated data with expectation 0 and
variance 1, i. e. our in-control state. If the п¬Ѓtted model (CAPM with AR(1)
noise) remains valid after the pre-run, the related standardized residuals behave
like in the in-control state. Now, the application of SPC, more precisely of a
twosided EWMA chart, allows to monitor the series in order to get signals, if
the original model was changed. Changes in О± or ОІ in (11.14) or in in (11.15)
or in the residual variance of both models lead to shifts or scale changes in the
empirical residuals series. Hence, the probability of an alarm signaled by the
EWMA chart increases (with one exception only вЂ“ decreased variances). In
this way a possible user of SPC in п¬Ѓnance is able to monitor an estimated and
presumed CAPM.
256 11 Statistical Process Control

In our example we use the parameter О» = 0.2 and an in-control ARL of 500,
such that the critical value is equal to c = 2.9623 (the Markov chain parameter
r was set to 100). Remark, that the computation of c is based on the normality
assumption, which is seldom fulп¬Ѓlled for п¬Ѓnancial data. In our example the
hypothesis of normality is rejected as well with a very small p value (Jarque-
Bera test with quantlet jarber). The estimates of skewness 0.136805 and
kurtosis 6.64844 contradict normality too. The fat tails of the distribution are
a typical pattern of п¬Ѓnancial data. Usually, the fat tails lead to a higher false
alarm rate. However, it would be much more complicated to п¬Ѓt an appropriate
distribution to the residuals and use these results for the вЂќcorrectвЂќ critical
value.
The Figures 11.6 and 11.7 present the EWMA graphs of the pre-run and the
monitoring period (from march, 19th, 1997 to april, 16th, 1999). In the pre-run

Twosided EWMA chart
76 lambda = 0.20, in-control ARL = 499
1
0.5
Z_t
0
-0.5
-1

0 1 2 3 4
t*E2

Figure 11.6. Twosided EWMA chart of the standardized CAPM-AR(1)
residuals for the pre-run period (06/01/95 - 03/18/97)

period the EWMA chart signals 4 times. The п¬Ѓrst 3 alarms seem to be outliers,
while the last points on a longer change. Nevertheless, the chart performs quite
typical for the pre-run period. The п¬Ѓrst signal in the monitoring period was
obtained at the 64th observation (i. e. 06/24/97). Then, we observe more
frequently signals than in the pre-run period, the changes are more persistent
and so one has to assume, that the pre-run model is no longer valid. A new
CAPM has therefore to be п¬Ѓtted and, if necessary, the considered portfolio has
to be reweighted. Naturally, a new pre-run can be used for the new monitoring
11.4 Real data example вЂ“ monitoring CAPM 257

Twosided EWMA chart
64 lambda = 0.20, in-control ARL = 499
2
1
Z_t
0
-1
-2

0 1 2 3 4 5
t*E2

Figure 11.7. Twosided EWMA chart of the standardized CAPM-AR(1)
residuals for the monitoring period (03/19/97 - 04/16/99)

period.
XFGcapmar1.xpl

Bibliography
Brook, D. and Evans, D. A. (1972). An approach to the probability distribution
of cusum run length, Biometrika 59: 539вЂ“548.
Crosier, R. B. (1986). A new two-sided cumulative quality control scheme,
Technometrics 28: 187вЂ“194.

Crowder, S. V. (1986). A simple method for studying run-length distributions
of exponentially weighted moving average charts, Technometrics 29: 401вЂ“
407.
Elton, EJ. and Gruber, MJ. (1991). Modern portfolio theory and investment
analysis, Wiley, 4. edition.
Knoth, S. (1998). Quasi-stationarity of CUSUM schemes for Erlang distribu-
tions, Metrika 48: 31вЂ“48.
258 11 Statistical Process Control

Lorden, G. (1971). Procedures for reacting to a change in distribution, Annals
of Mathematical Statistics 42: 1897вЂ“1908.

Lucas, J. M. (1976). The design and use of V-mask control schemes, Journal
of Quality Technology 8: 1вЂ“12.
Lucas, J. M. and Crosier, R. B. (1982). Fast initial response for cusum quality-
205.
Lucas, J. M. and Saccucci, M. S. (1990). Exponentially weighted moving aver-
age control schemes: properties and enhancements, Technometrics 32: 1вЂ“
12.
Page, E. S. (1954). Continuous inspection schemes, Biometrika 41: 100вЂ“115.
Pollak, M. and Siegmund, D. (1975). Approximations to the expected sample
size of certain sequential tests, Annals of Statistics 3: 1267вЂ“1282.
Pollak, M. and Siegmund, D. (1985). A diп¬Ђusion process and its applications to
detection a change in the drift of brownian motion, Biometrika 72: 267вЂ“
280.
Roberts, S. W. (1959). Control-charts-tests based on geometric moving aver-
ages, Technometrics 1: 239вЂ“250.
Shewhart, W. A. (1931). Economic Control of Quality of Manufactured Product,
D. van Nostrand Company, Inc., Toronto, Canada.
Vance, L. (1986). ARL of cumulative sum control chart for controlling normal
mean, Journal of Quality Technology 18: 189вЂ“193.
Waldmann, K.-H. (1986). Bounds to the distribution of the run length in
general qualityвЂ“control schemes, Statistische Hefte 27: 37вЂ“56.
12 An Empirical Likelihood
Goodness-of-Fit Test for
Diп¬Ђusions
Song Xi Chen, Wolfgang HВЁrdle and Torsten Kleinow
a

The analysis and prediction of diп¬Ђusion processes plays a fundamental role
in the statistical analysis of п¬Ѓnancial markets. The techniques applied rely
on the actual model assumed for the drift and diп¬Ђusion coeп¬ѓcient functions.
Mismodelling these coeп¬ѓcients might result in biased prediction and incorrect
parameter speciп¬Ѓcation. We show in this chapter how the empirical likelihood
technique, Owen (1988) and Owen (1990), may be used to construct test pro-
cedures for the Goodness-of-Fit of a diп¬Ђusion model. The technique is based
on comparison with kernel smoothing estimators. The Goodness-of-Fit test
proposed is based on the asymptotics of the empirical likelihood, which has
two attractive features. One is its automatic consideration of the variation as-
sociated with the nonparametric п¬Ѓt due to the empirical likelihoodвЂ™s ability to
studentize internally. The other one is that the asymptotic distributions of the
test statistic are free of unknown parameters which avoids secondary plug-in
estimation.

12.1 Introduction
Let us assume a strictly stationary one-dimensional diп¬Ђusion Z solving the
stochastic diп¬Ђerential equation (SDE)

dZ(t) = m{Z(t)}dt + v{Z(t)}dW (t) (12.1)

where the driving process W = {W (t), t в€€ [0, в€ћ)} in (12.1) is a standard
Wiener process. In a mathematical п¬Ѓnance setting, Z might be the price process
260 12 An Empirical Likelihood Goodness-of-Fit Test for Diп¬Ђusions

of a stock, a stock market index or any other observable process. For the rest
of the chapter the drift m : R в†’ R, and the diп¬Ђusion coeп¬ѓcient v : R в†’ [0, в€ћ)
in (12.1) are assumed to be suп¬ѓciently smooth, so that a unique solution of
(12.1) exists.
In applications we are mostly interested in the stationary solutions of (12.1).
For the existence of a stationary solution, the drift and the diп¬Ђusion coeп¬ѓcient
must satisfy some conditions, Bibby and SГёrensen (1995). The most important
condition is that the stationary forward Kolmogorov equation

(1/2) v 2 (z)p(z) в€’ m(z)p(z) = 0

has a solution p(z) which is a probability density. If the initial value Z(0) is
distributed in accordance with p0 , and if it is independent of the Wiener process
W (t) in (12.1), then (12.1) deп¬Ѓnes a stationary process. The above condition
holds for the Ornstein-Uhlenbeck process with a normal stationary distribution,
and for the Cox-Ingersoll-Ross process with a О“-distribution. For the statistical
analysis we assume that Z is observed at discrete times ti = iв€†, i = 1, 2, . . . , n,
with a time step size в€† > 0. From these observations we get a time series Z в€†
with certain dynamics speciп¬Ѓed in Section 12.2.
The aim of this chapter is to test a parametric model for the drift function m
against a nonparametric alternative, i.e.

H0 (m) : m(z) = mОё (z) (12.2)

where Оё is an unknown parameter. The test statistic we apply is based on
the empirical likelihood. This concept was introduced by Chen, HВЁrdle and
a
Kleinow (2001) for time series. To apply it in our situation we start with the
discretization of the diп¬Ђusion process Z.

12.2 Discrete Time Approximation of a Diп¬Ђusion
Let us assume that the diп¬Ђusion process Z is observed at discrete times ti =
iв€†, i = 1, 2, . . . , n, with a time step size в€† > 0. Here we suppose that в€† is
small or, more precisely, will tend to zero asymptotically. Under rather weak
assumptions, see Kloeden and Platen (1999), on the functions m and v 2 , it can
be shown that the Euler approximation
t t
в€† в€† в€†
v Z в€† (tis ) dW (s)
Z (t) = Z (0) + m Z (tis ) ds + (12.3)
0 0
12.3 Hypothesis Testing 261

with tis = max{ti , ti в‰¤ s}, converges in a mean square sense to Z as в€† в†’ 0,
i.e.,
lim E( sup |Z в€† (t) в€’ Z(t)|2 ) = 0, T > 0. (12.4)
в€†в†’0 0в‰¤tв‰¤T

From now on, we assume that a discrete time approximation Z в€† exists in the
form of (12.3), and that the property (12.4) holds. For the purposes of this
chapter, в€† will always be considered small enough that one can substitute Z
by Z в€† in our interpretation of the observed data. The increments of the Euler
approximation and so the observed data will have the form
Z в€† (ti+1 ) в€’ Z в€† (ti ) = m Z в€† (ti ) в€† + v Z в€† (ti ) W (ti+1 ) в€’ W (ti ) (12.5)
for i = 0, 1, . . . .. The observations {Z в€† (ti )}, i = 0, 1, . . . n form a time series.
As long as the step size в€† is small enough the concrete choice of в€† does not
matter since all the relevant information about the model is contained in the
drift m and diп¬Ђusion coeп¬ѓcient v.
For the following we introduce the notation
def def
Z в€† (ti ) ,
Xi = X = (X1 , . . . , Xn )
def def
W (ti+1 ) в€’ W (ti ) ,
Оµi = Оµ = (Оµ1 , . . . , Оµn )
def def
Xi+1 в€’ Xi = m Xi в€† + v Xi Оµi ,
Yi = Y = (Y1 , . . . , Yn )(12.6)

We can now apply the empirical likelihood Goodness-of-Fit test for stationary
time series developed by Chen et al. (2001).

12.3 Hypothesis Testing
Suppose (X, Y ) is deп¬Ѓned as in (12.6) and let m(x) = E(Y |X = x) be the
conditional mean function, f be the density of the design points X, and Пѓ 2 (x) =
Var(Y |X = x) be the conditional variance function of Y given X = x в€€ S, a
closed interval S вЉ‚ R. Suppose that {mОё |Оё в€€ О˜} is a parametric model for the
Л†
mean function m and that Оё is an estimator of Оё under this parametric model.
The interest is to test the null hypothesis:
for all x в€€ S
H0 : m(x) = mОё (x)
against a series of local smooth nonparametric alternatives:
H1 : m(x) = mОё (x) + cn в€†n (x),
262 12 An Empirical Likelihood Goodness-of-Fit Test for Diп¬Ђusions

where cn is a non-random sequence tending to zero as n в†’ в€ћ and в€†n (x) is a
sequence of bounded functions.
The problem of testing against a nonparametric alternative is not new for an
independent and identically distributed setting, HВЁrdle and Mammen (1993)
a
and Hart (1997). In a time series context the testing procedure has only been
considered by KreiГџ, Neumann and Yao (1998) as far as we are aware. Also
theoretical results on kernel estimators for time series appeared only very re-
cently, Bosq (1998). This is surprising given the interests in time series for
п¬Ѓnancial engineering.
We require a few assumptions to establish the results in this chapter. These
assumptions are the following:

(i) The kernel K is Lipschitz continuous in [в€’1, 1], that is |K(t1 ) в€’ K(t2 )| в‰¤
C||t1 в€’ t2 || where || В· || is the Euclidean norm, and h = O{nв€’1/5 };
(ii) f , m and Пѓ 2 have continuous derivatives up to the second order in S.
Л†
(iii) Оё is a parametric estimator of Оё within the family of the parametric
model, and
sup |mОё (x) в€’ mОё (x)| = Op (nв€’1/2 ).
Л†
xв€€S

(iv) в€†n (x), the local shift in the alternative H1 , is uniformly bounded with
respect to x and n, and cn = nв€’1/2 hв€’1/4 which is the order of the diп¬Ђer-
ence between H0 and H1 .
(v) The process {(Xi , Yi )} is strictly stationary and О±-mixing, i.e.
def
|P(AB) в€’ P(A)P(B)| в‰¤ aПЃk
О±(k) = sup
в€ћ
i
Aв€€F1 ,Bв€€Fi+k

l
for some a > 0 and ПЃ в€€ [0, 1). Here Fk denotes the Пѓ-algebra of events
generated by {(Xi , Yi ), k в‰¤ i в‰¤ l} for l в‰Ґ k. For an introduction into
О±-mixing processes, see Bosq (1998) or Billingsley (1999). As shown by
Genon-Catalot, Jeantheau and LarВґdo (2000) this assumption is fulп¬Ѓlled
e
if Zt is an О±-mixing process.
(vi) E{exp(a0 |Y1 в€’ m(X1 )|)} < в€ћ for some a0 > 0; The conditional density
of X given Y and the joint conditional density of (X1 , Xl ) given (Y1 , Yl )
are bounded for all l > 1.
12.4 Kernel Estimator 263

Assumptions (i) and (ii) are standard in nonparametric curve estimation and
are satisп¬Ѓed for example for bandwidths selected by cross validation, whereas
(iii) and (iv) are common in nonparametric Goodness-of-Fit tests. Assumption
(v) means the data are weakly dependent. It is satisп¬Ѓed for a wide class of
diп¬Ђusion processes.

12.4 Kernel Estimator
To develop a test about H0 we п¬Ѓrst introduce a nonparametric kernel estimator
for m. For an introduction into kernel estimation see HВЁrdle (1990), Wand and
a
Jones (1995) and (HВЁrdle, MВЁller, Sperlich and Werwatz, 2000). Without loss
a u
of generality we assume that we are only interested in m(x) for x в€€ [0, 1] and
that f (x) в‰Ґ C1 в€Ђx в€€ [0, 1] with a positive constant C1 . If in a particular
problem the data are supported by another closed interval, this problem can
be transformed by rescaling into an equivalent problem with data support [0, 1].
Let K be a bounded probability density function with a compact support on
[в€’1, 1] that satisп¬Ѓes the moment conditions:

u2 K(u)du = ПѓK
2
uK(u)du = 0,

2
where ПѓK is a positive constant. Let h be a positive smoothing bandwidth
which will be used to smooth (X, Y ).
The nonparametric estimator considered is the Nadaraya-Watson (NW) esti-
mator n
Yi Kh (x в€’ Xi )
m(x) = i=1
Л† (12.7)
n
i=1 Kh (x в€’ Xi )

with Kh (u) = hв€’1 K(hв€’1 u). This estimator is calculated in XploRe by the
quantlets regest or regxest.
The parameter estimation of Оё depends on the null hypothesis. We assume
в€љ
here, that the parameter Оё is estimated by a n-consistent estimator. Let

Kh (x в€’ Xi )mОё (Xi )
Л†
mОё (x) =
ЛњЛ† n
i=1 Kh (x в€’ Xi )

be the smoothed parametric model. The test statistic we are going to consider
is based on the diп¬Ђerence between mОё and m, rather than directly between m
ЛњЛ† Л† Л†
264 12 An Empirical Likelihood Goodness-of-Fit Test for Diп¬Ђusions

and mОё , in order to avoid the issue of bias associated with the nonparametric
Л†
п¬Ѓt.
The local linear estimator can be used to replace the NW estimator in estimat-
ing m. However, as we compare m with mОё in formulating the Goodness-of-Fit
Л† ЛњЛ†
test, the possible bias associated with the NW estimator is not an issue here.
In addition, the NW estimator has a simpler analytic form.

12.5 The Empirical Likelihood concept

12.5.1 Introduction into Empirical Likelihood

Let us now as in Owen (1988) and Owen (1990) introduce the empirical likeli-
hood (EL) concept. Suppose a sample (U1 , . . . , Un ) of independent identically
distributed random variables in R1 according to a probability law with un-
known distribution function F and unknown density f . For an observation
(u1 , . . . , un ) of (U1 , . . . , Un ) the likelihood function is given by
n
ВЇ
L(f ) = f (ui ) (12.8)
i=1

The empirical density calculated from the observations (u1 , . . . , un ) is
n
1
def
fn (u) = 1{ui = u} (12.9)
n i=1

where 1 denotes the indicator function. It is easy to see that fn maximizes
ВЇ
L(f ) in the class of all probability density functions.
The objective of the empirical likelihood concept is the construction of tests and
conп¬Ѓdence intervals for a parameter Оё = Оё(F ) of the distribution of Ui . To keep
things simple we illustrate the empirical likelihood method for the expectation
E[Ui ]. The null hypothesis is E[Ui ] = Оё. We can test this assumption based on
the empirical likelihood ratio
ВЇ
L{f (Оё)}
def
R(F ) = (12.10)
ВЇ
L(fn )
ВЇ
where f (Оё) maximizes L(f ) subject to

Ui dF = Оё. (12.11)
12.5 The Empirical Likelihood concept 265

On a heuristic level we can reject the null hypothesis вЂњunder the true distribu-
tion F , U has expectation ОёвЂќ if the ratio R(F ) is small relative to 1, i.e. the
test rejects if R(F ) < r for a certain level r в€€ (0, 1). More precisely, Owen
(1990) proves the following

THEOREM 12.1 Let (U1 , . . . , Un ) be iid one-dimensional random variables
with expectation Оё and variance Пѓ 2 . For a positive r < 1 let

Fn , R(F ) в‰Ґ r
Cr,n = Ui dF F

be the set of all possible expectations of U with respect to distributions F dom-
inated by Fn (F Fn ). Then it follows
lim P[Оё в€€ Cr,n ] = P[П‡2 в‰¤ в€’2 log r] (12.12)
nв†’в€ћ

where П‡2 is a П‡2 -distributed random variable with one degree of freedom.

From Theorem 12.1 it follows directly

в‰¤ r EUi = Оё = P[П‡2 в‰¤ r]
lim P в€’ 2 log max R(F )
nв†’в€ћ {F |F Fn , Ui dF =Оё}

This result suggests therefore to use the log-EL ratio
ВЇ
L{f (Оё)}
в€’2 log = в€’2 log
max R(F ) max ВЇ
Ui dF =Оё} L(fn )
{F |F {F |F
Fn , Ui dF =Оё} Fn ,

as the basic element of a test about a parametric hypothesis for the drift func-
tion of a diп¬Ђusion process.

12.5.2 Empirical Likelihood for Time Series Data

We will now expand the results in Section 12.5.1 to the case of time series data.
For an arbitrary x в€€ [0, 1] and any function Вµ we have
x в€’ Xi
{Yi в€’ Вµ(x)} E[Yi |Xi = x] = Вµ(x) = 0.
EK (12.13)
h
Let pi (x) be nonnegative numbers representing a density for
x в€’ Xi
{Yi в€’ Вµ(x)}
K i = 1, . . . , n
h
266 12 An Empirical Likelihood Goodness-of-Fit Test for Diп¬Ђusions

The empirical likelihood for Вµ(x) is
n
def
L{Вµ(x)} = max pi (x) (12.14)
i=1
n n xв€’Xi
{Yi в€’ Вµ(x)} = 0. The
subject to i=1 pi (x) = 1 and pi (x)K
i=1 h
second condition reп¬‚ects (12.13).
We п¬Ѓnd the maximum by introducing Lagrange multipliers and maximizing the
Lagrangian function
n
L(p, О»1 , О»2 ) = log pi (x)
i=1
n n
x в€’ Xi
в€’О»1 {Yi в€’ Вµ(x)} в€’ О»2 pi (x) в€’ 1
pi (x)K
h
i=1 i=1

The partial derivatives are
x в€’ Xi
в€‚L(p, О»1 , О»2 ) 1
в€’ О»1 K {Yi в€’ Вµ(x)} в€’ О»2 в€Ђi = 1, . . . , n .
=
в€‚pi (x) pi (x) h
With О» = О»1 /О»2 we obtain as a solution to (12.14) the optimal weights
в€’1
x в€’ Xi
в€’1
{Yi в€’ Вµ(x)}
pi (x) = n 1 + О»(x)K (12.15)
h
where О»(x) is the root of
n xв€’Xi
{Yi в€’ Вµ(x)}
K h
= 0. (12.16)
О»(x)K xв€’Xi {Yi в€’ Вµ(x)}
1+ h
i=1

Note, that О»2 = n follows from
n n
x в€’ Xi
{Yi в€’ Вµ(x)} = 1 .
pi (x) + О» pi (x)K
h
i=1 i=1

The maximum empirical likelihood is achieved at pi (x) = nв€’1 corresponding
Л†
to the nonparametric curve estimate Вµ(x) = m(x). For a parameter estimate Оё
Л†
we get the maximum empirical likelihood for the smoothed parametric model
L{mОё (x)}. The log-EL ratio is
ЛњЛ†
L{mОё (x)}
ЛњЛ†
def
= в€’2 log[L{mОё (x)}nn ].
{mОё (x)} = в€’2 log
ЛњЛ† ЛњЛ†
L{m(x)}
Л†
12.5 The Empirical Likelihood concept 267

To study properties of the empirical likelihood based test statistic we need to
evaluate {mОё (x)} at an arbitrary x п¬Ѓrst, which requires the following lemma
ЛњЛ†
on О»(x) that is proved in Chen et al. (2001).

LEMMA 12.1 Under the assumptions (i)-(vi),

sup |О»(x)| = Op {(nh)в€’1/2 log(n)}.
xв€€[0,1]

Let Оі(x) be a random process with x в€€ [0, 1]. Throughout this chapter we use
Лњ
the notation Оі(x) = Op (Оґn ) ( Op (Оґn )) to denote the facts that supxв€€[0,1] |Оі(x)| =
Лњ
Op (Оґn ) (Op (Оґn )) for a sequence Оґn .
j
n
ВЇ xв€’Xi
Let Uj (x) = (nh)в€’1 {Yi в€’ mОё (x)}
K ЛњЛ† for j = 1, 2, . . .. An appli-
i=1 h

cation of the power series expansion of 1/(1 в€’ вЂў) applied to (12.16) and Lemma
12.1 yields
в€ћ
n
x в€’ Xi x в€’ Xi
(в€’О»(x))j K j {Yi в€’ mОё (x)}j = 0.
{Yi в€’ mОё (x)}
K ЛњЛ† ЛњЛ†
h h
i=1 j=0

Inverting the above expansion, we have
ВЇ в€’1 ВЇ
О»(x) = U2 (x)U1 (x) + Op {(nh)в€’1 log2 (n)}.
Лњ (12.17)

From (12.15), Lemma 12.1 and the Taylor expansion of log(1 + вЂў) we get

{mОё (x)} = в€’2 log[L{mОё (x)}nn ]
ЛњЛ† ЛњЛ†
n
x в€’ Xi
{Yj в€’ mОё (x)}]
= 2 log[1 + О»(x)K ЛњЛ†
h
i=1
ВЇ ВЇ
2nhО»(x)U1 в€’ nhО»2 (x)U2 + Op {(nh)в€’1/2 log3 (n)}
Лњ
=
(12.18)

Inserting (12.17) in (12.18) yields
ВЇ в€’1 ВЇ2
{mОё (x)} = nhU2 (x)U1 (x) + Op {(nh)в€’1/2 log3 (n)}.
Лњ
ЛњЛ† (12.19)

For any x в€€ [0, 1], let
1 1
2
в€’ y)dy and b(x; h) = h Kh (x в€’ y)dy
v(x; h) = h Kh (x
0 0
268 12 An Empirical Likelihood Goodness-of-Fit Test for Diп¬Ђusions

be the variance and the bias coeп¬ѓcient functions associated with the NW esti-
mator, respectively, see Wand and Jones (1995). Let
SI,h = {x в€€ [0, 1]| min (|x в€’ 1|, |x|) > h}.
For h в†’ 0, SI,h converges to the set of interior points in [0, 1]. If x в€€ SI,h , we
def
K 2 (x)dx and b(x; h) = 1. Deп¬Ѓne
have v(x; h) =
v(x; h)Пѓ 2 (x)
V (x; h) = .
f (x)b2 (x; h)
Clearly, V (x; h)/(nh) is the asymptotic variance of m(x) when nh в†’ в€ћ which
Л†
is one of the conditions we assumed.
It was shown by Chen et al. (2001), that
n
ВЇ в€’1
Kh (x в€’ Xi ){Yi в€’ mОё (x)}
U1 (x) =n ЛњЛ†
i=1
n
Лњ
= nв€’1 Kh (x в€’ Xi ){Yi в€’ mОё (Xi )} + Op (nв€’1/2 )
i=1
Л† Лњ
= f (x){m(x) в€’ mОё (x)} + Op (nв€’1/2 )
Л† Лњ
Лњ
= f (x)b(x; h){m(x) в€’ mОё (x)} + Op {nв€’1/2 + (nh)в€’1 log2 (n)}.
Л† Лњ
ВЇ
In the same paper it is shown, that condition (iii) entails supxв€€[0,1] |U2 (x) в€’
f (x)v(x; h)Пѓ 2 (x)| = Op (h). These and (12.19) mean that
ВЇ в€’1 ВЇ 2 Лњ
(nh)U2 U1 + Op {(nh)в€’1/2 log3 (n)}
{mОё (x)} =
ЛњЛ†
Лњ
= V в€’1 (x; h){m(x) в€’ mОё (x)}2 + O{(nh)в€’1 h log2 (n)}
Л† Лњ (12.20)
Therefore, {mОё (x)} is asymptotically equivalent to a studentized L2 distance
ЛњЛ†
between mОё (x) and m(x). It is this property that leads us to use {mОё (x)} as
ЛњЛ† Л† ЛњЛ†
the basic building block in the construction of a global test statistic for distinc-
tion between mОё and m in the next section. The use of the empirical likelihood
ЛњЛ† Л†
as a distance measure and its comparison with other distance measures have
been discussed in Owen (1991) and Baggerly (1998).

12.6 Goodness-of-Fit Statistic
To extend the empirical likelihood ratio statistic to a global measure of
Goodness-of-Fit, we choose kn -equally spaced lattice points t1 , t2 , В· В· В· , tkn in
12.6 Goodness-of-Fit Statistic 269

[0, 1] where t1 = 0, tkn = 1 and ti в‰¤ tj for 1 в‰¤ i < j в‰¤ kn . We let kn в†’ в€ћ
and kn /n в†’ 0 as n в†’ в€ћ. This essentially divides [0, 1] into kn small bins of
size (kn )в€’1 . A simple choice is to let kn = [1/(2h)] where [a] is the largest
integer less than a. This choice as justiп¬Ѓed later ensures asymptotic indepen-
dence among {mОё (tj )} at diп¬Ђerent tj s. Bins of diп¬Ђerent size can be adopted to
ЛњЛ†
suit situations where there are areas of low design density. This corresponds to
the use of diп¬Ђerent bandwidth values in adaptive kernel smoothing. The main
results of this chapter is not aп¬Ђected by un-equal bins. For the purpose of easy
presentation, we will treat bins of equal size.
As {mОё (tj )} measures the Goodness-of-Fit at a п¬Ѓxed tj , an empirical likelihood
ЛњЛ†
based statistic that measures the global Goodness-of-Fit is deп¬Ѓned as
kn
def
{mОё (tj )}.
n (mОё )
ЛњЛ† = ЛњЛ†
j=1

The following theorem was proven by Chen et al. (2001).

THEOREM 12.2 Under the assumptions (i) - (vi),

{m(x) в€’ mОё (x)}2
Л† Лњ
в€’1 в€’1
dx + Op {kn log2 (n) + h log2 (n)}
kn n (mОё )
ЛњЛ† = (nh)
V (x)
(12.21)
def
where V (x) = limhв†’0 V (x, h).

HВЁrdle and Mammen (1993) proposed the L2 distance
a

Tn = nh1/2 {m(x) в€’ mОё (x)}2 ПЂ(x)dx
Л† ЛњЛ†

as a measure of Goodness-of-Fit where ПЂ(x) is a given weight function.
в€’1
 << стр. 9(всего 14)СОДЕРЖАНИЕ >>