<< ńņš. 9(āńåćī 19)ŃĪÄÅŠĘĄĶČÅ >>
more.
5. The distribution of positive and negative price changes was approximately
symmetric.

Consider next the number of transactions in a 5-minute time interval. Denote the
series by xt . That is, x1 is the number of IBM transactions from 9:30 am to 9:35 am
on November 1, 1990 Eastern time, x2 is the number of transactions from 9:35 am to
9:40 am, and so on. The time gaps between trading days are ignored. Figure 5.1(a)
shows the time plot of xt , and Figure 5.1(b) the sample ACF of xt for lags 1 to 260. Of
particular interest is the cyclical pattern of the ACF with a periodicity of 78, which
is the number of 5-minute intervals in a trading day. The number of transactions
thus exhibits a daily pattern. To further illustrate the daily trading pattern, Figure 5.2
shows the average number of transactions within 5-minute time intervals over the 63
days. There are 78 such averages. The plot exhibits a āsmilingā or āUā shape, indi-
cating heavier tradings at the opening and closing of the market and thinner tradings
during the lunch hours.
Since we focus on transactions that occurred in the normal trading hours of a
trading day, there are 59,838 time intervals in the data. These intervals are called the
intraday durations between trades. For IBM stock, there were 6531 zero time inter-
vals. That is, during the normal trading hours of the 63 trading days from Novem-
ber 1, 1990 to January 31, 1991, multiple transactions in a second occurred 6531
times, which is about 10.91%. Among these multiple transactions, 1002 of them had
0 20 40 60 80 120

0 1000 2000 3000 4000 5000
5-minute time intervals

Series : x
0.0 0.2 0.4 0.6 0.8 1.0
ACF

0 50 100 150 200 250
Lag

Figure 5.1. IBM intraday transactions data from 11/01/90 to 1/31/91: (a) the number of trans-
actions in 5-minute time intervals, and (b) the sample ACF of the series in part(a).
25
20
average
15 10

0 10 20 30 40 50 60 70 80
5-minute time intervals

Figure 5.2. Time plot of the average number of transactions in 5-minute time intervals. There
are 78 observations, averaging over the 63 trading days from 11/01/90 to 1/31/91 for IBM
stock.

183
184 HIGH-FREQUENCY DATA

for IBM Stock. The Price Movements Are Classiļ¬ed Into āUp,ā āUnchanged,ā and
āDown.ā The Data Span is From 11/01/90 to 1/31/91.

(i ā’ 1)th trade ā+ā ā0ā āā’ā Margin
ā+ā 441 5498 3948 9887
ā0ā 4867 29779 5473 40119
āā’ā 4580 4841 410 9831
Margin 9888 40118 9831 59837

different prices, which is about 1.67% of the total number of intraday transactions.
Therefore, multiple transactions (i.e., zero durations) may become an issue in statis-
tical modeling of the time durations between trades.
Table 5.2 provides a two-way classiļ¬cation of price movements. Here price move-
ments are classiļ¬ed into āup,ā āunchanged,ā and ādown.ā We denote them by ā+,ā
ā0,ā and āā’,ā respectively. The table shows the price movements between two con-
secutive trades (i.e., from the [i ā’ 1]th to the ith transaction) in the sample. From the

1. consecutive price increases or decreases are relatively rare, which are about
441/59837 = 0.74% and 410/59837 = 0.69%, respectively;
2. there is a slight edge to move from āupā to āunchangedā than to ādownā; see
row 1 of the table;
3. there is a high tendency for price to remain āunchangedā;
4. the probabilities of moving from ādownā to āupā or āunchangedā are about the
same. See row 3.

The ļ¬rst observation mentioned before is a clear demonstration of bid-ask bounce,
showing price reversals in intraday transactions data. To conļ¬rm this phenomenon,
we consider a directional series Di for price movements, where Di assumes the value
+1, 0, ā’1 for āup,ā āunchanged,ā and ādownā price movement, respectively, for the
ith transaction. The ACF of {Di } has a single spike at lag 1 with value ā’0.389, which
is highly signiļ¬cant for a sample size of 59,837 and conļ¬rms the price reversal in
As a second illustration, we consider the transactions data of IBM stock in
December 1999 obtained from the TAQ database. The normal trading hours are from
9:30 am to 4:00 pm Eastern time, except for December 31 when the market closed
at 13:00 pm. Comparing with the 1990ā“1991 data, two important changes have
occurred. First, the number of intraday tradings has increased sixfold. There were
also increased the chance of multiple transactions within a second. The percentage
of trades with zero time duration doubled to 22.98%. At the extreme, there were
185
EMPIRICAL CHARACTERISTICS

after-hour
regular
10000 8000
6000
4000 2000
0

day

Figure 5.3. IBM transactions data for December 1999. The plot shows the number of trans-
actions in each trading day with the after-hours portion denoting the number of trades with
time stamp after 4:00 pm.

42 transactions within a given second that happened twice on December 3, 1999.
Second, the tick size of price movement was \$1/16 = \$0.0625 instead of \$1/8.
The change in tick size should reduce the bid-ask spread. Figure 5.3 shows the daily
number of transactions in the new sample. Figure 5.4(a) shows the time plot of time
durations between trades, measured in seconds, and Figure 5.4(b) is the time plot of
price changes in consecutive intraday trades, measured in multiples of the tick size
of \$1/16. As expected, Figures 5.3 and 5.4(a) show clearly the inverse relationship
between the daily number of transactions and the time interval between trades. Fig-
ure 5.4(b) shows two unusual price movements for IBM stock on December 3, 1999.
They were a drop of 63 ticks followed by an immediate jump of 64 ticks and a drop
of 68 ticks followed immediately by a jump of 68 ticks. Unusual price movements
like these occurred infrequently in intraday transactions.
Focusing on trades recorded within the regular trading hours, we have 61,149
trades out of 133,475 with no price change. This is about 45.8% and substantially
lower than that between November 1990 and January 1991. It seems that reducing
the tick size increased the chance of a price change. Table 5.3 gives the percentages
of trades associated with a price change. The price movements remain approximately
symmetric with respect to zero. Large price movements in intraday tradings are still
relatively rare.

Remark: The record keeping of high-frequency data is often not as good as that
of observations taken at lower frequencies. Data cleaning becomes a necessity in
186 HIGH-FREQUENCY DATA

80
20 40 60
duration
0

0 20000 40000 60000 80000 100000 120000
sequence
-20 0 20 40 60
change
-60

0 20000 40000 60000 80000 100000 120000
sequence

Figure 5.4. IBM transactions data for December 1999. Part (a) is the time plot of time dura-
tions between trades and part (b) is the time plot of price changes in consecutive trades mea-
sured in multiples of the tick size of \$1/16. Only data in the normal trading hours are included.

high-frequency data analysis. For transactions data, missing observations may hap-
pen in many ways, and the accuracy of the exact transaction time might be question-
able for some trades. For example, recorded trading times may be beyond 4:00 pm
Eastern time even before the opening of after-hours tradings. How to handle these
observations deserves a careful study. A proper method of data cleaning requires a

Table 5.3. Percentages of Intraday Transactions Associated with a Price Change for IBM
Stock Traded in December 1999. The Percentage of Transactions without Price Change
Is 45.8% and the Total Number of Transactions Recorded within the Regular Trading
Hours Is 133,475. The Size Is Measured in Multiples of Tick Size \$1/16.

(a) Upward movements
>7
size 1 2 3 4 5 6 7
percentage 18.03 5.80 1.79 0.66 0.25 0.15 0.09 0.32
(b) Downward movements
percentage 18.24 5.57 1.79 0.71 0.24 0.17 0.10 0.31
187
MODELS FOR PRICE CHANGES

deep understanding of the way by which the market operates. As such, it is important
to specify clearly and precisely the methods used in data cleaning. These methods
must be taken into consideration in making inference.

Again, let ti be the calendar time, measured in seconds from the midnight, when
the ith transaction took place. Let Pti be the transaction price. The price change
from the (i ā’ 1)th to the ith trade is yi ā” Pti = Pti ā’ Ptiā’1 and the time duration
is ti = ti ā’ tiā’1 . Here it is understood that the subscript i in ti and yi denotes the
time sequence of transactions, not the calendar time. In what follows, we consider
models for yi and ti both individually and jointly.

5.4 MODELS FOR PRICE CHANGES

The discreteness and concentration on āno changeā make it difļ¬cult to model the
intraday price changes. Campbell, Lo, and MacKinlay (1997) discuss several econo-
metric models that have been proposed in the literature. Here we mention two mod-
els that have the advantage of employing explanatory variables to study the intraday
price movements. The ļ¬rst model is the ordered probit model used by Hauseman,
Lo, and MacKinlay (1992) to study the price movements in transactions data. The
second model has been considered recently by McCulloch and Tsay (2000) and is a
simpliļ¬ed version of the model proposed by Rydberg and Shephard (1998); see also
Ghysels (2000).

5.4.1 Ordered Probit Model
Let yiā— be the unobservable price change of the asset under study (i.e., yiā— = Ptā— ā’
i
Ptā— ), where Ptā— is the virtual price of the asset at time t. The ordered probit model
iā’1
assumes that yiā— is a continuous random variable and follows the model

yiā— = xi Ī² + i , (5.15)

where xi is a p-dimensional row vector of explanatory variables available at time
tiā’1 , Ī² is a k Ć— 1 parameter vector, E( i | xi ) = 0, Var( i | xi ) = Ļi2 , and
Cov( i , j ) = 0 for i = j. The conditional variance Ļi2 is assumed to be a posi-
tive function of the explanatory variable wi ā”that is,

Ļi2 = g(wi ), (5.16)

where g(.) is a positive function. For ļ¬nancial transactions data, wi may contain the
time interval ti ā’ tiā’1 and some conditional heteroscedastic variables. Typically, one
also assumes that the conditional distribution of i given xi and wi is Gaussian.
Suppose that the observed price change yi may assume k possible values. In the-
ory, k can be inļ¬nity, but countable. In practice, k is ļ¬nite and may involve combin-
188 HIGH-FREQUENCY DATA

ing several categories into a single value. For example, we have k = 7 in Table 5.1,
where the ļ¬rst value āā’3 ticksā means that the price change is ā’3 ticks or lower. We
denote the k possible values as {s1 , . . . , sk }. The ordered probit model postulates the
relationship between yi and yiā— as

if Ī± jā’1 < yiā— ā¤ Ī± j ,
yi = s j j = 1, . . . , k, (5.17)

where Ī± j s are real numbers satisfying ā’ā = Ī±0 < Ī±1 < Ā· Ā· Ā· < Ī±kā’1 < Ī±k = ā.
Under the assumption of conditional Gaussian distribution, we have

P(yi = s j | xi , wi ) = P(Ī± jā’1 < xi Ī² + i ā¤ Ī± j | xi , wi )
ļ£±
ļ£² P(xi Ī² + i ā¤ Ī±1 | xi , wi ) if j = 1
= P(Ī± jā’1 < xi Ī² + i ā¤ Ī± j | xi , wi ) if j = 2, . . . , k ā’ 1
ļ£³
P(Ī±kā’1 < xi Ī² + i | xi , wi ) if j = k
ļ£±
ļ£“ Ī±1 ā’xi Ī²
if j = 1
ļ£“
ļ£“ Ļi (wi )
ļ£“
ļ£“
ļ£²
Ī± j ā’1 ā’xi Ī²
Ī± j ā’xi Ī²
= ā’ if j = 2, . . . , k ā’ 1
Ļi (wi ) Ļi (wi )
ļ£“
ļ£“
ļ£“
ļ£“
ļ£“ Ī±kā’1 ā’xi Ī²
ļ£³1 ā’ if j = k,
Ļi (wi )
(5.18)
where (x) is the cumulative distribution function of the standard normal random
variable evaluated at x, and we write Ļi (wi ) to denote that Ļi2 is a positive function
of wi . From the deļ¬nition, an ordered probit model is driven by an unobservable
continuous random variable. The observed values, which have a natural ordering,
can be regarded as categories representing the underlying process.
The ordered probit model contains parameters Ī², Ī±i (i = 1, . . . , k ā’ 1), and those
in the conditional variance function Ļi (wi ) in Eq. (5.16). These parameters can be
estimated by the maximum likelihood or Markov Chain Monte Carlo methods.

Example 5.1. Hauseman, Lo, and MacKinlay (1992) apply the ordered pro-
bit model to the 1988 transactions data of more than 100 stocks. Here we only report
their result for IBM. There are 206,794 trades. The sample mean (standard devia-
tion) of price change yi , time duration ti , and bid-ask spread are ā’0.0010(0.753),
ticks. The model used has nine categories for price movement, and the functional
speciļ¬cations are

3 3 3
tiā—
xi Ī² = Ī²1 + Ī²v+1 yiā’v + Ī²v+4 SP5iā’v + Ī²v+7 IBSiā’v
v=1 v=1 v=1
3
+ Ī²v+10 [TĪ» (Viā’v ) Ć— IBSiā’v ] (5.19)
v=1
189
MODELS FOR PRICE CHANGES

Ļi2 (wi ) = 1.0 + Ī³1 tiā— + Ī³2 ABiā’1 ,
2 2
(5.20)

where TĪ» (V ) = (V Ī» ā’ 1)/Ī» is the Box-Cox (1964) transformation of V with Ī» ā
[0, 1] and the explanatory variables are deļ¬ned by the following:

tiā— = (ti ā’ tiā’1 )/100 is a rescaled time duration between the (i ā’ 1)th and ith
ā¢
trades with time measured in seconds.
ā¢ ABiā’1 is the bid-ask spread prevailing at time tiā’1 in ticks.
ā¢ yiā’v (v = 1, 2, 3) is the lagged value of price change at tiā’v in ticks. With
k = 9, the possible values of price changes are {ā’4, ā’3, ā’2, ā’1, 0, 1, 2, 3, 4}
in ticks.
ā¢ Viā’v (v = 1, 2, 3) is the lagged value of dollar volume at the (i ā’ v)th transac-
tion, deļ¬ned as the price of the (i ā’ v)th transaction in dollars times the number
of shares traded (denominated in hundreds of shares). That is, the dollar volume
is in hundreds of dollars.
ā¢ SP5iā’v (v = 1, 2, 3) is the 5-minute continuously compounded returns of the
Standard and Poorā™s 500 index futures price for the contract maturing in the
closest month beyond the month in which transaction (i ā’ v) occurred, where
the return is computed with the futures price recorded one minute before the
nearest round minute prior to tiā’v and the price recorded 5 minutes before this.
ā¢ IBSiā’v (v = 1, 2, 3) is an indicator variable deļ¬ned by
ļ£±
if Piā’v > (Piā’v + Piā’v )/2
a b
ļ£²1
= if Piā’v = (Piā’v + Piā’v )/2
a b
IBSiā’v 0
ļ£³
ā’1 if Piā’v < (Piā’v + Piā’v )/2,
a b

where P ja and P jb are the ask and bid price at time t j .

The parameter estimates and their t ratios are given in Table 5.4. All the t ratios
are large except one, indicating that the estimates are highly signiļ¬cant. Such high t
ratios are not surprising as the sample size is large. For the heavily traded IBM stock,
the estimation results suggest the following conclusions:

1. The boundary partitions are not equally spaced, but are almost symmetric with
respect to zero.
2. The transaction duration ti affects both the conditional mean and conditional
variance of yi in Eqs. (5.19) and (5.20).
3. The coefļ¬cients of lagged price changes are negative and highly signiļ¬cant,
indicating price reversals.
4. As expected, the bid-ask spread at time tiā’1 signiļ¬cantly affects the condi-
tional variance.
190 HIGH-FREQUENCY DATA

Table 5.4. Parameter Estimates of the Ordered-Probit Model in Eq. (5.19) and Eq. (5.20)
for the 1988 Transaction Data of IBM, Where t Denotes the t Ratio.

(a) Boundary partitions of the probit model
Ī±1 Ī±2 Ī±3 Ī±4 Ī±5 Ī±6 Ī±7 Ī±8
Par.
ā’4.67 ā’4.16 ā’3.11 ā’1.34
Est. 1.33 3.13 4.21 4.73
ā’145.7 ā’157.8 ā’171.6 ā’155.5
t 154.9 167.8 152.2 138.9
(b) Equation parameters of the probit model
tiā—
Ī³1 Ī³2 Ī²1 : Ī²2 : yā’1 Ī²3 Ī²4 Ī²5 Ī²6
Par.
ā’0.12 ā’1.01 ā’0.53 ā’0.21 ā’0.26
Est. 0.40 0.52 1.12
ā’11.4 ā’135.6 ā’85.0 ā’47.2 ā’12.1
t 15.6 71.1 54.2
Ī²7 Ī²8 Ī²9 : Ī²10 Ī²11 Ī²12 Ī²13
Par.
ā’1.14 ā’0.37 ā’0.17
Est. 0.01 0.12 0.05 0.02
ā’63.6 ā’21.6 ā’10.3
t 0.26 47.4 18.6 7.7

5.4.2 A Decomposition Model
An alternative approach to modeling price change is to decompose it into three com-
ponents and use conditional speciļ¬cations for the components; see Rydberg and
Shephard (1998). The three components are an indicator for price change, the direc-
tion of price movement if there is a change, and the size of price change if a change
occurs. Speciļ¬cally, the price change at the ith transaction can be written as

yi ā” Pti ā’ Ptiā’1 = Ai Di Si , (5.21)

where Ai is a binary variable deļ¬ned as

1 if there is a price change at the ith trade
Ai = (5.22)
0 if price remains the same at the ith trade.

Di is also a discrete variable signifying the direction of the price change if a change
occursā”that is,

1 if price increases at the ith trade
Di | (Ai = 1) = (5.23)
ā’1 if price drops at the ith trade,

where Di | (Ai = 1) means that Di is deļ¬ned under the condition of Ai = 1, and Si
is size of the price change in ticks if there is a change at the ith trade and Si = 0 if
there is no price change at the ith trade. When there is a price change, Si is a positive
integer-valued random variable.
Note that Di is not needed when Ai = 0, and there is a natural ordering in the
decomposition. Di is well deļ¬ned only when Ai = 1 and Si is meaningful when
191
MODELS FOR PRICE CHANGES

Ai = 1 and Di is given. Model speciļ¬cation under the decomposition makes use of
the ordering.
Let Fi be the information set available at the ith transaction. Examples of elements
in Fi are tiā’ j , Aiā’ j , Diā’ j , and Siā’ j for j ā„ 0. The evolution of price change under
model (5.21) can then be partitioned as

P(yi | Fiā’1 ) = P(Ai Di Si | Fiā’1 )
= P(Si | Di , Ai , Fiā’1 )P(Di | Ai , Fiā’1 )P(Ai | Fiā’1 ). (5.24)

Since Ai is a binary variable, it sufļ¬ces to consider the evolution of the probability
pi = P(Ai = 1) over time. We assume that

exi Ī²
pi
= xi Ī² pi = ,
ln or (5.25)
1 ā’ pi 1 + exi Ī²

where xi is a ļ¬nite-dimensional vector consisting of elements of Fiā’1 and Ī² is a
parameter vector. Conditioned on Ai = 1, Di is also a binary variable, and we use
the following model for Ī“i = P(Di = 1 | Ai = 1),

Ī“i ezi Ī³
= zi Ī³ or Ī“i = ,
ln (5.26)
1 ā’ Ī“i 1 + ezi Ī³

where zi is a ļ¬nite-dimensional vector consisting of elements of Fiā’1 and Ī³ is
a parameter vector. To allow for asymmetry between positive and negative price
changes, we assume that

g(Ī»u,i ) if Di = 1, Ai = 1
Si | (Di , Ai = 1) ā¼ 1 + (5.27)
g(Ī»d,i ) if Di = ā’1, Ai = 1,

where g(Ī») is a geometric distribution with parameter Ī» and the parameters Ī» j,i
evolve over time as

Ī» j,i ewi Īø j
= wi Īø j or Ī» j,i = , j = u, d,
ln (5.28)
1 ā’ Ī» j,i 1 + ewi Īø j

where wi is again a ļ¬nite-dimensional explanatory variables in Fiā’1 and Īø j is a
parameter vector.
In Eq. (5.27), the probability mass function of a random variable x, which follows
the geometric distribution g(Ī»), is

p(x = m) = Ī»(1 ā’ Ī»)m , m = 0, 1, 2, . . . .

We added 1 to the geometric distribution so that the price change, if it occurs, is at
least 1 tick. In Eq. (5.28), we take the logistic transformation to ensure that Ī» j,i ā
[0, 1].
192 HIGH-FREQUENCY DATA

The previous speciļ¬cation classiļ¬es the ith trade, or transaction, into one of three
categories:

1. no price change: Ai = 0 and the associated probability is (1 ā’ pi );
2. a price increase: Ai = 1, Di = 1, and the associated probability is pi Ī“i . The
size of the price increase is governed by 1 + g(Ī»u,i ).
3. a price drop: Ai = 1, Di = ā’1, and the associated probability is pi (1 ā’ Ī“i ).
The size of the price drop is governed by 1 + g(Ī»d,i ).

Let Ii ( j) for j = 1, 2, 3 be the indicator variables of the prior three categories. That
is, Ii ( j) = 1 if the jth category occurs and Ii ( j) = 0 otherwise. The log likelihood
function of Eq. (5.24) becomes

ln[P(yi | Fiā’1 )] = Ii (1) ln[(1 ā’ pi )] + Ii (2)[ln( pi ) + ln(Ī“i )
+ ln(Ī»u,i ) + (Si ā’ 1) ln(1 ā’ Ī»u,i )]
+ Ii (3)[ln( pi ) + ln(1 ā’ Ī“i ) + ln(Ī»d,i ) + (Si ā’ 1) ln(1 ā’ Ī»d,i )],

and the overall log likelihood function is

n
ln[P(y1 , . . . , yn | F0 )] = ln P(yi | Fiā’1 )], (5.29)
i=1

which is a function of parameters Ī², Ī³, Īøu , and Īød .

Example 5.2. We illustrate the decomposition model by analyzing the intra-
day transactions of IBM stock from November 1, 1990 to January 31, 1991. There
The explanatory variables used are

1. Aiā’1 : The action indicator of the previous trade (i.e., the [i ā’ 1]th trade within
2. Diā’1 : The direction indicator of the previous trade.
3. Siā’1 : The size of the previous trade.
4. Viā’1 : The volume of the previous trade, divided by 1000.
5. tiā’1 : Time duration from the (i ā’ 2)th to (i ā’ 1)th trade.
6. B Ai : The bid-ask spread prevailing at the time of transaction.

Because we use lag-1 explanatory variables, the actual sample size is 59,775. It turns
out that Viā’1 , tiā’1 and B Ai are not statistically signiļ¬cant for the model enter-
tained. Thus, only the ļ¬rst three explanatory variables are used. The model employed
is
193
MODELS FOR PRICE CHANGES

pi
= Ī²0 + Ī²1 Aiā’1
ln
1 ā’ pi
Ī“i
= Ī³0 + Ī³1 Diā’1
ln (5.30)
1 ā’ Ī“i
Ī»u,i
= Īøu,0 + Īøu,1 Siā’1
ln
1 ā’ Ī»u,i
Ī»d,i
= Īød,0 + Īød,1 Siā’1 .
ln
1 ā’ Ī»d,i

The parameter estimates, using the log-likelihood function in Eq. (5.29), are given
in Table 5.5. The estimated simple model shows some dynamic dependence in the
price change. In particular, the trade-by-trade price changes of IBM stock exhibit
some appealing features:

1. The probability of a price change depends on the previous price change.
Speciļ¬cally, we have

P(Ai = 1 | Aiā’1 = 0) = 0.258, P(Ai = 1 | Aiā’1 = 1) = 0.476.

The result indicates that a price change may occur in clusters and, as expected,
most transactions are without price change. When no price change occurred
at the (i ā’ 1)th trade, then only about one out of four trades in the subse-
quent transaction has a price change. When there is a price change at the
(i ā’ 1)th transaction, the probability of a price change in the ith trade increases
2. The direction of price change is governed by
ļ£±
if Diā’1 = 0 (i.e., Aiā’1 = 0)
ļ£²0.483
if Diā’1 = 1, Ai = 1
P(Di = 1 | Fiā’1 , Ai ) = 0.085
ļ£³
if Diā’1 = ā’1, Ai = 1.
0.904

This result says that (a) if no price change occurred at the (i ā’ 1)th trade, then
the chances for a price increase or decrease at the ith trade are about even; and
(b) the probabilities of consecutive price increases or decreases are very low.
The probability of a price increase at the ith trade given that a price change

Table 5.5. Parameter Estimates of the ADS Model in Eq. (5.30) for IBM Intraday Trans-
actions: 11/01/90 to 1/31/91.

Ī²0 Ī²1 Ī³0 Ī³1 Īøu,0 Īøu,1 Īød,0 Īød,1
Parameter
ā’1.057 ā’0.067 ā’2.307 ā’0.670 ā’0.509
Estimate 0.962 2.235 2.085
Std.Err. 0.104 0.044 0.023 0.056 0.029 0.050 0.187 0.139
194 HIGH-FREQUENCY DATA

occurs at the ith trade and there was a price increase at the (i ā’ 1)th trade is
only 8.6%. However, the probability of a price increase is about 90% given
that a price change occurs at the ith trade and there was a price decrease at the
(i ā’ 1)th trade. Consequently, this result shows the effect of bid-ask bounce
and supports price reversals in high-frequency trading.
3. There is weak evidence suggesting that big price changes have a higher prob-
ability to be followed by another big price change. Consider the size of a price
increase. We have

Si | (Di = 1) ā¼ 1 + g(Ī»u,i ), Ī»u,i = 2.235 ā’ 0.670Siā’1 .

Using the probability mass function of a geometric distribution, we obtain that
the probability of a price increase by one tick is 0.827 at the ith trade if the
transaction results in a price increase and Siā’1 = 1. The probability reduces to
0.709 if Siā’1 = 2 and to 0.556 if Siā’1 = 3. Consequently, the probability of
a large Si is proportional to Siā’1 given that there is a price increase at the ith

A difference between the ADS and ordered probit models is that the ADS model
does not require any truncation or grouping in the size of a price change.

5.5 DURATION MODELS

Duration models are concerned with time intervals between trades. Longer dura-
tions indicate lack of trading activities, which in turn signify a period of no new
information. The dynamic behavior of durations, thus, contains useful information
about intraday market activities. Using concepts similar to the ARCH models for
volatility, Engle and Russell (1998) propose an autoregressive conditional duration
(ACD) model to describe the evolution of time durations for (heavily traded) stocks.
Zhang, Russell, and Tsay (2001) extend the ACD model to account for nonlinearity
and structural breaks in the data. In this section, we introduce some simple duration
models. As mentioned before, intraday transactions exhibit some diurnal pattern.
Therefore, we focus on the adjusted time duration

tiā— = ti / f (ti ), (5.31)

where f (ti ) is a deterministic function consisting of the cyclical component of ti .
Obviously, f (ti ) depends on the underlying asset and the systematic behavior of the
market. In practice, there are many ways to estimate f (ti ), but no single method
dominates the others in terms of statistical properties. A common approach is to use
smoothing spline. Here we use simple quadratic functions and indicator variables to
take care of the deterministic component of daily trading activities.
195
DURATION MODELS

For the IBM data employed in the illustration of ADS models, we assume

7
f (ti ) = exp[d(ti )], d(ti ) = Ī²0 + Ī² j f j (ti ), (5.32)
j=1

where
ļ£± 2
ti ā’ 38700
ļ£²
2
ti ā’ 43200 ā’ if ti < 43200
f 1 (ti ) = ā’ , f 3 (ti ) = 7500
ļ£³
14400
0 otherwise,
ļ£± 2
ļ£²ā’ ti ā’ 48600
ļ£“
if ti ā„ 43200
2
ti ā’ 48300
f 2 (ti ) = ā’ , f 4 (ti ) = 9000
ļ£“
ļ£³
9300
0 otherwise,

f 5 (ti ) and f 6 (ti ) are indicator variables for the ļ¬rst and second 5 minutes of market
opening [i.e., f 5 (.) = 1 if and only if ti is between 9:30 am and 9:35 am Eastern

(a) (c)
-0.2

-0.2
-0.6

-0.6
-1.0
-1.0

0 100 200 300 400 0 100 200 300 400
minutes minutes

(b) (d)
0.0

-0.2
-1.0

-0.6
-2.0

-1.0

0 100 200 300 400 0 100 200 300 400
minutes minutes

Figure 5.5. Quadratic functions used to remove the deterministic component of IBM intraday
trading durations: (a)ā“(d) are the functions f 1 (.) to f 4 (.) of Eq. (5.32), respectively.
196 HIGH-FREQUENCY DATA

Time], and f 7 (ti ) is the indicator for the last 30 minutes of daily trading [i.e., f 7 (ti ) =
1 if and only if the trade occurred between 3:30 pm and 4:00 pm Eastern Time].
Figure 5.5 shows the plot of fi (.) for i = 1, . . . , 4, where the time scales in the
x-axis is in minutes. Note that f 3 (43,200) = f 4 (43,200), where 43,200 corresponds
to 12:00 noon.
The coefļ¬cients Ī² j of Eq. (5.32) are obtained by the least squares method of the
linear regression

7
ln( ti ) = Ī²0 + Ī² j f j (ti ) + i .
j=1

The ļ¬tted model is

ln( ti ) = 2.555 + 0.159 f 1 (ti ) + 0.270 f 2 (ti ) + 0.384 f 3 (ti )
+ 0.061 f 4 (ti ) ā’ 0.611 f 5 (ti ) ā’ 0.157 f 6 (ti ) + 0.073 f 7 (ti ).

Figure 5.6 shows the time plot of average durations in 5-minute time intervals over
the 63 trading days before and after adjusting for the deterministic component. Part

(a) (b)
3.2
40

3.0
30

2.8
ave-dur

ave-dur
2.6
20

2.4
2.2
10

2.0

0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
5-minute intervals 5-minute intervals

Figure 5.6. IBM transactions data from 11/01/90 to 1/31/91: (a) The average durations in 5-
minute time intervals, and (b) the average durations in 5-minute time intervals after adjusting
for the deterministic component.
197
DURATION MODELS

(a) is the average durations of ti and, as expected, it exhibits a diurnal pattern. Part
tiā— (i.e., after the adjustment), and the diurnal pattern
(b) is the average durations of
is largely removed.

5.5.1 The ACD Model
The autoregressive conditional duration (ACD) model uses the idea of GARCH mod-
els to study the dynamic structure of the adjusted duration tiā— of Eq. (5.31). For ease
in notation, we deļ¬ne xi = tiā— .
Let Ļi = E(xi | Fiā’1 ) be the conditional expectation of the adjusted duration
between the (i ā’ 1)th and ith trades, where Fiā’1 is the information set available at
the (i ā’ 1)th trade. In other words, Ļi is the expected adjusted duration given Fiā’1 .
The basic ACD model is deļ¬ned as

xi = Ļi i , (5.33)

where { i } is a sequence of independent and identically distributed non-negative ran-
dom variables such that E( i ) = 1. In Engle and Russell (1998), i follows a standard
exponential or a standardized Weibull distribution, and Ļi assumes the form

r s
Ļi = Ļ + Ī³ j xiā’ j + Ļ j Ļiā’ j . (5.34)
j=1 j=1

Such a model is referred to as an ACD(r, s) model. When the distribution of i is
exponential, the resulting model is called an EACD(r, s) model. Similarly, if i fol-
lows a Weibull distribution, the model is a WACD(r, s) model. If necessary, readers
are referred to Appendix A for a quick review of exponential and Weibull distribu-
tions.
Similar to GARCH models, the process Ī·i = xi ā’ Ļi is a Martingale difference
sequence [i.e., E(Ī·i | Fiā’1 ) = 0], and the ACD(r, s) model can be written as

max(r,s) s
xi = Ļ + (Ī³ j + Ļ j )xiā’ j ā’ Ļ j Ī·iā’ j + Ī· j , (5.35)
j=1 j=1

which is in the form of an ARMA process with non-Gaussian innovations. It is under-
stood here that Ī³ j = 0 for j > r and Ļ j = 0 for j > s. Such a representation can
be used to obtain the basic conditions for weak stationarity of the ACD model. For
instance, taking expectation on both sides of Eq. (5.35) and assuming weak station-
arity, we have

Ļ
E(xi ) = .
max(r,s)
1ā’ (Ī³ j + Ļj)
j=1
198 HIGH-FREQUENCY DATA

Therefore, we assume Ļ > 0 and 1 > j (Ī³ j + Ļ j ) because the expected duration is
positive. As another application of Eq. (5.35), we study properties of the EACD(1, 1)
model.

EACD(1, 1) Model
An EACD(1, 1) model can be written as

xi = Ļi i , Ļi = Ļ + Ī³1 xiā’1 + Ļ1 Ļiā’1 , (5.36)

where i follows the standard exponential distribution. Using the moments of a stan-
dard exponential distribution in Appendix A, we have E( i ) = 1, Var( i ) = 1, and
E( i2 ) = Var(xi ) + [E(xi )]2 = 2. Assuming that xi is weakly stationary (i.e., the
ļ¬rst two moments of xi are time-invariant), we derive the variance of xi . First, taking
expectation of Eq. (5.36), we have

E(xi ) = E[E(Ļi | Fiā’1 )] = E(Ļi ), E(Ļi ) = Ļ + Ī³1 E(xiā’1 ) + Ļ1 E(Ļiā’1 ).
i

(5.37)

Under weak stationarity, E(Ļi ) = E(Ļiā’1 ) so that Eq. (5.37) gives
Ļ
Āµx ā” E(xi ) = E(Ļi ) = . (5.38)
1 ā’ Ī³1 ā’ Ļ1

Next, because E( i2 ) = 2, we have E(xi2 ) = E[E(Ļi2 i2 | Fiā’1 )] = 2E(Ļi2 ).
Taking square of Ļi in Eq. (5.36) and expectation and using weak stationarity of
Ļi and xi , we have, after some algebra, that

1 ā’ (Ī³1 + Ļ1 )2
E(Ļi2 ) = Āµ2 Ć— . (5.39)
x
1 ā’ 2Ī³1 ā’ Ļ1 ā’ 2Ī³1 Ļ1
2 2

Finally, using Var(xi ) = E(xi2 ) ā’ [E(xi )]2 and E(xi2 ) = 2E(Ļi2 ), we have

1 ā’ Ļ1 ā’ 2Ī³1 Ļ1
2
Var(xi ) = 2E(Ļi2 ) ā’ Āµ2 = Āµ2 Ć— ,
x x
1 ā’ Ļ1 ā’ 2Ī³1 Ļ1 ā’ 2Ī³1
2 2

where Āµx is deļ¬ned in Eq. (5.38). This result shows that, to have time-invariant
unconditional variance, the EACD(1, 1) model in Eq. (5.36) must satisfy 1 > 2Ī³1 +
2

Ļ1 + 2Ī³1 Ļ1 . The variance of an WACD(1, 1) model can be obtained by using the
2

same techniques and the ļ¬rst two moments of a standardized Weibull distribution.

ACD Models with a Generalized Gamma Distribution
In the statistical literature, intensity function is often expressed in terms of hazard
function. As shown in Appendix B, the hazard function of an EACD model is con-
stant over time and that of an WACD model is a monotonous function. These hazard
functions are rather restrictive in application as the intensity function of stock trans-
199
DURATION MODELS

actions might not be constant or monotone over time. To increase the ļ¬‚exibility of the
associated hazard function, Zhang, Russell, and Tsay (2001) employ a (standardized)
generalized Gamma distribution for i . See Appendix A for some basic properties of
a generalized Gamma distribution. The resulting hazard function may assume vari-
ous patterns, including U shape or inverted U shape. We refer to an ACD model with
innovations that follow a generalized Gamma distribution as a GACD(r, s) model.

5.5.2 Simulation
To illustrate ACD processes, we generated 500 observations from the ACD(1, 1)
model

xi = Ļi i , Ļi = 0.3 + 0.2xiā’1 + 0.7Ļiā’1 (5.40)

using two different innovational distributions for i . In case 1, i is assumed to follow
a standardized Weibull distribution with parameter Ī± = 1.5. In case 2, i follows a
(standardized) generalized Gamma distribution with parameters Īŗ = 1.5 and Ī± =
0.5.
Figure 5.7(a) shows the time plot of the WACD(1, 1) series, whereas Figure 5.8(a)
is the GACD(1, 1) series. Figure 5.9 plots the histograms of both simulated series.

(a) A simulated WACD(1,1) series
8 10
46
dur
2
0

0 100 200 300 400 500

(b) Standardized series
3
std-dur
2
1 0

0 100 200 300 400 500

Figure 5.7. A simulated WACD(1, 1) series in Eq. (5.40): (a) the original series, and (b) the
standardized series after estimation. There are 500 observations.
(a) A simulated GACD(1,1) series

80
60
dur
40 20
0

0 100 200 300 400 500

(b) Standardized series
5 10 15 20 25
std-dur
0

0 100 200 300 400 500

Figure 5.8. A simulated GACD(1, 1) series in Eq. (5.40): (a) the original series, and (b) the
standardized series after estimation. There are 500 observations.

(b) GACD(1,1)
(a) WACD(1,1)
120

300
100
80

200
60
40

100
20
0

0

0 2 4 6 8 10 0 20 40 60 80
z x

Figure 5.9. Histograms of simulated duration processes with 500 observations:
(a) WACD(1, 1) model, and (b) GACD(1, 1) model
200
201
DURATION MODELS

Series : x
0.0 0.2 0.4 0.6 0.8 1.0
ACF

0 5 10 15 20 25 30
Lag

Series : y
0.0 0.2 0.4 0.6 0.8 1.0
ACF

0 5 10 15 20 25 30
Lag

Figure 5.10. The sample autocorrelation function of a simulated WACD(1, 1) series with 500
observations: (a) the original series, and (b) the standardized residual series.

The difference between the two models is evident. Finally, the sample ACF of the
two simulated series are shown in Figure 5.10(a) and Figure 5.11(b), respectively.
The serial dependence of the data is clearly seen.

5.5.3 Estimation
For an ACD(r, s) model, let i o = max(r, s) and xt = (x1 , . . . , xt ) . The likelihood
function of the durations x1 , . . . , x T is

T
f (xT | Īø) = f (xi | Fiā’1 , Īø) Ć— f (xio | Īø),
i=i o +1

where Īø denotes the vector of model parameters, and T is the sample size. The
marginal probability density function f (xio | Īø) of the previous equation is rather
complicated for a general ACD model. Because its impact on the likelihood function
is diminishing as the sample size T increases, this marginal density is often ignored,
resulting in the use of conditional likelihood method. For a WACD model, we use
the probability density function (pdf) of Eq. (5.55) and obtain the conditional log
202 HIGH-FREQUENCY DATA

Series : x
0.8
ACF
0.4
0.0

0 5 10 15 20 25
Lag

Series : y
0.8
ACF
0.4
0.0

0 5 10 15 20 25
Lag

Figure 5.11. The sample autocorrelation function of a simulated GACD(1, 1) series with 500
observations: (a) the original series, and (b) the standardized residual series.

likelihood function
T
Ī±
1
(x | Īø, xio ) = Ī± ln 1+ + ln
Ī± xi
i=i 0 +1
ļ£® ļ£¹Ī±
1+ 1
xi
Ī±
xi
ā’ļ£° ļ£»,
+ Ī± ln (5.41)
Ļi Ļi

where Ļi = Ļ + rj=1 Ī³ j xiā’ j + sj=1 Ļ j Ļiā’ j , Īø = (Ļ, Ī³1 , . . . , Ī³r , Ļ1 , . . . , Ļs , Ī±)
and x = (xio +1 , . . . , x T ) . When Ī± = 1, the (conditional) log likelihood function
reduces to that of an EACD(r, s) model.
For a GACD(r, s) model, the conditional log likelihood function is
Ī±
T
Ī± xi
(x | Īø, xio ) = +(ĪŗĪ±ā’1) ln(xi )ā’ĪŗĪ± ln(Ī»Ļi )ā’ , (5.42)
ln
(Īŗ) Ī»Ļi
i=i o +1

where Ī» = (Īŗ)/ (Īŗ + Ī± ) and the parameter vector Īø now also includes Īŗ. As
1

expected, when Īŗ = 1, Ī» = 1/ (1 + Ī± ) and the log likelihood function in Eq. (5.42)
1
203
DURATION MODELS

reduces to that of a WACD(r, s) model in Eq. (5.41). This log likelihood function
can be rewritten in many ways to simplify the estimation.
Under some regularity conditions, the conditional maximum likelihood estimates
are asymptotically normal; see Engle and Russell (1998) and the references therein.
In practice, simulation can be used to obtain ļ¬nite-sample reference distributions for
the problem of interest once a duration model is speciļ¬ed.

Example 5.3. (Simulated ACD(1,1) series continued) Consider the simulated
WACD(1,1) and GACD(1, 1) series of Eq. (5.40). We apply the conditional likeli-
hood method and obtain the results in Table 5.6. The estimates appear to be reason-
Ė Ė
able. Let Ļi be the 1-step ahead prediction of Ļi and Ėi = xi /Ļi be the standardized
series, which can be regarded as standardized residuals of the series. If the model
is adequately speciļ¬ed, {Ėi } should behave as a sequence of independent and iden-
tically distributed random variables. Figure 5.7(b) and Figure 5.8(b) show the time
plot of Ėi for both models. The sample ACF of Ėi for both ļ¬tted models are shown in
Figure 5.10(b) and Figure 5.11(b), respectively. It is evident that no signiļ¬cant serial
correlations are found in the Ėi series.

Example 5.4. As an illustration of duration models, we consider the trans-
action durations of IBM stock on ļ¬ve consecutive trading days from November 1 to
November 7, 1990. Focusing on positive transaction durations, we have 3534 obser-
vations. In addition, the data have been adjusted by removing the deterministic com-
ponent in Eq. (5.32). That is, we employ 3534 positive adjusted durations as deļ¬ned
in Eq. (5.31).
Figure 5.12(a) shows the time plot of the adjusted (positive) durations for the ļ¬rst
ļ¬ve trading days of November 1990, and Figure 5.13(a) gives the sample ACF of
the series. There exist some serial correlations in the adjusted durations. We ļ¬t a
WACD(1, 1) model to the data and obtain the model

xi = Ļi i , Ļi = 0.169 + 0.064xiā’1 + 0.885Ļiā’1 , (5.43)

Table 5.6. Estimation Results for Simulated ACD(1,1) Series with 500 Observations:
(a) for WACD(1,1) Series and (b) for GACD(1,1) Series.

(a) WACD(1,1) model
Ļ Ī³1 Ļ1 Ī±
Parameter
True 0.3 0.2 0.7 1.5
Estimate 0.364 0.100 0.767 1.477
Std Error (0.139) (0.025) (0.060) (0.052)
(b) GACD(1,1) model
Ļ Ī³1 Ļ1 Ī± Īŗ
Parameter
True 0.3 0.2 0.7 0.5 1.5
Estimate 0.401 0.343 0.561 0.436 2.077
Std Error (0.117) (0.074) (0.065) (0.078) (0.653)
204 HIGH-FREQUENCY DATA

40
10 20 30
0

0 1000 2000 3000
sequence
0 2 4 6 8 10 12 14
norm-dur

0 1000 2000 3000
sequence

Figure 5.12. Time plots of durations for IBM stock traded in the ļ¬rst ļ¬ve trading days of
November 1990: (a) the adjusted series, and (b) the normalized innovations of an WACD(1, 1)
model. There are 3534 nonzero durations.

where { i } is a sequence of independent and identically distributed random variates
that follow the standardized Weibull distribution with parameter Ī± = 0.879(0.012),
Ė
where 0.012 is the estimated standard error. Standard errors of the estimates in
Eq. (5.43) are 0.039, 0.010, and 0.018, respectively. All t ratios of the estimates
are greater than 4.2, indicating that the estimates are signiļ¬cant at the 1% level.
Ė
Figure 5.12(b) shows the time plot of Ėi = xi /Ļi , and Figure 5.13(b) provides the
sample ACF of Ėi . The Ljungā“Box statistics show Q(10) = 4.96 and Q(20) = 10.75
for the Ėi series. Clearly, the standardized innovations have no signiļ¬cant serial cor-
relations. In fact, the sample autocorrelations of the squared series {Ėi2 } are also small
with Q(10) = 6.20 and Q(20) = 11.16, further conļ¬rming lack of serial dependence
in the normalized innovations. In addition, the mean and standard deviation of a stan-
dardized Weibull distribution with Ī± = 0.879 are 1.00 and 1.14, respectively. These
numbers are close to the sample mean and standard deviation of {Ėi }, which are 1.01
and 1.22, respectively. The ļ¬tted model seems adequate.
In model (5.43), the estimated coefļ¬cients show Ī³1 + Ļ1 ā 0.949, indicating
Ė Ė
0.169/(1 ā’ 0.064 ā’ 0.885) = 3.31 seconds, which is close to the sample mean 3.29
of the adjusted durations. The estimated Ī± of the standardized Weibull distribution
205
DURATION MODELS

Series : x
0.0 0.2 0.4 0.6 0.8 1.0
ACF

0 10 20 30
Lag

Series : epsi
0.0 0.2 0.4 0.6 0.8 1.0
ACF

0 10 20 30
Lag

Figure 5.13. The sample autocorrelation function of adjusted durations for IBM stock traded
in the ļ¬rst ļ¬ve trading days of November 1990: (a) the adjusted series, and (b) the normalized
innovations for a WACD(1, 1) model.

is 0.879, which is less than but close to 1. Thus, the conditional hazard function is
monotonously decreasing at a slow rate.
If a generalized Gamma distribution function is used for the innovations, then the
ļ¬tted GACD(1, 1) model is

xi = Ļi i , Ļi = 0.141 + 0.063xiā’1 + 0.897Ļiā’1 , (5.44)

where { i } follows a standardized, generalized Gamma distribution in Eq. (5.56)
with parameters Īŗ = 4.248(1.046) and Ī± = 0.395(0.053), where the number in
parentheses denotes estimated standard error. Standard errors of the three parame-
ters in Eq. (5.44) are 0.041, 0.010, and 0.019, respectively. All of the estimates are
statistically signiļ¬cant at the 1% level. Again, the normalized innovational process
Ė
{Ėi } and its squared series have no signiļ¬cant serial correlation, where Ėi = xi /Ļi
based on model (5.44). Speciļ¬cally, for the Ėi process, we have Q(10) = 4.95 and
Q(20) = 10.28. For the Ėi2 series, we have Q(10) = 6.36 and Q(20) = 10.89.
The expected duration of model (5.44) is 3.52, which is slightly greater than that
of the WACD(1, 1) model in Eq. (5.43). Similarly, the persistence parameter Ī³1 + Ļ1
Ė Ė
of model (5.44) is also slightly higher at 0.96.
206 HIGH-FREQUENCY DATA

Remark: Estimation of EACD models can be carried out by using programs for
ARCH models with some minor modiļ¬cation; see Engle and Russell (1998). In this
book, we use either the RATS program or some Fortran programs developed by the
author to estimate the duration models. Limited experience indicates that it is harder
to estimate a GACD model than an EACD or a WACD model. RATS programs used
to estimate WACD and GACD models are given in Appendix C.

5.6 NONLINEAR DURATION MODELS

Nonlinear features are also commonly found in high-frequency data. As an illus-
tration, we apply some nonlinearity tests discussed in Chapter 4 to the normal-
ized innovations Ėi of the WACD(1, 1) model for the IBM transaction durations in
Example 5.4; see Eq. (5.43). Based on an AR(4) model, the test results are given in
part (a) of Table 5.7. As expected from the model diagnostics of Example 5.4, the
Ori-F test indicates no quadratic nonlinearity in the normalized innovations. How-
ever, the TAR-F test statistics suggest strong nonlinearity.
Based on the test results in Table 5.7, we entertain a threshold duration model
with two regimes for the IBM intraday durations. The threshold variable is xtā’1 (i.e.,
lag-1 adjusted duration). The estimated threshold value is 3.79. The ļ¬tted threshold
WACD(1, 1) model is xi = Ļi i , where

0.020 + 0.257xiā’1 + 0.847Ļiā’1 , ā¼ w(0.901) if xiā’1 ā¤ 3.79
i
Ļi =
1.808 + 0.027xiā’1 + 0.501Ļiā’1 , ā¼ w(0.845) if xiā’1 > 3.79,
i

(5.45)

Table 5.7. Nonlinearity Tests for IBM Transaction Durations from November 1 to
November 7, 1990. Only Intraday Durations Are Used. The Number in the Parenthe-
ses of Tar-F Tests Denotes Time Delay.

(a) Normalized innovations of a WACD(1,1) model
Type Ori-F Tar-F(1) Tar-F(2) Tar-F(3) Tar-F(4)
Test 0.343 3.288 3.142 3.128 0.297
p value 0.969 0.006 0.008 0.008 0.915
(b) Normalized innovations of a threshold WACD(1,1) model
Type Ori-F Tar-F(1) Tar-F(2) Tar-F(3) Tar-F(4)
Test 0.163 0.746 1.899 1.752 0.270
p value 0.998 0.589 0.091 0.119 0.929
207
THE PCD MODEL

where w(Ī±) denotes a standardized Weibull distribution with parameter Ī±. The num-
ber of observations in the two regimes are 2503 and 1030, respectively. In Eq. (5.45),
the standard errors of the parameters for the ļ¬rst regime are 0.043, 0.041, 0.024,
and 0.014, whereas those for the second regime are 0.526, 0.020, 0.147, and 0.020,
respectively.
Ė
Consider the normalized innovations Ėi = xi /Ļi of the threshold WACD(1, 1)
model in Eq. (5.45). We obtain Q(12) = 9.8 and Q(24) = 23.9 for Ėi and Q(12) =
8.0 and Q(24) = 16.7 for Ėi2 . Thus, there are no signiļ¬cant serial correlations in the
Ėi and Ėi2 series. Furthermore, applying the same nonlinearity tests as before to this
newly normalized innovational series Ėi , we detect no nonlinearity; see part (b) of
Table 5.7. Consequently, the two-regime threshold WACD(1, 1) model in Eq. (5.45)
If we classify the two regimes as heavy and thin trading periods, then the threshold
are different between heavy and thin trading periods for IBM stock even after the
adjustment of diurnal pattern. This is not surprising as market activities are often
driven by arrivals of news and other information.
The estimated threshold WACD(1, 1) model in Eq. (5.45) contains some insignif-
icant parameters. We reļ¬ne the model and obtain the result:

0.225xiā’1 + 0.867Ļiā’1 , ā¼ w(0.902) if xiā’1 ā¤ 3.79
i
Ļi =
1.618 + 0.614Ļiā’1 , ā¼ w(0.846) if xiā’1 > 3.79.
i

All of the estimates of the reļ¬ned model are highly signiļ¬cant. The Ljungā“Box
Ė
statistics of the standardized innovations Ėi = xi /Ļi show Q(10) = 5.91(0.82)
and Q(20) = 16.04(0.71) and those of Ėi 2 give Q(10) = 5.35(0.87) and Q(20) =

15.20(0.76), where the number in parentheses is the p value. Therefore, the reļ¬ned
model is adequate. The RATS program used to estimate the prior model is given in
Appendix C.

5.7 BIVARIATE MODELS FOR PRICE CHANGE AND DURATION

In this section, we introduce a model that considers jointly the process of price
change and the associated duration. As mentioned before, many intraday transactions
of a stock result in no price change. Those transactions are highly relevant to trading
intensity, but they do not contain direct information on price movement. Therefore,
to simplify the complexity involved in modeling price change, we focus on transac-
tions that result in a price change and consider a price change and duration (PCD)
model to describe the multivariate dynamics of price change and the associated time
duration.
We continue to use the same notation as before, but the deļ¬nition is changed to
transactions with a price change. Let ti be the calendar time of the ith price change
of an asset. As before, ti is measured in seconds from midnight of a trading day. Let
Pti be the transaction price when the ith price change occurred and ti = ti ā’tiā’1 be
208 HIGH-FREQUENCY DATA

the time duration between price changes. In addition, let Ni be the number of trades
in the time interval (tiā’1 , ti ) that result in no price change. This new variable is used
to represent trading intensity during a period of no price change. Finally, let Di be
the direction of the ith price change with Di = 1 when price goes up and Di = ā’1
when the price comes down, and let Si be the size of the ith price change measured
in ticks. Under the new deļ¬nitions, the price of a stock evolves over time by

Pti = Ptiā’1 + Di Si , (5.46)
 << ńņš. 9(āńåćī 19)ŃĪÄÅŠĘĄĶČÅ >>