Assume the lemma is true for all derivations with less than k steps. Assume

U V ’— W contains k steps. As above assume the ¬rst step is U V ’ U V

where U ’ U . Note that U V ’— W uses only k ’ 1 steps. By induction there

are derivations U ’— W1 , V ’— W2 containing at most k steps. Therefore

U ’ U , U ’— W1 , V ’— W2 are the required derivations.

One of the results of this lemma is that we can get from U V ’— W by the

derivations

U V ’— W1 V ’— W1 W2

where W = W1 W2 since if X ± ’ X β is a derivation, then so are X ± V ’ X β V

and U X ± ’ U X β .

Grammars

140

The next lemma shows us that in applying productions, the order in which

we apply them is not particularly important. In fact we can derive any word

w in the language generated by by replacing the leftmost nonterminal by a

production at each step in the derivation. This is called a leftmost derivation

of w.

Lemma 4.3 Let w ∈ (L), the language generated by = (N , T, S, P).

There exists a leftmost derivation of w.

Proof The proof uses induction of the number of steps n in the derivation. If

n = 1, the derivation S ’ w is obviously a leftmost derivation. If n = k > 1,

and S ’— w, let S ’ U V where U, V ∈ (N ∪ T )— . By Lemma 4.2, there exists

derivations U ’— w1 and V ’— w2 , where w = w1 w2 . Since both of these

derivations contain less than k steps, there exist leftmost derivations U ’—

w1 and V ’— w2 . Then S ’ U V ’— w1 V ’— w1 w2 is a leftmost derivation

of w.

The following lemma shows that if the language L generated by a context-

free grammar does not contain the empty word » then L can be generated by a

context-free grammar which does not contain any productions of the form A ’

», which we shall call a » production. The only purpose of such a production is

to remove A from the string of symbols. For example if we have productions

C ’ abBa Aaaa and A ’ », we can then derive C ’— abBaaaa. We could

simply remove A ’ » and replace it with abBa Aaaa ’ abBaaaa. We would

have to do this wherever A occurs in a production. For example if we have

the C ’ ab Aa Aba, if A ’ » is removed, we would have to include C ’

ab Aaba, C ’ aba Aba, and C ’ ababa.

Suppose we have productions A ’ a B, B ’ C, C ’ aa, C ’ ». If we

add the production B ’ », we have created a new » production. If we just

remove C ’ », we can no longer derive a. A nonterminal X is called nilpotent

if X ’— ». We solve the problem above by removing all nilpotents and not just

those directly from » productions. Thus we would also add the production

A ’ a above since B is nilpotent.

In the next lemma it is necessary to be able to determine the nilpotents of a

grammar. The following algorithm determines the set of all nilpotents in a

grammar by examining the productions.

(1) If A ’ » is a production, then A ∈ .

(2) If A ’ A1 A2 . . . An where A1 , A2 , . . . , An ∈ , then A ∈ .

(3) Continue (2) until no new elements are added to .

4.2 Chomsky normal form and Greibach normal form 141

Lemma 4.4 Let be a grammar such that (L) does not contain the empty

word. Form a grammar by beginning with the productions in and

(i) removing all » productions;

(ii) for each production A ’ w in , where w = w1 X 1 w2 X 2 . . . wn X n and

(not necessarily distinct) nilpotents X 1 , X 2 , . . . , X n in w, let P be the

power set of {1, 2, . . . , n} and for p ∈ P, form productions A ’ w p

where w p is the string w with the {X i : i ∈ p} removed, which produce »

productions.

The language generated by is equal to the language generated by .

Proof Let L be the language generated by = (N , T, S, P), and L be the

= (N , T, S, P ). The language L ⊆ L, since any

language generated by

production in which is not in can be replaced with the original productions

in used to de¬ne it.

To show L ⊆ L , let w ∈ L. Using induction on the number of steps in the

derivation, we show that if S ’— w using productions in P, then S ’— w using

productions in P . If n = 1 then S ’ w is obviously a production in P since

w = ». Assume n = k and S ’— w is a derivation in containing k steps. Let

S ’ A1 A2 A3 . . . Am be the ¬rst derivation where Ai ∈ N ∪ T . Therefore

S ’ A1 A2 A3 . . . Am ’— w is the derivation S ’— w.

By Lemma 4.2, there exist derivations Ai ’— wi in , for 1 ¤ i ¤ m, where

w = w1 w2 . . . wm and each derivation has less than k steps. By induction, if

wi = » there exists derivations Ai ’— wi in for 1 ¤ i ¤ m (note that if Ai

is a terminal then Ai = wi and Ai ’— wi has 0 steps). Let Ai = Ai if wi = »

and Ai = » if wi = ». Then S ’ A1 A2 A3 . . . Am is a derivation in and

S ’ A1 A2 A3 . . . Am ’ — w1 A2 A3 . . . Am ’ — w1 w2 A3 . . . Am ’ — w1 w2 . . . wm

is a derivation in .

For future reference, we point out that if one follows the proof above care-

fully, one ¬nds that if (L) had contained the empty word, the only difference

between (L) and (L ) is that (L) would have contained the empty word,

while (L ) does not. The nonterminal S is a nilpotent that does not get removed,

however if S ’ » is a production, the production is removed.

We now start making progress toward Chomsky normal form. We show

that given a grammar we can determine another grammar which generates the

same language and has no productions of the form A ’ B, where A, B ∈ N .

These productions are called trivial productions since they simply relabel

nonterminals. The process is simple. If A ’ B and B ’ W , where W ∈ (T ∪

Grammars

142

N )— , we remove A ’ B and include A ’ W . More generally, if A1 ’ A2 ’

A3 ’ · · · ’ Am ’— B, where each Ai ’ Ai+1 is a trivial production and

B ’ w, then remove the trivial productions and include A1 ’ w.

Lemma 4.5 If (L), the language generated by = (N , T, S, P), does not

contain the empty word, then there exists a grammar with no » productions

and no trivial productions such that (L) = (L ).

Proof First assume has had the » productions removed as shown in the pre-

vious theorem. Create by removing all of the trivial projections and wherever

A1 ’ A2 ’ A3 ’ · · · ’ Am ’— B occurs, where each Ai ’ Ai+1 is a triv-

ial production and B ’ w, then remove the trivial productions and include

A1 ’ B.

By construction, (L ) ⊆ (L).

Conversely, assume S ’— w occurs in . We use induction on the number

of trivial derivations to show that there is a derivation S ’— w in . Obviously

if there is no trivial production then the derivation is in . Assume there are

k trivial productions in the derivation. Assume that the derivation is a leftmost

derivation of w. Assume S ’— w has the form

S ’— V1 A1 V2 ’ w1 A1 V2 ’ w1 A2 V2 ’ w1 A3 V2 ’ · · · ’ w1 Am V2

’— w1 w V2

’ w1 w w2

where A1 ’ A2 ’ A3 ’ · · · ’ Am is the last sequence of trivial productions

in the derivation, and Am ’ w . Then there are derivations V1 ’— w1 , V2 ’—

w2 in , and

S ’— V1 A1 V2 ’— w1 A1 V2 ’ w1 w V2 ’— w1 w w2

has less trivial productions and all productions are in ∪ . Hence by the

induction hypothesis there is a derivation S ’— w in .

Lemma 4.6 If (L), the language generated by = (N , T, S, P), does not

= (N , T, S, P ) in

contain the empty word, then there exists a grammar

which every production either has the form A ’ A1 A2 A3 . . . Am for n ≥ 2

where A, A1 , A2 , A3 , . . . , Am are nonterminals or A ’ a where A is a non-

terminal and a is a terminal such that (L) = (L ).

Proof Assume all » productions and all trivial productions have been elim-

inated. Thus all productions are of the form A ’ A1 , A2 , A3 , . . . , Am where

m ≥ 2 and Ai ∈ N ∪ T or A ’ a where A is a nonterminal and a is a ter-

minal. If A1 , A2 , A3 , . . . , Am all are nonterminals then the production has the

4.2 Chomsky normal form and Greibach normal form 143

proper form. If not, for each Ai = ai where ai is a terminal, form a new nonter-

minal X ai . Replace A ’ A1 , A2 , A3 , . . . , Am with A1 , A2 , A3 , . . . , Am where

Ai = Ai if Ai is a nonterminal and Ai = X ai if Ai is a terminal. Thus if we

have V1 a1 V2 a2 V3 a3 . . . Vn an Vn+1 where Vi ∈ N — and ai is a terminal, replace

it with V1 X a1 V2 X a2 V3 X a3 . . . Vn X an Vn+1 and add productions X ai ’ ai for

1 ¤ i ¤ n. Let = (N , T, S, P ) be the new grammar formed. We need to

show that (L ) = (L). Clearly (L) ⊆ (L ) since

A ’ V1 a1 V2 a2 V3 a3 . . . Vn an Vn+1

in can be replaced by

A’ V1 X a1 V2 X a2 V3 X a3 . . . Vn X an Vn+1

’ V1 a1 V2 X a2 V3 X a3 . . . Vn X an Vn+1

’ V1 a1 V2 a2 V3 X a3 . . . Vn X an Vn+1

. .

. .

. .

’ V1 a1 V2 a2 V3 a3 . . . Vn an Vn+1

in .

Conversely assume that in the derivation

S ’— w1 ww2

where U ’— w1 and A ’— w and Vi ’ vi are productions in

S ’— U AU ’ U V1 X a1 V2 X a2 . . . Vn X an Vn+1 V

’— w1 v1 a1 v2 a2 v3 a3 . . . vn an vn+1 w2 = w1 ww2 ,

where

A ’ V1 X a1 V2 X a2 . . . Vn X an Vn+1

is a production which is in and not in and the derivation is

S ’— U AV

’— w1 AV

’ w1 V1 X a1 V2 X a2 . . . Vn X an Vn+1 V

’— w1 v1 X a1 V2 X a2 . . . Vn X an Vn+1 V

’ w1 v1 a1 V2 X a2 . . . Vn X an Vn+1 V

’— w1 v1 a1 v2 X a2 . . . Vn X an Vn+1 V

’ w1 v1 a1 v2 a2 V3 X a3 . . . Vn X an Vn+1 V

..

..

..

’ w1 v1 a1 v2 a2 v3 a3 . . . vn an Vn+1 V

’— w1 v1 a1 v2 a2 v3 a3 . . . vn an vn+1 V

’— w1 ww2 .

Grammars

144

This may be replaced by

S ’— U V1 a1 V2 a2 . . . Vn an Vn+1 V

’— w1 V1 a1 V2 a2 . . . Vn an Vn+1 V

’— w1 v1 a1 V2 a2 . . . Vn an Vn+1 V

. .

. .

. .

’— w1 v1 a1 v2 a2 . . . an Vn+1 V

’— w1 v1 a1 v2 a2 . . . an vn+1 V

’— w1 v1 a1 v2 a2 . . . an vn+1 w2

’— w1 ww2 .

We have a derivation for S ’— w1 ww2 in . Hence ⊆.

From the above lemmas we are now able to prove that a context-free gram-

mar whose language does not contain the empty word can be expressed in

Chomsky normal form.

Lemma 4.7 If (L), the language generated by = (N , T, S, P), does not

contain the empty word, then there exists a grammar in which every production

has either the form

A ’ BC

or

A’a

where A, B, and C are nonterminals and a is a terminal such that (L) = (L ).

Proof By the previous lemma, in which every production has either the form

A ’ A1 A2 A3 . . . Am where A, A1 , A2 , A3 , . . . , Am are nonterminals or A ’

a where A is a nonterminal and a is a terminal. We construct a new grammar

by replacing every production of the form A ’ A1 A2 A3 . . . Am by the set

of productions A ’ A1 X 1 , X 1 ’ A2 X 2 , . . . , X m’2 ’ Am’1 Am , where each

replacement of a production in uses a new set of symbols.

A ’ A 1 X 2 ’ A 1 A 2 X 3 ’— A 1 A 2 A 3 . . . A m

is a derivation in , (L) ⊆ (L ).

Conversely, if S ’— w in contains no productions which are not in , then

w ∈ (L). If it does, let Wm be the last term in the derivation containing a symbol

which is not in so we have Wm ’ Wm+1 ’— w and Wm ’ Wm+1

in

has the form U X m’2 V ’ U Am’1 Am V . Therefore the derivation uses the set

4.2 Chomsky normal form and Greibach normal form 145

of productions A ’ A1 X 1 , X 1 ’ A2 X 2 , . . . , X m’2 ’ Am’1 Am and has the

form

S ’— U AV — ’— ’—

U A1 X 1 V U A1 X 1 V

’— ’—

U A1 A2 X 2 V U A1 A2 X 2 V

’— ’—

U A1 A2 A3 X 3 V U A1 A2 A3 X 3 V

. . . .

. . . .

. . . .

’— U A1 · · · Am’2 X m’2 V ’— U A1 · · · Am’2 X m’2 V

’— w

’ Wm+1

where U = U A1 A2 A3 · · · Am’2 and Ai ’— Ai is a derivation in . If this

derivation S ’— w is not in , we again pick the last term in the derivation

containing a symbol in which is not in , and continue the process until no

such terms are left. Therefore w ∈ and (L ) ⊆ (L).

Finally we remove the restriction that (L) contains the empty word. As

mentioned, following the proof of Lemma 4.4, by eliminating » productions, if

(L) contained the empty word, one produced the same language with only the

empty word eliminated. Since all of the languages of the forms of grammars

developed since Lemma 4.4 are the same, if (L) contained the empty word,

the language developed by the grammar in the previous lemma would have

differed from (L) only in the fact that (L) contained the empty word while

(L ) did not. Thus to get (L) we need only have productions that add the

empty word to the language and leave the rest of the language alone. We do

this by adding two new nonterminal symbols, S and ψ, where S is the new

start symbol and productions S ’ Sψ and ψ ’ ». Call this the » extended

Chomsky normal form.

Theorem 4.4 Given a context-free grammar containing the empty word,

there is a context-free grammar in » extended Chomsky normal form so that

(L) = (L ).

We now consider converting a context-free language to Greibach normal

form. Even though we use leftmost derivations, we have no bound on how

many derivations may occur before the ¬rst terminal symbol appears at the left

of the string. For example, using the production A ’ Aa, we can generate the

string Aa n for arbitrary n, using n derivations without beginning a string with

a terminal symbol. We can eliminate this particular problem by eliminating the

productions of the form A ’ Aa. This is called elimination of left recursion.

In a grammar with no » productions or trivial productions, let

A ’ AV1 , A ’ AV2 , . . . , A ’ AVn

Grammars

146

be productions in which the right-hand side of the production begins with an A

and

A ’ U1 , A ’ U2 , . . . , A ’ Um

be productions in which the right-hand side of the production does not begin

with an A. We form a new grammar by adding a new nonterminal A to the

:grammar and using the following steps:

(1) Eliminate all productions of the form A ’ AVi for 1 ¤ i ¤ n.

(2) Form productions A ’ Ui A for 1 ¤ i ¤ n.

(3) Form productions A ’ Vi A and A ’ Vi .

(L) =

Lemma 4.8 (L).

Proof Let a derivation beginning with A have the form assuming A ’ Ui A

for 1 ¤ i ¤ n and we have A ’ AV(1) ’ AV(2) V(1) ’— AV(k) . . . V(2) V(1) ’

U(i) V(k) . . . V(2) V(1) where V( j) ∈ {V1 , V2 , . . . , Vn } for all 1 ¤ j ¤ k and U(i) ∈

{U1 ,U2 , . . . , Um }. Therefore using leftmost derivation, any production contain-

ing A will have the form

w AW ’ w AV(1) W ’ w AV(2) V(1) W ’— w AV(k) . . . V(2) V(1) W

’ wU(i) V(k) . . . V(2) V(1) W.

But

A ’ AV(1) ’ AV(2) V(1) ’— AV(k) . . . V(2) V(1) ’ U(i) V(k) . . . V(2) V(1)

can be replaced by

A ’ U(i) A ’ U(i) V(k) A ’ U(i) V(k) V(k’1) A

’ — U(i) V(k) . . . V(2) A ’ U(i) V(k) . . . V(2) V(1) .

Placing w on the left and W on the right of each term, we have, w AW ’—

wU(i) V(k) . . . V(2) V(1) W in . Hence (L) ⊆ (L).

The proof that (L) ⊆ (L) is left to the reader.

Lemma 4.9 Let A ’ U BV be a production in and B ’ W1 , B ’

W2 , . . . , B ’ Wm be the set of all productions in with B on the left. Let

be the grammar with production A ’ U BV removed and the productions

A ’ U Wi V for 1 ¤ i ¤ m added, then (L) = (L).

Proof The production A ’ U Wi V can always be replaced by the production

A ’ U BV followed by the production B ’ Wi . Hence (L) ⊆ (L). The

proof that (L) ⊆ (L) is left to the reader.

4.2 Chomsky normal form and Greibach normal form 147

The theorem that every context“free grammar can be expressed in Greibach

normal form can be proved by ¬rst expressing the grammar in Chomsky normal

form. We shall not do so however so that the development for Chomsky normal

form may be omitted if desired. Using the above lemmas, we are about to take

a giant leap toward proving that every context-free grammar can be expressed

in Greibach normal form.

Lemma 4.10 Any context-free grammar which does not generate » can be

expressed so that each of its productions is of the form

A ’ aW,

where a is a terminal and W is a string which is empty or consists of a string

of terminals and/or nonterminals.

Proof We ¬rst order the nonterminals beginning with S, the start symbol.

For simplicity, let the nonterminals be A1 , A2 , A3 , . . . , Am . Our ¬rst goal is to

change every production so that it is either in the form

A ’ aW,

where a is a terminal and W is a string which is empty or consists of a string

of terminals and/or nonterminals, or in the form

Ai ’ A j Y,

where i < j and Y consists of a string of terminals and/or nonterminals. Recall

that using the procedures for elimination of left recursion and for eliminating

a nonterminal described in Lemma 4.9 to alter the productions of the grammar

does not change the language generated by the grammar.

Using induction, for i = 1, since S = A1 is automatically less than every

other nonterminal, we need only consider S ’ SY , the S on the right-hand side

of the production can be removed by the process of elimination of left recur-

sion. Assume it is true for every Ai where i < k. We now prove the statement

for i = k. In each case where Ak ’ A j Y is a production for k > j, use the

procedure in Lemma 4.9 to eliminate A j . When Ak ’ Ak Y is a production, use

the process of elimination of left recursion to remove Ak from the right-hand

side.

Therefore by induction we have every production so that it is either in the

form

A ’ aW,

Grammars

148

where a is a terminal and W is a string which is empty or consists of a string

of terminals and/or nonterminals, or in the form

Ai ’ A j Y,

where i < j and Y consists of a string of terminals and/or nonterminals.

Any production with Am on the left-hand side must have the form Am ’ aW

since there is no nonterminal larger than Am . If there is a production of the

form Am’1 ’ Am W , use the procedures in Lemma 4.9 to eliminate Am . The

result is a production of the form Am ’ bW . Assume k is the largest value

so Ak ’ A j Y is a production where k < j. Again using the procedures in

Lemma 4.9 to eliminate A j , we have a procedure of the form Ak ’ aW . When

the process is completed, we have

Ai ’ aW

where a is a terminal and W is a string which is empty or consists of a string

of terminals and/or nonterminals for every i. We now have to consider the Bi

created using a process of elimination of left recursion. From the construction

of the Bi , it is impossible to have a production of the form Bi ’ B j W . There-

fore productions with Bi on the left have the form Bi ’ aW or Bi ’ A j W .

Repeating the process above we can change these to the form Bi ’ aW , and

the lemma is proved.

Theorem 4.5 Every context-free grammar whose language does not contain

» can be expressed in Greibach normal form.

Proof We outline the proof. The details are left to the reader. Since we already

know that every production can be written in the form

A ’ aW,

where a is a terminal and W is a string which is empty or consists of a string

of terminals and/or nonterminals, for every terminal b in W replace it with

nonterminal Ab and add the production Ab ’ b. Hint: see proof of Lemma 4.6.

For any context free grammarcontaining the empty word we can form a

grammar in extended Greibach normal form, which accepts the empty word

by simply adding the production S’ » after the completion of the Greibach

normal form.

4.3 Pushdown automata and context-free languages 149

Exercises

(1) In Lemma 4.9 “Let A ’ U BV be a production in and B ’ W1 , B ’

W2 , . . . , B ’ Wm be the set of all productions in with B on the left.

Let be the grammar with production A ’ U BV removed and the pro-

ductions A ’ U Wi V for 1 ¤ i ¤ m added, then (L) = (L),” prove

(L) ⊆ (L).

(2) Prove Theorem 4.5 “Every context-free grammar can be expressed in

Greibach normal form.”

(3) Complete the proof of Lemma 4.8.

= (N , , S, P) be the grammar de¬ned by N = {S, A, B}, =

(4) Let

{a, b}, and P be the set of productions

S ’ AB AB AB A A ’ Aa A’» B ’ b.

Express this grammar in Chomsky normal form.

(5) Express the previous grammar in Greibach normal form.

(6) Let = (N , T, S, P) be the grammar with N = {S}, T = {a, b}, and P

contain the productions

S ’ SS B ’ aa S ’ BS B ’ bb S ’ SB

A ’ ab S’ » A ’ ba S ’ AS A.

Express this grammar in Chomsky normal form.

(7) Express the previous grammar in Greibach normal form.

= (N , , S, P) be the grammar de¬ned by N = {S, A, B}, =

(8) Let

{a, b}, and P be the set of productions

S ’ Aba B A ’ b Aa A’» B ’ A Ab Baab A.

Express this grammar in Chomsky normal form.

(9) Express the previous grammar in Greibach normal form.

4.3 Pushdown automata and context-free languages

The primary importance of the PDA is that a language is accepted by a PDA

if and only if it is constructed by context-free grammar. Recall that a context-

free language is a language that is generated by a context-free grammar. In the

remainder of this section, we show that a language is context-free if and only

if it is accepted by a PDA.

We ¬rst demonstrate how to construct a PDA that will read the language

generated by a context-free grammar.

Grammars

150

Before beginning we need two tools. The ¬rst is the concept of pushing a

string of stack symbols into the stack. We are not changing the de¬nition of the

stack. To push a string into the stack we simply mean that we are pushing the last

symbol of the string into the stack, then the next to last symbol into the stack, and

continuing until the ¬rst symbol of the string has been pushed into the stack. If

the string is then popped a symbol at a time, the symbols form the original string.

For example to push ab Ac, we ¬rst push c, then push A, then push b, and ¬nally

push a. Thus we may consider a PDA to have the form M = { , Q, s, I, ’, F)

where is the alphabet, Q is the set of states, s is the initial or starting state, I is

the set of stack symbols, F is the set of acceptance states, and ’ is the transition

relation. The relation ’ is a ¬nite subset of ((Q — — — I — ) — (Q — I — )). This

means that the machine in some state q ∈ Q can read a possibly empty string

of the alphabet by reading a letter at a time if the string is nonempty, pop and

read a possibly empty string of symbols by popping and reading a symbol at a

time if the string of symbols in nonempty and as a result can read a string of

letters, change state, and push a string of symbols onto the stack as described

above.

Throughout the remainder of this section we shall assume that only left

derivations for context-free languages are used and that the PDA has only two

states, s and t. The alphabet in the PDA consist of the terminal symbols

of the grammar . The stack symbols of the PDA consist of the terminal and

nonterminal symbols of the grammar , i.e. I = T ∪ N .

To convert a context-free grammar into a PDA, which accepts the same

language generated by , we use the following rules:

(1) Begin by pushing S, start symbol of the grammar, i.e. begin with the

automaton.

Start

Push

(t) S

(2) If a nonterminal A is popped from the stack, then for some production

A ’ w in , w is pushed into the stack, i.e. we have the automaton in

4.3 Pushdown automata and context-free languages 151

¬gure

pop

A

push

w

(3) If a terminal a is popped from the stack then a must be read, i.e. if we have

the automaton

pop

a

then we have the automaton

pop

a

read

a

Thus the terminal elements at the top of the stack are removed and matched

with the letters from the tape.

The result is that we are imitating the productions in the grammar by popping

the ¬rst part of the production and pushing the second part so that while the

grammar replaces the ¬rst part of the production with the second part, so does

the PDA. The stack then resembles the strings derived in the grammar except

that the terminals on the left of the derived string (top of the stack) are then

removed as they occur in the stack and compared with the letters on the tape.

As before a word is accepted if the word has been read and the stack is empty.

Example 4.17 Let = (N , T, S, P) be the grammar with N = {S}, T =

{a, b}, and P contain the productions

S ’ aSa S ’ bSb S’»

Grammars

152

which generates the language {ww R : w ∈ T — }. This has the PDA

Start

push

S

pop

a

a

read S

aSa

push

a accept

accept aSb

b push

b

read

Consider the word abba. We can trace its path with the following table:

state instruction stack tape state instruction stack tape

»

x0 abba t pop b Sba bba

start

t push S S abba t read b Sba ba

»

t pop S abba t pop S ba ba

t push aSa aSa abba t pop b a ba

t pop a Sa abba t read b a a

»

t read a Sa bba t pop a a

» »

t pop S a bba t read a

» »

t push bSb bSba bba t accept

Example 4.18 Let = (N , T, S, P) be the grammar with N = {S}, T =

{a, b}, and P contain the productions

B ’ bb S ’ SB

S’ SS B’ aa S’ B S

S’ » S ’ AS A

A’ ab A’ ba

which generates the language {w : w ∈ A— and contains an even number of as

and an even number of bs.}. This has the PDA

4.3 Pushdown automata and context-free languages 153

Start

A

S B push

push

push push S

S S push accept

S

A

push push

push

S

pop

A

B b

a

push push push push

read read

a

b a b

a b

push push push push

a b a

b

accept

Consider the word abbabb. We can trace its path with the following table. In

this table to save space, strings will be pushed in as one operation rather than

pushing in each symbol.

instruction stack tape instruction stack tape

» abbabb AS babb

start pop

push S S abbabb S babb

pop

» abbabb push ba baS babb

pop

push SS SS abbabb aS babb

pop

S abbabb read a S bb

pop

»

push AS A AS AS abbabb bb

pop

S AS abbabb push bb bb bb

pop

push ab abS AS abbabb b bb

pop

bS AS abbabb read b b b

pop

»

read a bS AS bbabb b

pop

» »

S AS bbabb read b

pop

read b S AS babb

Before formally proving that a language (L) is context-free if and only

if it is accepted by a PDA, we shall adopt a notation for PDAs which will be

more convenient. We shall denote by an ordered triple the current condition of

the PDA. This triple consists of the current state of the machine, the remaining

string of input symbols to be read, and the current string in the stack. For

example the triple (s, aabb, Aa Ba B) represents the PDA in state s, with aabb

on the input tape, and Aa Ba B in the stack. Given triples (s, u, V ) and (t, v, W ),

then notation (s, u, V ) (t, v, W ) indicates that the PDA can be changed from

(s, u, V ) to (t, v, W ) in a single transition from F. For example the transition

Grammars

154

((a, s, B), (t, »)) when the PDA is in condition (ab, s, Baa B) gives notation

(ab, s, Baa B) (b, t, aa B) and the transition ((a, s, »), (t, D)) when the PDA

is in condition (ab, s, Baa B) gives notation (ab, s, Baa B) (b, t, D Baa B).

We say that (s, u, V ) — (t, v, W ) if the PDA can be changed from (s, u, V ) to

(t, v, W ) in a ¬nite number of transitions.

Lemma 4.11 Let (L) be a context-free language. There exists a PDA that

accepts L

M = ( , Q, s, I, ’, F)

where is a ¬nite alphabet, Q is a ¬nite set of states, s is the initial or starting

state, I is a ¬nite of stack symbols, ’ is the transition relation, and F is the set

of acceptance states. The relation ’ is a subset of

»

— Q — I » ) — (Q — I » )).

((

Proof As previously mentioned, we shall assume that the PDA has two

states which we shall denote here as s and t so that M = { , Q, s, I, ’, F)

where , the alphabet, consists of the terminal symbols T of the grammar

= (N , T, S, P), Q = {s, t}, s is the initial or starting state, I consists of the

terminal and nonterminal symbols of the grammar, i.e. I = T ∪ N , the set of

stack symbols, T = {s, t}, and ’ is the transition relation de¬ned as follows:

(1) ((s, », »), (t, S)) ∈ ’ so (s, u, ») (t, u, S) for u ∈ T — . Begin by pushing

S, start symbol of the grammar, i.e. The automaton begins with

Start

Push

(t) S

(2) If A ’ w in , then (t, », A), (t, w)) ∈ ’ so (t, u, A) (t, u, w) for u ∈

T — . (If a nonterminal A is popped from the stack, then for some production

A ’ w in , w is pushed into the stack, i.e. we have in the automaton

pop

A

push

w

4.3 Pushdown automata and context-free languages 155

(3) For all a ∈ T , ((t, a, a), (t, »)) ∈ ’ so (t, au, aw) (t, u, w) for u ∈ T — .

If a terminal a is popped from the stack then a must be read, i.e. if we have

in the automaton

pop

a

it must be followed by

read

a

We shall assume that is in Chomsky normal form. The reader is asked to

prove the theorem when is in Greibach normal form.

Using left derivation, every string that is derived either begins with a terminal

or contains no terminal. In the corresponding PDA, the derived string is placed

in the stack using (2). If there is a terminal, it is compared with the next letter to

be read as input. If they agree, then the terminal is removed from the stack using

(3). If the word generated by the terminal is the same as the word generated

by grammar, each terminal will be removed from the stack as it is generated

leaving only an empty stack after the tape has been read.

We ¬rst show, assuming leftmost derivation in , that if S ’— ±β where

± ∈ T — and β begins with a nonterminal or is empty, then (t, ±, S) — (t, », β).

Hence if S ’— ± in where ± ∈ T — , then (s, ±, ») (t, ±, S) — (t, », »),

and ± is accepted by the PDA. We prove this using induction on the length of

the derivation. Suppose n = 0, but then we have S ’— S, so ± = », β = S,

and (t, », S) — (t, », S) gives us (t, ±, S) — (t, », β). Now assume S ’— γ

in k + 1 steps. Say

S ’ m 1 ’ m 2 ’— m k ’ m k+1 .

Then there is a ¬rst nonterminal B in the string m k and a produc-

tion B ’ w so m k = u Bv and m k+1 = uwv. By the induction hypothe-

sis, since S ’— u Bv, (t, u, S) — (t, », Bv). Since B ’ w using relation

(2), we have (t, », B), (t, w)) ∈ ’ and (t, », Bv) (t, », wv). If the pro-

duction B ’ w has the form B ’ C D, where C and D are nontermi-

nals, so that w = C D, then w begins with a nonterminal and (t, u, S) —

(t, », Bv) (t, », wv) or (t, u, S) — (t, », wv) where m k+1 = uwv as desired.

Grammars

156

If the production has the form B ’ a so that w = a, where a is a terminal,

then (t, ua, S) — (t, a, Bv) (t, a, av) (t, », v) using derivation (3) above.

Hence (t, uw, S) — (t, », v) where m k+1 = uwv as desired. Note that since

we are using leftmost derivation, v must begin with a nonterminal.

We now show that if (t, ±, S) — (t, », β) with ± ∈ T — , β ∈ I — , then S ’—

±β. Hence if (s, ±, ») (t, ±, S) — (t, », ») so ± is accepted by the PDA, then

S ’— ± so ± is generated by the grammar.

We again use induction on the length of the computation by the PDA. If

k = 0, then (t, », S) — (t, », S) so S ’— S, which is certainly true. Assume

(t, ±, S) — (t, », β) in k + 1 steps so that (t, ±, S) — (t, v, γ ) in k steps and

(t, v, γ ) (t, », β). If the transition relation for (t, w, γ ) (t, », β) is relation

(2), then γ = Bv, β = wv and B ’ w. Since no input is read, we have w = ».

Therefore by induction, S ’— ±γ = ± Bv. Since B ’ w, ± Bv ’ ±wv = ±β.

Therefore S ’— ±β. If the transition relation for (t, v, γ ) (t, », β) is type

(3), then v = a and γ = aβ for some terminal a. But since a is the last input

read from the string ±, ± = au for u ∈ T — . Hence (t, u, S) — (t, », γ ) and by

induction S ’— uγ = uaβ = ±β.

For a given pushdown automaton M = ( , Q, s, I, ’, F), we next wish to

construct a context-free grammar = (N , T, S, P). The expression N shall be

of the form p, B, q , where p and q are states of the automaton and B is in the

stack. Thus p, B, q represents the input u read in passing from state p to state

q, where B is removed from the stack. In fact we shall have p, B, q ’— u.

The terminal p, », q represents the input read in passing from state p to state

q and leaving the stack as it was in state p. The productions consist of the

following four types.

(1) For each q ∈ T, the production S ’ s, », q .

(2) For each transition (( p, a, B), (q, D)) ∈ ’, where B, D ∈ I ∪ {»}, the

productions p, B, t ’ a q, D, t for all t ∈ Q.

(3) For each transition (( p, a, D), (q, B1 B2 . . . Bn )) ∈ ’, where D ∈ I ∪

{»}, B1 , B2 , . . . , Bn ∈ C, the productions

p, D, t ’ a q, B1 , q1 q1 , B2 , q2 q2 , B3 , q3 . . . qn’1 , Bn , t

for all q1 , q2 , . . . , qn’1 , t ∈ Q.

(4) For each q ∈ Q, the production q, », q ’ ».

The ¬rst statement intuitively says that at the beginning we need to generate

the entire word accepted by the PDA. The second statement intuitively says that

the output generated by p, B, t , which is the input to be read by the PDA in

state p using stack B moving to state t, is equal on the right-hand side of the

4.3 Pushdown automata and context-free languages 157

production to the input read in passing from state p with stack A to state q with

stack D followed by the output generated by q, D, t which is the input read

by the PDA in state q with stack D moving to state t.

The third statement intuitively says that the output generated by p, D, t ,

which is the input to be read by the PDA in state p with stack D moving to

state t, is equal on the right-hand side of the production to the input read in

passing from state p with stack D to state q with stack B1 B2 . . . Bn followed

by the output generated by q, B1 , q1 , the input read by passing from state q

using stack B1 to state q1 . . . followed by the output generated by q1 , B2 , q2 ,

the input read by passing from state q1 using stack B2 to state q2 . . . followed

by the output generated by qn’1 , Bn , t , the input read by passing from state

qn’1 using stack Bn to state t.

The fourth statement intuitively says that to move from a state to itself

requires no input. Note that the productions of type (4) are the only ones

which pop nonterminals without replacing them with other nonterminals. Hence

a word in the language of the grammar cannot be generated without these

productions.

Lemma 4.12 A language M(L) accepted by a pushdown automaton M =

( , Q, s, I, ’, F), is a context-free language.

Proof Using the grammar = (N , , S, P), where the nonterminals and pro-

ductions are described above, we show that generates the same language as

accepted by M.

We ¬rst show that for p, q ∈ Q, B ∈ I ∪ {»} and w ∈ A— , that

p, B, q ’— w if and only if ( p, w, B) —

(q, », »).

Thus for t ∈ Q, s, », t ’— w if and only if (s, w, ») — (t, », ») so that a

word is generated by if and only if it is accepted by M.

First, using induction on the number of derivation steps, we show that if

p, B, q ’— w then ( p, w, B) — (q, », »). Beginning with n = 1, the only

possibility is that a nonterminal is popped, without replacement. This can only

occur using productions of type (4), so we have p = q, B = », and w = ».

But this gives us ( p, », ») — ( p, », ») which is obvious. Assume n = k > 1,

then the ¬rst production can only be of type (2) or type (3). If it is type (2),

we have p, B, q ’ a r, D, q for p, r ∈ Q, where (( p, a, B), (r, D)) ∈ ’.

Hence letting w = av, ( p, w, B) (q, v, D) and by induction if r, D, q ’— v

then (q, v, D) — (q, », »). Therefore ( p, w, B) — (q, », »).

If the ¬rst production is of type (3), we have

p, B, q ’ a q0 , B1 , q1 q1 , B2 , q2 q2 , B3 , q3 . . . qn’1 , Bn , q ’— w

Grammars

158

and (( p, a, B), (q, B1 B2 . . . Bn )) ∈ ’. So if w = av, ( p, w, B) (q, v,

B1 B2 . . . Bn ). For convenience of notation, let q = qn . Let qi’1 , Bi , qi ’— u i

so that w = au 1 u 2 . . . u n and v = u 1 u 2 . . . u n . By induction, (qi’1 , u i , Bi ) —

(qi , », »).

Therefore we have

( p, w, B) (q0 , u 1 u 2 . . . u n , B1 B2 . . . Bn )

—

(q1 , u 2 . . . u n , B2 . . . Bn )

—

(q2 , u 3 . . . u n , B3 . . . Bn )

. .

. .

. .

—

(qn’1 , u n , Bn )

(qn , », »)

so that ( p, w, B) — (q, », »).

We now show that if ( p, w, B) — (q, », ») then p, B, q ’— w. We use

induction on the number of steps in ( p, w, B) — (q, », »). If there are 0 steps,

then p = q and w = B = ». This corresponds to p, », p ’ » which is one

of the productions. Therefore the statement is true for 0 steps.

Assume ( p, w, B) — (q, », ») in k + 1 steps. First assume that we have

w = av and

—

( p, w, B) (q, v, D) (q, », »)

where (( p, a, B), (r, D)) ∈ ’, and B, D ∈ I ∪ {»}, giving productions p, B,

q ’ a r, D, q . Since (q, v, D) — (q, », ») by induction hypothesis,

r, D, q ’— v. Therefore p, B, q ’ a r, D, q ’— av = w and we are

¬nished.

Next assume w = av and the ¬rst step is ( p, w, B) (q, v, B1 B2 . . . Bn ) so

we have

—

( p, w, B) (q0 , v, B1 B2 . . . Bn ) (q, », »)

and each Bi is eventually removed from the stack in order so that there are states

q1 , q2 , . . . qn’1 , qn where qn = q and v = v1 v2 . . . vn’1 vn such that

( p, w, B) (q0 , v1 v2 . . . vn’1 vn , B1 B2 . . . Bn )

—

(q1 , v2 . . . vn’1 vn , B2 . . . Bn )

—

(q2 , v3 , . . . vn’1 vn , B3 . . . Bn )

..

..

..

—

(qn’1 , vn , Bn )

—

(qn , », »).

4.3 Pushdown automata and context-free languages 159

By the induction hypothesis, qi’1 , Bi , qi ’— vi .

But since the production is type (3),

’ a q0 , B1 , q1 q1 , B2 , q2 q2 , B3 , q3 . . . qn’1 , Bn , q

p, B, q

’— av1 q1 , B2 , q2 q2 , B3 , q3 . . . qn’1 , Bn , q

’— av1 v2 q2 , B3 , q3 . . . qn’1 , Bn , q

..

..

..

’— av1 v2 . . . vn’1 qn’1 , Bn , q

’— av1 v2 . . . vn’1 vn

so that p, B, q ’— w.

Theorem 4.6 A language is context-free if and only if it is accepted by a PDA.

Exercises

(1) Construct a pushdown automaton which reads the same language as the

grammar = (N , , S, P) de¬ned by N = {S, A, B}, = {a, b, c}, and

the set of productions P given by

S ’ aA A ’ a AB A’a B’b B ’ ».

(2) Construct a pushdown automaton which reads the same language as

generated by the grammar = (N , , S, P) de¬ned by N = {S, A, B},

= {a, b, c}, and the set of productions P given by

S ’ AB A ’ aba A A’» B ’ Bcacc B ’ ».

(3) Construct a pushdown automaton which reads the same language as

generated by the grammar = (N , , S, P) de¬ned by N = {S, A, B},

= {a, b, c, }, and the set of productions P given by

S ’ AcB A ’ aba A A’» B ’ Bcacb B ’ ».

(4) Construct a pushdown automaton which reads the same language as

generated by the grammar = (N , , S, P) de¬ned by N = {S, A, B},

= {a, b, c}, and the set of productions P given by

S ’ AB A ’ ac A B ’ bcB B ’ bB

A ’ a Aa B’» A ’ ».

Grammars

160

(5) Construct a pushdown automaton which reads the same language as

generated by the grammar = (N , , S, P) de¬ned by N = {S, A, B},

= {a, b, c, d}, and the set of productions P given by

S ’ AB A ’ a Ac B ’ bBc B ’ bB

A ’ Aa A B’» A ’ ».

(6) Construct a grammar which generates the language read by the pushdown

automaton

Start pop

b

(t)

(s)

b

a

b

a a a

pop

push read read b

a

(t)

a

b b b pop

push

b

accept

(7) Construct a grammar which generates the language read by the pushdown

automaton

Start

a

read push

a

a

read

b

read

b b

a

pop

read

b

Accept

4.3 Pushdown automata and context-free languages 161

(8) Construct a grammar which generates the language read by the pushdown

automaton

Start

a

a

read push

a

read

b

pop

a

a

read

a

Accept

(9) Construct a grammar which generates the language read by the pushdown

automaton

Start

(t)

a a a

push

b b a

read push push

read

b b

b

a

a

b pop

read read

b

a a b

accept accept

(10) Prove Theorem 4.6 “A language is context-free if and only if it is accepted

by a PDA.” Assume the grammar is in Greibach normal form.

(11) Construct a pushdown automaton that reads the same language as the

grammar = (N , , S, P) de¬ned by N = {S, B} ∪ , = {a, b, c},

and the set of productions P given by

S ’ aA A ’ a AB A’a B’b B ’ ».

Grammars

162

(12) Construct a pushdown automaton that reads the same language as the

grammar = (N , , S, P) de¬ned by N = {S, A, B}, = {a, b, c}, and

the set of productions P given by

S ’ AB A ’ aba A A’» B ’ Bcacc B ’ ».

(13) Construct a pushdown automaton that reads the same language as the

grammar = (N , ’, S, P) de¬ned by N = {S, A, B}, = {a, b, c}, and

the set of productions P given by

S ’ Ad B A ’ aba A A’» B ’ Bcacb B ’ ».

(14) Construct a pushdown automaton that reads the same language as the

grammar = (N , ’, S, P) de¬ned by N = {S, A, B}, ’ = {a, b, c}, and

the set of productions P given by

S ’ AB A ’ ac A B ’ bcB B ’ bB

A ’ a Aa B’» A ’ ».

(15) Construct a pushdown automaton that reads the same language as the

grammar = (N , ’, S, P) de¬ned by N = {S, A, B}, ’ = {a, b, c, d},

and the set of productions P given by

S ’ AB A ’ a Ac B ’ bBc B ’ bB

A ’ Aa A B’» A ’ ».

4.4 The Pumping Lemma and decidability

Just as we were able to show that there are languages that are not regular

languages, we are also able to show that there are languages that are not context-

free. We begin by returning to the concept of the parse tree or derivation tree.

The height of the tree is the length of the longest path in the tree. The level of

a vertex A in the tree is the length of the path from the vertex S to the vertex A.

A tree is a binary tree if each vertex has at most two children. Note that if the

grammar is in Chomsky normal form then every tree formed is a binary tree.

Lemma 4.13 If A ’— w where A is a nonterminal and the height of the

corresponding derivation tree with root A is n, then the length of w is less than

or equal to 2n’1 .

Proof We use induction on the height of the derivation tree. If n = 1, then the

derivation has the form A ’ a, and the length of w = a is 1 = 20 . Assume the

lemma is true when n = k, and let A ’— w have a derivation tree of height k +

1. Then A ’ BC ’— uv = w, where B ’— u and C ’— v and the derivation

4.4 The Pumping Lemma and decidability 163

tree for both of these derivations has height n. Therefore both u and v have length

less than or equal to 2k’1 and w has length less than or equal to 2 · 2k’1 = 2k =

2(k+1)’1 .

Previously we had a pumping theorem for regular languages. We now have

one for context-free languages.

Theorem 4.7 (Pumping Lemma) Let L be a context-free language. There

exists an integer M so that any word longer than M in L has the form xuvwy

where uw is not the empty word, the word uvw has length less than or equal

to M, and xu n wv n y ∈ L for all n ≥ 0.

Proof Let L ’ {»} be a nonempty language generated by the grammar

= (N , , S, P) in Chomsky normal form with p productions. Let M = 2 p .

Assume there is a word w in L with length greater than or equal to M. Then

by the previous theorem, the derivation tree has height greater than p. There-

fore there is a path S ’ · · · ’ a where a is a letter in the derivation tree with

length greater than p and a is a letter of w. Since there are only p productions,

some nonterminal occurs more than once on the left-hand side of a production.

Let C be the ¬rst nonterminal to occur the second time. Therefore we have a

derivation

S ’— ±Cβ ’— xuCvy ’— xuwvy

where ± ’— x, β ’— y, C ’— uCv and C ’ w. But using these derivations,

we can form the derivation

S ’ xuCvy ’— xuuCvvy ’— xu n wv n y for any positive integer n.

Since the ¬rst production in the derivation has the form C ’ AB ’— uCv,

and there are no empty words, either u or v is not the empty word. Pick a letter

a in uwv; we can work our way back to S using one occurrence of each of the

productions C ’— uCv and C ’ w. Hence the length of the path is at most p,

and the length of uwv is less than or equal to M.

We are now able to ¬nd a language which is not a context-free language.

Corollary 4.1 The language L = {a m bm a m : m ≥ 1} is not a context-free

language.

Proof Assume m is large enough so that the length of a m bm a m is larger than

M. Therefore a m bm a m = puqvr where pu n qv n r ∈ L for all n ≥ 1. If either u

or v contains both a and b, for example assume u = a i b j , then (a i b j )n must be

a substring of pu n qv n r which is clearly impossible. Thus u and v each consist

entirely of strings of as or entirely of strings of bs. They cannot both be strings

Grammars

164

of as or strings of bs since the number of occurrences of the common letter in

these strings could continue as n increases but the number of occurrences of the

other letter would not increase. Therefore u must be a string of as to begin each

word a m bm a m and v must be a string of as to end each word a m bm a m which is

a contradiction.

Example 4.19 The language L = {a i b j ci d j } : i, j ≥ 1} is not a context-free

language.

Proof Let M and xu n wv n y be the same as in the previous theorem. Therefore

|uwv| ¤ m. Consider a m bm cm d m ∈ L. Since |uwv| ¤ m it must be contained

in a power of one of the letters a, b, c, or d or in the powers of two adjacent

letters. If it is contained in the power of one letter, say a, then there are fewer

as than cs in each word in M, which is a contradiction. If it is contained in a

power of two adjacent letters, say a and b, then there are fewer as than cs in

each word in M, which is a contradiction.

Example 4.20 The language L = {ww : w ∈ {a, b}— } is not context-free. We

shall see in Theorem 4.10 that the intersection of a regular language and a

context-free language is context-free. But L © a — b— a — b— ={a i b j a i b j } is not

context-free. The argument is the same as in the previous example.

Example 4.21 The language L = {x : the length of x is a prime} is not

context-free. Since primes are arbitrarily large some element w with length m of

L must have the form xu n wv n y for n ≥ 2. Let m = |xwy|. Then |uv| = n ’ w

and |xu m wv m y| = m + m(n ’ w) which is not a prime.

We have previously seen that the set of regular languages is closed under

the operations concatenation, union, Kleene star, intersection and complement.

We now explore these same operations for the set of context-free languages.

Theorem 4.8 The set of context-free languages is closed under the operations

of concatenation, union, and Kleene star.

Proof Let L 1 and L 2 be generated by grammars 1 = (N1 , 1 , S1 , P1 ) and

2 = (N2 , 2 , S2 , P2 ) respectively. Assume N1 and N2 are disjoint. This can

always be accomplished by relabeling the elements in either N1 or N2 .

The language L 1 L 2 can be generated by the grammar = (N , , S, P)

where N = N1 ∪ N2 ∪ {S}, = 1 ∪ 2 , and P = P1 ∪ P2 ∪ {S ’ S1 S2 }. If

u ∈ L 1 and v ∈ L 2 , then S1 ’— u, in 1 , S2 ’— v in 2 , and using leftmost

derivation, we have S ’ S1 S2 ’— u S2 ’— uv in .

The language L 1 ∪ L 2 can be generated by the grammar = (N , , S, P)

where N = N1 ∪ N2 ∪ {S}, = 1 ∪ 2 , and P = P1 ∪ P2 ∪ {S ’ S1 , S ’

4.4 The Pumping Lemma and decidability 165

S2 }. Let w ∈ L 1 ∪ L 2 . Therefore w ∈ L 1 or w ∈ L 2 . If w ∈ L 1 , then S1 ’— w in

— — —

1 , and S ’ S1 ’ w in . If w ∈ L 2 , then S2 ’ w in 2 , and S ’ S2 ’ w

in .

The language L — can be generated by the grammar = (N , , S, P)

1

where N = N1 ∪ {S}, = 1 and P = P1 ∪ P2 ∪ {S ’ S1 S, S ’ »}. Let

w1 , w2 , w3 , . . . , wn ∈ L 1 . Using productions S ’ S1 S and S ’ », we can

form the derivation S ’— S1 = S1 S1 S1 · · · S1 . Using leftmost derivations

n

we can derive S ’— S1 S1 S1 · · · S1 ’— w1 S1 S1 · · · S1 ’— w1 w2 S1 · · · S1 ’—

w1 w2 · · · wn in . Hence L — = L, the language generated by .

1

Theorem 4.9 The set of context-free languages is not closed under the oper-

ations of intersection and complement.

Proof The sets {a n bn cm : m, n ≥ 0} and {a n bm cm : m, n ≥ 0} are context-

free. The ¬rst is generated by the grammar with productions

P = {S ’ BC, B ’ a Bb, B ’ », C ’ cC, C ’ »}.

The second is generated by the grammar with productions

P = {S ’ AB, A ’ a A, A ’ », B ’ bBc, B ’ »}.

However, the intersection is the language L = {a m bm a m : m ≥ 0}, which we

have shown is not context-free.

If the set of context-free languages is closed under complement then since

L 1 © L 2 = (L 1 ∪ L 2 ) , the set of context-free languages is closed under inter-

section which we have already shown is not true.

Although the intersection of context-free languages is not necessarily a

context-free language, the intersection of a context-free language and a reg-

ular language is a context-free language. The proof is somewhat similar to the

one showing that the union of languages accepted by an automaton is accepted

by an automaton.

Theorem 4.10 The intersection of a regular language and a context-free

language is context-free.

Proof Let the pushdown automaton M = ( , Q, s, I, ’, F) where is the

alphabet, Q is the set of states, s is the initial or starting state, I is the set of stack

symbols, F is the set of acceptance states, and ’ is the transition relation where

the relation ’ is a ¬nite subset of ((Q — — — I — ) — (Q — I — )). Let the deter-

ministic ¬nite automaton M1 = ( 1 , Q 1 , q0 , ’1 , F1 ) where 1 is the alphabet,

Q 1 is the set of states, q0 is the initial or starting state, F1 is the set of acceptance

states, and ’1 is the transition function. We now de¬ne the pushdown automaton

Grammars

166

M2 = ( 2 , Q 2 , s2 , I2 , ’, F2 ) where s2 = (s, q0 ), 2 = 1 ∪ , I2 = I, Q 2 =

Q — Q 1 , and F2 = F — F1 . De¬ne ’2 by (((si , q j ), a, X ), ((sm , qn ), b)) ∈ ’2

if and only if ((si , a, X ), (sm , b)) in M and ’1 (q j , u) = qn in M1 . A word is

w accepted in M2 if and only if ((s, q0 ), w, ») — ((sa , qb ), ») in M2 , where sa

2

and qb are acceptance states in M and M1 respectively. Thus w is accepted by

the pushdown automaton M and also accepted by M1 .

To show M2 (L) = M(L) © M1 (L), the reader is asked to ¬rst show that

((s, q0 ), w, ») — ((sm , qn ), », ±) if and only if (s, w, ») — (sm , », ±) and

(q0 , w) — (qn , »), using induction on the number of operations in — . The

1 2

theorem immediately follows.

De¬nition 4.10 A nonterminal in a context-free grammar is useless if it does

not occur in any derivation S ’— w, for w ∈ — . If a nonterminal is not useless,

then it is useful.

Theorem 4.11 Given a context-free grammar, it is possible to ¬nd and remove

all productions with useless nonterminals.

Proof We ¬rst remove any nonterminal U so that there is no derivation U

’— w , for w ∈ — . To ¬nd such nonterminals, let X be de¬ned as follows:

(1) For each nonterminal V such that V ’ w is a production for w ∈ — , let

V ∈ X . (2) If V ’ V1 V2 · · · Vn where Vi ∈ X or — for 1 ¤ i ¤ n, let V ∈ X .

Continue step (2) until no new nonterminals are added to X . Any nonterminal

U not in X has no derivation U ’— w. If S is not in X , then the language

generated by the context-free grammar is empty and we are done. There are no

useful nonterminals. Assume S is in X . All productions containing nonterminals

not in X are removed from the set of productions P.

Assume such productions have been removed. We now have to remove any

nonterminal U which is not reachable by S, i.e. there is no production S ’— W

where U is in the string W . To test each nonterminal U we form a set YU as

follows: (1) If V ’ W and U is in the string W , then V ∈ YU . (2) if R ’ T ,

where an element of YU is in the string T , then R ∈ YU . Continue step (2)

until no new nonterminals are added to YU . If S ∈ YU , then U is reachable

by S. If not, U is not reachable by S. Remove all productions which contain

nonterminals not reachable by S. The context-free grammar created contains

no useless nonterminals.

Theorem 4.12 It is possible to determine whether a context-free language L

is empty.

Proof A context-free language L is empty if and only if it contains no useful

nonterminals.

4.4 The Pumping Lemma and decidability 167

Theorem 4.13 Given w ∈ — , and a context-free grammar G, it is possible

to determine whether w is in the language generated by G.

Proof Assume that G is in Chomsky normal form. Let w be a word of length

n ≥ 1. Each step of the derived string becomes longer except when the nonter-

minal is replaced by a terminal. Therefore the derivation S ’— w has length k

¤ 2n’1 since G is in Chomsky normal form. Therefore check all derivations of

length k ¤ 2n’1 .

The above proof shows that it is possible to determine whether a word is in

the language generated by a grammar; it really is not a practical way.

Theorem 4.14 Let G be a context-free grammar in Chomsky normal form

with exactly p productions. The language L(G) is in¬nite if and only if there

exists a word ω in L(G) such that 2 p < |ω| < 2 p+1 .

Proof If there is a word with length greater than 2 p then by the proof of the

Pumping Lemma, L(G) is in¬nite. Conversely, let ω be the shortest word with

length greater than 2 p+1 . By the Pumping Lemma, ω = xu i wv i y, where the

length of uvw ¤ 2 p and μ = xu i’1 wv i’1 y is in L(G). But |μ| > |ω| ’ |uv| ≥

2 p . Also |μ| < |ω| and w is the shortest word with length greater than or equal

to 2 p+1 . Therefore |μ|. < 2 p+1 .

Theorem 4.15 It is possible to determine whether a language generated by

a context-free grammar is ¬nite or in¬nite.

Proof Since it is possible to determine whether a word is in the language of

a context-free grammar, simply try all words with length between 2 p and 2 p+1

to see if one of them is in the context-free grammar. If one is, the grammar is

in¬nite. If not the grammar is ¬nite.

Exercises

= (N , ’, S, P) be de¬ned by N = {S, A, B}, ’ =

(1) Let grammar

{a, b, c}, and the set of productions P given by

S ’ AB A ’ ac A B ’ bcB B ’ bB

A ’ a Ba B’» A ’ a.

Let L be the language generated by . Find the grammar that generates L — .

(2) Let L 1 be the language generated by the grammar 1 = (N , , S, P)

de¬ned by N = {S, A, B}, = {a, b, c}, and the set of productions P

Grammars

168

given by

S ’ aA A ’ a AB A’a B’b B ’ »,

and L 2 be the language generated by the grammar 2 = (N , , S, P)

de¬ned by N = {S, A, B}, = {a, b, c}, and the set of productions P

given by

S ’ AB A ’ aba A A’» B ’ Bcacc B ’ ».

Find the grammar that generates L 1 L 2 .

(3) Let L 1 be the language generated by the grammar 1 = (N , ’, S, P)

de¬ned by N = {S, A, B}, = {a, b, c}, and the set of productions P

given by

S ’ Ad B A ’ aba A A’» B ’ Bcacb B ’ »,

and L 2 be the language generated by the grammar 2 = (N , ’, S, P)

de¬ned by N = {S, A, B}, ’ = {a, b, c}, and the set of productions P

given by

S ’ AB A ’ ac A B ’ bcB B ’ bB

A ’ a Aa B’» A ’ ».

Find the grammar that generates L 1 ∪ L 2 .

Determine whether the following languages are context-free. If the lan-

guage is context-free, construct a grammar that generates it. If it is not

context-free, prove that it is not.

L = {a m bn c2n : m, n = 1, 2, . . . }.

(4)

L = {ww R w : w ∈ {a, b}— }.

(5)

L = {w ∈ {a, b, c}— } : w has an equal number of as and bs }.

(6)

L = {a n b2n cn : n = 1, 2, . . . }.

(7)

(8) Prove the induction step in Theorem 4.10.

5

Turing machines

5.1 Deterministic Turing machines

The Turing machine is certainly the most powerful of the machines that we

have considered and, in a sense, is the most powerful machine that we can

consider. It is believed that every well-de¬ned algorithm that people can be

taught to perform or that can be performed by any computer can be performed

on a Turing machine. This is essentially the statement made by Alonzo Church

in 1936 and is known as Church™s Thesis. This is not a theorem. It has not been

mathematically proven. However, no one has found any reason for doubting it.

It is interesting that although the computer, as we know it, had not yet been

invented when the Turing machine was created, the Turing machine contains

the theory on which computers are based. Many students have been amazed to

¬nd that, using a Turing machine, they are actually writing computer programs.

Thus computer programs preceded the computer.

We warn the reader in advance that if they look at different books on Turing

machines, they will ¬nd the descriptions to be quite different. One author will

state a certain property to be required of their machine. Another author will

strictly prohibit the same property on their machine. Nevertheless, the machines,

although different, have the same capabilities.

The Turing machine has an input alphabet , a set of tape symbols,