a

b

b

b

s3

s0 s1

a

a

(7) Find the Moore machine that duplicates the ¬nite automaton

a

s1

a b a a

b b

s0 s2 s3

b

(8) Find the Moore machine that duplicates the ¬nite automaton

a b

b

s4

a

a s1 a

s0

ba

s3 b

b

s2

3.7 Mealy and Moore machines 111

(9) Let the Mealy machine Me = ( , A, S, s0 , ’, δ) be given by the

diagram

b/0

s1

b/0

a/1

a/1

s0 s3

b/1

s2 a/1

b/1

a/1

Describe A, , and S. Find tables for ’ and δ.

(10) Let the Mealy machine Me = ( , A, S, s0 , ’, δ) be given by the

diagram

a/1

c/0

a/1

s0 s1 c/1

c/1

b/0 a/0 b/1

a/0

s2 s3 c/1

b/1

b/0

Describe A, , and S. Find tables for ’ and δ.

(11) Let the Mealy machine Me = ( , A, S, s0 , ’, δ) be given by the

diagram

a/1

s1

s0 b/0 c/1

c/1 b/0 a/0

c/1

a/0

b/0

b/0 s3

s2

a/0

c/1

Find the output with input abaabbab.

(a)

Find the output with input bbaaba.

(b)

Find the output with input aabbaaa.

(c)

Find the output with input », the empty word.

(d)

Automata

112

(12) Let the Mealy machine Me = ( , A, S, s0 , ’, δ) be given by the diagram

a/1

c/0

a/1

s0 s1 c/1

c/1

b/0 a/0 b/1

a/0

s2 s3 c/1

b/1

b/0

(a) Find the output with input abcccbab.

(b) Find the output with input bbaabc.

(c) Find the output with input aaccbba.

(13) Given the Moore machine Mo = ( , A, S, s0 , ’, φ)

b

a a,b

a

a

b

s1 /0 s2 /1 s3 /0

s0 /0

b

¬nd the equivalent Mealy machine.

(14) Given the Moore machine Mo = ( , A, S, s0 , ’, φ)

a

s0 /0 s1 /1

b c

a

c

b

a c

b

b

s2 /1 s3 /0

a

c

¬nd the equivalent Mealy machine.

(15) Given the Mealy machine Me = ( , A, S, s0 , ’, δ)

a/1

b/0 s1

s0

a/1

b/ 0

b/0

s2

a/0

3.7 Mealy and Moore machines 113

¬nd the equivalent Moore machine.

(16) Given the Mealy machine Me = ( , A, S, s0 , ’, δ)

a/1

b/0 s1

s0

a /1

b/0

a/0

b /1 s2

¬nd the equivalent Moore machine.

(17) Construct a Mealy machine which directly subtracts a signed binary num-

ber from another signed binary number.

(18) Let Z 5 = {0, 1, 2, 3, 4} be the set of integers modulo 5, where the “sum” of

¯¯¯¯¯

two integers is found by adding the numbers and ¬nding the remainder of

this sum when divided by 5. Therefore 3 + 4 = 2 and 2 + 3 = 0. Construct

¯¯ ¯ ¯¯ ¯

the Moore machine that gives a sum of initial strings of elements of Z 5 .

¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯

Thus the input 2140321 produces output 02322023.

4

Grammars

4.1 Formal grammars

A grammar is intuitively a set of rules which are used to construct a language

contained in — for some alphabet . These rules allow us to replace symbols

or strings of symbols with other symbols or strings of symbols until we ¬nally

have strings of symbols contained in allowing us to form an element of

the language. By placing restrictions on the rules, we shall see that we can

develop different types of languages. In particular we can restrict our rules to

produce desirable qualities in our language. For example in our examples below

we would not want 3 + ·4 ’ —6. We also would not want a sentence Slowly

cowboy the leaped sunset. Suppose that we begin with a word add, and that we

have a rule that allows us to replace add with A + B and that both A and B can

be replaced with any nonnegative integer less that ten. Using this rule, we can

replace A with 5 and B with 3 to get 5 + 3. There might also be an additional

rule that allows us to replace add with a different string of symbols.

If we add further rules that A can be replaced by A + B and B can be

replaced by A — B, we can start by replacing add with A + B. If we then

replace A with A + B and B with A — B, we get A + B + A — B. We can

continue this process getting longer and longer strings, so that we can continue

to build strings of arbitrary length, but eventually we will want to replace all of

the As and Bs with integers. As noted above, we have choices in the replacement

of A and B so there is not necessarily any uniqueness in replacing a symbol or

string of symbols. Hence grammars are not deterministic. If we have derived

A + A — B and choose to replace the As and Bs with integers we do not have

to replace both by the same As with the same value. If we replace the ¬rst A

with 3, the second A with 5, and B with 7, we have 3 + 5 — 7.

Note that in the above rules add, A, and B can be replaced by other symbols

while + and the integers cannot be replaced. The symbols that can be replaced

114

4.1 Formal grammars 115

by other symbols are called nonterminal symbols and the symbols that can

not be replaced by other symbols are called terminal symbols. We generate

an element of the language when the string consists only of terminal symbols.

The rules which tell us how to replace symbols are called productions. We

denote the production (or rule) which tells us that add can be replaced with

A+B

add ’ A + B.

Thus the productions for our ¬rst example above are

’ A+B

add

’ A+B

A

’ A—B

B

’ ’

A B

0 0

’ ’

A B

1 1

’ ’

A B

2 2

. . . . . .

. . . . . .

. . . . . .

A’9 B ’ 9.

Below, we shall expand our rules to do arbitrary addition, subtraction, multi-

plication, and division of integers.

A grammar is formally de¬ned as follows:

De¬nition 4.1 A formal grammar or phrase structure grammar is denoted

by the 4-tuple (N , , S, P) which consists of a ¬nite set of nonterminal symbols

N , a ¬nite set of terminal symbols , an element S ∈ N , called the start symbol

and a ¬nite set of productions P, which is a relation in (N ∪ )— such that each

¬rst element in an ordered pair of P contains a symbol from N and at least one

production has S as the left string in some ordered pair.

De¬nition 4.2 If W and W are elements of (N ∪ )— , W = uvw, W =

uv w, and v ’ v is a production, this is denoted by W ’ W . If

W1 ’ W2 ’ W3 ’ · · · ’ Wn

for n ≥ 1, then Wn is derived from W1 . This is denoted by W1 ’— Wn and is

n

called a derivation. If the number of productions in not important we simply

use W1 ’— Wn . The set of all strings of elements of which may be generated

by the set of productions P is called the language generated by the grammar

and is denoted by (L).

To generate a word from the grammar , we keep using productions to derive

new strings until we have a string consisting only of terminal elements.

Grammars

116

Thus in our example above,

N = {add, A, B},

= {+, —, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9},

S = add,

and

P = {(add, A + B), (A, A + B), (B, A — B), (A, 0),

(A, 1), . . . , (A, 9), (B, 0), (B, 1), . . . , (B, 9)},

where we will denote (add, A + B) by add ’ A + B, (A, A + B) by A ’

A + B, etc. If we eliminate the production (B, A — B), the language generated

by is the set of all formal expressions of ¬nite sums of nonnegative integers

less than 10.

Example 4.1 In the grammar described above, derive the expression

2 + 4 + 7 — 6.

Begin with the production

add ’ A + B

to derive

A + B.

Then use the production

B ’ A—B

to derive

A + A — B.

Then use the production

A’ A+B

to derive

A + A + B — B.

Then use the productions

A’2 A’4 B’7 B’6

4.1 Formal grammars 117

to derive 2 + 4 + 7 — 6. Note that we cannot derive

3 — 2 + 4 + 7 — 6.

Example 4.2 Suppose we want a grammar which derives arithmetic expres-

sions for the set of integers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Thus the language gen-

erated by the grammar is the set of all ¬nite arithmetic expressions for the

set of integers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Examples would be 3 — (5 + 4) and

(4 + 5) · (3ˆ2), where ˆ denotes exponent. As mentioned above, we obviously

want to exclude expressions such as 3 + —6 and 3 + ·6 — 4 ’ 5. Let the set

N = {S, A, B} and = {+, ’, —, ·, ˆ, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, )}. We will

need the following productions:

’ (A + B) ’ (A + B)

S B

’ (A ’ B) ’ (A ’ B)

S B

’ (A — B) ’ (A — B)

S B

’ (A · B) ’ (A · B)

S B

’ (AˆB) ’ (AˆB)

S B

’ (A + B) ’0

A A

. ..

. ..

A ’ (A ’ B) . ..

A ’ (A — B) ’9

A

A ’ (A · B) ’0

B

. ..

. ..

A ’ (AˆB) . ..

’ 9.

B

We will use the grammar to derive the arithmetic expression

((2 + 3) · (4 + 5)).

We begin with the production

S ’ (A · B).

We then use the productions

A ’ (A + B)

and

B ’ (A + B)

to derive

((A + B) · (A + B)).

Grammars

118

The productions

A’2 B’3

and

give us

((2 + 3) · (A + B)).

Finally we use the productions

A’4 B’5

and

to derive

((2 + 3) · (4 + 5)).

We next use the grammar to derive the arithmetic expression

((3ˆ2) · (5 — 7)).

We begin with the production

S ’ (A · B).

We then use the productions

A ’ (AˆB) B ’ (A — B)

and

to derive

((AˆB) · (A — B)).

The productions

A’3 B’2

and

give us

((3ˆ2) · (A — B)).

Finally we use the productions

A’5 B’7

and

to derive

((3ˆ2) · (5 — 7)).

Example 4.3 In a similar manner, we may form arithmetic expressions in

post¬x notation. Let the set N = {S, A, B} and

= {+, ’, —, /, ˆ, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.

4.1 Formal grammars 119

We will need the following productions:

S ’ AB+ A ’ AB+ B ’ AB+ A’0

.

.

S ’ AB’ A ’ AB’ B ’ AB’ .

S ’ AB— A ’ AB— B ’ AB— A’9

S ’ AB· A ’ AB· B ’ AB· B’0

.

.

S ’ ABˆ A ’ ABˆ B ’ ABˆ .

B ’ 9.

Consider the expression 3 2 + 4 7 + —. Since our integers are all less

than ten, 3 2+ represents the integer symbol 3, followed by the integer

symbol 2 and the + symbol. To construct this expression we begin with the

production

S ’ AB — .

We then use the productions

A’ A+B B ’ A+B

and

to derive

AB + AB + —.

The productions

A’2 B’3

and

give us

3 + AB + —.

2

Finally we use the productions

A’4 B’7

and

to derive

3+4 7 + —.

2

Example 4.4 A grammar may also be used to derive proper sentences. These

sentences are proper in the sense that they are grammatically correct, although

they may not have any meaning. Suppose we want a grammar which will derive

the following statements, among others:

Joe chased the dog.

The fast horse leaped over the old fence.

The cowboy rode slowly into the sunset.

Grammars

120

Before actually stating the grammar let us decide upon its structure. This

allows us to be assured that each sentence in the grammar is a grammatically

correct sentence. Each of our sentences has a noun phrase (noun p), a verb

phrase (verb p), and another noun phrase. In addition the last two sentences

have a preposition (prep). Therefore let the ¬rst production be

S ’< noun p >< verb p >< prep >< noun p > .

In our example, the most general form of a noun phrase is an article followed

by an adjective and then a noun. Therefore let the next production be

< noun phrase > ’ < art >< adj >< noun >

where “art” represents article and “adj” represents adjective

The most general form of a verb phrase is a verb followed by an adverb.

Therefore let the next production be

< verb p > ’ < adv >< verb >

where “adv” represents adverb.

At this point, we know that the terminal set = {Joe, chased, the, The,

dog, fast, horse, leaped, over, old, fence, cowboy, rode, slowly, into, sunset}.

The nonterminal set N = {S, <noun p>, <verb p>, <art>, <adj>, <noun>,

<adv>, <verb>, <prep>}.

We next need productions which will assign values to <art>, <adj>,

<noun>, <adv>, and <verb>. In some of our sentences we do not need

<art>, <adjective>, <prep>, and <adv>. To solve this problem, we include

the productions

< art > ’ » < adj > ’ » < adv > ’ » < prep > ’ ».

By assigning these symbols to the empty set, we simply erase them when

they are not needed. The remainder of our productions consists of the following:

< art > ’ < noun > ’ < noun > ’

the horse fence

< adj > ’ < noun > ’ < adv > ’

fast dog slowly

< adj > ’ < noun > ’ < verb > ’

old cowboy chased

< noun > ’ < noun > ’ < verb > ’

Joe sunset leaped

< verb > ’ < prep > ’ < prep > ’

rode over into

< art > ’ The.

To derive the sentence “Joe chased the dog,” we begin with

S ’< noun p >< verb p >< prep >< noun p >

4.1 Formal grammars 121

to derive

< noun p >< verb p >< prep >< noun p > .

Using the production

< noun p > ’ < article >< adjective >< noun >

we derive

< art >< adj >< noun >< verb p >< prep >< noun p > .

Using

<art> ’ » <adj>’ »

we derive

< noun >< verb p >< prep >< noun p > .

Repeating the process for the second <noun phrase>, we derive

< noun >< verb p >< prep >< art >< noun > .

Using

< verb p > ’ < adv >< verb >,

we derive

< noun >< adv >< verb >< prep >< art >< noun > .

Using

<adv> ’ » <prep> ’ »

we derive

< noun >< verb >< art >< noun > .

Using

< noun > ’ Joe < noun > ’ dog < verb > ’ chased < art > ’ the

we derive “Joe chased the dog.”

To derive the sentence “The fast horse leaped over the tall fence,” we again

begin with

S ’< noun p >< verb p >< prep >< noun p >

to derive

< noun p >< verb p >< prep >< noun p > .

Grammars

122

Using the production

< noun p > ’< art >< adj >< noun >

we derive

< art >< adj >< noun >< verb p >< prep >< noun p > .

Using

< art > ’ the < art > ’ The < adj > ’ fast < noun > ’ horse

we derive

The fast horse < verb p >< prep >< noun p > .

Using

< verb p > ’< adv >< verb >,

we derive

The fast horse < adv >< verb >< prep >< noun p > .

Using

< adv > ’ » < verb > ’ leaped

we derive

The fast horse leaped < prep >< noun p > .

Using

< prep > ’ over,

we derive

The fast horse leaped over < noun p > .

Using the production

< noun p > ’< art >< adj >< noun >

we derive

The fast horse leaped over < art >< adj >< noun > .

Using

< art > ’ the < adj > ’ tall < noun > ’ fence

4.1 Formal grammars 123

we derive

The fast horse leaped over the tall fence.

Derivation of the last sentence is left to the reader.

De¬nition 4.3 For each production P ’ w1 w2 w3 . . . wn the corresponding

tree is

P

...

w1 w2 w3 wn

Thus the corresponding tree for S ’ A + B is

S

A+B

De¬nition 4.4 If the corresponding trees of the productions used to derive a

given expression are connected, they form a tree with root S, called the parse

tree or the derivation tree. If A ’ B occurs in the derivation then there is an

edge from A to B in the tree. The symbols A and B are called vertices or nodes.

The vertex B is called the child of A. Note that a terminal at a vertex has no

children. Such a vertex is called a leaf of the tree. The leaves of the tree, when

read left to right, form the word generated by the tree. If A0 ’ A1 ’ · · · ’ An

forms a string of edges in the tree then there is a path of length n from A0 to An .

Example 4.5 In Example 4.1, we used productions to derive 3 + 2 + 4.

To construct the tree, begin with the ¬rst production used

add ’ A + B

to form corresponding tree

add

A+B

Then use the corresponding tree

A

A+B

Grammars

124

of the production

A’ A+B

to form the tree

add

B

A +

A+B

Then use the corresponding tree in

B

— B

A

of the production

B ’ A—B

to get the corresponding tree

add

B

A +

A+B A —B

Then use the corresponding trees of the next productions

A’2 B’4 A’7 B’6

to form the parse tree

add

B

A +

A+B A —B

4 7 6

2

Example 4.6 In Example 4.3, to derive ((2 + 3) — (4 + 5)), we use the

productions

S ’ (A — B) A ’ (A + B) A’2 B’3

B ’ (A + B) A’4 B ’ 5.

4.1 Formal grammars 125

Therefore the parse tree is the tree

S

— B)

(A

(A + B) (A + B)

3

2 5

4

Example 4.7 In Example 4.4, to derive the sentence “Joe chased the dog,”

using productions

S ’< noun p >< verb p >< prep >< noun p >

to get

< noun p >< verb p >< prep >< noun p >

< noun p > ’< article >< adjective >< noun >

to get

< art >< adj >< noun >< verb p >< prep >< noun p >

Using

< art > ’ » < adj > ’ »

we get

< noun >< verb p >< prep >< noun p >

Again using

< noun p > ’< article >< adjective >< noun >

and

< art > ’ » < adj > ’ »

we get

< noun >< verb p >< prep >< art >< noun > .

Using

< verb p > ’ < adv >< verb >,

Grammars

126

we get

< noun >< adv >< verb >< prep >< art >< noun > .

Using

< adv > ’ » < prep > ’ »

we get

< noun >< verb >< art >< noun > .

Using

< noun > ’ Joe < noun > ’ dog < verb > ’ chased art ’ the

we have the correspondence tree for “Joe chased the dog.”

S

(noun p) (verb p) (prep) (noun p)

l

(art) (adj) (noun) (adv) (verb) (art) (adj) (noun)

l

l l

l

Joe the dog

chased

Example 4.8 In Example 4.4, to derive the sentence “The large dog leaped

over the old fence,” we use productions

S ’< noun p >< verb p >< prep >< noun p >

< noun p > ’< art >< adj >< noun >

< noun p > ’ < art >< adj >< noun > < adj > ’ fast

< verb p > ’ < adv >< verb > < adv > ’ »

< prep > over < art > ’ The

< adj > ’ tall < noun > ’ fence

< noun > ’ horse < verb > ’ leaped

< art > ’ the.

4.1 Formal grammars 127

Thus the parse tree is

S

(noun p) (verb p) (prep) (noun p)

(art) (adj) (noun) (adv) (verb) (over) (art) (adj) (noun)

The fast horse the tall fence

leaped

l

In all of the grammars in this section, the productions have been of the form

A ’ W , where A is a nonterminal symbol. Therefore the production can be

used everywhere that A appears, regardless of its position in an expression.

Such grammars are called context-free grammars. A language generated by

a context-free grammar is called a context-free language. If a grammar has

a production of the form a Ab ’ W where A is a nonterminal and ab = »

then this production can only be used when a is on the left-hand side of A and

b is on the right-hand side. It therefore cannot be used whenever A appears and

so it is dependent on the context in which A appears. Such a grammar is called

a context-sensitive grammar.

In the following examples, we consider context-free grammars which gen-

erate more abstract languages:

= (N , , S, P) be the grammar de¬ned by N =

Example 4.9 Let

{S, A, B}, = {a, b} and P be the set of productions

S ’ AB A’a B ’ Bb B’» A’» A ’ a A.

Using the production S ’ AB, we derive AB. Next using the productions

A ’ a and B ’ », we derive a. If we use the productions

S ’ AB A’» B ’ Bb B’»

in order, we derive b. We can also generate aabbb, aaaa, aaab, and bbbbb.

In fact, we can generate a m bn for all nonnegative integers m, n. Hence the

expression for the language generated by is a — b— .

= (N , , S, P) be the grammar de¬ned by N =

Example 4.10 Let

{S, A}, = {a, b} and P be the set of productions

S ’ a Ab A ’ a Ab A ’ ».

Grammars

128

Using the productions S ’ a Ab and A ’ » we derive ab. Using the

productions

S ’ a Ab A ’ a Ab A’»

in order, we derive aabb or a 2 b2 . Using the productions

S ’ a Ab A ’ a Ab A ’ a Ab A ’ ab

in order, we derive aaabbb or a 3 b3 . It is easily seen that the language generated

by is {a n bn : n is a positive integer}. Note that this is not the same as a — b—

since this would also include a m bn where m and n are not equal.

= (N , , S, P) be the grammar de¬ned by N =

Example 4.11 Let

{S, A, B}, = {a, b} and P be the set of productions

S ’ AB AB AB A A ’ Aa A’» B ’ b.

It can be shown that the expression for the language generated by is

a — ba — ba — ba — . This is the language consisting of all words containing exactly

three bs.

In Example 4.10 we generated the language {a n bn : n is a positive integer}.

Intuitively, we can see that this is not a regular language since the only way that

we can generate an in¬nite regular language using a ¬nite alphabet is with the

Kleene star —. In this case the only possibility is a — b— but, as mentioned earlier,

this does not work since this would also include a m bn where m and n are not

equal.

One might ask if there is a particular type of grammar which generates only

regular languages. The answer is yes, as we shall now show.

De¬nition 4.5 A context-free grammar = (N , , S, P) is called a regular

grammar if every production p ∈ P has the form n ’ w where w is the empty

word » or the string w contains at most one nonterminal symbol and it occurs

at the end of the string if at all.

Therefore w could be of the form aac A, ab, » or b A, where a, b, and

c are terminals and A is a nonterminal. However, w could not be of the

form a Ab, a AB, or Aa. The production n ’ abc A could be replaced by the

productions

n ’ aB B ’ bC C ’ c A.

Also it is possible w could contain no terminal and one nonterminal so we have

B ’ C, but if this is followed by C ’ t D, where t is a terminal, then we

4.1 Formal grammars 129

can combine the two productions to get B ’ t D. Hence it is no restriction to

require each production to be one of the following forms:

A ’ aB B’b C ’»

where A, B, and C are nonterminal elements, and a and b are terminal elements.

More formally, we de¬ne a linear regular grammar as follows:

De¬nition 4.6 A context-free grammar = (N , , S, P) is called a linear

regular grammar if every production p ∈ P has the form n ’ w where the

string w has the form xY , x or » where x ∈ and Y ∈ N .

Theorem 4.1 A language is generated by a linear regular grammar if and

only if it is generated by a regular grammar.

Proof Obviously every language that is generated by a linear regular grammar

is generated by a regular grammar. To show every regular grammar is generated

by a linear regular grammar, we divide the proof into two parts. We ¬rst show

the language of a regular grammar can be generated by productions of the forms

A ’ aB B’b C ’» C’D

where A, B, C, and D are nonterminals and a, b are terminals. Let =

(N , , S, P) be a regular grammar and L be the language generated by

= (N , , S, P ) be the grammar formed by replacing every pro-

. Let

duction A ’ a1 a2 a3 . . . an’1 B by the set of productions A ’ a1 A1 , A1 ’

a2 A2 , . . . , An’1 ’ an’2 An’2 , An ’ an’1 B where A1 , A2 , . . . , An are new

nonterminal symbols. Let L be the language generated by . By construction

we have A ’— a1 a2 a3 . . . an’1 B. So any word of L will be created by the gram-

mar . Conversely if A ’— a1 a2 a3 . . . an’1 B is formed by productions A ’

a1 A1 , A1 ’ a2 A2 , . . . , An’1 ’ an’2 An’2 , An ’ an’1 B, then there must be

a production A ’ a1 a2 a3 . . . an’1 B in since the symbols A1 , A2 , . . . , An are

symbols which appear only in forming A ’ a1 a2 a3 . . . an’1 B.

Hence we can now assume that a regular grammar can be formed using only

productions of the form

A ’ aB B’b C ’» C’D

where A, B, C, and D are nonterminals and a, b are terminals. We want to

show that we can form a regular grammar without productions of the form

C ’ D where C and D are both nonterminals. Call this a 1-production. Let

be a regular grammar formed using the productions above and L be the

language generated by . Assume that we have productions of the form above.

Let be the grammar with all 1-productions deleted and insert the production

Grammars

130

A1 ’ An b if

A1 ’ A2 , A2 ’ A3 , · · · , An’2 ’ An’1 , An’1 ’ b An

occurred in L.

If

A1 ’ A2 , A2 ’ A3 , · · · , An’2 ’ An’1 , An’1 ’ b

occurred in L, insert the production A1 ’ b in L .

If

A1 ’ A2 , A2 ’ A3 , · · · , An’2 ’ An’1 , An’1 ’ »

occurred in L, insert the production A1 ’ » in L . Let L be the language

generated by the grammar . Certainly L ⊆ L .

Assume we have S ’— w where the productions are from or or both

—

and w ∈ . If all of the productions are from , then w ∈ L. If not then there

exists u B ’ vC in the sequence where B ’ aC is not a production of and

v = ua. Take the ¬rst such production. Therefore there exist productions B ’

A1 , A1 ’ A2 , A2 ’ A3 , · · · , An’2 ’ An’1 , An’1 ’ aC in and we can

replace u B ’ vC with u B ’ u A1 ’ u A2 · · · u An’1 ’ uaC = vC, where

all of the productions are in . Since there are only a ¬nite number of produc-

tions not in , we can continue this process until all of the productions are in

and w ∈ L. Therefore L ⊆ L .

We now proceed to prove the following theorem.

Theorem 4.2 A language is regular if and only if it is generated by a regular

grammar.

Since a language is regular if and only if it is accepted by an automa-

ton, all we need to know is that a language is generated by a regular gram-

mar if and only if it is accepted by an automaton. We ¬rst show how

to construct a regular grammar which generates the same language that is

accepted by a given deterministic automaton and we then show how to con-

struct an automaton which accepts the language generated by a given regular

grammar.

Normally when we consider a word being read by an automaton, we probably

think of the automaton as removing letters from the word as it reads it. Thus if

the word to be read is abbc, and there is an a-arrow from state s0 to s1 , then

we read a, move to state s1 , and still have bbc left to read. If there is a b-arrow

from state s1 to s2 then we read b, move to state s2 , and still have bc left to read.

If there is a b-arrow from state s2 to s3 then we read b, move to state s3 , and

4.1 Formal grammars 131

still have c left to read. Finally, if there is a c-arrow from state s3 to s4 then we

read c, move to state s4 , and have nothing left to read. If s4 is a terminal state,

then we accept the word abbc.

We may also think of an automaton adding letters to words rather than

removing them. Suppose that we consider the string that has been read rather

than the string left to read. In the example above, at state s1 , we have read a.

At state s2 we have read ab. At state s3 we have read abb, and at state s4 we

have read abbc. Thus at each state we are adding a letter. Consider the grammar

= (N , , s0 , P), where N = {s0 , s1 , s2 , s3 , s4 }, = {a, b, c}, and P is the

set of productions

s0 ’ as1 s1 ’ bs2 s2 ’ bs3 s3 ’ cs4 s4 ’ »

where we have a production s4 ’ » only if s4 is a terminal state. It is easily

seen that generates the word abbc. Thus to change an automaton to a regular

grammar, if there is a k-arrow from si to s j , in the corresponding grammar,

form the production si ’ ks j . If s j is an acceptance state, add the production

s j ’ ». We shall shortly show that grammar will generate the same language

accepted by the automaton.

Example 4.12 Given the automaton,

s1

a,b

a b a

s0

a

s2

we form the productions for the corresponding grammar as follows:

Description of automaton Production

s0 ’ as1

There is an a-arrow from s0 to s1

s0 ’ bs1

There is a b-arrow from s0 to s1

s1 ’ as0

There is an a-arrow from s1 to s0

s1 ’ bs2

There is a b-arrow from s1 to s2

s2 ’ as1

There is an a-arrow from s2 to s1

s0 ’ bs2

There is a b-arrow from s0 to s2

s2 ’ ».

The state s2 is an acceptance state

= (N , , s0 , P), where N =

Hence the corresponding grammar is

{s0 , s1 , s2 }, = {a, b}, and P is the above set of productions.

Grammars

132

Example 4.13 Given the automaton

b

s1

b

a

a s3

b

s0

b

s2 a

a

we form the productions for the corresponding grammar as follows:

Description of automaton Production

There is an a-arrow from s0 to s1 s0 ’ as1

s0 ’ bs2

There is a b-arrow from s0 to s2

There is an a-arrow from s1 to s2 s1 ’ as2

s1 ’ bs3

There is a b-arrow from s1 to s3

There is an a-arrow from s2 to s2 s2 ’ as2

s2 ’ bs3

There is a b-arrow from s2 to s3

There is an a-arrow from s3 to s2 s3 ’ as2

s3 ’ bs3

There is a b-arrow from s3 to s3

The state s2 is an acceptance state s2 ’ »

The state s3 is an acceptance state s3 ’ ».

= (N , , s0 , P), where N =

Hence the corresponding grammar is

{s0 , s1 , s2 , s3 }, T = {a, b}, and P is the above set of productions.

Given an automata M = (A, S, s0 , T, F), we now give a formal de¬nition of

the grammar M = (N , , S, P), associated with an automaton and then show

that the language accepted by M is generated by M .

M = (N , T, S, P), the grammar associated with the

De¬nition 4.7

automaton M = ( , Q, s0 , T, F) has N = Q, and s0 = S. The production

si ’ as j is in P if and only if F(a, si ) = s j , and s j ’ » if and only if s j is an

acceptance state.

Lemma 4.1 The language L 1 accepted by M is equal to the language L 2

generated by M .

Proof From the above de¬nition, we have si ’ as j if and only if (si , a)

(s j , »). Thus if (si , ab) (s j , b) (sk , ») in M then si ’ as j ’ absk in M .

More generally, (si , w) — (sk , ») if and only if si ’— wsk .

We ¬rst show L 1 ⊆ L 2 . Assume w is accepted by M, then (s0 , w) — (sk , »)

where sk is an acceptance state. Since (s0 , w) — (sk , »), we have s0 ’— wsk in

—

M . Since sk is an acceptance state, sk ’ » is a production. Therefore s0 ’ w

and w is generated by M .

4.1 Formal grammars 133

Conversely, let w be generated by M . Let s0 ’— w, then s0 ’— wsk ’ w.

Therefore sk ’ » is a production, and by de¬nition of M , sk is an acceptance

state. Since s0 ’— w, we have (s0 , w) — (sk , ») in M. Therefore w is accepted

by M.

Given a regular grammar in linear regular grammar form, we now construct

an automaton which accepts this linear grammar. Given = (N , , S, P),

intuitively we add an additional nonterminal t to N and for each production

B ’ a, where a is a terminal, we remove this production and replace it with the

productions B ’ at and t ’ ». Obviously this does not change the language

of the grammar. Let M = ( , Q, s0 , T, F) be the automaton in which Q is the

set of nonterminals together with the additional nonterminal t, s0 = S. The set

F is de¬ned by F(a, A) = B if and only if A ’ a B is in P. The state B ∈ T

if B ’ ».

= (N , , S, P) be the grammar de¬ned by N =

Example 4.14 Let

{S, A, B, C}, = {a, b, c}, and P be the set of productions

S ’ aA A ’ aA S ’ bB B ’ bB

A ’ cC C ’ cC B ’ aA C ’ ».

The corresponding automaton is

a

A c

a

a C c

S b B

b

= (N , T, S, P) be the grammar de¬ned by N =

Example 4.15 Let

{S, A, B, C}, T = {a, b, c}, and P be the set of productions

S ’ aA A ’ bB S ’ bB B ’ cC A ’ aC

C ’ cA B ’ aA C ’» B ’ ».

The corresponding automaton is

a

A c

a

b

S a

b C

B c

Grammars

134

More formally, given a regular grammar = (N , T, S, P) in linear regular

grammar form, we de¬ne a nondeterministic automaton M which accepts

this linear grammar. Given = (N , T, S, P), let M = ( , Q, s0 , ’, F) be

= T and N is the set of non-

the nondeterministic automaton in which

terminals together with an additional nonterminal t, s0 = S. The set ’ is

de¬ned by B ∈ ’(a, A) if A ’ a B is in P and t ∈ ’(a, A) if A ’ a is in

P. The state B ∈ T if B ’ » or B = t. Hence (A, a) (B, ») if and only if

A ’ a B. Thus if A ’ a B ’ abC in then (A, ab) (B, b) (C, ») in M .

More generally, A ’— w B if and only if (A, w) — (B, ») for nonterminals A

and B.

Theorem 4.3 The language L 1 accepted by M is equal to the language L 2

generated by .

Proof We ¬rst show L 2 ⊆ L 1 . Let w = va be generated by . If A ’—

va B ’ va, then (A, va) — (B, ») and since the last production is B ’ », B

is an acceptance state. If A ’— v B ’ va, then (A, v) — (B, ») so (A, va) —

(B, a) since the last production is B ’ a, (B, a) (t, »), and t is an acceptance

state. Therefore w is accepted by M .

To show L 1 ⊆ L 2 let w = va be accepted by M , then if (A, va) — (B, »)

and B is an acceptance state with B ’ » then A ’— va B ’ va. If (A, va) —

(B, a) (t, »), then (A, v) — (B, ») so A ’— v B and since (B, a) (t, »),

B ’ a, so A ’— v B ’ va. Either way, w is generated by .

Exercises

Using the grammar in Example 4.9, construct a parse tree for abbb.

(1)

Using the grammar in Example 4.10, construct a parse tree for aaabbb.

(2)

Using the grammar in Example 4.11, construct a parse tree for babaab.

(3)

(4) In Example 4.4, derive the statement “The cowboy rode slowly into the

sunset” and construct the correspondence parse tree.

(5) Find the language generated by the grammar = (N , T, S, P) de¬ned

by N = {S, A, B}, T = {a, b} and the set of productions P given by

S ’ AB A ’ aA A’» B ’ Bb B ’ ».

(6) Find the language generated by the grammar = (N , T, S, P) de¬ned

by N = {S, A, B}, T = {a, b} and the set of productions P given by

S ’ aB B ’ bA A ’ aB B ’ b.

4.1 Formal grammars 135

(7) Find the language generated by the grammar = (N , T, S, P) de¬ned

by N = {S, A, B}, T = {a, b} and the set of productions P given by

S ’ aA B ’ aA S ’ bB A ’ aB

B ’ bB A ’ bA B’b A ’ a.

(8) Find the language generated by the grammar = (N , T, S, P) de¬ned

by N = {S, A, B, C}, T = {a, b} and the set of productions P given by

S’C A ’ aB C ’ bC B ’ bB

C ’ aA B ’ aA A ’ bA B ’ ».

(9) Find the grammar which generates the language wwr where w is a string

of as and bs and wr is the reverse string. For example, abba, abaaba, and

abbbba belong to wwr .

(10) Construct a grammar which generates the language wcwr where w ∈

{a, b} and wr is the reverse string.

(11) Construct a grammar which generates the language L = {w : where w ∈

{a, b} and w = wr }.

(12) Construct a grammar which generates the language L described by the

expression aa— bb— .

(13) Construct a grammar which generates the language L described by the

expression (abc)— .

(14) Construct a grammar which generates the language L described by the

expression (ab)— ∨ (ac)— .

(15) Construct a grammar which generates the language L described by the

expression ac(bc)— d.

(16) Construct a grammar which generates the language expressed by

(a— ba— ba— b)— .

(17) Construct a grammar which generates the language expressed by

(a— (ba)— bb— a)— .

(18) Construct a grammar which generates the language expressed by

(a— b) ∨ (b— a)— .

(19) Construct a grammar which generates the language expressed by

aa— bb— aa— .

(20) Construct a grammar which generates the language expressed by

(a— b) ∨ (c— b) ∨ (ac)— .

(21) Construct a grammar which generates the language expressed by

(a∨b)— (aa ∨ bb)(a ∨ b)— .

(22) Construct a grammar which generates the language expressed by

((aa— b) ∨ bb— a)ac— .

Grammars

136

(23) Construct a grammar to generate arithmetic expressions for positive inte-

gers less than ten in pre¬x notation.

(24) Find an automaton which accepts the language generated by the grammar

= (N , T, S, P) de¬ned by N = {S, A, B}, T = {a, b} and the set of

productions P given by

S ’ aB B ’ bA A ’ aB B ’ b.

(25) Find an automaton which accepts the language generated by the grammar

= (N , T, S, P) de¬ned by N = {S, A, B}, T = {a, b} and the set of

productions P given by

S ’ aA B ’ aA S ’ bB A ’ aB

B ’ bB A ’ bA B’b A ’ a.

(26) Find an automaton which accepts the language generated by the grammar

= (N , T, S, P) de¬ned by N = {S, A, B, C}, T = {a, b} and the set of

productions P given by

S’C A ’ aB C ’ bC B ’ bB

C ’ aA B ’ aA A ’ bA B ’ ».

(27) Find an automaton which accepts the language generated by the grammar

= (N , T, S, P) de¬ned by N = {S, A, B, C}, T = {a, b} and the set of

productions P given by

S’C C ’b C ’ aA A ’ aA

C ’ aC A’a C ’a A ’ ».

(28) Find an automaton which accepts the language generated by the grammar

= (N , T, S, P) de¬ned by N = {S, A, B, C}, T = {a, b} and the set of

productions P given by

S’C C ’ aaC C ’ abC

C ’ baC C ’ bbC C ’ ».

(29) Find an automaton which accepts the language generated by the grammar

= (N , T, S, P) de¬ned by N = {S, A, B, C}, T = {a, b} and the set of

productions P given by

S’C B ’ aB C ’ bC B ’ bB C ’ aA

B’a A ’ bC B’b A ’ a B.

4.1 Formal grammars 137

(30) Construct a grammar which generates the language accepted by the

automaton

b

s1 s3

a a

s0 b

b

s2 s4

b

a

(31) Construct a grammar which generates the language accepted by the

automaton

b

b b

s2

s1

a a

s0 b

b

s4

s2 b

b

a

(32) Construct a grammar which generates the language accepted by the

automaton

a

s1 b

s3

a

aa

b

s0 s2

b

b

(33) Construct a grammar which generates the language accepted by the

automaton

a,b

a a

s'0 s'1 s'2

b b

Grammars

138

b b

s2

s0 s1

a

a

a

s3 a

(34) Construct a grammar which generates the language accepted by the

automaton

(35) Construct a grammar which generates the language accepted by the

automaton

b

a s4

s3

s0

ca

b

a

b

s2 c

c

(36) Construct a grammar which generates the language accepted by the

automaton

a b

b

s4

a

a s1 a

s0

ba

b

s3

b

s2

4.2 Chomsky normal form and Greibach normal form

is in Chomsky normal form if each

De¬nition 4.8 A context-free grammar

of its productions is either of the form

A ’ BC

or

A’a

where A, B, and C are nonterminals and a is a terminal.

is in Greibach normal form if each

De¬nition 4.9 A context-free grammar

of its productions is of the form

A ’ aW

where a is a terminal and W is a possibly empty string of nonterminals.

4.2 Chomsky normal form and Greibach normal form 139

We shall show that every language L, not containing the empty word, which

is generated by a context-free grammar can be generated by a context-free

grammar in Chomsky normal form. We shall also show that every language L,

not containing the empty word, which is generated by a context-free grammar

can be generated by a context-free grammar in Greibach normal form.

We shall ¬rst show that a language L, not containing the empty word, gener-

ated by a context-free grammar, can be generated by a context-free grammar in

Chomsky normal form. We begin with a series of lemmas. The ¬rst lemma,

which demonstrates the ¬‚exibility of derivations in context-free languages

shows us that if we have a derivation U V ’— W where U, V, W ∈ (N ∪ T )—

then U and V may be treated separately.

Lemma 4.2 Let = (N , T, S, P) be a grammar and U V ’— W , where

U, V, W ∈ (N ∪ T )— , be a derivation in with n steps, then W can be expressed

as W1 W2 where U ’— W1 , V ’— W2 are derivations in , both containing at

most n steps.

Proof The proof of this lemma uses induction on the number of steps in the

production. Assume there is one step. Then only one nonterminal is replaced

using a production. Assume it is the production A ’ w Bw . Either A is in the

string U or is in the string V . Without loss of generality assume it is in U , so

U = X AY

and

U ’ X w Bw Y = U .

Further

U V ’ X w Bw Y V = U V,