<<

. 8
( 9)



>>

(c) aababaabbaa
(d) aabaabaabaab
(e) aabaaabbabb.
(8) Find the Turing machine that corresponds to the string
abaaabbbbaabbbabaabababbaababb.
(9) Find the Turing machine that corresponds to the string
abaaaababbabbbaababbaababaaababb.
(10) Find the Turing machine and input that correspond to the string
abaaaabbbbabbbaabbbbaabbbabbbaababaaababbbaaabbb.
(11) Find the Turing machine and input that correspond to the string
abaaaabbbbabbbaabaabaabbbabbbaaabaaabbbaaababaa
ababababaabb.
(12) Devise a method of coding that allows the use of A and B as well as
a and b by allowing strings of length 3 to represent input and output
symbols.
(13) Use the coding in the previous problem to ¬nd the string corresponding
to (s1 , a, s3 , A, R).
(14) Find the string that represents the machine
(s1 , a, s2 , b, R) (s1 , b, s2 , b, R) (s2 , a, s2 , #, R)
(s2 , b, s2 , b, R) (s1 , #, s3 , #, R)
together with input ababaab.
(15) Find the string that represents the machine
(s1 , a, s2 , b, R) (s1 , b, s2 , a, R) (s2 , a, s2 , #, R)
(s2 , b, s2 , #, R) (s1 , #, s3 , #, R)
together with input babbab.
Turing machines
200


(16) Let L be a language. Prove that one and only one of the following must
be true:
(a) Neither L nor L is Turing acceptable.
(b) Both L and L are Turing decidable.
(c) Either L or L is Turing acceptable but not Turing decidable.



5.4 Undecidability problems for context-free languages
We begin with the Post™s Correspondence Problem, which is not only an interest-
ing problem in itself, but is used to prove that certain statements about context-
free languages are undecidable.
De¬nition 5.6 Given an alphabet , let P be a ¬nite collection of ordered
pairs of nonempty strings (u 1 , v1 ), (u 2 , v2 ), . . . , (u m , vm ) of . Thus P is
a ¬nite subset of + — + . A match of P is a string w for which there
exists a sequence of pairs (u i1 , vi1 ), (u i2 , vi2 ), . . . , (u im , vim ), such that w =
u i1 u i2 · · · u im = vi1 vi2 · · · vim . Post™s Correspondence Problem is to determine
if a match exists.
An alternative way to think about Post™s Correspondence Problem is to
consider two lists A = u 1 , u 2 , . . . , u n and B = v1 , v2 , . . . , vn where each u i
and vi is a nonempty string of and there is a match if there exists w such
that w = u i1 u i2. · · · u im = vi1 vi2 . . . vim . The important factor is that the products
must consist of corresponding pairs.
Example 5.1 Let P = {(a, ab), (bc, cd), (de, ed), (d f, f )}, then abcded f
and abcdeded f are both matches of P.
We wish to show that Post™s Correspondence Problem is not decidable. To
help us do so we de¬ne a modi¬ed correspondence system. We shall show that
if the modi¬ed correspondence system is not decidable, then Post™s Correspon-
dence Problem is not decidable. Finally we show that the modi¬ed correspon-
dence is not decidable.
De¬nition 5.7 Given an alphabet , Let P be a ¬nite collection of
ordered pairs of nonempty strings (u 1 , v1 ), (u 2 , v2 ), . . . , (u m , vm ) of together
with a special pair (u 0 , v0 ). In a modi¬ed correspondence system, a
match of P is a string w such that there exist a sequence of pairs
(u 0 , v0 ), (u i1 , vi1 ), (u i2 , vi2 ), . . . , (u im , vim ), such that w = u 0 u i1 u i2 · · · u im =
v0 vi1 vi2 . . . vim . Thus a match must begin with the designated pair (u 0 , v0 ). The
modi¬ed Post™s Correspondence Problem is to determine if a match exists in
a modi¬ed correspondence system.
5.4 Undecidability problems for context-free languages 201


Note that in the previous example, any match w must begin with (a, ab),
however P is not a modi¬ed correspondence, since we are not required to begin
with (a, ab) to try to form a match.

Lemma 5.1 If Post™s Correspondence Problem is decidable, then the modi¬ed
Post™s Correspondence Problem is decidable.

Proof Let P1 be a modi¬ed correspondence system with the sequence of
ordered pairs (u 0 , v0 )(u 1 , v1 ), (u 2 , v2 ), . . . , (u m , vm ) and the alphabet consist
of all symbols occurring in any of u i or vi . Assume every match must begin
with u 0 and v0 . Assume also that and $ do not occur in . For a string
w = a1 a2 a3 · · · ak , de¬ne L(w) = a1 a2 a3 · · · ak and R(w) = a1 a2
a3 · · · ak . Let P2 contain the pair (L(u 0 ), L(v0 ) ), and for all other (u, v) in
P1 , let (L(u), R(v)) belong to P2 . In addition include, ( $, $) in P2 . It is obvious
that only (L(u 0 ), L(u 0 ) ) can begin a match in P2 , since it is the only pair where
we do not have one word in the pair beginning with a star while the other does
not. It is also obvious that the only pair that can end a pair in P2 , is ( $, $),
since it is the only word where the last symbols match, that is we do not have
one ending in a star while the other does not.
It is also obvious that if there exist a sequence of pairs

(u 0 , v0 )(u i1 , vi1 ), (u i2 , vi2 ), . . . , (u im , vim )

in P1 , such that w = u 0 u i1 u i2 · · · u im = v0 vi1 vi2 · · · vim . Then the sequence

(L(u 0 ), L(v0 ) )(L(u i1 ), R(vi1 )), (L(u i2 ), R(vi2 )), . . . , (L(u im ), R(vim )), ( $, $)

produces a match

w = L(u 0 )L(u i1 )L(u i2 ) . . . L(u im ) $ = L(u 0 ) R(vi1 )R(vi2 ) . . . $

in P2 . The words

L(u 0 )L(u i1 )L(u i2 ) . . . L(u im ) $

and

L(v0 ) R(vi1 )R(vi2 ) . . . $

in P2 differ from the words u 0 u i1 u i2 · · · u im and v0 vi1 vi2 · · · vim respectively in
P1 in the fact that that they have stars between the letters and end in $.
Hence, since a match in the modi¬ed Post™s correspondence system has a
corresponding match in Post™s correspondence system, if Post™s Correspon-
dence Problem is decidable, then the modi¬ed Post™s Correspondence Problem
is decidable.
Turing machines
202


Example 5.2 Using the previous modi¬ed Post™s correspondence

P1 = {(a, ab), (bc, cd), (de, ed), (d f, f )}

with match abcded f , we have

P2 = {(—a, —a — b—), (—b — c, c — d—), (—d — e, e — d—), (—d — f, f —), ( $, $)}

with match —a — b — c — d — e — d — f $.

Theorem 5.9 Post™s Correspondence Problem is undecidable.

Proof We show that Post™s Correspondence Problem is undecidable by show-
ing that the modi¬ed Post™s Correspondence Problem is undecidable. We do
this by showing that if the modi¬ed Post™s Correspondence Problem is decid-
able, then L 0 (see previous section) is acceptable, which means that it is
decidable if a Turing machine accepts a given word. Assuming the sequence
for a given Turing machine and word, we construct a modi¬ed Post™s corre-
spondence system that has a match if and only if M accepts w. Intuitively
assume

#s0 w#±1 s1 β1 #±2 s2 β2 # . . . #±k sk βk #

describes the process used by the Turing machine to read w, where each w, and
each of the ±i and βi , are strings, the Turing machine begins in state s0 . Each of
the following steps describes the process for the machine accepting w. Hence
between the spaces, each string in the match represents symbols on the tape
and the state of the machine for each step as the Turing machine progresses in
its computation. We wish to create a modi¬ed post™s correspondence system
which has this description. We shall see that the overlapping produced by the
rules below together with the fact that the top and bottom row must match give
us the process described above.
Note that reaching an acceptance state is equivalent to reaching a halt state
since as above, we can create rules that take us from an acceptance state to the
¬nal state.
For a given Turing machine and word, we create the modi¬ed Post™s
correspondence system as follows. We shall use two rows to represent the
¬rst coordinates and second coordinates respectively. The following are the
rules we are allowed to use. We begin with the pair (#, #s0 w) so that
we have
#
.
#s0 w
5.4 Undecidability problems for context-free languages 203


Since this is a modi¬ed Post™s correspondence system, we can require that we
begin with this pair. For each X in we have
X
.
X
We next use the following pairs to guide us in selecting the next string in our
match:
For each state s, which is not a ¬nal state and each state s , and symbols X ,
Y , and Z in ,
sX
if δ(s, X ) = (s , Y, R)
Ys
X sY
if δ(s, Y ) = (s , Z , L)
s X Z)
s#
if δ(s, #) = (s , X, R)
Xs #
X s#
if δ(s, #) = (s , Y, L).
s XY #
We shall call these the pairs generated by δ.
In trying to get our match this set guides us to the next string. For example
if we have
...#
. . . #11si 011#
in our match and one of the pairs above is (si 0, 1s j ), we will want the next
string to be #111s j 11#. Note however that the two 1s at the beginning and end
of the string are not affected by the pair above. Hence we need pairs (#, #) and
(1, 1) to get
. . . #11si 011#
. . . #11si 011#111s j 11#.
1 1 si 0 1 1 #
More precisely we would use , , , , , . Hence we need pairs
1 1 1s j 1 1 #
X
for all X in
X
#
.
#
Obviously if we never get to an acceptance state (and hence a ¬nal state) we
will never have a match since there will always be an overlap at the bottom. We
Turing machines
204


thus need rules to get a match if we reach a halt state h. We use the following
pairs to get rid of the overlap.
(0sm 0, sm )
(1, 1)
(#, #)
(0sm 1, sm )
(1sm 0, sm )
(1sm 1, sm )
(0sm , sm )
(sm 0, sm )
(1sm , sm )
(sm 1, sm )
(sm ##, #).
The last term gets rid of the overlap sm when all of the other symbols have been
eliminated. Thus if we reached
. . . #11si 011#
. . . #11si 011#111sm 11#
sm ##
1sm 1 1sm 1 #
, , , , and
rules would produce
sm sm 1 # #
. . . #11si 011#111sm 11#11sm 1#1sm #sm ##
. . . #11si 011#111sm 11#11sm 1#1sm #sm ##
as follows
. . . #11si 011#
. . . #11si 011#111sm 11#
. . . #11si 011#1
. . . #11si 011#111sm 11#1
. . . #11si 011#11
. . . #11si 011#111sm 11#11
. . . #11si 011#111sm 1
. . . #11si 011#111sm 11#11sm
. . . #11si 011#111sm 11
. . . #11si 011#111sm 11#11sm 1
5.4 Undecidability problems for context-free languages 205


. . . #11si 011#111sm 11#
. . . #11si 011#111sm 11#11sm 1#
. . . #11si 011#111sm 11#1
. . . #11si 011#111sm 11#11sm 1#1
. . . #11si 011#111sm 11#11sm 1
. . . #11si 011#111sm 11#11sm 1#1sm
. . . #11si 011#111sm 11#11sm 1#
. . . #11si 011#111sm 11#11sm 1#1sm #
. . . #11si 011#111sm 11#11sm 1#1sm
. . . #11si 011#111sm 11#11sm 1#1sm #sm
. . . #11si 011#111sm 11#11sm 1#1sm #
. . . #11si 011#111sm 11#11sm 1#1sm #sm #
. . . #11si 011#111sm 11#11sm 1#1sm #sm ##
. . . #11si 011#111sm 11#11sm 1#1sm #sm ##.
Formally we give a proof of the theorem. If we have a valid set of sequences
describing the acceptance of w by M, using induction on the number of com-
putations we show that there is a partial solution
#s0 w#±1 s1 β1 #±2 s0 β2 # . . . #±n’1 sn’1 βn’1 #
.
#s0 w#±1 s1 β1 #±2 s0 β2 # . . . #±n’1 sn’1 βn’1 #±n sn βn #
For n = 0, we have
#
.
#s0 w
Assuming the statement is true for k, and sk is not the halt state we have
#s0 w#±1 s1 β1 #±2 s0 β2 # . . . #±n’1 sk’1 βk’1 #
.
#s0 w#±1 s1 β1 #±2 s0 β2 # . . . #±k’1 sk’1 βk’1 #±k sk βk #
The next pairs are chosen so the string at the top forms #±k sk βk # using the rules
above. There is at most one pair in the pairs generated by δ that works.
We can thus form
#s0 w#±1 s1 β1 #±2 s0 β2 # . . . #±n’1 sk’1 βk’1 #±k sk βk #
#s0 w#±1 s1 β1 #±2 s0 β2 # . . . #±k’1 sk’1 βk’1 #±k sk βk #±k+1 sk+1 βk+1 #
and we have extended a new partial solution. Since rules generated by δ apply
1 0
to only one letter, rules and may be needed to produce ±k and βk . If M
1 0
Turing machines
206


starting with #s0 w reaches a halt state, there is a rule to get from #±k sk βk # to
#±k+1 sk+1 βk+1 # otherwise, for some k, there is not a rule and there can be no
match. If, for some k, βk is a halt state, then as mentioned above, there are rules
to make the upper and lower lists agree.
As already mentioned, if we do not reach the halt state, we cannot have
a match. If we do reach the halt state, we can produce a match. Hence if
the modi¬ed Post™s Correspondence Problem is decidable, L 0 is decidable.
Therefore the modi¬ed Post™s Correspondence Problem is undecidable.

Example 5.3 Let the Turing Machine

M = ({s0 , s1 , s2 , h}, {0, 1}, {0, 1, , #}, δ, s0 , h)

and word 0110 where
δ(s0 , 0) = (s1 , , R)
δ(s0 , 1) = (s1 , 1, R)
δ(s1 , 1) = (s1 , 1, R)
δ(s1 , 0) = (s2 , 0, L)
δ(s2 , 1) = (s2 , 1, L)
δ(s2 , 0) = (s2 , 1, R)
δ(s2 , #) = (h, #, #)

with corresponding pairs

(s0 0, s1 )
(s0 1, 1s1 )
(s1 1, 1s1 )
§
⎨ (0s1 0, s2 00)
(1s1 0, s2 10)
©
( s1 0, s2 0)
§
⎨ (0s2 1, s2 01)
(1s2 1, s2 11)
©
( s1 1, s2 1)
(s2 0, 1s2 )
(s2 , h0).

In addition we have pairs

(0, 0)
(1, 1)
(#, #)
( , ).
5.4 Undecidability problems for context-free languages 207


Our ¬rst pair is (#, #s0 w) which produces
#
#s0 010#.
We now use (s0 0, s1 ) to get
#s0
#s0 010# s1 .
We then use (1, 1) twice, (0, 0), and (#, #) to get
#s0 0110#
#s0 0111# s1 110#.
We next use ( , ), (s1 1, 1s1 ), (1, 1),(0, 0), and (#, #), to get
#s0 0110# s1 110#
#s0 0110# s1 110# 1s1 10#,
again using ( , ), (1, 1), (s1 1, 1s1 ), (0, 0), and (#, #) we get
#s0 0110# s1 110# 1s1 10#
#s0 0110# s1 110# 1s1 10# 11s1 0#.
Now using ( , ), (1, 1), (1s1 0, s2 10), and (#, #) we get
#s0 0110# s1 110# 1s1 10# 11s1 0#
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10#.
Using ( , ), ( s1 1, s2 1), (1, 1), (0, 0), and (#, #) we get
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10# s2 110#
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10# s2 110#s2 110#.
Now using (s2 , h0), (1, 1) twice, (0, 0), and (#, #), we get
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10# s2 110#s2 110#
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10# s2 110#s2 110#h0110#.
Finally, using the pairs containing h, together with (1, 1), (0, 0), and (#, #), we
get
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10# s2 110#s2 110#h0110#h
#s0 0110# s1 110# 1s1 10# 11s1 0# 1s2 10# s2 110#s2 110#h0110#h.
We can now use the fact that Post™s Correspondence Problem is undecidable
to solve several other questions about solvability with regard to context-free
languages.
Turing machines
208


Theorem 5.10 It is undecidable for arbitrary context-free grammars G 1 and
G 2 whether L(G 1 ) © L(G 2 ) = ….

Proof Let P ‚ — — — be an arbitrary correspondence system with pairs
(u 0 , v0 ), (u 1 , v1 ), (u 2 , v2 ), . . . , (u n , vn ). In the following, w ’1 will be w with
the letters reversed. For example 1101’1 is 1011. Let G 1 be generated by
productions

S ’ u i Cvi’1 for i = 1 to n.
C ’ u i Cvi’1 for i = 1 to n.
C ’ c.

Thus every word in L(G 1 ) has the form u i0 u i1 u i2 . . . u im cvi’1 . . . vi’1 vi’1 vi’1 .
m 2 1 0
Let L(G 2 ) = {wcw ’1 | w ∈ — }. Then w ∈ L(G 1 ) © L(G 2 ) if and only if
w = u i0 u i1 u i2 . . . u im = vi0 vi1 vi2 . . . vim which is a solution to the Post™s corre-
spondence system. Hence it is undecidable for arbitrary context-free grammars
G 1 and G 2 whether L(G 1 ) © L(G 2 ) = ….

De¬nition 5.8 A context-free grammar is ambiguous if there are two leftmost
generations of the same word.

= (N , , S, P) be the grammar de¬ned by N =
Example 5.4 Let
{S, A, B}, = {a, b}, and P be the set of productions

S ’ aSb S ’ aA A ’ Bb A ’ aA B ’ Bb B’» S ’ ».

Obviously a n bn can be generated in two different ways.

Theorem 5.11 It is undecidable whether an arbitrary context-free grammar
is ambiguous.

Proof Let P ‚ + — + be an arbitrary correspondence system with pairs
(u 0 , v0 )(u 1 , v1 ), (u 2 , v2 ), . . . , (u n , vn ). Let ±0 , ±1 , ±2 , . . . , ±n be symbols not in

. We construct two grammars G 1 and G 2 as follows:

G 1 = (N1 , a, S1 , P1 )

where N1 = {S1 }, a = ∪ {±0 , ±1 , ±2 , . . . , ±n }, and P1 = {S1 ’ ±i S1 u i for
i = 0, 1, . . . , n, and S1 ’ »}.

G 2 = (N2 , a, S2 , P2 )

where N2 = {S2 }, a = ∪ {±0 , ±1 , ±2 , . . . , ±n }, and P2 = {S2 ’ ai S2 vi for
i = 0, 1, . . . , n, and S2 ’ »}.
Obviously G 1 and G 2 are not ambiguous.
5.4 Undecidability problems for context-free languages 209


Let G = (N , a , S, P) where N = {S, S1 , S2 } and P = P1 ∪ P2 ∪ {S ’
S1 , S ’ S2 }. Obviously if there is a match, one derivation begins with S ’ S1
and the other with S ’ S2 so G is ambiguous. Conversely if G is ambigu-
ous, then u i0 u i1 u i2 . . . u im = vi0 vi1 vi2 . . . vim and there is a match. Hence there
is a match if and only if the context-free grammar is ambiguous. Therefore
it is impossible to determine whether an arbitrary context-free grammar is
ambiguous.


Exercises
(1) Show that the class of Turing acceptable languages is closed under union.
(2) Show that the class of Turing acceptable languages is closed under inter-
section.
(3) Show that the class of Turing decidable languages is closed under inter-
section.
(4) Show that the class of Turing decidable languages is closed under union.
(5) Show that the class of Turing decidable languages is closed under con-
catenation.
(6) Show that the class of Turing decidable languages is closed under Kleene
star.
(7) Show that it is an unsolvable problem to determine, for a given Turing
machine M, whether there is a string w such that M enters each of the
machine™s states during its computation of input w.
(8) Show that it is undecidable for any arbitrary context-free grammar
whether (M) = — .
(9) Show that for arbitrary context-free grammars and , it is undecidable
whether (L) = (L).
(10) Show that there is no algorithm that determinines whether the intersec-
tion of languages of two context-free grammars contains in¬nitely many
elements.
(11) Show that there is no algorithm that determines whether the complement of
the languages of context-free grammars contains in¬nitely many elements.
6
A visual approach to formal languages




6.1 Introduction
Formal language theory is overlapped by a close relative among the family of
mathematical disciplines. This is the specialty known as Combinatorics on
Words. We must use a few of the most basic concepts and propositions of this
¬eld. A nonnull word, q, is said to be primitive if it cannot be expressed in
the form x k with x a word and k > 1. Thus, for any alphabet containing the
symbols a and b, each of the words a, b, ab, bab, and abababa is primitive. The
words aa and ababab are not primitive and neither is any word in (aba)+ other
than aba itself. One of the foundational facts of word combinatorics, which is
demonstrated here in Section 6.2, is that each nonnull word, w, consisting of
symbols from an alphabet , can be expressed in a unique way in the form
w = q n where q is a primitive word and n is a positive integer. The uniqueness
of the representation, w = q n , allows a useful display of the free semigroup
+
, consisting of the nonnull words formed from symbols in , in the form of
a Cartesian product, Q — N , where Q is the set of all primitive words in +
and N is the set of positive integers. Each word w = q n is identi¬ed with the
ordered pair (q, n). This chapter provides the groundwork for investigations of
concepts that arise naturally in visualizing languages as subsets of Q — N . In
the suggested visualizations, the order structure of N is respected. We regard
N as labeling a vertical axis (y-axis) that extends upward only. We regard Q as
providing labels for the integer points on a horizontal axis (x-axis) that extends
both to the left and right except in the case in which the alphabet is a singleton.
When is a singleton the unique symbol in is the only primitive word and
Q — N occupies only the vertical axis.
The set Q of primitive words, over an alphabet having two or more letters, is
a rather mysterious language. It is known that Q is not a regular language and
that its complement is not a context-free language. (See Exercises 1 and 2 of


210
6.1 Introduction 211


this section.) At this writing, it is not yet known whether Q itself is context-free.
Of course Q is clearly recursive and one can con¬rm that it is context-sensitive.
With Q itself not fully understood, it is surprising that insightful results can
be obtained about languages by displaying them in the half plane Q — N . It
is as though, in constructing displays above Q, we are building castles on
sand. Nevertheless we proceed by taking Q as a totally structure-less countable
in¬nite set and we allow ourselves to place Q in one-to-one correspondence
with the set of integers (on an x-axis) in any way we wish in order to provide
the most visually coherent display of the language being treated. One might
say that we take advantage of our ability to sprinkle the grains of sand (i.e.,
primitive words) along the x-axis just as we please.
In the next section adequate tools and exercises are given to allow the intro-
duction of visually coherent displays of languages based on the concept of
primitive words.



Exercises
For this set of exercises let = {a, b} serve as an alphabet and let Q be the set
of all primitive words in + .

(1) Let Q be the complement of Q in + and note that, for every positive
integer n, abn aabn a is in Q . Prove that the language Q is not regular.
Conclude that the language Q also cannot be regular.
(2) Let Q be the complement of Q and note that, for every positive inte-
ger n, abn aabn aabn a is in Q . Prove that Q is not a context-free lan-
guage. Although complements of context-free languages need not be
context-free, there is a subclass of context-free languages, called the
deterministic context-free languages, for which complements are always
context-free. Conclude that Q cannot be a deterministic context-free
language.
(3) For each positive integer n, ¬nd three primitive words, u, v, and w, for
which uv = wn .
(4) Show that both Q and Q are recursive languages.
(5) Let L be a language that has the property that there is a bound B in N
for which for every u in — there is a word v in — of length at most B
for which uv is in L. Show that L contains an in¬nite number of primitive
words.
(6) Characterize those regular languages that have the property stated in the
previous exercise using the intrinsic automaton of L.
A visual approach to formal languages
212


6.2 A minimal taste of word combinatorics
Throughout this section, uppercase will denote a nonempty ¬nite set of
symbols that will be used as an alphabet, while u and v will be reserved to
denote nonnull words in + . Uppercase Q will denote the set of all primitive
words in + , while p and q will be reserved to denote individual primitive
words. The following de¬nition for the division of words provides a convenient
tool for the present exposition: For each pair of words u, v in + we de¬ne
u/v to be the ordered pair (n, r ) where n is a nonnegative integer and u = v n r
with r a suf¬x of u which does not have v as a pre¬x. We call r the remainder
when u is divided by v.
Observe that for each u, v there is only one such pair that meets the conditions
in the de¬nition. Note that u is a power of v if and only if u/v = (n, ») in which
case u = v n .

Proposition 6.1 Words u and v are powers of a common word if and only
if uv = vu. Moreover, when uv = vu both u and v are powers of the word w
that occurs as the last nonnull remainder in a sequence of word divisions. The
length of this w is the greatest common divisor of the lengths of u and v.

Proof Suppose ¬rst that u = w m and v = w n . Then uv = w m w n = w m+n =
wn+m = vu as required.
Suppose now that uv = vu. If |u| = |v| then u = v. Otherwise, by the sym-
metry of the roles of u and v in the hypothesis, we may assume |v| < |u| and we
observe that v is a pre¬x of u. Let w0 = u and w1 = v. De¬ne successively each
word w2 , w3 , . . . , as the remainder, wi , in the division wi’2 /wi’1 = (n i , wi ),
stopping when the word wk = » is obtained. That such a wk = » must arise is a
consequence of the observation that each wi’1 is a nonnull pre¬x of wi’2 when
wi’1 itself is not null. (See Exercise 3 of this section.) We have in succession:
wk’2 /wk’1 = (n k , ») and wk’2 is a power of wk’1 , wk’3 is a power of wk’1 ,
wk’4 is a power of wk’1 , . . . , w1 (= v) is a power of wk’1 , w0 (= u) is a power
of wk’1 . Thus u and v are powers of the common word wk’1 . A review of the
sequence of divisions con¬rms that |wk’1 | = gcd(|u|, |v|).

From the second paragraph of the previous proof we observe that if u and v
are powers of a common word, then the length of the longest word w for which
both u and v are powers of w is the greatest common divisor (gcd) of |u| and
|v|. Consequently, Proposition 6.1 provides two methods of deciding whether
two words u, v are powers of a common word; one can test the equality of uv
and vu. Alternatively one can compute g = gcd(|u|, |v|), test the equality of
6.2 A minimal taste of word combinatorics 213


the pre¬xes u , v of length g of u, v, respectively, and if u = v then compute
u/u and v/v to test whether the remainder in each case is ».
+
Proposition 6.2 For each word u in there is a unique pair (q, n), with q
in Q and n in N , for which u = q n .

Proof For each i with 1 ¤ i ¤ |u|, let u i be the pre¬x of u of length i. Compute
successively the u/u i until a j occurs at which u/u j = (m, »). Such a j will
certainly occur since u/u |u| = (1, »). For the pair (u j , m) we have u j in Q.
Consequently u = u m has the required form. Suppose now that u = p m = q n ,
j
where both p and q are in Q.

uu = p m p m = pp m’1 p m = pp m p m’1 = pup m’1 = pq n p m’1 = ( pq)q n’1 p m’1 ,

and

uu = q n q n = qq n’1 q n = qq n q n’1 = quq n’1 = q p m q n’1 = (qp) p m’1 q n’1 .

Thus pq = qp and by Proposition 6.1, p and q must be powers of a common
word. Since each is primitive, p = q and then also m = n, as required for
uniqueness.

For each word u the unique primitive word q for which u is a power of q will
be called the primitive root of u and will be denoted r t(u). Thus Proposition 6.1
may be rephrased: uv = vu if and only if r t(u) = r t(v). Note that for each
word u, u = r t(u)n for a unique n in N . Thus for each positive integer m,
u m = r t(u)nm . Thus r t(u m ) = r t(u) for each word u and each positive integer
m. The exponent n in the unique representation u = r t(u)n will be called the
exponent of u.
Propositions 6.1 and 6.2 are the bedrock of the theory of word combinatorics
and should become familiar tools for anyone studying formal languages. They
contain only the information required to begin the discussion of language visu-
alization. The additional information about word combinatorics that is required
in the algorithmics of visualization is given in Section 8 of this chapter. For
further study of the fascinating but subtle mathematics of word combinatorics
see [36] [35] [28] and [13].


Exercises
= {a, b, c} serve as alphabet. Let
(1) Let

u = abcabcabcabcabcabcabcabcabcabc
A visual approach to formal languages
214


and
v = abcabcabcabcabcabc.
Given that u and v commute, carry out the steps of the procedure given in
the second paragraph of the proof of Proposition 6.1 for ¬nding the longest
word w for which u and v are powers of w. Give the ¬nite sequence of the
words w0 , w1 , w2 , . . . , wk that arises in this computation.
(2) Determine whether the words u and v as in the previous exercise are powers
of a common string w by each of the two methods stated in the paragraph
following Proposition 6.1. Note that the second method uses Euclid™s num-
ber theoretic algorithm for ¬nding greatest common divisors of integers
and produces the longest such w when u and v are powers of a common
string. The ¬rst method given produces the shortest such w when u and v
are powers of a common string.
(3) The proof of Proposition 6.1 contains the following assertion: “That such
a wk = » must arise is a consequence of the observation that each wi’1
is a nonnull pre¬x of wi’2 when wi’1 itself is not null.” Prove the obser-
vation that each wi’1 is a nonnull pre¬x of wi’2 when wi’1 itself is not
null.
(4) Observe that the length function | | : + ’ N is a semigroup homomor-
phism that maps + onto the additive semigroup N . Let u and v be any two
nonnull words in + for which uv = vu. Show that the restriction of the
length function to the subsemigroup {u, v}+ is a semigroup homomorphism
| | : {u, v}+ ’ N that maps {u, v}+ one-to-one into the additive semigroup
N . What fails in your argument if uv = vu?



6.3 The spectrum of a word with respect to a language
With each language L and each nonnull word w we de¬ne a subset of the
positive integers N called the spectrum, Sp(w, L), of w with respect to L :
Sp(w, L) = {n ∈ N : w n is in L}. In the display of L in the half plane Q — N
the column at each primitive word, q, displays the spectrum of q. In fact,
the display of the spectra of the words in Q constitutes the representation of
L within Q — N . It is the spectra of the primitive words that are of primary
concern (since the spectrum of w n can be read directly from the spectrum of w).
However, in Section 6.9 the value of de¬ning the spectra of the nonprimitive
words along with the primitive words is justi¬ed.
It is convenient to classify spectra into ¬ve qualitatively distinct categories.
The spectrum of a word with respect to a language L may be the empty set,
6.3 The spectrum of a word with respect to a language 215


a ¬nite set, a co¬nite set, the entire set N , or an intermittent set. When the
spectrum is the entire set N we say that the spectrum is full. Recall that a set
is co¬nite if it has a ¬nite complement. When a spectrum is empty it is also
¬nite and when a spectrum is full it is also co¬nite. By an intermittent spectrum
we mean a spectrum that is neither ¬nite nor co¬nite. Note that if Sp(w, L) is
intermittent then, for every positive integer n, there are integers i > n and j >
n for which wi ∈ L and w j is not in L.
Each of the ¬ve cases is easily illustrated using the one letter alphabet =
{a} : Sp(a, the empty language …) is empty. Sp(a, {a, aaa}) is the ¬nite set
{1, 3}. Sp(a, a ∨ aaa + ) is the co¬nite set {n ∈ N : n = 1 or n ≥ 3}. Sp(a, a + )
is the full set N . Sp(a, (aa)+ ) is intermittent, being {n ∈ N : n is even}. The
single letter a is the only primitive word for the alphabet = {a}. Spectra of
nonprimitive words for these same languages are illustrated: Sp(aa, …) = …,
Sp(aaa, {a, aaa}) = {1}, Sp(aa, a ∨ aaa + ) = {n ∈ N : n ≥ 2}, Sp(aaa, a + )
is full, and Sp(aaa, (aa)+ ) = {n ∈ N : n is even}. Note that, in the case of one
letter as alphabets, such as = {a}, the distinction between a language and
the spectrum of the letter a is somewhat arti¬cial. Consider now the two letter
alphabet = {a, b}. The spectrum of each word w with respect to the context-
free language L = {w ∈ + : a and b occur equally often in w} is either empty
or full; Sp(w, L) is full if w is in L and empty otherwise. For the regular
language L = (aa)+ ∨ (bbb)+ , Sp(a n , L) is full if n is even and intermittent if
n is odd. Sp(bn , L) is full if n is divisible by three and intermittent otherwise.
Finally, Sp(w, L) = … if both a and b occur in w.



Exercises
(1) Let L be a regular language in + and let N be represented in tally notation
N = |+ . Show that, for any word w in + , Sp(w, L) is a regular language
in N = |+ .
(2) Let L be a context-free language in + and let N be represented in tally
notation. Show that, for any word w in + , Sp(w, L) is a regular language
in N = |+ .
(3) Let be a ¬nite alphabet that contains at least two symbols. Let w be one
speci¬c ¬xed word in + . For each of the following sets state whether the
set is countably in¬nite or uncountably in¬nite:
(a) {L : L is a language contained in + },
(b) {Sp(w, L) : L ranges through all the languages in + },
(c) {Sp(w, L) : L ranges through all the regular languages in + }, and
(d) {Sp(w, L) : L ranges through all the context-free languages in + }.
A visual approach to formal languages
216


+
and the support of L
6.4 The spectral partition of
Let L be a language contained in + . This language L provides an equivalence
relation, ∼, de¬ned for words u and v in + , by setting u ∼ v provided u and v
have identical spectra, i.e., Sp(u, L) = Sp(v, L). We call the partition provided
by ∼ the spectral partition, P(L), of + induced by L. This partition is a
fundamental tool for the present study. In Section 6.7 it is observed that, when L
is regular, P(L) consists of a ¬nite number of constructible regular languages.
Using a re¬nement of P(L) and Theorem 6.1 gives a precise view of L, within
Q — N , when L is a regular language. The spectral partitions determined by
the languages discussed in Section 6.3 are given next as examples.
= {a}. For the language L = + , the spectrum of every word
Let
in + is full. Consequently P(L) = P( + ) consists of a single class,
i.e., P( + ) = { + }. For the empty language, …, the spectrum of every
word in + is …. Thus P(…) also consists of the single class { + }.
For L = {a, aaa}, P(L) = {{a}, {aaa}, + \L}. For L = a ∨ aaa + , P(L) =
{{a}, {aa}, aaa + }. For L = (aa)+ , P(L) = {{a n : n is odd}, {a n : n is even}}.
Now let = {a, b}. For L = {w in + : a and b occur equally often in w},
P(L) = {L , + \L}. For L = (aa)+ ∨ (bbb)+ , P(L) = {L , a(aa)— , b(bbb)— ∨
bb(bbb)— , — ab — ∨ — ba — }.
For visualizing a language, L, within Q — N , the spectra of the primitive
words in + provide the whole picture. If desired, the spectrum of a nonprim-
itive word, q n , can be obtained from the spectrum of its primitive root, q. In
fact, for the task at hand here, there is little motive for interest in the spec-
tra of individual nonprimitive words. For each equivalence class, C, in P(L)
we are actually only interested in C © Q. The single reason for providing the
de¬nition of the spectra of nonprimitive words is that each resulting spectral
class, C, can often provide satisfactory access to the crucial set of primitive
words C © Q. The ¬rst three crucial questions we ask about a set C © Q are:
(a) Is C © Q empty? (b) If not, is C © Q in¬nite? (c) If C © Q is ¬nite, can its
elements be listed? These questions are answered for the languages discussed
in the previous paragraph in order to provide examples.
For a one letter alphabet, = {a}, the letter itself is the only primitive
word. Consequently for any language L contained in + , C © Q is empty for
each C other than the one containing the letter a. Now let = {a, b}. For
L = {w in + : a and b occur equally often in w}, each of the two classes
in P(L) = {L , + \L} contains an in¬nite number of primitive words. For
L = (aa)+ ∨ (bbb)+ , we previously obtained P(L) = {L , a(aa)— , b(bbb)— ∨
bb(bbb)— , — ab — ∨ — ba — }. For these four spectral classes we have:
6.5 Visualizing languages 217


L © Q is empty; (a(aa)— ) © Q = {a}; (b(bbb)— ∨ bb(bbb)— ) © Q = {b}; and
( — ab — ∨ — ba — ) © Q is in¬nite.
For each language L contained in + , the set Su(L) = {q ∈ Q : Sp(q, L) is
not empty} will be called the support of L. For a one letter alphabet, = {a},
itself. Now let = {a, b}.
the support of each nonempty language L is
For the language L = {w in + : a and b occur equally often in w}, Su(L) is
the in¬nite set L © Q. For L = (aa)+ ∨ (bbb)+ , Su(L) is the ¬nite set {a, b}.
The cardinality of the support of a language is of special signi¬cance for the
investigations introduced here. When a support is ¬nite, the speci¬c primitive
words in the support are desired.


Exercises
(1) For = {a, b} and L = {(abm )n : m, n ∈ N }:
(a) determine the spectrum of each of the words ab, abbabbabb, ababb;
state whether each spectrum is empty, ¬nite, co¬nite, full, or intermit-
tent;
(b) determine the spectral partition P(L); and
(c) determine the support Su(L) and state whether it is a regular language.
(2) Let = {a, b} and L = {a n bn : n ∈ N }.
(a) Con¬rm that the spectrum of each word in + is either … or {1}.
(b) Determine P(L) and Su(L).
(c) For the language L L, determine the spectra of ab, abab, and ababab.
(d) Describe P(L L) and Su(L L).



6.5 Visualizing languages
In order to spell out the visualization of a language L within Q — N , we begin
with the usual x“y plane with each point having associated real number coor-
dinates (x, y). We use only the upper half plane, {(x, y) : y > 0}. With each
integer i and each positive integer n we associate the unit rectangle
R(i, n) = {(x, y) : i ’ 1 < x ¤ i, n ’ 1 < y ¤ n}.
In this way the upper half plane is partitioned into nonoverlapping unit squares
{R(i, n) : i an integer, n ∈ N }. To visualize a speci¬c language L in + we
¬rst identify the set Q with the set Z of integers using any chosen bijection
B : Q ’ Z . (The bijection B is chosen only after a study of the spectral par-
tition of the speci¬c language L has been made, as illustrated below.) Once
A visual approach to formal languages
218


the bijection B is chosen, each word q n in + is associated with (¬guratively,
“placed on”) the unit square R(B(q), n). Finally, the language L is visualized
by de¬ning, using B, a sketch function S : {R(B(q), n) : q ∈ Q, n ∈ N } ’
{Black, White} for which S(R(B(q), n)) = Black if q n is in L and White oth-
erwise. For each given language L and each bijection B, the resulting sketch
function is said to provide a sketch of the language L. By the sketch we mean
the image of the sketch function that provides it. Thus we regard the sketch as a
half plane in which each of the unit squares is either black or white. Since there
are many possible choices for B, there may be many possible sketches of L.
For many languages, coherent sketches can be given by basing the choice of the
bijection B on a determination of the spectral decomposition of the language.
Examples follow for which we use the alphabet = {a, b}. These examples
suggest several new formal language concepts that we believe are worthy of
theoretical development. Each de¬nition given in this section follows immedi-
ately below one or more examples that illustrate or clarify the concept being
de¬ned.
Example 6.1 For L = {w in + : a and b occur equally often in w}, each of
the two spectral classes in P(L) = {L , + \L} contains an in¬nite number of
primitive words. The spectrum of each word in L is full and the spectrum of
each word in + \L is empty. Let B be any bijection for which B(L © Q) =
{i ∈ Z : i ¤ 0} and B(( + \L) © Q) = {i ∈ Z : i ≥ 1}. The sketch provided
by this choice of B gives a black left quadrant and a white right quadrant. The
support of this language is the in¬nite set L © Q.
+
De¬nition 6.1 A language L is cylindrical if, for each word w in , Sp(w, L)
is either empty or full.
The language L of Example 6.1 is cylindrical. There are numerous “natu-
rally occurring” examples of cylindrical languages: The ¬xed language L =
{w ∈ + : h(w) = w} of each endomorphism h of + is a cylindrical reg-
ular language and so is the stationary language of each such endomorphism
[16] [15]. Retracts and semiretracts [16][10][3][1] of free monoids are cylin-
drical languages. Investigations of various forms of periodicity in the theory
of Lindenmayer systems have led to additional examples of cylindrical
languages [24][26].
Example 6.2 For L = {aa, aaa, aaaa, aaaaaa, bbb, bbbb, ababab} only
three primitive words have nonempty spectra: a, b, and ab. Let B be any
bijection for which B(a) = 1, B(b) = 2, and B(ab) = 3. The sketch provided
by such a B gives a half plane that is white except for the three columns
above the three primitive words a, b, and ab. The column above a reads, from
6.5 Visualizing languages 219


the bottom up, white, black, black, black, white, black, and white thereafter.
The column above b reads white, white, black, black, and white thereafter. The
column above ab reads white, white, black, and white thereafter. The support
of this language is the ¬nite set Su(L) = {a, b, ab}.
Example 6.3 For L = {(a m b)n : m, n ∈ N , m ≥ n}, the support of L is
Su(L) = {a m b : m ∈ N }. Let B be any bijection for which, for each m in N ,
B(a m b) = m. The sketch provided by such a B gives a white left quadrant. The
right quadrant is white above a sequence of black squares ascending upward
at 45 degrees and black below this sequence of squares. The support of this
nonregular language is the in¬nite regular language a + b.
De¬nition 6.2 A language L is bounded above if, for each word w in + ,
Sp(w, L) is ¬nite. A language L is uniformly bounded above if it is bounded
above and there is a positive integer b for which, for each w in + and each n
in Sp(w, L), n ¤ b.

Any ¬nite language, such as the one given in Example 6.2, is necessarily
uniformly bounded above. An in¬nite language may also be uniformly bounded
above (Exercise 4, below in this section) or bounded above without a uniform
bound, as illustrated in Example 6.3.

Example 6.4 For L = {a, aaa, aaaaa} ∪ b(a ∨ b)— , each word that begins
with a b has a full spectrum and each word that begins with an a and con-
tains a b has an empty spectrum. Let B be any bijection for which B(a) = 1;
B(Q © b(a ∨ b)— ) = {n ∈ N : n ≥ 2}. The sketch provided by such a B gives
a white left quadrant and a right quadrant that is black except for the col-
umn above a which reads black, white, black, white, black, and white there-
after. The support of this regular language is the in¬nite nonregular set
Su(L) = L © Q.

Example 6.5 For L = {(a m b)n : m, n ∈ N , m odd, m ≥ n} ∪ {(a m b)n : m,
n ∈ N , m even, m ¤ n}, the support of L is Su(L) = a + b. Let B be any
bijection for which, for each m in N , B(a m b) = m. The sketch provided by
such a B gives a white left quadrant. The right quadrant has a sequence of
black squares ascending upward at 45 degrees. For each odd positive integer
m, (a m b)n is black for n ¤ m and white for n > m. Whereas, for each even
positive integer m, (a m b)n is white for n < m and black for n ≥ m.

De¬nition 6.3 A language L is eventual if, for each word w in + , Sp(w, L)
is either ¬nite or co¬nite. The language L is uniformly eventual if there is an m
in N for which, for each word w in + , either Sp(w, L) ⊆ {n ∈ N : n < m}
or Sp(w, L) ⊇ {n ∈ N : n ≥ m}.
A visual approach to formal languages
220


The language of Example 6.4 is uniformly eventual. The language of Exam-
ple 6.5 is eventual but not uniformly eventual. Note that each cylindrical
language is uniformly eventual (where any n in N may be taken as the uni-
form bound). Note also that each language that is (uniformly) bounded above
is (uniformly) eventual. Every uniformly eventual language is the symmetric
difference of a cylindrical language and a language that is uniformly bounded
above (Exercise 2, below in this section). Each noncounting language [31] is
uniformly eventual as was pointed out in [15] where the concept of an eventual
language was ¬rst introduced.

Example 6.6 For L = aa ∨ aaa ∨ (aabaab)+ ∨ (ababab)+ ∨ b(a ∨ b)— ,
each word that begins with a b has a full spectrum. Each primitive word
that begins with an a has an empty spectrum except for the primitive words
a, aab, and ab. Let B be any bijection for which B(a) = 1, B(aab) = 2,
B(ab) = 3, and B(b(a ∨ b)— © Q) = {n ∈ N : n ≥ 4}. The sketch provided
by such a B gives a white left quadrant and a right quadrant that is black
except for three columns. The column above a reads: white, black, black, and
white thereafter. The columns above aab and ab are both intermittent with
the ¬rst having period two and the second having period three. Therefore
Su(L) = {a, aab, ab} ∪ (Q © (b — )).

De¬nition 6.4 A language L is almost cylindrical (respectively, almost
bounded above, almost uniformly bounded above, almost eventual, almost
uniformly eventual) if it is the union of a language with ¬nite support and a
language that is cylindrical (respectively, bounded above, uniformly bounded
above, eventual, uniformly eventual).

The language of Example 6.6 is almost cylindrical and therefore also almost
uniformly eventual. The uniformly eventual language of Example 6.4 is almost
cylindrical. The language of Exercise 3 in this section below, is uniformly
eventual, almost uniformly bounded above, and also almost cylindrical. The
union of the languages of Exercises 3 and 4 in this section below is uniformly
eventual and almost uniformly bounded above but not almost cylindrical. The
union of the languages of Examples 6.5 and 6.6 is almost eventual, but not
almost uniformly eventual.
John Harrison provided the ¬rst application of the concept of an almost
cylindrical language in [14].
If humor can be tolerated, we may say that the freedom we allow in choos-
ing the bijections B for determining our language sketches can be supported
with the slogans: “All Primitives Were Created Equal”, “End Domination by
Alphabetical Symbols”, and “Power to the Primitives!”
6.6 The sketch parameters of a language 221


Exercises
(1) Regular languages that have a given property often have the uniform version
of the property:
(a) Show that every regular language that is bounded above is uniformly
bounded above.
(b) Show that every regular language that is almost eventual is uniformly
almost eventual. (Both parts of this exercise may be easier after reading
Section 6.10.)
(2) Show that each uniformly eventual language L is the symmetric differ-
ence of a cylindrical language and a language that is uniformly bounded
above.
(3) Let L = a ∨ aaa ∨ (ab)+ ∨ b+ . Describe the spectrum of each word in Q.
Find P(L) and Su(L). Choose a bijection B : Q ’ Z which will provide
a coherent sketch of L. Describe this sketch.
(4) Let L = {a n bn : n ∈ N }. Describe the spectrum of each word in Q. Find
P(L) and Su(L). Choose a bijection B : Q ’ Z which will provide a
coherent sketch of L. Describe this sketch.
)+ . Describe the spectrum of each word in + . State whether
(5) Let L = (
each spectrum is empty, ¬nite, co¬nite, full, or intermittent. Find P(L) and
Su(L). Choose a bijection B : Q ’ Z which will provide a coherent sketch
of L. Describe this sketch.
(6) Let L = (ab+ ab+ )+ . Find P(L) and Su(L). Choose a bijection B : Q ’ Z
which will provide a coherent sketch of L. Describe this sketch.
(7) Let L = ((ab+ )6 )+ . Find P(L) and Su(L). Choose a bijection B : Q ’ Z
which will provide a coherent sketch of L. Describe this sketch.


6.6 The sketch parameters of a language
Each sketch of a language L in + is given by a sketch function S that is
determined entirely by L and the choice of a bijection B : Q ’ Z. Given two
sketches of the same language L, each can be obtained from the other by an
appropriate permutation of columns appearing in the sketches. Mathematically,
distinguishing between different sketches of the same language L is rather
arti¬cial. The distinctions have been made because we prefer the more visu-
ally coherent sketches to those that are less visually coherent. The class of all
sketches of a given language is determined by any one of its members. Observe
that the sketches of a language L are determined by what we call the sketch
parameters of L that we de¬ne as follows: There is one sketch parameter for
each spectral class C that contains at least one primitive word. The parameter
A visual approach to formal languages
222


associated with such a C is the ordered pair consisting of the spectrum of
any primitive word q in C and the cardinal number of C © Q. This sketch
parameter is therefore the ordered pair (Sp(q), K ) where q is in C © Q and K
is the cardinal number of C © Q. In the discussion of the examples that follows,
the cardinal number of N , i.e. the denumerable in¬nite cardinal, is denoted by
the symbol ∞.
For Example 6.1 of Section 6.5, there are only two sketch parameters,
(N , ∞) and (…, ∞). For Example 6.2 of the same section, there are four
sketch parameters, ({2, 3, 4, 6}, 1), ({3, 4}, 1), ({3}, 1), and (…, ∞). Example 6.3
has parameters (…, ∞) and, for each n in N , ({m : m ¤ n}, 1). Example 6.4
has parameters ({1, 3, 5}, 1), (N ,∞), and (…, ∞). Example 6.5 has parame-
ters as follows: for each m in N with m odd, ({n ∈ N : m ≥ n}, 1); for each
m in N with m even, ({n ∈ N : m ¤ n}, 1); and ¬nally (…, ∞). Example 6.6
has ¬ve parameters ({2, 3}, 1), ({2n : n ∈ N }, 1), ({3n : n ∈ N }, 1), (N , ∞),
and (…, ∞).
We say that two languages are sketch equivalent if they can be represented
by a common sketch. For example, the context-free language L of Example 6.1
is sketch equivalent to the regular language b(a ∨ b)— since each can be repre-
sented by a sketch that has a black left quadrant and a white right quadrant. Sim-
ilarly the context-free language of Exercise 4 of Section 6.5 is sketch equivalent
to the regular language ba — since each can be represented by a sketch that has a
white left quadrant and a right quadrant that is white except for one horizontal
black stripe at n = 1. Since the sketch parameters of a language determine the
class of all possible sketches of a language, two languages are sketch equiva-
lent if and only if they have the same sketch parameters. Consequently if L and
L are languages for which the sketch parameters can be determined, then one
may be able to decide whether L and L are sketch equivalent by comparing
the sketch parameters of L and L . This will certainly be the case if one of the
languages has only ¬nitely many sketch parameters. In Section 6.10, it is shown
that every regular language has only ¬nitely many sketch parameters and that
they can be calculated.


Open Ended Exercise Investigate the sketch parameters of Q Q = { pq :
p, q ∈ Q}.


Open Ended Exercise Which sets of sketch parameters can occur as the set
of sketch parameters of a language L? This question becomes more interesting
when L is required to be regular. The regular case might be considered again
after reading one or more of the remaining sections.
6.7 Flag languages 223


6.7 Flag languages
Each language L is recognized by its intrinsic automaton M(L). The concept of
the recognition of a language by an automaton is thoroughly classical, at least
for the regular languages. A concise presentation for arbitrary languages has
been included in Chapter 3. In this chapter we apply M(L) only to the study
of the spectra of regular languages, although applications may be possible in
additional contexts. The notation of Chapter 3 is used to give a perfectly explicit
discussion of the sketches of regular languages.
Assume now that L is a regular language in + and that its recognizing
automaton M(L) has m states. With each word w in + we associate a ¬nite
sequence of states of M(L) in the following way: Consider the in¬nite sequence
of states, {[wn ] : n a nonnegative integer}. Since M(L) has only m states, there
is a least nonnegative integer i for which there is a positive integer j for which
[wi ] = [wi+ j ]. Let r be the least positive integer for which [w i ] = [w i+r ]. We
call the sequence {[wn ] : 0 ¤ n < i + r } the ¬‚ag F(w) of the word w. The
length of F(w) is i + r . Since M(L) has only m states, the maximum length
of the ¬‚ag of any word is m. The collection of distinct ¬‚ags {F(w) : w ∈ + }
associated with a regular language L is necessarily ¬nite. By a ¬‚ag F of the
language L we mean a sequence of states that constitutes the ¬‚ag, relative to
L, of some word in w in + . With each ¬‚ag F of L we associate the language
I (F) = {w ∈ + : F(w) = F}. We call I (F) the language of the ¬‚ag F. For
each ¬‚ag F = {s j : 0 ¤ j ¤ k}, where the si denote the states in F, we have

I (F) = {L(s j , s j+1 ) : 0 ¤ j ¤ k ’ 1}
j

where each L(s j , s j+1 ) is the language that consists of all words x in + for
which s j x = s j+1 . Since each of the languages L(s j , s j+1 ) is regular, each ¬‚ag
language is regular. The great value of the ¬‚ag languages, for regular L, is that
they constitute a ¬nite partition of + into equivalence classes each of which is
a nonempty regular language. The ¬‚ag partition of + into the ¬‚ag languages
determined by L is denoted P (L).

Open Ended Exercise In the theory of Abelian Groups the concept of torsion
plays a fundamental role [8]. Can this suggest a worthwhile concept of torsion
for language theory? A ¬rst attempt might begin with the tentative de¬nition: A
word w in + is a torsion word with respect to a language L if the ¬‚ag of w
in M(L) is ¬nite. If this de¬nition is used then, for each regular L, all words in
A+ would be torsion words with respect to L. The torsion words with respect
to the context-free language L = {w in + : a and b occur equally often in w}
would be the words in {μv : μ — , v L}.
A visual approach to formal languages
224


6.8 Additional tools from word combinatorics
This section contains three additional propositions on word combinatorics that
are needed for the algorithmics of the next section. Two words x and y are said to
be conjugates of one another if they possess factorizations of the form x = uv
and y = vu. From the next proposition it follows that conjugates have the
same exponent, which includes the information that the conjugates of primitive
words are primitive. This last fact, that conjugates of primitives are primitive,
is applied many times in Section 6.9.
If uv = p n then vu = q n with q a conjugate of p.
Proposition 6.3

Proof Since uv = p n , we may assume that p = u v where u = pi u and
v = v p j with i and j nonnegative integers for which i + j = n ’ 1. For q =
v u we have

q n = (v u )n = v (u v )n’1 u = v p n’1 u = v p j pi u = vu.


Lemma 6.1 Let v be a word for which vv = xvy with x and y nonnull, then
v = x y = yx.

Proof Since |v| = |x| + |y| and v has x as a pre¬x and y as a suf¬x, v = x y.
Then vv = xvy gives x yx y = x x yy and by cancellation yx = x y.

Proposition 6.4 If u i and v j , with i, j ≥ 2, have a common pre¬x of length
|u| + |v| then u and v are powers of a common word.

Proof By the symmetry of u and v in the hypothesis, we may assume |u| ≥ |v|.
Then v is a pre¬x of u and u = v n x where u/v = (n, x). The v that occurs as
the pre¬x of the second u, in the series of us concatenated to form u i , occurs
also as a factor of the product of the two vs that occur as the (n + 1)st and
the (n + 2)nd vs concatenated to form v j . This provides a factorization of the
form vv = xvy. By Lemma 6.1 and Proposition 6.1, x and y are powers of a
common word and therefore so are v = x y and u = (x y)n x.

Proposition 6.5 Let u and v be words that are not powers of a common word.
For each n in N either u n v is primitive or u n+1 v is primitive. Consequently the
set Q © u + v is in¬nite.

Proof If both u n v and u n+1 v fail to be primitive then, by Proposition 6.3,
the conjugate u n vu of u n+1 v also fails to be primitive and we have
u n v = pi and u n vu = q j with p, q primitive and i, j ≥ 2. Then p 2i = u n vu n v
and q j = u n vu have the common initial segment u n vu which has length
6.9 Counting primitive words in a regular language 225


|u n vu| = (1/2)|u n vu|+(1/2)|u n vu| > (1/2)|u n v|+(1/2)|u n vu| > | p| + |q|.
By Proposition 6.4, p and q are powers of a common word and, since they
are primitive, p = q. We then have u n v = pi and u n vu = p j , which gives the
contradiction: u = p j’i and v = pi’n( j’i) . Finally, since at least one member
of each pair from the in¬nite collection of pairs {u k v, u k+1 v} must be primitive,
the set Q © u + v is in¬nite.


Exercises
(1) Provide an alternative proof of Proposition 6.2 using Proposition 6.4.
(2) Provide an alternative proof of Proposition 6.2 using Lemma 6.1.
(3) Let be an alphabet containing the symbols a and b. Let u be any word in
+
. Show that at least one of ua and ub must be primitive.
(4) Let u and v be in A+ . Suppose that, for some n in N , no word in the set
{u k v | k ≥ n} is primitive. Prove that uv = vu. Can you prove this using
only Lemma 6.2 without using either Proposition 6.4 or Proposition 6.5?



6.9 Counting primitive words in a regular language
In order to construct the sketch parameters of a language L we will need to
determine the cardinal number of the set C © Q for each spectral class C of L.
The conceptually simple instructions for ¬nding the cardinal of each set L © Q
for any regular language L are given next and followed by a justi¬cation that
is a simpli¬ed version of a proposition provided by M. Ito, M. Katsura, H. J.
Shyr and S. S. Yu in [25].
The Counting Procedure Let A be an alphabet with at least two symbols
and let L be a regular language contained in A+ . Let n ≥ 2 be the number of
states of the automaton M(L) that recognizes L and let B = 4n. Begin testing
the primitivity of words in L of length ¤ B. As the testing progresses maintain
a list of all primitive words found thus far. If a primitive word p with | p| ≥ n
is encountered, STOP with the information that |L © Q| = ∞. Otherwise
continue the testing and the listing process for words in L of length ¤ B until
either a primitive word p with | p| ≥ n is encountered, or all the words in L
of length ¤ B have been tested. If this procedure has not STOPPED with the
information that |L © Q| = ∞, then the ¬nal list of primitive words found is
the complete list of all primitive words in L. Such a list will be ¬nite and may be
empty.
This counting procedure is justi¬ed by the following result:
A visual approach to formal languages
226


Theorem 6.1 Let be an alphabet with at least two symbols and let L be
a regular language contained in + . Let n ≥ 2 be the number of states of the
automaton M(L) that recognizes L and let B = 4n. Then: (1) L © Q is empty
if it contains no primitive word of length ¤ B; (2) L © Q is in¬nite if it contains
a primitive word of length ≥ n; and (3) if L © Q is in¬nite then it contains a
primitive word p with | p| ¤ B.

Proof Note ¬rst that (1) will follow immediately once (2) and (3) are proved:
Suppose L © Q contains no primitive word of length ¤ B. Then, if L © Q
contained any primitive word at all, that word would have length > B. Then
L © Q would be in¬nite by (2) and would contain, by (3), a primitive word p
for which | p| ¤ B contradicting the original supposition. Next we prove (2).
We consider two distinct cases: Suppose ¬rst that for every state [u] in M(L),
there is a word v in + for which [uv] is a ¬nal state. Since M(L) has only n
states it follows that there is such a word v with |v| ¤ n ’ 1. Let a and b be two
distinct symbols in . For every integer i ≥ n ’ 1 there is a word v of length
¤ n ’ 1 for which a i bv is in L. Each such a i bv is primitive and is therefore
in Q © L. Consequently Q © L is in¬nite as was to be proved. Surprisingly,
perhaps, for this case we have a stronger version of (3) since a word w = a n’1 bv
lies in Q © L and n ¤ |w| ¤ 2n ’ 1 < B. (Exercises 5 and 6 of Section 6.1
contain related concepts.)
Now suppose that M(L) has a state g for which [gv] is not ¬nal for any word
v in + . Such a state g is often called a dead state. Suppose that w is in L © Q
and |w| = r ≥ n. As w is read by M(L), a walk is made from the initial state
to a ¬nal state and this walk enters r states after leaving the initial state. This
walk does not enter g. Since this walk involves a sequence of r + 1 ≥ n + 1
states there must be a repetition of states among the last n states in the list. This
gives a factorization w = uxv for which [u] = [ux] where both u and x are
nonnull and ux — v ⊆ L. Since uxv = w is primitive, so is its conjugate xvu.
Since xvu is primitive, x and vu cannot be powers of a common word. By
Proposition 6.5 (Section 6.8) the set Q © x + vu is in¬nite. Since each word in
ux + v is a conjugate of a word in x + vu, the set Q © ux + v is also in¬nite and
since also ux — v ⊆ L, Q © L is in¬nite as was to be proved.
Suppose now that |L © Q| = ∞. Let z be a word of minimal length in L © Q.
To conclude the proof it is only necessary to show that |z| ¤ B: Suppose that
|z| > B. Since B = 4n and M(L) has only n states, z possesses a factorization
z = ux xvy yw for which: |ux x| < 2n; |y yv| < 2n; [u] = [ux ] = [ux x];
and [ux xv] = [ux xvy ] = [ux xvy y]; and none of the words x , x, y , y,
uvw is null. We are concerned with the relative lengths of the four words x , x,
y , and y. It is suf¬cient to treat only the case in which: |x| ¤ |x |, |y| ¤ |y |,
6.10 Algorithmic sketching of regular languages 227


and |x | ¤ |y |. Each of the seven other settings of the inequalities can be treated
in an exactly analogous manner. (See Exercises 1 and 2 in this section, below.)
Since uxvw and ux xvw are in L and are shorter than z, neither is in Q.
Consequently, neither of their conjugates xvwu and x xvwu is in Q. From
Proposition 6.5 it follows that r t(x) = r t(vwu). Since uxvy yw and ux xvy yw
are in L and are shorter than z, neither is in Q. Consequently, neither of their
conjugates xvy ywu and x xvy ywu is in Q. From Proposition 6.5 it follows that
r t(x) = r t(vy ywu). Since ux vw and ux x vw are in L and are shorter than z,
neither is in Q. Consequently, neither of their conjugates x vwu and x x vwu
is in Q. From Proposition 6.5, it follows that r t(x ) = r t(vwu). We now have
r t(x ) = r t(x) = r t(vy ywu). Consequently the word (x )(x)(vy ywu) is not
primitive, being in fact a power of r t(x). Since z = ux xvy yw is a conjugate
of x xvy ywu it cannot be primitive either. This contradiction con¬rms that the
shortest word in L © Q has length ¤ B.


Exercises
(1) Carry out the proof in the ¬nal two paragraphs of Theorem 6.1 above using
the settings: |x| ¤ |x |, |y| ¤ |y |, and |y | ¤ |x |.
(2) Carry out the proof in the ¬nal two paragraphs of Theorem 6.1 above using
the settings: |x | ¤ |x|, |y| ¤ |y |, and |y | ¤ |x|.
(3) Study the proof of Theorem 6.1 above to see if the given proof will hold if
you replace B = 4n by B = 4n ’ 1. Can you reduce B any further without
some basic additional insight?

Remark 6.1 The value of B can be reduced a good deal in Theorem 6.1
and in the resulting Counting Procedure using more powerful tools from word
combinatorics. This is con¬rmed for B = 3n ’ 3 in [25] and later for B =
(1/2)(5n ’ 9) by M. Katsura and S. S. Yu. See also [13].



6.10 Algorithmic sketching of regular languages
The spectrum of any word w in + relative to a regular language L can be read
from the ¬‚ag of w. This is merely a matter of noting which of the states in the
¬‚ag of w is a ¬nal state of M(L). Thus words having the same ¬‚ag have the
same spectrum. There are several absolutely fundamental consequences of this
fact: (1) The ¬‚ag partition P of + re¬nes the spectral partition P; (2) since a
regular language has only ¬nitely many ¬‚ags it has only ¬nitely many distinct
spectra; and (3) since each spectral class is the union of (a ¬nite number) of ¬‚ag
A visual approach to formal languages
228


languages (each of which is regular) the spectral classes of a regular language
are regular. Thus both P (L) and P(L) are ¬nite partitions of + into regular
sets.

Theorem 6.2 Each regular language has only ¬nitely many sketch parameters
and these parameters are algorithmically computable.

Proof Given a regular language L, construct M(L). Let m be the number of
states of M(L). Only ¬nitely many sequences of states F = {s j : 0 ¤ j < k}
with k ¤ m could possibly occur as ¬‚ags of words in + . For each such sequence
F construct the intersection I = ©{L(s j , s j+1 ) : 0 ¤ j ¤ k ’ 1}, where each
L(s j , s j+1 ) is the language that consists of all words x in + for which s j x =
s j+1 . If I is empty then F is not the ¬‚ag of any word. If I is not empty then
F is the ¬‚ag of each word w in I and consequently we have I (F) = I . At
this point we have determined the partition P (L) of + into the ¬‚ag languages
determined by L. Note that each ¬‚ag F determines the spectrum that is common
to each word w in I (F) since Sp(w) = {n ∈ N : [wn ] is a ¬nal state of M(L)}.
For each ¬‚ag F associated with L, determine the spectrum of F and apply
the Counting Procedure in Section 6.9 to determine the cardinal number of
Su(I (F)). Since distinct ¬‚ags may have the same spectrum, ¬‚ag languages that
have a common spectrum must be collapsed together. Each spectral class C
arises as the union of the ¬‚ag languages it contains. Thus the spectral partition
P(L) arises as the resulting coarsening of P (L). Each sketch parameter arises
from a spectral class C that contains a primitive word q and has the form
(Sp(q), sum {|Su(I (F))| : I (F) ⊆ C}).

An Example Computation Let L = (a ∨ b)a — b— . One may verify that M(L)
has four states: [»], [a] = [b] = [aa] = [ba], [ab] = [bb], [aba] = [bba] =
[abab] = [baba] = “dead.” There are two ¬nal states: [a] and [ab] and
two non¬nal states [»] and “dead.” There are six distinct ¬‚ags: F(a) : [»],
[a] = [aa]; F(b) : [»], [b], [bb] = [bbb]; F(ab) : [»], [ab], [abab] = “dead;”
F(bb) : [»], [bb] = [bbbb]; F(ba) : [»], [ba], [baba] = “dead;” and F(aba) :
[»], [aba] = “dead.” The languages of these six ¬‚ags are: L(F(a)) = a + ;
L(F(b)) = b+ ; L(F(ab)) = a + b+ ∨ ba + b— ; L(F(bb)) = bb+ ; L(F(ba)) =
ba + ; L(F(aba)) = (a ∨ b)— (aba ∨ bba)(a ∨ b)— . We count the primitive words
in each ¬‚ag language: L(F(a)) contains one primitive word, namely a; L(F(b))
contains one, namely b; L(F(ab)) contains an in¬nite number of primitive
words; L(F(bb)) contains no primitive words; L(F(ba)) contains in¬nitely
many primitive words and so does L(F(aba)). The spectra of these ¬‚ag
languages of primitive words are: Sp(a) = N ; Sp(b) = N ; Sp(ab) = {1};
Sp(ba) = {1}; and Sp(aba) = …. The two ¬‚ag languages, containing a and
6.10 Algorithmic sketching of regular languages 229


b, respectively, have the same spectrum N . Thus the union of these two ¬‚ag
languages, which is a + ∨ b+ , constitutes a spectral class. The two ¬‚ag lan-
guages, containing ab and ba, respectively, have the same spectrum {1}.
Thus the union of these two ¬‚ag languages, which is a + b+ ∨ ba + b— ∨ ba + =
a + b+ ∨ ba + b— , constitutes a second spectral class. Finally, the ¬‚ag language
containing aba, namely (a ∨ b)— (aba ∨ bba)(a ∨ b)— , constitutes the third
spectral class of L. The ¬rst spectral class contains exactly two primitive words,
namely, a and b. This gives the parameter: (N , 2). The second spectral class
contains in¬nitely many primitive words. This gives the parameter: ({1}, ∞).
The third spectral class contains in¬nitely many primitive words which gives
the parameter: (…, ∞).
Using the sketch parameters from the example above we provide a sketch of
L: Let B : Q ’ Z be any bijection for which: B(a) = 1; B(b) = 2; B estab-
lishes a one-to-one correspondence between the set of primitive words in the
second (in¬nite) spectral class above and the set {z ∈ Z : z ¤ 0}; and B estab-
lishes a one-to-one correspondence between the set of primitive words in the
third (in¬nite) spectral class above and the set {z ∈ Z : z ≥ 3}. In this sketch
of L, there is a vertical black stripe two units wide above a and b (i.e., x = 1
and x = 2). The remainder of the right quadrant is white. The left quadrant is
white except for one horizontal black stripe at the level n = 1. Although this
language is not bounded above, it is almost uniformly bounded above. It is
not almost cylindrical, but it is uniformly eventual. From this sketch of L all
further sketches of L can be obtained by permuting the columns of the given
sketch.

Corollary 6.1 Sketch equivalence is decidable for each pair of regular lan-
guages. Each of the ten language-theoretic properties de¬ned in Section 6.5 is
decidable for a regular language.

Procedures These decisions can be made after computing the sketch param-
eters of the languages in question. Two languages are sketch equivalent if and
only if they have the same set of sketch parameters. The ten decisions con-
cerning a regular language are easily made by an examination of the sketch
parameters of the language.
Which of the two partitions P(L) and P (L) induced by a language L in the
free semigroup + is more fundamental may not be clear at this time. In this
chapter the detailed work has been done at the ¬‚ag level, P (L). A previous
exposition [23] applied the algorithms given by M. Ito, H. J. Shyr, and S. S. Yu
in their paper [25] to construct the sketch parameters of the regular languages.
See also [13] for new elegant short proofs providing relevant tools.
A visual approach to formal languages
230


Open Ended Exercise Let be an alphabet and let K be an arbitrary language
contained in + . If the sketch parameters of K are given, to what extent can
they be used to decide whether there is a regular language L that has these
sketch parameters? Special cases in which K is required to satisfy one or more
of the ten language-theoretic properties de¬ned in Section 6.5 might be treated.
Open Ended Exercise Can additional classes of languages be found that
allow their sketches to be determined? Note that for each context-free language
L, w— © L is regular and recall Exercise 2 of Section 6.3.
Open Ended Project The production of software for displaying sketches of
languages is encouraged.
An Aside to Readers Interested in Art The inspiration for the vision-based
approach to languages came in part from admiration for the late paintings of Piet
Mondrian and certain paintings by Barnet Newman. Note that one can sketch
two or more languages on the same half plane and use distinct color pairs for
distinct languages.
7
From biopolymers to formal language theory




7.1 Introduction
Living systems on our planet rely on the construction of long molecules by
linking relatively small units into sequences where each pair of adjoining units
is connected in a uniform manner. The units of polypeptides (proteins) are a
set of twenty amino acids. These units are connected by the carboxyl group
(COOH) of one unit being joined through the amino group (NH2 ) of the next
unit, with a water molecule being deleted in the process. The units of RNA are
a set of four ribonucleotides. These units are connected by the phosphate group
(PO4 attached at the 5 carbon) of one unit being joined through replacement of
the hydroxyl group (OH attached at the 3 carbon) of the next unit, with a water
molecule being deleted in the process. The units of single stranded DNA are a
set of four deoxyribonucleotides with the joining process as in the case of RNA.
Molecules lie in three-dimensional space, whereas words lie on a line. One
may adopt the convention of listing the amino acids of a protein on a line with
the free amino group on the left and the free carboxyl group on the right. For
both single stranded RNA and DNA molecules one may adopt the convention
of listing their units on a line with the phosphate at the left and the free hydroxyl
group at the right. These conventions allow us to model (without ambiguity)
these biopolymers as words over ¬nite alphabets: a twenty letter alphabet of
symbols that denote the twenty amino acids and two four letter alphabets of
symbols denoting the four units for RNA and DNA, respectively.
Within a decade of the announcement in 1953 of the structure of DNA by
Watson and Crick, mathematicians and scientists were suggesting that bridges
be found between the study of the fundamental polymers of life and the mathe-
matical theory of words over abstract alphabets. The biopolymers were modeled
by words in a free monoid with word concatenation modeling the chemical end-
to-end joining of biopolymers through deletion of water. To obtain nontrivial


231
From biopolymers to formal language theory
232


results in this rare¬ed context some additional source of structure seems to
be required. The Shannon information content of biopolymers has long been
studied. The transcription of DNA into RNA and the translation of RNA into
protein are easily viewed as actions of ¬nite transducers. This chapter treats
additions to formal language theory that have their source in the conceptual
modeling of the actions of enzymatic processes on double stranded DNA. The
modeling process has motivated new concepts, constructions, and results in the
theory of formal languages and automata. The focus of this chapter is on these
new concepts, rather than the associated biomolecular science. Discussion of
the science that provoked these developments in formal language theory has
been restricted to this section and Section 7.3, with Section 7.3 optional reading

<<

. 8
( 9)



>>