. 5
( 10)


Get any book for free on: www.Abika.com

There is a formal problem with this idea. From a mathematical-logical point of view no
mathematical function can belong to its own domain or range. However, the functions that
describe component-systems try to do exactly this, if we take them literally. (p.212)

But in fact, two hundred pages later, Kampis admits that this complaint is not strictly accurate.
He refers to Lofgren's (1968) demonstration that the existence of functions belonging to their
own domain or range is independent of the ordinary axioms of set theory.

As it happens, Lofgren is not the only researcher to point out the existence of set theories in
which functions can belong to their own domain or range. From my point of view, Paul Aczel's
AFA axiom provides a much more elegant approach to constructing such unusual functions. So,
let us digress for a few paragraphs to describe the AFA axiom.

7.2.1. Hypersets

In mathematics, one defines complex concepts in terms of simpler concepts. But this process
must bottom out somewhere -- there must be something that is sosimple that there is nothing
simpler in terms of which to define it. In modern mathematics, this elementary concept is usually
taken to be the "set." The term "set" may be defined intuitively, as a collection of entities -- but
of course, this is circular, since what is a "collection" if not a "set"?

Mathematicians take this intuitive definition of a set, and then postulate certain simple rules
for dealing with sets. All the complex constructions of modern mathematics can be expressed in
terms of sets, and a great many of the theorems of modern mathematics can be proved by using
the rules of set theory. However, in the 1930's Kurt Godel showed that, given any particular list
of rules for manipulating sets, there are some mathematical theorems that can be expressed in the
language of sets, but cannot obtained by using the rules on the list. Essentially this is because
each list of rules has a certain finite amount of "algorithmic complexity", and cannot be used to
prove theorems that possess an algorithmic complexity in excess of this amount.

At first, mathematicians were very loose about what qualified as a set. They dealt with finite
sets like {1,2,3}, infinite sets like (1,2,3,...}, the set of all fractions, or the set of all numbers on
the number line; and also with abstract sets far more esoteric than these. But since the concept of
set never caused them any trouble, they had no motivation to fiddle with it.

But then, around the turn of the century, Bertrand Russell noted a problem. He said, consider
the set containing all sets that do not contain themselves. And he asked: does this set in fact
contain itself? The trouble is, what if it does contain itself? Then it is not a set that does not
contain itself -- so it cannot be an element of itself, it cannot contain itself. But, on the other
hand, if it does not contain itself then it must be an element of itself, since it is the set of all sets
that do not contain themselves. A serious problem!

Incidentally, for years this "Russell Paradox" has been formulated as follows: "There is a town
in which the barber shaves all men who do not shave themselves. Who shaves the barber?"
However, someone has wittily pointed out that this version is less potent than the original. The
solution is: the barber is a woman!

Get any book for free on: www.Abika.com

In order to avoid Russell's Paradox, a rule was added onto the original axioms of set theory: no
set can contain itself as an element. More generally, there can be no "descending chain of
membership" -- no set can Acan contain as an element a set which contains A as an element, and
so on.

Many mathematicians were uncomfortable with this rule, since it was not "natural", it was
simply appended onto the list of rules in order to avoid contradiction. And this discomfort
became especially acute after Godel came out with his Incompleteness Theorem. For Russell's
Paradox is essentially a variation on the old Paradox of Epiminides the Cretan: "This sentence is
false." But Godel's Theorem is an even cleverer variation on this same ancient paradox. Godel
showed that, by implementing a clever scheme of coding, one can use any mathematical system
to form the proposition X = "This proposition cannot be proved true or false within this
mathematical system." If X can be proved true, then it must be false, in which case it is true, so it
cannot be proved true. But if X can be proved false, then it must be false, so it has been proved
true, and has therefore not been proved false.

Godel, with his ingenious Incompleteness proof, showed that self-reference cannot be banned
from mathematics anyway, no matter how hard you try. This made Russell's elaborate theory of
Types seem even more excessive than it had before. But still, since mathematicians never
seemed to have any use for sets that contained themselves as elements, they simply accepted the
axiom and went on doing mathematics.

However, while working with various models of complex systems such as ecosystems and
brains, I found that I did have a need for sets that contained themselves as elements. I spent a
long time trying to concoct ways of avoiding this problem. But then, while sightseeing in
Cambridge and browsing through the MIT Bookstore, I came across a little book by Paul Aczel,
entitled Non-Well-Founded Sets. This book describes a research programme in mathematical
logic, active since the late 1960's, aimed at constructing a consistent set theory involving sets that
can contain themselves as elements.

The most easily applicable result of this intriguing research programme is the concept of a
hyperset. A labeled graph is defined as any collection of dots with a symbol drawn next to each
dot, and arrows drawn between the dots. Aczel's "AFA Axiom" implies that every finite graph
corresponds to some set. For instance the graph


A Objectivism


corresponds to the set A = {objectivism, subjectivism, mysticism}. And the graph


Get any book for free on: www.Abika.com

A Objectivism




corresponds to the set A = {{objectivism, subjectivism, mysticism}, botulism, aneurism}.

But what of the graph


This graph corresponds to the set A whose only element is A -- the set A = {A}. This is a "non-
well-founded" set.

And, similarly, the graph

A botulism



corresponds to the set A = { A, botulism, {objectivism, subjectivism, A} }.

To the mathematically indoctrinated mind, all this is incredibly liberating! Just as the
paradoxes of quantum physics free the mind from objectivism, so hyperset theory frees the mind
from the stifling preconception that, if A contains B, B cannot in turncontain A. Common sense
tells us that, if mind is a part of physical reality, then physical reality cannot possibly be a part of
mind. And common sense tells us that, if you are a part of my subjective reality, then I cannot
possibly be a part of your subjective reality. But hyperset theory tells us that in this case
common sense is wrong.

According to Godel's Theorem, once can never mathematically prove that a complicated
mathematical theory is consistent, devoid of self-contradictions. But Aczel has shown that, if
there are contradictions in hyperset theory, then there are also contradictions in plain ordinary
mathematics, the kind that every scientist uses to make calculations. This is as good a
consistency result as one could hope for. One may confidently say: there are mathematical
objects that contain one another as elements.

Get any book for free on: www.Abika.com

In the following pages, I will not need to use any of the technical mathematics of hyperset
theory. However, I will find it convenient to talk about sets that contain one another in a
"circular" way. Hyperset theory ensures that this is okay, that I am not contradicting myself any
more than a physicist is when he deals with algebraic or differential equations.

This section began, if you recall, with Kampis's observation that "from a mathematical-logical
point of view no mathematical function can belong to its own domain or range. However, the
functions that describe component-systems try to do exactly this, if one takes them literally." It is
easy to see that with hypersets, mathematical logic has transcended the limitation to which
Kampis is referring. One particular type of hyperset is the hyperfunction -- the function which is
contained in its domain or range.

A hyperfunction maps hyperfunctions and/or other entities into hyperfunctions and/or other
entities. Because it is a "function," it is not allowed to map any one thing into two different
things. To deal with the more general case, I will introduce the term hyperrelation. A
hyperrelation maps hyperrelations and/or other entities into hyperrelations and/or other entities;
and it may map one thing into as many other things as it likes.

These odd constructions, hyperrelations, are the first step on the path toward a cognitive
equation. For they give us a straightforward way to talk about components that truly transform
one another. A More Formal Treatment (*)

Recall that, in order to avoid Russell's Paradox, a rule called the Axiom of Foundation was
added onto the original axioms of set theory: no set can contain itself as an element. More
generally, one is not allowed to have an infinite descending sequence

... an+1 an ... a1 b

A set b which contains no such sequence is well-founded. All the traditional sets of mathematics
-- the sets involved in geometry, calculus, topology, etc. -- are well-founded. But, for example, S
= {S} is not well-founded, because it leads to the infinite descending sequence

... S S ... S S

And { 1, { 1, { 1, ... is not well-founded, even though it has the natural "solution"

x = {1,x}

Many mathematicians were uncomfortable with this rule, since it was not "natural", it was
simply appended onto the list of rules in order to avoid contradiction. But since they never had
any use for sets that contained themselves as elements, they simply accepted the axiom and went
on doing mathematics.

Get any book for free on: www.Abika.com

Paul Aczel (1988) was one of the few who decided to do something about his discomfort. He
constructed the "non-well-founded sets" which, following Jon Barwise and Larry Moss (1991), I
have called hypersets. According to Aczel's approach, the path to hypersets begins with graphs.
A digraph (G,E) consists of a set G of entities called "nodes," and a set E of ordered pairs of
nodes, these pairs being called edges. The most common examples of graphs are finite graphs, as
in Figure 1; however, the concept of an infinite graph presents no difficulties. If (n,m) is an
element of E, I will write n-->m, and call m the child of n, and n the parent of m. Fix a set A of
tags. Then a tagged digraph (G,E,t) is a digraph together with a function t that assigns a tag
drawn from A to each childless node of G.

Next, define an accessible pointed graph (apg) (G,E,t,p) to consist of a tagged digraph
together with a distinguished node p which has the property that every node can be reached by
some finite path from p. And define a decoration of an apg as a set-valued function d with
domain G, satisfying

d(n) = t(n)

if n is childless and

d(n) = {d(m):n-->m}

otherwise. That is, a decoration assigns to each childless node its tag, and to each parent node n
those nodes m which are its children.

Finally, let us say that an apg pictures a set b if there is a decoration d of the graph so that
d(p)=b; that is, so that b is the set which decorates the distinguished node. This permits us to
state Aczel's Anti-Foundation Axiom (AFA), which characterizes hypersets:

Every apg pictures a unique set.

According to this definition, all the sets of standard set theory are still sets. But there are other
sets too. Anything which is a set according to this definition, but not the classical definition is a
hyperset. For example, consider again the following graph:


There is only one node, so it must be the distinguished node. What is the unique set pictured by
this apg? It must be the set A = {A}! However, according to the ordinary axioms of set theory,
no set can contain itself.

Hypersets can contain themselves. One might at first think that this would lead to
contradictions, but Aczel has shown that if ordinary set theory is consistent, so is set theory
augmented by AFA. In addition, he has proved a very useful result called the Solution Lemma.
Roughly speaking, the Solution Lemma states that every system of equations in indeterminates
x,y,z,..., say

Get any book for free on: www.Abika.com




has a unique hyperset solution.

For a deep mathematical treatment of hypersets, the reader is referred to Aczel's (1988)
original monograph on the subject. However, the clearest discussion of the fundamentals of
hyperset theory which I have found is ina delightful little book by Jon Barwise and John
Etchemendy entitled The Liar (1988). This book contains a more rigorous statement of the
Solution Lemma.

As an example, let us construct a function which maps itself into itself -- a function so that f =
f(f). If f takes no other arguments besides itself, then by the standard definition f = { (f,f) } = { {
f, {f} } }. f is then the solution of the system of equations

w = {f}

x = {f,w}

f = {x}

Graphically, one has

w f


More generally, if f=f(y) and if is only defined on y, f is the solution of




And it is clear that, in a similar manner, one may cast any system of expressions of the form

f1(f1,...,fn,x1,x2,...) = fi(1)

... (*)

fn(f1,...,fn,x1,x2,...) = fi(n)

Get any book for free on: www.Abika.com

in the form required by the Solution Lemma and thus obtain a hyperset solution.

Getting back to Self-Modifying Systems, I suggest that component-systems should be
conceptualized in terms of systems of hyperfunctional equations of the form (*) given above;
and, more generally, hyperrelational equations defined in a similar way. Fuzzy Hypersets (*)

Although fuzzy sets are now commonplace in artificial intelligence, so far as I know fuzzy
hypersets have never before been discussed. Fortunately, therewould appear to be no particular
problems involved with this useful idea: the basic mathematics of fuzzy hypersets, at least as far
as I have worked it out, is completely straightforward.

The simplest example of a fuzzy hyperset is the set x defined by:

dx(x) = c,

dx(y)=0 for y not equal to x.

Here, if c=0, one has an ordinary well-founded set, namely the empty set. If c=1, one has the set
x={x}. Otherwise, one has something inbetween the empty set and x={x}.

Each fuzzy hyperset is characterized by a fuzzy apg, which is exactly like an apg except that
each link of the graph has a certain number in [0,1] associated with it. The Fuzzy AFA then
states that each fuzzy apg corresponds to a unique fuzzy set. It is easy to see that the natural
analogue of the Solution Lemma holds for fuzzy hypersets. And, of course, the consistency of
fuzzy hypersets with the axioms of set theory (besides the axiom of reducibility) follows trivially
from the fact that each fuzzy hyperset is, in fact, a hyperset under AFA.

7.2.2. Self-Generating Systems

Kampis's examples of component-systems are both relevant and elegant: mobile interattracting
LEGO blocks, enzyme systems, self-modifying robots,.... However, for reasons given above, I
do not find his formal definition of component-system entirely satisfactory. Thus I think it is
worthwhile to define a closely related type of system called a "self-generating system."

A self-generating system, at each time, consists of a collection of components which are
modelable as "finitely given hyperrelations" -- meaning that they are defined by their actions on
a finite number of different possible components. Each component may be thought of as having a
certain degree of membership in the system, with the constraint that the total degrees of all the
components should be finite.

Self-generating dynamics is defined as a two-stage process. First, universal action: each
component acts on each other component with a certain probability, yielding different new
components with different probabilities. Then, transformation: these resultant components are

Get any book for free on: www.Abika.com

transformed in some way, yielding a new collection ofcomponents. The results of transformation
are then fed back into the first step, and used as fodder for universal action.

The transformation rule may be stochastic -- for instance, it may make errors. It may change f
into w by mistake 4% of the time. Or, on the other hand, it may simply make a random addition
or deletion to the definition of each component a certain percentage of the time. Given f which
does not act on g at all, it may randomly define the action of f on g, or it may define the action
of f on g by some complex formula combining deterministic and stochastic elements. By use of
randomness, the transformation rule could generate dynamics that are fundamentally
unpredictable, in the sense of being non-Turing-computable.

The connection between self-generating systems and component-systems is quite simple. Not
every component-system is a self-generating system; but I propose that every physically useful
component-system is actually a self-generating system. Note that, since the definition of self-
generating systems is phrased in terms of hyper systems , it is perfectly natural for the number of
components to shift in the course of evolution.

Now, the converse is not true: not every physically useful self-generating system is a
component-system. The reason is that the self-generating dynamical equation is capable of
describing totally deterministic processes. Component-systems can only be obtained if the
function R is allowed to contain random elements, although rather good simulations of
component-systems can be obtained using chaotic or "pseudorandom" deterministic functions.
Thus the class of component-systems and the class of self-generating systems possess a
nontrivial intersection. I propose that real complex systems lie in this intersection. A More Formal Treatment (*)

Define an hyperrelation to be finitely given if its associated apg is finite, and the labels of its
associated apg are all encodable as finite binary sequences. Line up the set of all finitely given
hyperrelations in some arbitrary order: f1, f2, f3,.... Given this ordering, an hypersystem may be
defined as an infinite vector C = (p1,...,pn,...), where the pi are nonnegative and the sum p1 + p2 +
... + pn + ... is finite. (In functional-analytic lingo, the set of hypersystems is therefore isomorphic
to the space l1+).

The entry pi is to be interpreted as the degree with which fi belongs to the hypersystem
represented by C. For instance, in the context of enzyme systems, pi would denote the
concentration of the enzyme fi in the solution in question. In the deterministic case, an
important but special situation, all the pi are assumed to be 1 or 0.

Also, I will use the notation xit to denote "inanimate objects": entities which can be acted on,
but cannot act. I will not refer to these very often in the following, but they may be useful in
certain applications.

To each hypersystem Systemt, associate a set of "action products"

Get any book for free on: www.Abika.com

A[Systemt] = {fit(fjt), fit(fjt,xkt)}

Here the indices i and j run over all hyperrelations that have a nonzero probability of
membership in Systemt.

Next, define a "filtering operation" Yt which determines, based on the degrees of the elements
in Systemt, the degrees of the elements in A[Systemt]. The only restriction is that, if either f and
g have degree zero in Systemt, then Yt cannot assign f(g) or f(g,x) a nonzero probability. In the
deterministic case, Yt is only a formality, for it may be assumed that all probabilities are either 1
or 0. But in the most extreme fuzzy case, it may come about that A is the formality, leaving Yt to
do most of the work.

Finally, one may collapse these two operations into one composite operation R[Systemt] =
Yt(A[Systemt]) -- the "Raw Potentiality" operation. Then, where T is some stochastically
computable function mapping hypersystems to hypersystems, one may define a self-generating
system as an iteration

Systemt+1 = T( R[Systemt] ), (**)

Here the hypersystem System1 is considered to be given a priori, yielding a dynamic iteration.

7.2.3. Hypersets and Physical Reality

One would like to think of component-systems and self-generating systems as models of real
physical complex systems. At first sight, however, there seems to be a serious obstacle in the
way of this interpretation. Hyperrelations are peculiar set-theoretic objects. Formally, in Aczel's
construction they are defined asequivalence classes of sets of ordinary sets (this construction is
somewhat analogous to the construction of real numbers as equivalence classes of Cauchy
sequences of rationals). This places them in a cardinality class far, far above the countable
computable sets, and also far, far above the Hilbert- space-defined sets which quantum
mechanics associates with physical reality.

However, interpreted properly, a finite system of finitely given hyperrelations does not
violate the stochastic Church-Turing thesis, since any such system of equations can be simulated
on a stochastic computer. A stochastic computer can never actually contain hyperrelations, but if
they are finitely given, it can simulate their behavior easily enough. After all, manipulations with
finitely given hyperrelations are merely manipulations with finite graphs!

Physically, what does this mean? While quantum physics does not permit the existence of
physical hypersets, it does permit physical events that are effectively modeled as finite labeled
graphs. Now, suppose that the interactions of some of these physical events can be modeled as
interactions between finite labeled graphs, and that these graph interactions are usefully
describable using self-generating systems or component-systems. Then these mathematical
systems are emergent patterns in physical reality. No contradiction. No problem. Physical
reality can simulate component-systems; or, to put it another way, the reality of component-
systems can be understood as a "virtual reality" running on the hardware of quantum reality.

Get any book for free on: www.Abika.com

In sum, I suggest that Kampis's picture of complex system behavior is fundamentally right.
Complex systems consist of components that act on one another to create new components. Thus
they effectively violate the hierarchy of logical types; they contain emergent patterns which are
usefully modelable in terms of stochastic systems of hyperrelations.

But, on the other hand, as already mentioned, Kampis's theory of creativity is flawed. The
creativity of a complex system is due both to the unfolding of the rules implicit in its
components, and to the mutation of these rules by random error. The opposition which Kampis
has set up, between computation and component-systems, is in my opinion a false one. The
difference between simple systems and complex systems is not that the former are computable,
but that the latter contain emergent structures which are modelable in terms of stochastic
hyperfunctional iterations (self-generating systems). The concept of self-generating systems
makes this point in a very clear way. A Binary Model of Hypersets (*)

To make this conclusion more concrete, let me construct a specific computational scenario
which gives rise to hypersets as a natural model. Suppose one has a finite collection of
computable relations f, g, h,..., each of which maps binary sequences into binary sequences.
Then one may represent each relation by a finite code sequence, e.g.

sf = 010100101111010010...01

sg = 010111101001001010...10


And one may define the action f(g) as the result of the following two-step process:

1) letting the program f act upon the sequence sg, producing a new sequence sfg.

2) Decoding sfg into a program, by selecting the first (in alphabetic order) from among all
programs h for which the Hamming distance d(sh,sfg) is at its absolute minimum. This "first
closest" program is taken as f(g).

The relations f, g and h are computable relations -- but there is aboslutely nothing wrong with
thinking of them as hyperrelations, acting directly on each other. The whole system may be
elegantly modeled as a system of hyperrelations, without ever referring to the underlying bit-
string manipulations. This requires no new information, only a shift in point of view. I will
refer to this system for deriving hyperrelations from computable relations as the basic
computational model.


Coauthored with Harold Bowman

Get any book for free on: www.Abika.com

Our treatment of self-generating systems has up to this point been purely formal. However,
one may also describe self-generating systems in much a less mathematical way, in the guise of
self-referential sentences (Hofstadter, 1985). For example, suppose that the complete formal
definitions of the hyperrelations f, g and h are given by:

f(f) = g, f(g) = h, f(h) = f, f(g,h) = g

g(g) = g, g(f) = h, g(f,h) = f

h(f,g) = h, h(g,h) = g, h(h) = f

This looks terrible. To make it a little prettier, let us rename "f," "g," and "h" as "Fanny,"
"Geronimo" and "Hattie." Then one may represent this same collection of definitions as follows:

This sentence, which is named Fanny, turns itself into Geronimo, it turns Geronimo into
Hattie, it turns Hattie into itself, and it turns the pair 'Geronimo and Hattie' into Geronimo.

This sentence, which is named Geronimo, turns itself into itself, it turns Fanny into Hattie, and
it turns the pair 'Fanny and Hattie' into Fanny.

This sentence, which is named Hattie, turns itself into Fanny, it turns the pair 'Geronimo and
Hattie' into Geronimo, and it turns the pair 'Fanny and Geronimo' into Hattie.

This says the same thing as the preceding group of mathematical definitions, but it is a little more
colorful. The best way to visualize the situation is to think of Fanny, Geronimo and Hattie as a
group of three over-active magicians. Each one has a spell to turn each one into someone. For
instance, Fanny has a spell to turn herself into Geronimo; she has a spell to keep Hattie the same
as she was, and the has a spell to turn the combined group Geronimo/Hattie into Geronimo

Recall that a self-generating system is a stochastically computable rule for evolving
populations of finitely given hyperrelations. This is a very general definition; it leaves a lot of
freedom. For starters, therefore, let us consider the simplest possible situation. Given a collection
of hyperrelations {f,g,h,...}, one can form a vast variety of "compounds" of the form f(g), f(g,h),
h(g), and so forth. One of the very simplest self-generating systems says: "given a collection of
hyperrelations, replace it with the collection of all compounds which one can form from it."

For instance, one may think about self-generating systems in the context of our earlier
Fanny/Hattie/Geronimo example. Each magician had spells for changing magicians (or group of
magicians) into othermagicians. So the simplest self-generating system involving these
magicians consists of

1) each magician applying all the spells she knows, thus creating a new collection of

2) the new collection of magicians then applying all the spells they know

Get any book for free on: www.Abika.com

3) etc.

It is easy to see that, in the case of our simple example, if one starts with all the magicians
present, then all the magicians are so proficient that the group of three magicians will persist
forever. If one starts with just Fanny and Geronimo, they will immediately produce Hattie, and
the same will be true. Similarly, if one starts with just Fanny and Hattie, they will immediately
create Geronimo. But on the other hand, if one starts with Geronimo and Hattie, the two of them
will never be able to produce Fanny. Or if one starts with just Fanny, she will immediately turn
herself into Geronimo, who will then perpetuate herself forever....

A group of somewhat less proficient magicians is provided by the following set of rules:

f(f) = g

f(g) = f

g(g) = g

g(f) = h

h(f) = g

If one takes

{f,g} time = 1

this rule produces the collection {f(f),f(g),g(f),g(g)} = {g,f,g,h} = {f,g,h}, or

{f,g,h} time = 2

And iterated once again, the rule produces {f(f),f(g),f(h), g(f),g(g),g(h),h(f)}, or

{f,g,h} time = 3

In this particular case, after two steps, our dynamical rule has reached a fixed point. No matter
how many times one keeps iterating, one will keep on obtaining {f,g,h}.

In "magician-language", what one has here is

This sentence, whose name is Fanny, turns itself into Geronimo and turns Geronimo into

This sentence, whose name is Geronimo, turns itself into itself and turns Fanny into Hattie

This sentence, whose name is Hattie, turns Fanny into Geronimo

Get any book for free on: www.Abika.com

Starting from Fanny and Geronimo, as above, one will immediately get all three magicians.

More generally, a self-generating system is simply a rule which determines a "range"
collection of hyperrelations from the compounds formed by another "domain" collection of
hyperrelations. In these simple examples, I have taken collection of compounds itself as the
range collection. But this is not the only way to do things. As an example, one could consider the
rule: "given a collection of hyperrelations, replace it with two hyperrelations randomly drawn
from the collection of all compounds which one can form from it." The reader may determine for
herself the possible evolutionary courses which our collection {f,g} may take under this rule.

7.3.1. Antimagicians

Both of the "magician" examples discussed above were of the simplest kind. All possible
compounds were generated and kept. However, this type of self-generating system does not
appear to be capable of generating particularly complex behaviors. To get the full range of
dynamical behaviors, one must provide some way for compounds to eliminate one another. For
instance, in addition to our three faithful magicians Fanny, Geronimo and Hattie, one may
introduce three antimagicians, called anti-Fanny, anti-Geronimo and anti-Hattie. And one may
modify the rules of our game accordingly. At each time step, the following three processes are

first, all magicians cast all their spells

second, all magicians whose anti-magicians have been created are eliminated

third, all anti-magicians are eliminated

For instance, suppose one has

This sentence, named Fanny, turns itself into Geronimo, turns Geronimo into Hattie, and turns
Hattie into anti-Fanny

This sentence, named Geronimo, turns itself into Fanny, turns the pair 'Fanny and Geronimo'
into Hattie, and turns Hattie into anti-Hattie

This sentence, named Hattie, turns itself into Hattie, turns Fanny into Hattie, and turns
Geronimo into Fanny

Without anti-magicians, Hattie would be self-perpetuating. But the anti-magicians change all
that. Suppose one starts with

Geronimo and Hattie time 1

Then this evolves into the interim population

Get any book for free on: www.Abika.com

Fanny, Hattie, anti-Hattie

The magician and its anti-magician self-destruct, leaving

Fanny time 2

Then Fanny creates Geronimo, who creates Fanny, who creates Geronimo, and so on ad

Geronimo time 3

Fanny time 4

Geronimo time 5

Fanny time 6

Geronimo time 7

Fanny time 8

... ...

On the other hand, suppose one starts with

Fanny, Geronimo time 1

Then one has

Geronimo, Fanny, Hattie time 2

and from this one gets the interim population

Fanny, anti-Fanny, Geronimo,

Hattie, anti-Hattie

resulting in

Geronimo time 3

and yielding the same attracting cycle as before:

Fanny time 4

Geronimo time 5

Get any book for free on: www.Abika.com

Fanny time 6

Geronimo time 5

Fanny time 6

... ...

This sort of behavior, with Hattie and Fanny appearing and then disappearing, is much much
easier to set up with anti-magicians than without then. I will show a little later exactly how much
more computational power is yielded by the introduction of anti-magicians.

To get even more interesting behavior, one must stochasticize the dynamics. For example, one
could replace the three processes of our "antimagician" iteration with the following three
processes, to be executed at each time step:

first, each spell of each magician is cast with a certain fixed

probability (call it p)

second, all magicians whose anti-magicians have been created are eliminated

third, all anti-magicians are eliminated

This creates an unpredictable iteration: if one runs it several times, one may obtain many
different results, because there is no telling which spells will be chosen. The reader is
encouraged to explore the consequences of "stochasticizing" our Fanny/Geronimo/ Hattie

7.3.2. Self-Generation and Computation

So self-generating systems can be considered as models of real systems. But what kind of
behavior can they model? In fact it is not too hard to prove that they can model any kind of
behavior at all. Harold Bowman and I have constructed a very simple argument which shows
that self-generating systems are capable of universal computation. This means that any possible
behavior can be mimicked by some self-generating system.

Specifically, as hinted above, it turns out that simple systems of the Fanny/Geronimo/Hattie
variety are not enough. One needs to introduce anti-magicians as well. But if one does this, then
one very easily obtains a recipe for constructing a self-generating system tosimulate any given
computer. The basic idea is that systems with anti-magicians give one the ability to express the
two fundamental operations of conjunction (AND) and negation (NOT). Since all computers
can be built of AND and NOT gates, it follows (with a little work) that this type of system is a
universal computer.

Get any book for free on: www.Abika.com

The AND gate is easy; it can be done without anti-magicians. Let's say one wants Fanny to
create Josie only if both Geronimo and Hattie are present. Then one needs merely say

This sentence, named Fanny, turns the pair 'Geronimo and Hattie' into Josie

By simply not specifying Fanny to turn Geronimo individually or Hattie individually into Josie,
one makes Fanny into an AND function.

Of course, in the context of the whole system, it is possible that Geronimo or someone else
will turn Geronimo or Hattie into Josie -- but if one wants Fanny to be a good AND, one must
design one's system to prevent this from happening at the same time that Fanny is operating as an

Incidentally, it is worth noting that

This sentence, named Fanny, turns Fanny into Fanny, turns the pair 'Geronimo and Hattie' into
Josie, and turns Hattie into Geronimo

serves the purpose of executing the AND operation as well. Extra spells are allowed, so long as
they do not interfere.

To get NOT, on the other hand, one must proceed as follows. Let's say one wants Fanny to
create Hattie only if Geronimo is not present. Then one needs to specify

This sentence, named Fanny, turns itself into Hattie, and turns Geronimo into anti- Hattie

If Geronimo is not present, then Fanny produces Hattie, so Hattie is introduced into the next
population (assuming no one else is out there producing anti-Hatties). But if Geronimo is
present, then Fanny still acts on itself to produce Hattie, but it also acts on Geronimo to produce
anti-Hattie. The two cancel out, and one is left with no Hattie (assuming no one else is out there
producing Hatties).

7.3.3. Imperfectly Mixed Computation

The biggest lesson of the computer revolution is that by piecing ANDs and NOTs together one
can do just about anything. The reasoning of the past few paragraphs, elaborated appropriately,
leads to the consequent conclusion that self-generating systems (in particular "antimagician"
systems") can do just about anything.

But this is just the tip of the iceberg. The next question is, what about stochastic
"antimagician" systems? What if, at each stage, only a certain percentage of possible
compounds are formed? This is the case, for example, in real chemical solutions: not every
conceivable compounds forms at every moment. In chemical parlance, deterministic self-
generating systems correspond to infinitely "well-mixed" solutions, whereas stochastic self-
generating systems correspond to the more realistic case.

Get any book for free on: www.Abika.com

It turns out that, even in the stochastic case, it is possible to construct "antimagician" systems
which carry out universal computation -- to within any specified level of accuracy short of
perfection. The trick is an appropriate use of redundancy. For instance, if instead of just doing
NOT with one hyperfunction, one does it simultaneously with a sufficiently huge number of
hyperrelations, one is bound to get it right an arbitrarily high percentage of the time.

What this result shows is that one can build a viable computer in which each connection
between components has only a certain probability of existing. One may build a computer out of
"components" that circulate around each other, sometimes combining with one another to
produce new components, sometimes not. Computation can self-organize from an imperfectly-
mixed-up substrate.

Applied to biological and psychological systems, this conclusion would seem to have
profound consequences. The dual network model views the mind as a collection of processes,
interacting with one another and constantly creating new processes. The ideas of this sections
suggest that these general "processes" may perhaps be fruitfully modeled as interlocking self-
referential statements -- as simple statements about how other processes, and they themselves,
are to be transformed. This is an intriguing insight, and an important step on the path to the
"cognitive equation" of Chapter Eight.


In Section 7.3 I gave a simple "reductionist" model of hyperset dynamics -- the "basic
computational model." In this section I will briefly digress to describe a more interesting
elaboration of the same fundamental concept. Instead of mapping functions into sequences in an
arbitrary way, I will demonstrate how one might elegantly systematize the coding and decoding
of functions and sequences.

7.4.1. Array Operations

Let us begin with the concept of a rational array. An n-dimensional rational array may be
defined inductively as a finite sequence of (n-1)-dimensional rational arrays, where a 1-
dimensional rational array is just a finite sequence of rational numbers. Our eight basic
operations will be operations on rational arrays.

The most relevant examples are one, two and three-dimensional arrays; however, it is quite
possible to envision psychological uses for arrays of higher dimension. For instance, spacetime is
four-dimensional, and a five-dimensional array could therefore represent a scalar field over
spacetime, an eight-dimensional array a vector field over spacetime, etc.

Each rational array A comes equipped with a natural coordinate system, so that each rational
stored in A has a unique coordinate vector (a1,...,an), ai a nonnegative rational. This coordinate
system imposes a natural alphabetic order on the elements of A, which one may extend to
subsets of A by defining subset B to come before subset C if B - C contains a point which comes
before any point in C - B in the alphabetic ordering.

Get any book for free on: www.Abika.com

Human sensory inputs can be expressed very naturally in terms of rational arrays. For
instance, light on the retina forms a two-dimensional array; and sound waves on the eardrum
form a one-dimensional array. Muscle movements can also be easily expressed in terms of
rational arrays: when one has different amounts of stimulus sent to different points, the different
points can be envisioned as elements of a three-dimensional rational array. And, to take a
biological example, the interactions between proteins can also be effectively expressed in this
way: the surface of a protein is just a rational array.

In general, any continuous field can be approximated to within arbitrary accuracy by an
rational array. For example, if one wants to approximate a field on the positive hyperoctant of Rn
with 5 digits of accuracy, one may divide the positive hyperoctant of Rn up into alatticework of
cubes of side 10-6, and construct an n-dimensional rational array containing one element for each

Of course, given sufficiently complex codings, one can dispense with the whole formalism of
rational arrays and consider only binary sequences. But here I am not thinking in terms of
abstract algorithmic information, I am rather thinking in terms of concrete information
processing systems, for which "sufficiently complex codings" can present formidable practical
obstacles. Pointwise Operations

The first four of our nine operations are addition, multiplication, negation, and
maximization, which are defined pointwise. More explicitly, let A and B be two rational arrays.
Then the sum of A and B is A+B, the product of A and B is written AB, the maximum of A and
B is written A^B, and the negation of A is written -A. If a has coordinates (a1,...,an) in A, and a'
has coordinates (a1,...,an) in B, then,

a+a' has coordinates (a1,...,an) in A + B,

aa' has coordinates (a1,...,an) AB,

max(a,a') has coordinates (a1,...,an) in A^B, and

-a has coordinates (a1,...,an) in -A.

if A and B are of different sizes, then (a1,...,an) is assumed to exist in A + B, AB and A^B only if
it exists in both A and B.

It is worth noting that this collection of operations is redundant in two ways. One the one
hand, by combining negation and maximization one can generate any Boolean function, and thus
any computable function, including addition and multiplication. Secondly, by combining
multiplication and addition, one can generate any polynomial, and hence approximate any
continuous function, including the maximum function, to within arbitrary accuracy. However,
our goal here is not to give a minimal set of operations; it is to give an exhaustive set of basic

Get any book for free on: www.Abika.com
CHAOTIC LOGIC Combinatory Operations

Addition, multiplication, negation and maximization are all pointwise operations. Now I will
introduce two operations that act on whole arrays rather than on an entry-by-entry basis.

First, the cut-and-paste operator is a ternary operation which may be written C(A,B,S), where
A and B are general rational arrays and S is a sequence of nonnegative integers. The expression
C(A,B,S) is to be read: paste B into A, placing the first entry in B into position S in A.

More explicitly, what this means is as follows. Suppose S = (s1,...,sk). Then if a has coordinate
(s1 - r1,...,sk - rk) in A, where the rj are all nonnegative integers, then a has the same coordinate in
C(A,B,S). But if b has coordinate (s1 + r1 - 1,...,sk + rk -1) in B, where the rj are all nonnegative
integers, then b has the same coordinate in C(A,B,S).

For instance, C( (1,2,3,4,5,6,7), (9,9,9,9,9,9,9,9), 5) =

(1,2,3,4,9,9,9,9). The number 5 indicates that the elements 5-8 of the sequence B =
(9,9,9,9,9,9,9,9) are pasted onto the elements 1-4 of the sequence A = (1,2,3,4,5,6,7).

Cut-and-paste also permits us to build higher-dimensional arrays out of lower-dimensional
ones. For example, one has

C( (1,2), (3,4), (2,1) ) = 1 2


The coordinate (2,1) gives the point at which the array (3,4) is "pasted" onto the array (1,2).

In general, as the name suggests, cut-and-paste permits us to form new arrays by combining
parts of different old arrays.

Next, the reduce operation allows one to take part of an array and consider it as an array in
itself. This is of obvious utility as an adjunct to the cut-and-paste operation: it allows one to paste
in parts of arrays rather than just whole arrays. The simplest way to define the reduce operation
is as R(A,S,T), where A is an arbitrary rational array and S and R are lists of nonnegative
integers, with the property that the i'th entry in S never exceeds the i'th entry in S. Write S =
(s1,...,sn), R = (t1,...,tn). Then R(A,S,T) is the array composed of all elements of A whose
coordinates lie "between" the arrays S and T. Explicitly, if a has coordinate (a1,...,an) in
R(A,S,T), this means that a has coordinate (a1+s1-1,...,an+sn-1) in A, and ai + si < ti.

Finally, substitution is a ternary operation which may be denoted by S(A,B,C), to be read:
substitute A for B, everywhere B appears in C. The meaning of this isobvious in simple cases,
for instance S(6,(3,4),(1,2,3,4,3,4,5,3,4)) =

Get any book for free on: www.Abika.com


In general, there is an ambiguity here: what if two appearances of B overlap in C? However,
this can be resolved by the following rule: if there are two appearances of B in C, and it is not
possible to substitute A for both of them, substitute A for that appearance of B which occurs first
in C.

Note that substitution is a special instance of cut-and-paste. However, it is a very important
instance. A large percentage of the patterns that we recognize are repetitions . For instance, the
whole Behaviorist school of psychology is based on the recognition of repeated stimulus-
response associations! As we shall see, repetitions can be easily expressed in terms of

A special case of substitution is "change of notation." For instance, S(2,3,A) is the operation
of replacing every 3 in A with a 2. The inclusion of substitution as a basic operation guarantees
that no specific notation or "encoding" is essential to human thought. Random Generation

Our eighth operation, random generation, is the simplest of all. It may be defined as R(A),
where A is a one-dimensional integer array. Its function is to create a random array of
dimensions given by A, whose entries each have an equal probability of being either zero or one.
This operation could be simulated fairly well in terms of the other operations, by using standard
pseudo-randomness techniques; but for theoretical purposes I prefer to introduce true

The choice of a 50% chance of a 0 or 1 in each entry is purely a matter of convention. Using
the operation R(A) together with the previous seven operations, one may construct arrays so that
each entry a has a different probability pa, and one may choose the pa's to be arbitrarily close to
any number in [0,1]. Decoding

Finally, let us consider the operation of decoding. This is in a way the most fundamental
operation of all. Let us assign each one of our fundamental operations a code number, according
to the following arbitrary scheme:

addition = 1

multiplication = 2

negation = 3

maximization = 4

substitution = 5

Get any book for free on: www.Abika.com

cut-and-paste = 6

decoding = 7

random generation = 8

Next, let us arbitrarily assign the number 9 to stand for "open parenthese," and the number 10 to
stand for "close parenthese." Finally, the integers greater than 10 will be understood to denote
variables. Since the substitution operator is fundamental, the specifics of the encoding do not
matter; they can always be renamed.

Given this encoding, many integers sequences may be understood as sequences of operations .
This prepares us to define the decoding operator. Where A is a sequence (a one-dimensional
integer array), and B1,...,Bk are arbitrary rational arrays, D[A,B1,...,Bk] is the array obtained by
applying the sequence of operations encoded in A to the arrays B1,...,Bk, where the variable 'j' in
A is taken to refer to the array Bj+10.

For sake of generality, two notational conventions will be required. Not every sequence A
yields a well-defined sequence of operations; but if the operation D is given a sequence A which
does not yield a well-defined sequence of operations, it will be understood to give output
consisting of the 0-dimensional array "0". Also, if A contains m>k variable names, D[A,B1,...,Bk]
may be defined by D[A,B1,...,Bk,0,...,0], where each 0 represents an array of appropriate size and
dimension containing all zero entries.

Two simple examples are:

D[(1,9,9),(1,2,3)] = (1,2,3) + (1,2,3) = (2,4,6)


D[(1,9,2,10,9),(1,2,3),(4,2,1)] = (1,2,3) + (4,2,1)(1,2,3) = (5,6,6)

7.4.2. Array Component Systems

Now, using these nine operations, I will give an example of a new kind of self-generating
system -- a type of system called an array component system, or ACS. Let us begin with a list
of arrays Vi of the form

Vi = (Ai, Bi1,...,Bim(i,t)),

t = 0,1,2,...; i = 1,2,...,N(t)

where the Ai are code sequences, and the Bij are rational arrays. Each such array Vi may be
associated with a hyperfunction

H(Vi) = fi,

Get any book for free on: www.Abika.com

defined by the equation

fi(fjs) = H( D[Ai,Bi1,...,Bjm(j,t),Ajs, Bj1s,...,Bjm(j,t)s] )

The fi are the components.

In other words, each component is associated with a code sequence, and a bunch of arrays.
Applying one component to another means applying the code sequence of the first component to
the list of arrays consisting of the first component's arrays, the second component's code
sequence, and the second component's arrays. According to the conventions delineated above, it
is possible for each component to act on other components which contain different numbers of
arrays. "Missing" arrays are simply treated as zero arrays, and the control sequences Ai may
potentially contain conditional expressions indicating how to deal with zero arrays.

ACS's illustrate in a very concrete way how hyperrelations may emerge as natural models of
systems which are, in themselves, quite well-founded and computable. There is nothing in any
way mysterious about the Vi, nor about the idea that the various Vi can act on one another. But in
order to express this idea mathematically, one cannot use ordinary functions; one needs to use
hyperrelations. To put it another way: in order to express patterns relating to the interaction of
Vi, the only efficient course is to use hyperrelations.

7.4.3. Immune Systems as ACS's

Let us briefly consider a simple example, which will be taken up more thoroughly in Chapter
Ten: immunodynamics.

The immune system is complicated as well as complex, and it contains many different kinds of
cells. But the simplest mathematical models deal only with B-cells, and that is what I will do
here. Let us begin with the approach of de Boer et al (1990), in which each antibody type in the
immune system is associated with an integer sequence of length N. To be realistic, of
course,antibodies should be modeled as three-dimensional rational arrays, since they are three-
dimensional objects; but for the points I am making here, it is immaterial whether antibodies are
associated with 1-D or 3-D arrays.

In the 1-D model, one may think of a B-cell as a pair Vi = (A,Bi), where Bi is an integer
sequence, and A is an integer code sequence. The code sequence specifies what happens when
one forms fi(fj), or in other words when one forms


Specifically, what happens most of the time when fi(fj) is formed is nothing. But if the
conditions are right, the effects can be drastic. Define the raw match between two B-cells Vi and
Vj as the maximum number of consecutive bits in which the corresponding sequences Bi and Bj
are different. And define the match between two sequences as max{0, raw match(Bi,Bj) - T},
where T is some given threshold. In terms of component-systems, then, one may think of the

Get any book for free on: www.Abika.com

dynamics of the immune system as specifying how a B-cell Vi acts on those B-cells Vj for which
match(Bi,Bj) is large, thus causing the creation of new antibodies.

There is a great deal of biological subtlety involved here. In the crudest formal model,
however, what happens is as follows: if fi(fj) is formed and match(Bi,Bj) is large, then with a
certain probability, the cell Vj is killed by the cell Vi (in fact, this killing takes place indirectly,
via antibodies; but I do not need to consider these details here). But when the proportion of B-
cells with shape Bj that are killed falls within a certain critical range, then cells of this shape are
stimulated to reproduce. New B-cells are created.

Some of these new cells are identical to Vj, and some are new types Vk, which have have
shape sequences similar, but not identical to Bj. This is somatic mutation. It is the creation of
new B-cells by certain old B-cells acting on other old B-cells. There is randomness in the process
because there is no deterministic way of telling exactly which new types of B-cells will be

This B-cells-only model is an extreme oversimplication. But the more accurate models are
similar in spirit. Cells in the immune system act on one another, thus stimulating one another to
produce new cells. Sometimes these new cells are copies of old cells, but sometimes they are
structurally novel. Thisis a simple example of a component-system; and I have indicated in a
rough way how it can be modeled using systems of stochastically computable hypersets.

Chapter Eight


To anyone trained in physical science, the overall impression made by psychology and
neuroscience is one of incredible messiness. So many different chemical compounds, so many
different neural subsystems, so many different psychic dysfunctions, so many different
components of intelligence, perception, control.... And no overarching conceptual framework in
which all aspects come together to form a unified whole. No underlying equation except those of
physics and chemistry, which refer to a level incomprehensibly lower than that of thoughts,
emotions and beliefs. No cognitive law of motion.

Of course, there is no a priori reason to expect such a thing as a "cognitive law of motion" to
be possible at all. It is amazing that one can find far-reaching yet precise generalizations such as
Newton's laws in any field of study. To expect to find such conceptual jewels in every single
discipline may be asking more than the world has to offer.

But on the other hand, consider: Newton's laws would have been impossible without calculus,
general relativity would have been impossible without differential geometry, and quantum
physics would have been impossible without functional analysis. It is quite conceivable that,
once we have developed the appropriate mathematical concepts, the goal of a "cognitive law of
motion" will cease to appear so unrealistic.

Get any book for free on: www.Abika.com

In fact, my contention is that this time has already come . As of 1993, I suggest, we have
collectively developed precisely the mathematical and conceptual tools required to piece together
the rudiments of a "fundamental equation of mind." The most important of these tools, I suggest,
are four in number:

1) component systems

2) pattern theory

3) algorithmic information

4) strange attractors

In this chapter I show how these ideas may be used to formulate a new type of equation, which I
call a "self-generating pattern dynamic." This is the type of equation, I suggest, which makes one
thought, one emotion, drift into the next. It is the general form which a cognitive law of motion
must take.

In The Evolving Mind the term "self-structuring system" is used to describe a system which,
more than just organizing itself, structures and patterns itself; a system which studies the
patterns in its past, thus determining the patterns in its future. Here I will delineate a class of
systems which is a subset of the self-structuring systems -- namely, the class of systems that
evolve by self-generating pattern dynamics. My hypothesis is that minds, as well as being self-
structuring, also fall within this narrower category.

This is at the same time a brand new approach to mind, and a re-interpretation of the dual
network model given in Chapter Three. The cognitive equation presents a dynamical view of
mind, whereas the dual network presents a static view; but the two are ultimately getting at the
same thing. In the dual network perspective, one begins with a structure, asks what the dynamics
must be to retain that structure, and obtains the answer: something like the cognitive equation. In
the cognitive equation perspective, on the other hand, one begins with a dynamical iteration, asks
what sorts of structures will tend to persist under this iteration, and obtains the answer:
something like the dual network. Dynamics lead to statics, statics leads to dynamics, and the
simultaneous analysis of the two provides the beginning of an understanding of that mysterious
process called mind.


The systems theory of Chapter Seven gives us a new way of looking at the dual network. The
mind, filtered through the component-systems/self-generating-systems view, emerges as a
structured network of components.

Note that this conclusion refers primarily to the mind -- the patterns in the brain -- and not to
the brain itself. One could model the brain as a component-system, insofar as each neuron is not
a fixed "component" but aspace of potential components -- one component for each condition of
its synaptic potentials. When neuron A feeds its output to neuron B, thus altering its synaptic

Get any book for free on: www.Abika.com

potential, it is in effect "creating" a new element of the space corresponding to neuron B. This
may be a fruitful way to think about the brain. However, it is much more direct and elegant to
view the collection patterns in the brain as a self-generating component-system -- recalling that
a pattern is first of all a process. In the context of general systems theory, the pattern-theoretic
model of mind is not merely useful but conceptually essential.

The mind is vastly different from a soup of molecules -- unlike the immune system, it is not
even in a rough approximation well-mixed. (Putting brain tissue in a blender to make
"synaptosome soup" is a nifty method for determining the levels of different neurotransmitters in
the brain, but it has a definite negative effect on brain function.) But the relatively rigid structure
of the brain does not prevent it from being a genuine self-generating system, and a genuine

There is an overall global structure of mind; and this structure self-organizes itself by a
dynamic of typeless interaction, in which some mental processes act on others to produce yet
others, without respect for any kind of "function/argument" distinction. One can model this sort
of activity in terms of stochastic computation alone, without mentioning hypersets or
component-systems -- this is the contemporary trend, which I have followed in my previous
research. However, in many situations this point of view becomes awkward, and the only way to
express the reality clearly is to adopt a typeless formalism such as the one developed in Sections
8.2 and 8.4.

Let us take a simple heuristic example -- purely for expository purposes, without any pretense
of detailed biological realism. Let us think, in an abstract way, about the relation between a
mental process that recognizes simple patterns (say lines), and a mental process that recognizes
patterns among these simple patterns (say shapes). These shape recognizers may be understood
as subservient to yet higher-level processes, say object recognizers. If the shape recognizer has
some idea what sort of shape to expect, then it must partially reprogram the line recognizer, to
tell it what sort of lines to look for. But if the line recognizer perpetually receives instructions to
look for lines which are not there, then it must partially reprogram the shape recognizer, to
cause it to give more appropriateinstructions. Assuming there is a certain amount of error innate
in this process, one has an obvious circularity. The collection of two processors may be naturally
modeled as a self-generating system.

It seems likely that the specific programs involved in these perceptual processes involve linear
array operations. But still, one does not yet have an array component system. To see where
component-systems come in, one needs to take a slightly more realistic view of the perceptual
process. One must consider that the mapping between line-recognizing processes and shape-
recognizing processes is many-to-many. Each shape process makes use of many line-recognizing
process, and the typical line-recognizing process is connected to a few different shape-
recognizing processes. A shape-recognizing process is involved in creating new line-
recognizing processes; and a group of line-recognizing processes, by continually registering
complaints, can cause the object-recognizing parents of the shape-recognizing processes to create
new shape-recognizing processes.

Get any book for free on: www.Abika.com

What this means is that the reprogramming of processes by one another can be the causative
agent behind the creation of new processes. So the collection of processes, as a whole, is not
only a self-generating system but a component-system as well. By acting on one another, the
mental processes cause new mental processes to be created. And, due to the stochastic influence
of errors as well as to the inherent chaos of complex dynamics, this process of creation is
unpredictable. Certain processes are more likely to arise than others, but almost anything is
possible, within the parameters imposed by the remainder of the network of processes that is the

This example, as already emphasized, is merely a theoretical toy. The actual processes
underlying shape and line recognition are still a matter of debate. But the basic concept should be
clear. Whenever one has sophisticated multilevel control, combined with heterarchical
relationship, one has a situation in which self-referential models are appropriate. The whole
network of processes can be modeled otherwise, using only stochastic computer programs. But
the vocabulary of self-generating and component-systems leads to a novel understanding of the
basic phenomena involved.


Now let us return to the formal "process iterations" of Chapter Seven. Equation (**), in itself,
is much too general to be of any use as a "cognitive law of motion." If System1 and T are chosen
appropriately, then (**) can describe anything whatsoever. That is, after all, the meaning of
universal computation! However, this simple iteration is nevertheless the first stop along the path
to the desired equation. What is needed is merely to specialize the operator T.

Instead of taking the compounds formed from Systemt, I suggest, one must take the patterns
in these compounds . This completes the picture of the mind as a system which recognizes
patterns in itself, which forms its own patterns from its own patterns. There might seem to be
some kind of contradiction lurking here: after all, how can patterns in hyperrelations themselves
be hyperrelations? But of course, this is precisely the distinctive quality of hyperrelations: they
subvert the hierarchy of logical types by potentially belonging to their own domain and range.
And this unusual property does not violate the laws of physical reality, because the
hyperrelations required for practical modeling can themselves be perfectly well modeled in terms
of ordinary Boolean functions.

To make this more precise, define the relative structure St^ of a set A = {a, b, c, ...} as the set
of all x which are patterns in some subset of A relative to some other subset of A.

For instance, in the algorithmic information model, "x is an exact pattern in b relative to a"

1) b produces x from a

2) I(x|b,a) < I(x|a)

More generally, statement (2) must be replaced with a less specific formal notion such as

Get any book for free on: www.Abika.com

2') |x\{b,a}| < |x\a|

The generalization of this notion to encompass patterns that are approximate rather than exact is
quite straightforward.

In this notation, the simplest self-generating pattern dynamic says that, where Systemt is the
system at time t,

Systemt+1 = St^( R[Systemt] ) (***)


. 5
( 10)