. 3
( 7)


about what is given. To ignore this elementary fact can be to
land in serious trouble.

a c o m m on e r ro r
Another common error is to overlook the fact that the order
in which A and B are written matters and can be crucial.
Formally, it is not generally true that P(A g i v e n B) is equal
to P(B g i v e n A). Treating these two probabilities as the same
is often known as the prosecutor™s fallacy because it sometimes
arises in legal cases.4 Suppose that DNA found at the site of a
crime is compared with that obtained from a suspect and that
the two match. It may be claimed in court that the probability
of such a match occurring with the DNA of a randomly chosen
individual in the population is one in several million. The jury
is then invited to infer that the accused is guilty because what
has happened would be extremely unlikely if the accused were
not guilty. Let us express this in the formalism set out above.
The probability, which is said to have a value of one in several
million, may be written
P(match g i v e n accused not guilty).
If we (incorrectly) reverse the elements inside the brackets we
would interpret this as

Examples are not con¬ned to the ¬eld where the term was coined. One of

the earliest errors of this kind was due to Hoyle and Wickramasinghe and
before them, Le Comte du No¨ y, who used it to prove that God exists. In
the next chapter we shall see that because some occurrence in the physical
world has an incredibly small probability g i v e n the chance hypothesis, it
cannot be assumed that this was the probability of the chance hypothesis
g i v e n the occurrence and hence the probability of its complement (that
God was the cause) was very close to one. More details will be found in
in Bartholomew (1984), see especially chapter 3. Overman (1997) mentions
these and other examples and falls into the same trap.
God, Chance and Purpose
P(accused not guilty g i v e n match).
As this is negligibly small, the complementary event, that the
accused is guilty, seems overwhelmingly probable. The rea-
soning is fallacious but an over-zealous prosecutor and lay jury
can easily be persuaded that it is sound. It is true that there
are circumstances in which these two probabilities might hap-
pen to coincide but this is not usually the case. I shall not go
further into the technicalities of why the reasoning behind the
prosecutor™s fallacy is mistaken but one can get an indication
that something might be wrong by the following considera-
tion. Intuitively, one would expect that the judgement of guilt
ought to depend also on the probability of being not guilty
in the absence of any evidence at all “ but here this does not
come into the picture.
The prosecutor™s fallacy often arises when very small prob-
abilities are involved, as in the above example. Very small
probabilities also play a key role in judging whether bio-
logically “ and theologically “ interesting events could have
occurred. It is, therefore, particularly important to be alert for
examples of A and B being inadvertently reversed.
c h a pt e r 6
What can very small probabilities tell us?

It is often claimed that such things as the origin of life on earth and
the coincidental values of cosmic constants are so improbable on the
˜chance hypothesis™ that they must point to divine action. The correct-
ness of this conclusion depends on both correct calculation and valid
forms of inference. Most examples fail on both counts. Two common
mistakes in making probability calculations were identi¬ed in chapter
¬ve. In this chapter I explain and illustrate the three main approaches
to inference: signi¬cance testing, likelihood inference and Bayesian

w h at i s t h e a rg u m e n t a b o ut ?
It is tempting to argue that if something has a very small
probability we can behave as if it were impossible. When we
say ˜small™ here we are usually thinking of something that is
extremely small, like one in several millions. We run our lives
on this principle. Every day we are subject to a multitude of
tiny risks that we habitually ignore. We might be struck by
a meteorite, contract the ebola virus, ¬nd a two-headed coin
or forget our name. Life would come to a halt if we paused
to enumerate, much less prepare, for such possibilities. Even
the most determined hypochondriac would be hard put to
identify all the risks we run in matters of health. Surely, then,
we can safely reject the possibility of anything happening if
the probability is extremely small? Actually the position is
God, Chance and Purpose
more subtle and we must be a little more careful about what
we are saying.
Every time we play a card game such as bridge we are dealt
a hand of thirteen cards. The probability of being given any
particular hand is extremely small “ about one in 6.4 — 1011
and yet it undoubtedly has happened and we have the hand
before us as evidence of the fact. That very small probability
does not, however, give us grounds for doubting that the
pack was properly shuf¬‚ed. There are other situations like
this, where everything that could possibly happen has a very
small probability, so we know in advance that what will have
happened will be extremely unlikely. It is then certain that a
very rare event will occur! Does this mean that there is never
any ground to dismiss any hypothesis “ however absurd?
It is clear that there is a problem of inference under uncer-
tainty here with which we must get to grips. Before plunging
into this major topic, I pause to mention a few examples where
very small probabilities have been used as evidence for the
divine hand. In the more spectacular cases these have been
held to show that there must have been a Designer of the
universe. It is important, therefore, that we get our thinking
straight on such matters. Actually there is a prior question
which arises concerning all of these examples; not only must
we ask whether the inferences based on them are legitimate,
but whether the probabilities have been correctly calculated
in the ¬rst place. Most fail on both counts.

e x a m p l e s o f v e ry s m a l l p ro ba b i l i t i e s
A common line of argument, employed by many Christian
apologists, is that if we can ¬nd some event in nature which,
on chance alone, would have an extremely small probability,
then chance can be ruled out as an explanation. The usual
What can very small probabilities tell us? 79
alternative envisaged is that the event must have been caused
by the deity, or someone (or thing) operating in that role. It
is not hard to ¬nd such events and, even if their probabili-
ties cannot always be calculated exactly, enough can usually
be said about them to establish that they are, indeed, very
small. Some of these examples are longstanding and have
been suf¬ciently discredited not to need extended treatment
here. A number were given in God of Chance (Bartholomew
1984), and Overman (1997) reports ¬ve such calculations in his
section 3.7.
The earliest example appears to have been John
Arbuthnot™s discovery that male births exceeded female births
in London parishes in each of the eighty-two years from 1628
to 1710. Le Comte du No¨ y looked at the more fundamen-
tal question of the origin of life and considered the proba-
bility that it might have happened by the fortuitous coming
together of proteins to form an enzyme necessary for life
to exist. He quoted the extraordinarily small probability of
2.02 — (1/10)321 (Hick 1970, p. 15). This is an example of
a mistake in the calculation. It falls into the common error
of supposing that enzymes are formed by a kind of random
shaking together of their constituent parts. Hick, appealing
to Matson (1965), recognised this and rejected the calcula-
tion. No biologist supposes that enzymes were formed in such
a way and the calculation is not so much wrong as irrele-
vant. In effect this is another attempt to use the multiplica-
tion rule for independent probabilities, when the probabili-
ties involved are not independent. Ayala (2003)1 makes this
point very clearly (see, especially page 20). He returns to
This paper by Ayala is an excellent review of the ˜argument from design™

by a biologist and philosopher. Apart from dealing with the matter of small
probabilities the paper contains much that is relevant to the debate on
Intelligent Design, which is treated in the following chapter.
God, Chance and Purpose
this topic in his most recent book (Ayala 2007). In chapter 8,
especially on pages 153 and 154, Ayala points out that the
assumptions on which such calculations rest are simply wrong.
There would, perhaps, be no point in relating these episodes
were it not for the fact that the same error is continually being
repeated. In his attempt to show that life was ˜beyond the
reach of chance™, Denton (1985) uses essentially the same
argument, taking as an illustration the chance of forming
real words by the random assemblage of letters. This exam-
ple has been discussed in more detail in Bartholomew (1996,
pp. 173ff.)2 but it suf¬ces here to say that Denton™s calcu-
lation is based on an oversimpli¬ed model of what actually
happens in nature. In addition to Overman™s examples men-
tioned earlier, Hoyle and Wickramasinghe (1981) made the
same kind of error and this deserves special mention for two
reasons. First, it appears to have given currency to a vivid
picture which helps to grasp the extreme improbabilities of
the events in question. Hoyle compares the chance to that
of a Boeing 747 being assembled by a gale of wind blow-
ing through a scrap yard.3 This analogy is widely quoted,
misquoted (one example refers to a Rolls Royce!) and mis-
interpreted as any web search will quickly show. It was in
Hoyle™s book The Intelligent Universe (1983, p. 19) where it
appears as:

Ayala (2007) also uses the analogy of words formed by typing at random in

a section beginning on page 61, headed ˜A monkey™s tale™. He then proposes
a more realistic version on page 62, which is based on the same idea as
the example referred to here. This example ¬rst appeared in Bartholomew
None of the quotations I have tracked down give a precise reference to

its origin. However, on the web site http://home.wxs.nl/∼gkorthof/
kortho46a.htm Gert Kortof says that the statement was ¬rst made in a
radio lecture given by Hoyle in 1982.
What can very small probabilities tell us? 81
A junkyard contains all the bits and pieces of a Boeing-747, dismem-
bered and in disarray. A whirlwind happens to blow through the yard.
What is the chance that after its passage a fully assembled 747, ready
to ¬‚y, will be found standing there?

Inevitably, for such a famous utterance, there is even doubt as
to whether Hoyle was the true originator. Whether or not the
probability Hoyle and Wickramasinghe calculated bears any
relation at all to the construction of an enzyme is very doubt-
ful, but its vividness certainly puts very small probabilities in
Secondly, their argument is a ¬‚agrant example of the pros-
ecutor™s fallacy, or the illegitimate reversal of the A and B in
the probability notation above (see p. 75). For what Hoyle and
Wickramasinghe actually claimed to have calculated was the
probability that life would have arisen g i v e n that chance
alone was operating. This is not the same as the proba-
bility of chance g i v e n the occurrence of life, which is how
they interpret it. That being so impossibly small, Hoyle and
Wickramasinghe wrongly deduce that the only alternative
hypothesis they can see (God) must be virtually certain.
In more recent examples the errors in calculation are some-
times less obvious, usually because no explicit calculation is
made. Here I mention two very different examples. The ¬rst
lies behind Stephen J. Gould™s claim that if the ¬lm of evolu-
tion were to be rerun, its path would be very different and we,
in particular, would not be here to observe the fact! We shall
meet this example again in chapter 11 where its full signi¬cance
will become clear. Here I concentrate on it as an example of a
fallacious probability argument. Gould visualises evolution as
a tree of life. At its root are the original, primitive, forms from
which all life sprang. At each point in time at which some dif-
ferentiating event occurs, the tree branches. Evolution could
God, Chance and Purpose
have taken this path or that. The tree thus represents all possi-
ble paths which evolution could have taken. The present form
of the world is just one of many end-points at the very tip of one
of the branches. All of the other tips are possible end-points
which happen not to have been actualised. In reality our pic-
ture of a tree is limiting, in the sense that there are vastly more
branches than on any tree we are familiar with. The probabil-
ity of ending up at our particular tip is the probability of taking
our particular route through the tree. At each junction there
are several options, each with an associated probability. These
probabilities need not be individually small but, if successive
choices are assumed to be independent, the total probability
will be the product of all the constituent probabilities along
the path. Given the enormous number of these intermediate
stages, the product is bound to be very small indeed. From this
Gould™s conclusion follows “ or does it? His assumption was
tacit but wrong. It is highly likely that the branch taken at any
stage will depend on some of the choices that have been made
earlier and that other environmental factors will be called into
play to modify the simplistic scheme on which the indepen-
dence hypothesis depends. In fact there is no need to rely on
speculation on this point. As we shall see in chapter 11, there
is strong empirical evidence that the number of possible paths
is very much smaller than Gould allows. In that chapter we
shall look at Simon Conway Morris™ work on convergence in
evolution. This draws on the evidence of the Burgess shales
of British Columbia just as Gould™s work did. It shows how
facile it can be to make the casual assumption of independence
and then to build so much on it.
My second example is quite different and is usually dis-
cussed in relation to the anthropic principle.4 It relates to the

The anthropic principle was discussed from a statistical point of view in

Bartholomew (1988, pp. 140ff.), where some references are given. It has
What can very small probabilities tell us? 83
remarkable coincidence in the basic parameters of the universe
which seem to have been necessary for a world of sentient
beings to come into existence and to observe the fact! Any list
of such coincidences will include the following:
(i) If the strong nuclear force which holds together the parti-
cles in the nucleus of the atom were weaker by more than
2 per cent, the nucleus would not hold together and this
would leave hydrogen as the only element. If the force
were more than 1 per cent stronger, hydrogen would be
rare, as would elements heavier than iron.
(ii) The force of electromagnetism must be 1040 times
stronger than the force of gravity in order for life as
we know it to exist.
(iii) If the nuclear weak force were slightly larger there would
have been little or no helium produced by the big bang.
In combination, these and other coincidences of the same
kind add up to a remarkable collection of surprising facts about
the universe. It thus turns out that the values of many basic
constants have to be within very narrow limits for anything
like our world to have emerged. Probabilities are brought into
the picture by claiming that the chance of such coincidences is
so remote that we must assume that the values were ˜¬xed™ and
that the only way they could have been ¬xed was for there to be
a supreme being who would have done it. This is seen by many

been used as an argument for God™s existence by, for example, Monte¬ore
(1985), where the list of coincidences which he gives in chapter 3 is somewhat
different from that given here. Other discussions will be found in Holder
(2004) and Dowe (2005). Sharpe and Walgate (2002) think that the principle
has been poorly framed and they wish to replace it by saying ˜that the
universe must be as creative and fruitful as possible™. The most recent
discussion in a much broader context is in Davies (2006). An illuminating
but non-probabilistic discussion will be found in ˜Where is natural theology
today?™ by John Polkinghorne (2006). Dembski (1999, p. 11) regards these
coincidences as evidence of Intelligent Design.
God, Chance and Purpose
to be one of the most powerful arguments for the existence of
God “ and so it may be “ but the probabilistic grounds for that
conclusion are decidedly shaky. The argument consists of two
parts. First it is argued that the probability of any one param-
eter falling within the desired range must be in¬nitesimally
small. Secondly, the probability of them all falling within their
respective ranges, obtained by multiplying these very small
probabilities together, is fantastically small. Since this rules
out the possibility of chance, the only option remaining is that
the values were deliberately ¬xed.
Let us examine each step in the argument, beginning with
the second. There is no obvious reason for supposing that
the selections were made independently. Indeed, if there were
some deeper model showing how things came about, it could
well be that some of the parameter values were constrained
by others or even determined by them. The total probabil-
ity might then be much larger than the assumption of inde-
pendence suggests. But surely something might be salvaged
because there is so little freedom in the determination of each
parameter treated individually; this brings us back to the ¬rst
point. Why do we suppose it to be so unlikely that a given
parameter will fall in the required small interval? It seems
to depend on our intuition that, in some sense, all possible
values are equally likely. This is an application of the prin-
ciple of insuf¬cient reason which says that, in the absence of
any reason for preferring one value over another, all should
be treated as equally likely. The application of this principle
is fraught with problems and it does not take much ingenuity
to construct examples which show its absurdity. The best we
can do is to say that if it were the case that all values were equally
likely and also if the parameters were independent, t h e n it
would follow that the creation ˜set-up™, was, indeed extremely
unlikely if nothing but chance were operating. But there is no
What can very small probabilities tell us? 85
evidence whatsoever for either of these assumptions “ they
merely refer to our state of mind on the issues and not to any
objective feature of the real world.
It should now be clear that the various very small probabil-
ities on offer in support of theistic conclusions must be treated
with some reserve. But that still leaves us with the question of
how we should interpret them if they were valid.

i n f e r e n c e f ro m v e ry s m a l l p ro ba b i l i t i e s
The question is: if some event, or proposition, can be validly
shown to have an exceedingly small probability on the chance
hypothesis (whatever that might turn out to mean) on what
grounds can we reject the hypothesis? As noted already, this
is the question which Dembski seeks to answer in his support
of Intelligent Design but, as Dembski himself recognises, this
is not a new issue but one older even than modern statis-
tics. We shall look at Dembski™s arguments in the following
chapter but ¬rst, and without introducing the complications
of the contemporary debate, consider the problem from ¬rst
First, I reiterate the point made in relation to dealing a
bridge hand selected from a pack of cards, namely that a
small probability is not, of itself, a suf¬cient reason for reject-
ing the chance hypothesis. Secondly, there are three broad
approaches, intimately related to one another, which are in
current use among statisticians. I take each in turn.
I start with signi¬cance testing, which is part of the everyday
statistical routine “ widely used and misused. If we wish to
eliminate chance from the list of possible explanations, we
need to know what outcomes are possible if chance is the
explanation. In most cases all possible outcomes could occur,
so chance cannot be ruled out with certainty. The best we
God, Chance and Purpose
can hope for is to be nearly certain in our decision. The basic
idea of signi¬cance testing is to divide the possible outcomes
into two groups. For outcomes in one group we shall reject
the chance hypothesis; the rest will be regarded as consistent
with chance. If the outcomes in the rejection set taken together
have a very small probability, we shall be faced with only two
possibilities (a) the chance hypothesis is false or (b) a very
rare event has occurred. Given that very rare events are, as
their name indicates, very rare, one may legitimately prefer
(a). If we make a practice of rejecting the chance hypothesis
whenever the outcome falls in the rejection set we shall only
be wrong on a very small proportion of occasions when that
is, indeed, the case. The probability that we shall be wrong in
accepting the chance hypothesis on any particular occasion is
therefore very small.
In essence this is how a signi¬cance test works, but it leaves
open two questions (i) how small is ˜small™ and (ii) how should
we construct the rejection set?
This brings us to a second version of the approach, often
associated with the names of Neyman and Pearson, for which
I begin with the second of the above questions. Let us imagine
allocating possible outcomes to the rejection set one by one.
The best candidate for inclusion, surely, is the one which is
furthest from what the chance hypothesis predicts, or alter-
natively perhaps, the least likely. In practice, these are often
the same thing. In fact, in this case, we may de¬ne ˜distance ™
by probability, regarding those with smaller probability as
further away. We can then go on adding to the rejection set,
starting with the least probable and working our way up the
probability scale. But when should we stop? This is where
the answer to the ¬rst question comes in. We want to keep
the overall probability small, so we must stop before the col-
lective probability of the rejection set becomes too large. But
What can very small probabilities tell us? 87
there are considerations which pull us in opposite directions.
On the one hand we want to keep the probability very small
and this argues for stopping early. On the other, if we make
the net too small, we reduce the chance of catching anything.
For the moment I shall describe the usual way that statisticians
take out of this dilemma. This says that there are two sorts
of mistakes we can make and that we should try to make the
chance of both as small as possible. The ¬rst kind of mistake
is to reject the chance hypothesis when it is actually true. The
probability of this happening is called the ˜size of the rejec-
tion region™. The second kind of error is failing to reject the
chance hypothesis when we ought to “ that is when some other
hypothesis is true. This argues for making the rejection region
as large as possible, because that way we make the net larger
and so increase our chance of catching the alternative. The
reader may already be familiar with these two kinds of error
under the guise of false positives and false negatives. If, for
example, a cervical smear test yields a positive result when the
patient is free of disease, this is said to be a false positive (error
of Type I). If a patient with the condition yields a negative
result that is described as a false negative (error of Type II). In
both contexts any rule for deciding must be a compromise.
Let us see how this works out in two particular cases. As
noted earlier, John Arbuthnot claimed to have found evidence
for the divine hand in the records of births in London parishes.
In the eighty-two years from 1629to 1710there was an excess of
male births over female births in every one of those years. If the
chances of male and female births were equal (as Arbuthnot
supposed they would be if chance ruled), one would have
expected an excess of males in about half of the years and an
excess of females in the rest. This is not what he found. In
fact the deviation from the ˜chance™ expectation was so great
as to be incredible if sex was determined as if by the toss
God, Chance and Purpose
of a coin. Can we eliminate chance as an explanation in this
case? Arbuthnot thought we could and he based his case on
the extremely small probability of achieving such an extreme
departure from expectation if ˜chance ruled™. It would have
been remarkably prescient of him to have also considered
the probability of failing to detect a departure if one really
existed, but this was implicit in his deciding to include those
cases where the proportion of males, or females, was very
different from equality.
My second example is included partly because it has ¬gured
prominently in Dembski™s own writings, to which we come
in the next chapter. It concerns the case of Nicolas Caputo,
the one time county clerk of New Jersey county in the United
States. It is a well-established empirical fact that the position
on the ballot paper in an election in¬‚uences the chance of suc-
cess. The position is determined by the clerk and, in this case,
the Democrats had ¬rst position on forty out of forty-one
occasions. Naturally, the Republicans suspected foul play and
they ¬led a suit against Caputo in the New Jersey Supreme
Court. If the clerk had determined the position at random,
the two parties should have headed the list on about half of
the occasions. To have forty of forty-one outcomes in favour
of the party which Caputo supported seemed highly suspi-
cious. Such an outcome is certainly highly improbable on the
˜chance™ hypothesis “ that Caputo allocated the top position
at random. But what is the logic behind our natural inclina-
tion to infer that Caputo cheated? It cannot be simply that the
observed outcome “ forty Democrats and only one Republican
is highly improbable “ because that outcome has exactly the
same probability as any other outcome. For example, twenty
Democrats and twenty-one Republicans occurring alternately
in the sequence would have had exactly the same probability,
even though it represents the nearest to a ¬fty:¬fty split that
What can very small probabilities tell us? 89
one could have. Our intuition tells us that it is the combination
of Caputo™s known preference with the extreme outcome that
counts. According to the testing procedure outlined above,
we would choose the critical set so that it has both a very small
probability and includes those outcomes which are furthest
from what we would expect on the chance hypothesis that
Caputo played by the rules.

i n f e r e n c e to t h e b e s t e x p lanat i on
This is a second way of approaching the inference problem.5
I borrow the terminology from philosophers and only bring
statisticians into the picture at a later stage. This is a delib-
erate choice in order to bring out a fundamental idea which
is easily lost if I talk, at this stage, about the statistical term
˜likelihood™. Inference to the Best Explanation is the title of a
book by philosopher Peter Lipton (Lipton 2004), whose ideas
have been seen as applicable to theology. The term inference to

There is an interesting parallel between developments in matters of infer-

ence in philosophy and statistics which has not, as far as I am aware, been
remarked upon before. With very few exceptions, these have been entirely
independent. The basic idea of signi¬cance testing is that hypotheses are
rejected “ not accepted. A statistical hypothesis remains tenable until there
is suf¬cient evidence to reject it. In philosophy Karl Popper™s idea of fal-
si¬cation is essentially the same. The likelihood principle, introduced into
statistics in the 1920s, has surfaced in philosophy as inference to the best
explanation, which is discussed in this section. Bayesian inference, to which
we come later in the chapter, is becoming popular in philosophy after the
pioneering work of Richard Swinburne set out in The Existence of God
(Swinburne 2004). In his earlier work he referred to Bayesian inference as
Con¬rmation Theory. A recent example is Bovens and Hartmann (2003).
One important difference is that statisticians give probability a more exclu-
sive place. For example, the judgement of what is ˜best™ in Lipton™s work
does not have to be based on probability. There is an interesting ¬eld of
cross-disciplinary research awaiting someone here.
God, Chance and Purpose
the best explanation is almost self-explanatory. It is not usually
dif¬cult to conceive of several explanations for what we see
in the world around us. The question then is: which explana-
tion is the best? What do we mean by ˜best™ and how can we
decide which is best? This is where probability considerations
come in.
Suppose we return home one day and ¬nd a small parcel
on the doorstep. We ask ourselves how it came to be there.
One explanation is that a friend called and, since we were
not at home, left the parcel for us to ¬nd. An alternative is
that a carrier pigeon left it. Suppose, for the moment, that
no other explanation suggests itself to us. Which is the better
explanation? The ˜friend™ hypothesis is quite plausible and
possible; the pigeon hypothesis much less so. Although it is
just possible, it seems scarcely credible that a pigeon should
be dispatched with the parcel and choose to deposit it on a
strange doorstep. The appearance of the parcel seems much
more likely on the ˜friend™ rather than the ˜pigeon™ explanation.
In coming to this conclusion we have, informally, assessed
the probability of a parcel™s being deposited by a friend or
a pigeon and concluded that the friend provides the better
explanation. Let us set out the logic behind this conclusion
more carefully. There are two probabilities involved. First
there is the probability that a friend decided to deliver a parcel
and, ¬nding us out, deposited it on the doorstep. The event we
observe is the parcel. We might write the relevant probability
as P(parcel g i v e n friend). Similarly on the pigeon hypothesis
there is P(parcel g i v e n pigeon). We judge the former to
be much higher than the latter and hence prefer the friend
explanation because it makes what has happened much more
likely. In that sense the friend is the better explanation.
The procedure behind this method of inference is to take
each possible explanation in turn, to estimate how probable
What can very small probabilities tell us? 91
each makes what we have observed and then to opt for the
one which maximises that probability. In essence that is how
inference to the best explanation works in this context.
There is one very important point to notice about all of
this “ it merely helps us to choose between some given expla-
nations. There may be many other possible explanations which
we have not thought of. It does not, therefore, give us the ˜best™
explanation in any absolute sense but merely the best among
the explanations on offer. Secondly, notice that the probabili-
ties involved do not have to be large. It need not be very likely
that our friend would have left a parcel on our doorstep “
indeed, it may be a very unlikely thing to have happened. All
that matters is that it should be much more likely than that a
pigeon put it there. The relative probabilities are what count.
If the probabilities are so very low, there would be a powerful
incentive to look for other explanations which would make
what has happened more likely, but that is a separate issue.
The debate between science and religion, when pursued as
objectively as possible, involves choosing between explana-
tions. Two explanations for life in the universe are (a) that God
created it in some fashion and (b) that it resulted, ultimately,
from the random nature of collisions of atoms (as it has been
put). If we could calculate the probabilities involved, it might
be possible to use the principle to decide between them. Dowe
(2005),6 for example, uses the principle in relation to the ¬ne
tuning of the universe and the evidence which that provides
for theism.
This principle of choice has been one of the key ideas in the
theory of statistics for almost a century. It is also particularly
associated with the name of Sir Ronald Fisher and especially
with a major publication in 1922, although the idea went back

The discussion begins on p. 154 in the section on God as an explanation.
God, Chance and Purpose
to 1912 (see Fisher Box (1978), especially number 18 in the
list of Collected Papers). In statistics the principle is known as
maximum likelihood, where the term likelihood is used in a
special sense which I shall explain below. One advantage of
drawing attention to this connection is that it facilitates the
elucidation of the limitations of the principle.
Because the ideas involved are so important for inference
under uncertainty, we shall look at a further example con-
structed purely for illustrative purposes. This is still to over-
simplify, but it enables me to repeat the essential points in a
slightly more general form.
Suppose tomorrow™s weather can be classi¬ed as Sunny (S),
Fair (F) or Cloudy (C) and that nothing else is possible. We
are interested in explaining a day™s weather in terms of the
barometric pressure on the previous day. Weather forecast-
ers are interested in such questions though they, of course,
would take many more factors into account. To keep this as
simple as possible, let us merely suppose the pressure to be
recorded, at an agreed time, as either above (A) or below
(B) 1,000 mm of mercury, which is near the middle of the
normal range of barometric pressure. The forecaster is able to
estimate probabilities for tomorrow™s weather outcomes given
today™s barometric pressure. We write these probabilities as,
for example, P(S g i v e n B). This is shorthand for the prob-
ability that it is sunny (S) tomorrow, g i v e n that today™s
barometric pressure is below 1,000 mm (B). The probabilities
available to the forecaster can be set out systematically in a
table, as follows.
P(S g i v e n A) P(F g i v e n A) P(C g i v e n A)
P(S g i v e n B) P(F g i v e n B) P(C g i v e n B)

In the ¬rst row are the probabilities of tomorrow™s weather
g i v e n that the barometer is above 1,000 mm today; in the
What can very small probabilities tell us? 93
second row are the corresponding values if the pressure is
below 1,000 mm today. A forecaster would choose the row
representing today™s pressure and look for the option which
had the largest probability. To make this more de¬nite, suppose
the numbers came out as follows.
0.6 0.3 0.1
0.2 0.4 0.4

If the barometer today is above 1,000 mm, the ¬rst row is
relevant, and since sunny has the highest probability (0.6),
this is the most likely outcome. In the contrary case, fair and
cloudy have the same probabilities. Both are more likely than
sunny but, on this evidence, there is no basis for choosing
between them.
However, the real point of introducing this example is to
look at the problem the other way round. Imagine that a day
has passed and we now know what the weather is like today
but have no record of yesterday™s pressure. What now is the
best explanation we can give of today™s weather in terms of
the barometric pressure on the day before? In other words,
what do we think the pressure on the day before would have
been g i v e n that it is sunny, say, today?
To answer this question, the principle of inference to the
best explanation says we should look at the columns of the
table. In our example there are only two elements to look at:
P(S g i v e n A) and P(S g i v e n B) or, in the numerical exam-
ple, 0.6 and 0.2. The principle says that we should choose
A because it gives a higher probability to what has actually
happened than does B (0.6 as against 0.2). Similarly, if it was
cloudy today we would say that B was the most likely expla-
nation (because 0.4 is bigger than 0.1).
When we look at the probabilities by column we call them
likelihoods. Choosing the largest value in the column is thus
God, Chance and Purpose
maximising the likelihood. In statistics we talk about the princi-
ple of maximum likelihood; in philosophy, as inference to the
best explanation. For those with some knowledge of mathe-
matics, we use the term probability function when we consider
the probability as a function of its ¬rst argument, when the
second “ after the g i v e n “ is ¬xed. It is called the likeli-
hood function when it is considered as a function of the second
argument, with the ¬rst ¬xed.
We can put the matter in yet another way by saying that
when we use the probabilities to look forward in time we
are making forecasts; when we use them to look backward in
time we are making an inference to the best explanation. This
prompts a further observation before we leave this example.
The rows in the tables always add up to one because all the pos-
sible outcomes are included, so one or other must happen “
that is, the event that one or other happens is certain. The
columns do not add up to one because, when looking down
a column, it is only the relative values of the entries that
Likelihood inference was used in this fashion in
Bartholomew (1996) in relation to the existence of God. How-
ever, the likelihood principle is part of Bayes™ rule which has
been used extensively by the philosopher Richard Swinburne
and others. The problem with inference to the best explana-
tion is that it takes no account of anything we may know about
the hypothesis in advance. For example, in a country where
the barometric pressure was almost always high, the fact of
knowing that it was a sunny day would not greatly affect our
propensity to believe that the pressure had been high the day
before. The high prior probability of the pressure being high
is not greatly affected by the additional information that it is
sunny today, because the two usually go together anyhow. The
trouble with comparing likelihoods is that it does not take into
What can very small probabilities tell us? 95
account the relative probabilities of hypotheses at the out-
set. This is what the third approach to inference, Bayesian
inference, does for us.

bay e s i an i n f e r e n c e
I shall not go into the technicalities but, instead, give a further
example which illustrates how the idea works. Incidentally, this
example also demonstrates that if we use likelihood inference
on the question of God™s existence, it produces a rather odd
conclusion. This fact gives added impetus to the search for
something better.
The likelihood comparison always makes the ˜God hypoth-
esis™ more credible than any competing hypothesis. For sim-
plicity, suppose that we wish to compare just two hypotheses:
one that there exists a God capable of bringing into being what-
ever world he wishes, the other that there is no such God. On
the ¬rst hypothesis it is certain that the world would turn out
exactly as God intended and hence the conditional probability
of its happening thus must be one. On the alternative hypoth-
esis we may not be able to determine the probability precisely,
or even approximately, but it will certainly be less than one,
so the ˜God hypothesis™ is to be preferred. I suspect that few
atheists would be converted by this argument and it is perti-
nent to ask ˜why not?™ The whole thing smacks of sleight of
hand. It sounds like cheating if we contrive a hypothesis with
the sole purpose of making what has happened certain. If you
begin with the conviction that God™s existence is intrinsically
unlikely there ought to be some way of making sure that that
fact is taken into account. That is exactly what Bayes™ theorem
Essentially it asks whether the likelihood is suf¬cient to
outweigh any prior odds against the hypothesis. Looked at in
God, Chance and Purpose
another way, it allows us to bring our own feelings, judgements
or prior evidence into the equation. Whether or not we should
be allowed to do this is an important question.
In order to get a rough idea of how the theorem works let
us introduce the idea of a likelihood ratio. This is a natural
and simple extension of the ideas already introduced. Let us
go back to the example about weather forecasting. There we
considered two probabilities, P(S g i v e n A) and P(S g i v e n
B). Their numerical values as used in the example were 0.6 and
0.2, respectively. I regarded A as the more likely explanation of
the sunny weather, represented by S, because its likelihood was
three times (=0.6/0.2) the likelihood of the alternative. This
was the likelihood ratio for comparing these two hypotheses.
The bigger it is, the more strongly inclined are we to accept
A as the explanation. However, if we had a justi¬able prior
preference for B we might ask how strong that preference
would have to be to tip the evidence provided by the likelihood
ratio the other way. Bayes™ theorem tells us that if the prior odds
on B were three to one, that would just balance the evidence
from the likelihood ratio the other way. The odds would have
to be less than three to one for the evidence of the likelihood
ratio to carry the day and more than three to one for it to go
the other way.
In general we have to compare the likelihood ratio to the
prior probability ratio. The product of the two determines
which way the conclusion goes.
We shall meet these ideas again in the next chapter when
we move on to the stage where Intelligent Design is debated.
c h a pt e r 7
Can Intelligent Design be established

Intelligent Design has been proposed as a credible scienti¬c alterna-
tive to the theory of evolution as an explanation of life on earth. Its
justi¬cation depends on an extension of Fisherian signi¬cance testing
developed by William Dembski. It is shown, in this chapter, that there
is a fatal ¬‚aw in the logic of his method, which involves a circularity.
In order to construct a test to detect design and ˜eliminate™ chance,
one has to know how to recognise design in the ¬rst place. Dembski™s
calculation of the probability required to implement the method is also
shown to be erroneous.

w h at i s t h e a rg u m e n t a b o ut ?
Intelligent Design is at the centre of one of the ¬ercest debates
currently taking place in the science and religion ¬eld. Its
proponents claim that the scienti¬c establishment is set on
an atheistic course by refusing to countenance the possibility
that the world might contain evidence of design. All they ask
is that design should not be arbitrarily ruled out from the
start and that nature should be allowed to speak for itself; no
special privileges are asked for. The whole debate should then
take place within the bounds of science and according to its
principles of rationality.
The opponents will have none of this, claiming that Intel-
ligent Design makes no claims that can be tested empirically

God, Chance and Purpose
and, because it cannot be falsi¬ed, it is not science. Many see
it as crypto-creationism masquerading under the guise of sci-
ence. They suspect that it is a scienti¬c front for an ideol-
ogy whose aims are more sinister and which are carefully
The United States of America is the birthplace and home of
Intelligent Design1 and it is out of the heady mix of a conser-
vative fundamentalism and the threat of religion™s trespassing
into education that the heat of the debate comes. If evolution
is ˜only a theory™ then why, it is argued, should not other
theories get ˜equal time™ in education? After all, should not
children be given the opportunity to make up their own minds
on such an important matter and should not their parents have
a say in what their children are taught? The proponents of
Intelligent Design, it is alleged, wear the clothes of liberals
pleading that all sides should be given a fair hearing, whereas,
from the other side, the scienti¬c establishment is presented
as a group of reactionaries seeking to control what is taught.

There is an enormous literature on this topic, much of it highly controversial.

It is dif¬cult to select a few articles as background reading. A broad, if
uncritical, survey will be found in O™Leary (2004) about a quarter of whose
book consists of notes, which re¬‚ect the author™s wide reading. O™Leary
is a journalist who makes no pretence of being an expert in science. Her
sense of fairness doubtless leads to her tendency to grant ˜equal time™ to
all sides of an argument. This means that minority viewpoints may appear,
to the uninitiated, to carry more weight than they ought. This is true,
for example, of the few creationists who have scienti¬c quali¬cations. The
far more numerous members of the scienti¬c establishment are treated with
less sympathy. A more academic treatment will be found in Peterson (2002).
The journal Perspectives on Science and Christian Faith (the journal of the
American Scienti¬c Af¬liation) has carried two extensive debates that re¬‚ect
the divisions on the subject among more conservative Christians: volume
54 (2002): 220“63 and volume 56 (2004): 266“98. Much of the technical
material is due to William Dembski and this will be referred to in the course
of the chapter.
Can Intelligent Design be established scienti¬cally? 99
This is not a private ¬ght con¬ned to one country but, since
it goes to the heart of what is true in both science and religion,
anyone may join in. Those who start as spectators may well
see more of the game (to change the metaphor) and so have
something to contribute.
The main thesis of the Intelligent Design movement runs
counter to the central argument of this book. Here I am arguing
that chance in the world should be seen as within the providence
of God. That is, chance is a necessary and desirable aspect of
natural and social processes which greatly enriches the poten-
tialities of the creation. Many, however, including Sproul,
Overman and Dembski, see things in exactly the opposite
way. To them, belief in the sovereignty of God requires that
God be in total control of every detail and that the presence
of chance rules out any possibility of design or of a Designer.
To such people, the fact that evolution by natural selection
involves chance in a fundamental way appears to rule out the
design and purpose without which the whole theistic edi¬ce
is non-existent or, at best, is no more than a description of our
ignorance. It is allowed to have no ontological status at all.
The Intelligent Design movement is dedicated to showing
that the world, as we know it, simply could not have arisen
in the way that evolutionary theory claims. This is not to
say that evolution by natural selection could not have played
some part, but that it could have done so only in a secondary
manner. The broad picture could not, they argue, have come
about without the involvement of a Designer. In this chapter I
shall examine the claim that it can be rigorously demonstrated
that chance does not provide a suf¬cient explanation for what
has happened. I shall not be directly concerned with other
aspects of Dembski™s argument, in particular his claim that
information cannot be created by chance.
God, Chance and Purpose
Essentially, there are two matters to be decided. Is the logic
of the argument which, it is claimed, leads to the ˜design™
conclusion valid and, if it is, are the probability calculations
which it requires correct? The logic is effectively that of sig-
ni¬cance testing outlined in the last chapter. According to
the chief theoretician among the proponents of Intelligent
Design, William Dembski, the logic is an extension of Sir
Ronald Fisher™s theory of signi¬cance testing. Given Fisher™s
eminence as a statistician, it is appropriate that his voice should
be heard on a matter so close to the core of his subject. Prob-
ability calculations also come under the same heading, so I
shall examine how Dembski made his calculations.
William Dembski has single-mindedly pursued his goal of
establishing Intelligent Design as a credible alternative to evo-
lution in several major books and a host of other articles, books
and lectures. This publication trail started with Design Infer-
ence, in which he set out the basic logic of eliminating chance
as an explanation of how things developed. This was followed
by No Free Lunch and The Design Revolution (Dembski 1998,
2002 and 2004).2 The latter book is subtitled Answering the
Toughest Questions about Intelligent Design and is, perhaps,
the clearest exposition of his basic ideas for the non-technical
reader. He has also collaborated with Michael Ruse in editing
Debating Design; From Darwin to DNA (Dembski and Ruse
Much of Dembski™s argument is highly technical, and well
beyond the reach of anyone without a good preparation in
mathematics, probability theory and logic. This applies as
much to the material written for a general readership as to
In Zygon 34 (December 1999): 667“75, there was an essay review by Howard

J. van Till of both Dembski (1998) and Overman (1997). This is in substantial
agreement with the views expressed in this book. Van Till™s review was
followed by a rejoinder from Paul A. Nelson on pp. 677“82.
Can Intelligent Design be established scienti¬cally? 101
the avowedly technical monograph which started it all off
(Dembski 1998). In one sense this is highly commendable,
because the clarity and rigour which can be attained by this
means offers, at least, the prospect of establishing the ideas
on a secure scienti¬c foundation, so that the debate can take
place within the scienti¬c community.3 However, this fact
poses a serious dilemma for anyone who wishes to engage
with him. If the case against Intelligent Design is made at too
high a level, it will pass over the heads of many of those who
most need to question it. If it is too elementary, it will fail
to treat the opposition seriously enough. One must also bear
in mind the psychology of the readership. A highly technical
treatment can have two opposite effects. On the one hand
there is the tendency, on the part of some, to put undue trust
in mathematical arguments, thinking that anything which is
beyond their reach is also beyond question and almost certainly
correct! On the other hand, others may dismiss it instantly as
they dismiss all such material, on the grounds that what cannot
be expressed in simple everyday language can be ignored as
esoteric nonsense. Neither view is correct in this case. The
extensive theoretical treatment cannot be so easily dismissed,
but neither should it be swallowed whole.

d e m b s k i ™ s a rg u m e n t
To do justice to the subtleties of Dembski™s arguments we
shall have to take things fairly slowly, but the reader may be

The two opposite reactions mentioned in this paragraph will be familiar to

anyone who, like the author, has attempted to explain technical “ especially
mathematical “ matters to lay audiences. As noted in the preface, the problem
is acute in a book such as this. It often is a case of being ˜damned if you do
and damned if you don™t™.
God, Chance and Purpose
grateful for a simple statement at the outset of the essence of
the situation.
either enough time or enough space for some exceptionally
rare events to occur. Roughly speaking, Dembski claims to
be able to calculate the probability of the rarest event one
could have expected to happen ˜by chance™ somewhere at some
time. It is simply not reasonable to expect any chance event
with smaller probability to have occurred at all. Hence if we
can ¬nd existing biological entities, say, whose probability of
formation by chance is less than that critical bound, we can
infer that they could not have arisen by chance. Hence they
could not have arisen by evolution if that process is driven by
chance. Dembski claims that at least one such entity exists “
the bacterial ¬‚agellum “ and that suf¬ces to establish design
and, necessarily, a Designer.
It is important to notice that the ˜design™ conclusion is
reached by a process of elimination. According to Demb-
ski there are only two other possible explanations: natural law
or chance. Once these are eliminated, logic compels us to fall
back on design. It is not often noted that the entity must not
only be designed but also brought into being.4 There must,
therefore, be a Maker as well as a Designer. Since natural law
can be regarded as a special, but degenerate, case of chance
the main thing is to eliminate chance. That is exactly what a
Fisherian signi¬cance test was designed to do.

t h e e l i m i nat i on o f c h an c e
It can never be possible to eliminate the chance explanation
absolutely. The best we can do is to ensure that our probability
Howard van Till is an exception. Van Till distinguishes ˜the mind-like action

of designing from the hand-like action of actualising . . . what had ¬rst been
designed™ (2003, p. 128).
Can Intelligent Design be established scienti¬cally? 103
of being wrong, when we claim to have eliminated chance, is
so small that it can be neglected. Dembski believes that Fisher
essentially solved this problem but that his procedure had two
gaps which can be closed. When this is done, the way is clear
to reach the goal.
I shall begin by recapitulating the basic idea of a test of
signi¬cance, which starts from the assumption that chance is
the explanation and then seeks to demonstrate that what has
happened is not consistent with that hypothesis. This time,
however, I shall use what is, perhaps, the simplest possible
kind of example which still facilitates comparison with what
Dembski actually does. We have already seen this example,
discussed by John Arbuthnot in chapter 6 on sex determina-
tion. Here, as he did, we suppose that the probability of a
female birth is exactly 0.5, independently of all other births.
In other words, it is just as if sex was determined by coin
Suppose we have a sample of eight births. The ¬rst step
is to list all possible outcomes. One possibility is that they all
turn out to be male, which we might write as MMMMMMMM;
another would be MFMMFFMM, and so on. Altogether there
are 28 = 256possibilities. The next step is to calculate the prob-
ability of each outcome. Because of the simple assumptions we
have made they all have the same probability of 1/256. The
¬nal step is to construct a rejection set such that any occur-
rence in that set will lead us to reject the hypothesis. Since we
do not wish to reject the hypothesis when it is actually true, the
probability of falling in this set should be small. One possible
way to do this would be to make our rejection set consist of
the two outcomes MMMMMMMM and FFFFFFFF, that is: all
male or all female. It seems clear that if one of these goes in,
the other should too, because they represent equivalent depar-
tures from what one would expect “ roughly equal numbers
of males and females. The probability associated with this set
God, Chance and Purpose
of two outcomes is 1/128, which is not particularly small but
it will serve for purposes of illustration if we treat it as ˜small™.
If we now adopt the rule that we will reject the chance
hypothesis whenever we observe all males or all females in a set
of eight births, we shall wrongly reject the hypothesis one time
in 128. Is this a sensible procedure? It certainly ensures that
we shall rarely reject the chance hypothesis when it is, in fact,
true but that would be the case for any set of two outcomes we
might happen to select for the rejection set. What is required,
it seems, is a rejection set which has both small probability
and which ˜catches™ those outcomes which are indicative of
non-randomness or, in Dembski™s terminology, design. At
this point there is some divergence between Dembski and the
traditional statistical approach, as represented by Fisher. It
will be instructive to look at these two approaches in turn.
The Fisherian would want to include in the rejection set
those outcomes which were ˜furthest™ from ˜chance™, in some
sense, that is, from what the hypothesis under test predicts. If
the probability of a male birth is really 0.5 we would expect
around four males in every eight births. An all-male or an all-
female outcome would be the most extreme and these would
be the ¬rst candidates for inclusion in the rejection set. Next
would come those with only one male or female, then those
with two, and so on. The process would stop when the prob-
ability of the rejection set reached the value we had chosen
as the small probability that we had set as the risk we were
prepared to run of wrongly rejecting the chance hypothesis.
To the end of his life Fisher thought that his procedure just
described captured the essence of the way that scientists work,
though he strongly objected to the idea of rejection ˜rules™.
He preferred to quote the signi¬cance level, which was the
size of the smallest set which just included the observed sam-
ple. Nevertheless, the distinction is not important for present
Can Intelligent Design be established scienti¬cally? 105
purposes. Other, perhaps most, statisticians, came to think
that more explicit account should be taken of the alternative
hypotheses which one was aiming to detect. This was done
implicitly in the sex ratio example by constructing the rejec-
tion region starting with those samples whose proportion of
males, and hence females, was furthest from 0.5. Neyman and
Pearson extended the theory to one which bears their name,
by arguing that the rejection set should be determined so as to
˜catch™ those outcomes which were indicative of the alterna-
tives envisaged. Thus, for example, if one were only interested
in detecting a tendency for males to be more common, then
one would only include outcomes where male births predom-
It is in choosing the critical region that Dembski™s aim is
different. He is looking for outcomes which show evidence
of design, so his critical region needs to be made up of those
outcomes which bear evidence of being non-random. One
example of such an outcome would be MFMFMFMF. As male
and female births are precisely equal, this outcome would not
be allocated to the critical region in the Fisherian approach.
This difference directs our attention to the fact that Dembski
is actually testing a different hypothesis. In our example, the
hypothesis concerned the value of the probability “ whether
or not it was 0.5. The question we were asking was: are the out-
comes consistent with births being determined at random and
with equal probability, in other words, just as in coin tossing?
Dembski is not concerned with the value of the probability
but with the randomness, or otherwise, of the series. It is
not altogether clear from his writing whether Dembski has
noticed this distinction. He does, however, recognise that the
Fisherian scheme needs to be developed in two respects to meet
his needs. First he notes that one has to decide what counts
as ˜small™ in ¬xing the signi¬cance level. Dembski claims to
God, Chance and Purpose
have an answer to this question and we shall return to it below.
The other point, which is germane to the discussion of how to
select the rejection region, is that Dembski wishes to eliminate
all chance hypotheses not, as in our example, just the one with
probability 0.5.
Although it is not totally clear to me how Dembski thinks
this should be handled, it is implicit in much of his discussion.
Essentially he wishes to include in the rejection set all those
outcomes which show unmistakable evidence of design. He
calls this property, speci¬ed complexity. An interesting way of
approaching this is through the work of Gregory Chaitin and
his way of measuring non-randomness, which I have already
discussed in chapter 4. Outcomes exhibiting speci¬ed com-
plexity will score highly on a measure of non-randomness and
so will be candidates for inclusion. Of all possible outcomes it
is known that almost all of them appear to be random, so the
proportion which show some pattern form a vanishingly small
set which must, inevitably, have small probability. Dembski™s
approach is slightly different in that he sometimes appears to
use ˜small probability™ as a proxy for ˜speci¬ed complexity™.
This is plausible if one thinks that anything which is designed is
bound to be virtually impossible to construct by chance alone
and hence must have an exceedingly small probability. Con-
structing a rejection region by ¬rst including outcomes with
the smallest probabilities will thus ensure that we only reject
the chance hypothesis in favour of something which has spec-
i¬ed complexity. However, while it is true that any outcome
exhibiting speci¬ed complexity will have small probability,
the converse is not necessarily true.
All of these ideas are put to the test when we come to
consider particular examples, and for Dembski, that means the
bacterial ¬‚agellum. But ¬rst we must return to the question of
what is meant by ˜small™ in this context.
Can Intelligent Design be established scienti¬cally? 107

t h e u n i v e r sa l p ro ba b i l i t y b o u n d :
h ow s m a l l i s s m a l l ?
According to Dembski, one of the defects of Fisherian signif-
icance testing is that it does not say what is meant by ˜small™
when choosing a signi¬cance level. He provides an answer to
this question in what he calls the universal probability bound.
The idea is very simple; the calculation less so. The idea is
that the universe is simply not old enough, or big enough, for
some events to have materialised anywhere at any time. To put
a ¬gure on this requires a calculation of how many events of
speci¬ed complexity could have occurred. This leads to what
Dembski calls the universal probability bound of 1/10150. I
shall not go into the details of his calculation but it depends,
for example, on the number of elementary particles in the
universe (1080), the rate at which changes can take place, and
so on.
It is worth pausing to re¬‚ect on the extreme smallness of the
probability that I have just been talking about. The number
of elementary particles in the universe is, itself, unimaginably
large. It is dif¬cult enough to imagine the number of stars in
the universe but this dif¬culty is compounded by the fact that
every single star is composed of a vast number of elementary
particles. Even when all these are added up we are still many
orders of magnitude short of the number 10150. The only point
of raising this issue is to turn the spotlight onto the importance
compare our calculated probability with some in¬nitesimally
small bound.
The calculation is not straightforward. Although we shall
not go into details, there are a number of pitfalls in mak-
ing such calculations, which Dembski avoids, even though he
has to invent a whole new terminology to express what he is
God, Chance and Purpose
about. To begin with, Dembski ¬nds it necessary to introduce
the notion of what he calls probabilistic resources. This has to
do with the fact that there may have been many opportunities
at many places for a particular event to occur. So the question
we ought to be asking is not whether that event occurs exactly
once, but at least once. Another dif¬culty is that there is simply
not always enough information to make an exact calculation. It
is sensible, however, to err on the safe side, so Dembski™s ¬nal
answer is not, therefore, an exact ¬gure but a lower bound.
This means that the true ¬gure cannot be smaller than this
bound, but may be higher. So if, when we make a compari-
son with another probability, that probability turns out to be
smaller than the bound, it will certainly be smaller than the true
There is one more important aspect of Dembski™s argument
to which we should pay attention. He proposes that the rejec-
tion set should consist of the speci¬cally complex outcomes. At
this point we run up against the fact that Dembski sometimes
appears to regard a rejection set as consisting of one outcome
but this is not strictly true. Just as he introduces the idea of
probabilistic resources so he introduces structurally complex
resources. The former allows for the fact that an event which
has only a very small probability of occurring at a particular
time and place will have a much larger probability if it can
occur at many places and times. Similarly, the latter allows for
the fact that there may not be just one structurally complex
outcome but a number. The probability of observing at least
one of them is, therefore, larger than that of exactly one. In
effect this means that we have a rejection set consisting of sev-
eral outcomes just as I supposed in describing the Fisherian
signi¬cance test.
If we let this collection of speci¬cally complex outcomes
constitute the rejection set, we will achieve two objectives at
Can Intelligent Design be established scienti¬cally? 109
once. First, since the set of speci¬cally complex outcomes is
very small, its size (probability) will also be very small, thus
meeting the ¬rst requirement of a test of signi¬cance. Secondly,
if we reject the chance hypothesis whenever the outcome falls
in this set, we shall never make an error of Type II (false
negative). This is because every element in the rejection set is
certainly indicative of design, by de¬nition. That is, we may
ascribe design to the outcome in such cases without any risk
of making a mistake. This is exactly what Dembski wishes to
A moment™s re¬‚ection will show that there is something a
little odd about this line of reasoning. It says that we should
reject the chance hypothesis whenever the outcome exhibits
speci¬c complexity. In doing so, we shall certainly be cor-
rect if design is, in fact, present and our chance of wrongly
rejecting the chance hypothesis will be very small (the size
of the rejection set). However, one may legitimately ask why
we need all this technical apparatus if we know already that
certain outcomes exhibit design. The conclusion is, indeed, a
tautology. It says that if something bears unmistakable evi-
dence of design, then it has been designed! The nature of what
Dembski is doing, and its absurdity, will be even more obvious
when we set it in the context of what he calls ˜comparative ™
methods below. First, I digress to point out the other ¬‚aw in
Dembski™s argument.

t h e p ro ba b i l i t y o f t h e bac t e r i a l f lag e l lu m
Although Dembski spends a great deal of time developing a
version of Fisherian signi¬cance testing designed to eliminate
chance, the main application is to one particular case where
the theory is not much in evidence. This concerns a remark-
able biological structure attached to the bacterium Escherichia
God, Chance and Purpose
coli,5 which drives it in the manner of a propeller. The ques-
tion is whether this construction could have been assembled
by chance or whether its presence must be attributed to design.
Dembski makes a rough calculation of the probability that this
structure could have come about by chance and arrives at the
exceedingly small value of 1/10263 . How on earth, one may
wonder, could anyone ever arrive at such a seemingly pre-
cise ¬gure? Inevitably there have to be some approximations
along the way, but he chooses them so as to err on the safe side.
However, there is no need to stay on the details because the
whole enterprise is seriously ¬‚awed. Howard van Till (2003)
has put his ¬nger on the source of the problem. His criticism
is that Dembski™s probability calculation in no way relates
to the way in which the ¬‚agellum might conceivably have
been formed. Dembski treats it as what van Till calls a discrete
combinatorial object. Essentially, Dembski counts the number
of ways in which the ingredients of the ¬‚agellum could be
brought together and assembled into a structure. The bland,
and false, assumption that all of these structures are equally
likely to have arisen then yields the probability.
It is dif¬cult to understand how such an elementary mistake
can have been made by someone so mathematically sophisti-
cated. Possibly it stems from confusion about what is meant
by ˜pure chance™. There are many examples in the literature of
similar combinatorial calculations which purport to show that
such things as the origin of life must have been exceedingly
small. This has already been noted in chapter 6 in relation

The case of the bacterial ¬‚agellum dominates the literature, almost as though

it were the only suf¬ciently complicated biological organism. Later in this
paragraph we come to van Till™s discussion of its probability, which was the
main purpose of the paper quoted in note 4 above. In strict logic, of course,
only one case is needed to establish the conclusion that some things are too
complicated to have evolved.
Can Intelligent Design be established scienti¬cally? 111
to the work of du No¨ y and to Hoyle and Wickramasinghe,
supposed that such complicated entities can be assembled as a
result of some cosmic shuf¬‚ing system. Indeed, the main point
of Dawkins™ book Climbing Mount Improbable (Dawkins 2006
[1996]) is to demonstrate that complicated structures which it
would be virtually impossible to assemble as discrete combi-
natorial objects could be constructed in a series of small steps
which, taken together, might have a much larger probability
(see ˜Chance in evolution™ in chapter 11, below). According
to evolutionary theory the growth in complexity would have
taken place sequentially over immense periods of time. What is
needed is a model of how this might have happened before we
can begin to make any meaningful calculations. To produce an
argument, as Dembski does, that the ¬‚agellum could not have
been formed by an ˜all-at-once™ coming together and random
assembly of the ingredients is hardly more than a statement of
the blindingly obvious. The inference that Dembski wishes to
make thus fails, even if his universal probability is accepted.

t h e pa r a d ox
We now return to the logic of Dembski™s argument. Because
the fallacy is so fundamental, I shall repeat what was said
above but in a slightly different way.
Dembski has always seen his approach as standing squarely
in the Fisherian tradition, in which no account needs to be
taken of alternative hypotheses. At ¬rst sight this seems to
be a reasonable position to adopt, because any alternative
hypothesis would have to be speci¬ed probabilistically and it
is the express purpose of the exercise to eliminate all chance
hypotheses. It is thus somewhat ironic that Dembski™s logic
can be set out quite simply within the framework of the
God, Chance and Purpose
Neyman“Pearson approach to inference. The clarity which
we gain thereby also serves to underline the essentially tauto-
logical character of the formalism.
Let us think of a situation, like the coin-tossing exercise,
in which there are very many possible outcomes, each having
very small probability (in Dembski™s terminology these are
complex). Some of these outcomes will be what Dembski calls
speci¬cally complex. These outcomes exhibit some kind of
pattern which bears the hallmark of design “ let us leave aside
for the moment the question of whether or not ˜pattern™ can
be adequately de¬ned. The essence of the Neyman“Pearson
approach to statistical inference is to choose the rejection set
to include those outcomes which are most likely to have arisen
under some alternative hypothesis. In this case the alternative
is that the outcomes are the result of design. The characteristic
of a designed outcome is that it exhibits speci¬ed complexity.
The rejection set should therefore consist of all those out-
Now let us consider the consequences of what we have
done. The likelihood of wrongly rejecting the chance hypoth-
esis is very small because speci¬ed outcomes have very small
probability. The probability of correctly rejecting the chance
hypothesis is one (that is, certain) because all outcomes in the
rejection set are certainly the result of design (that is why they
were selected). In other words, we have maximised the chance
of detecting design when it is present. We thus seem to have
a foolproof method of detecting design whose logic has been
made clearer by setting it in the Neyman“Pearson framework
(which Dembski seems to be hardly aware of ). So where is the
catch? The problem is that, in order to construct the rejection
set, we have to be able to identify those outcomes which are
the result of design. If we know that already, why do we need
the test in the ¬rst place?
Can Intelligent Design be established scienti¬cally? 113
One possible response is to say that we only identify design
indirectly through the very small probability which it assigns
should be formed of those outcomes which have the smallest
probabilities and leave, in particular, those which are less than
the universal probability bound. In that case we are entitled to
ask why we need the formalism at all. If the rule to follow is to
reject the chance hypothesis whenever the outcome observed
has probability that is so small that it could not have arisen in
a universe as large or old as the one we inhabit, is that not a
suf¬cient ground of itself?

d e m b s k i ™ s c r i t i c i s m s o f c o m pa r at i v e m et h o d s
Dembski is highly critical of what he calls comparative meth-
ods and his reasons are set out in chapter 33 of Dembski
(2004).6 A comparative method is any method which involves
the comparison of the chance hypothesis with an alternative.
Such a method involves selecting one from several possibil-
ities and is thus concerned with the relative rather than the
absolute credibility of hypotheses. At ¬rst sight this is a sur-
prising position to take because there clearly is an alternative
in mind “ that the complexity we observe is the work of a
Designer. However, this alternative clearly has a different
status in Dembski™s mind, presumably because it is not speci-
¬ed probabilistically. There are three comparative methods
in common use which I have already reviewed in chapter 6.
The ¬rst is the Neyman“Pearson approach, which uses the
alternative hypotheses to select the rejection set; the second
Dembski mentions (2004, p. 242) a conference at Calvin College in May

2001 on Design Reasoning, at which he spoke. Timothy and Linda McGrew
and Robin Collins are reported as putting Bayesian arguments. In particular
these critics objected to the notion of speci¬cation.
God, Chance and Purpose
is the likelihood method, or inference to the best explanation
approach; and the third is the Bayesian method, in which the
alternatives have speci¬ed prior probabilities. Dembski seems
hardly aware of the ¬rst two approaches and concentrates his
¬re on the Bayesian threat. Possibly this is because his own
treatment has been challenged from that quarter and this is
where much current philosophical interest in uncertain infer-
ence lies.
I agree with Dembski™s strictures on the use of Bayesian
inference in this particular context primarily because the intro-
duction of prior probabilities makes the choice too subjective.
In an important sense it begs the question because we have to
decide, in advance, the strength of our prior belief in the exis-
tence, or not, of a designer. Bayesian inference tells us how our
beliefs should be changed by evidence, not how they should be
formed in the ¬rst place. What Dembski seems to have over-
looked is that his method is, in fact, a comparative method and
that it can be seen as such by setting it within the framework
of the Neyman“Pearson theory as demonstrated in the last
section. By viewing it in that way we saw that its tautological
character was made clear and hence the whole inferential edi-
¬ce collapses. Given, in addition, that Dembski™s probability
calculation is certainly incorrect I conclude that Intelligent
Design has not been established scienti¬cally.

is intelligent design science?
Much of the debate over Intelligent Design has not been on
its statistical underpinning but on the more general question
of whether or not it is science. This usually turns on whether
it makes empirical statements which can be tested empiri-
cally. Although this is true, in my view it approaches the
problem from the wrong direction. To make my point it is
Can Intelligent Design be established scienti¬cally? 115
important to distinguish ˜science™ from ˜scienti¬c method™.
Scienti¬c method is the means by which science as a body
of knowledge is built up. Dembski has proposed a method by
which, he claims, knowledge is validly acquired. The question
then is: is this method valid? That is, does it yield veri¬able
facts about the real world? As I noted at the outset, two ques-
tions have to be answered: is the logic sound and is the method
correctly applied? To the ¬rst question my answer is that the
logic is not sound, because the extension proposed to Fisherian
signi¬cance testing is not adequate in itself and also because
almost all statisticians ¬nd the original Fisher method incom-


. 3
( 7)