. 5
( 5)

to see that we will actually need 0 = 0, for otherwise each of the polynomials n ak x k
vanishes at x = 0 (and so has distance at least 1 from the constant 1 function, for example).
If the n's are integers, it's also clear that we'll have to have n ! 1 as n ! 1. But
what else is needed? The answer comes to us from Muntz in 1914. (You sometimes see
the name Otto Szasz associated with Muntz's theorem, because Szasz proved a similar
theorem at nearly the same time (1916).)
Theorem. Let 0 < 1< 2< . Then, the functions (x n ) have dense linear span
in C 0 1 ] if and only if 0 = 0 and 1 ;1 = 1.
n=1 n

What Muntz is trying to tell us here is that the n's can't get big too quickly. In
Pn a xk2 are evidently not dense in C 0 1 ]. On
particular, the polynomials of the form k=0 k
the other hand, the n's don't have to be unbounded indeed, Muntz's theorem implies an
earlier result of Bernstein from 1912: If 0 < 1 < 2 < < K (some constant), then
1 x 1 x 2 : : : have dense linear span in C 0 1 ].
Before we give the proof of Muntz's theorem, let's invent a bit of notation: We write
(X )
ak x k : a0 : : : an 2 R
Xn =
and, given f 2 C 0 1 ], we write dist(f Xn ) to denote the distance from f to the space
S1 X . That is, X is the linear span of
spanned by 1 x 1 : : : x n . Let's also write X =
n=0 n
the entire sequence (x n )1 . The question here is whether X is dense, and we'll address
the problem by determining whether dist(f Xn ) ! 0, as n ! 1, for every f 2 C 0 1 ].
If we can show that each ( xed) power xm can be uniformly approximated by a linear
combination of x n 's, then the Weierstrass theorem will tell us that X is dense in C 0 1 ].
Muntz Theorems 140
(How?) Surprisingly, the numbers dist(xm Xn) can be estimated. Our proof won't give
the best estimate, but it will show how the condition 1 ;1 = 1 comes into the
n=1 n
1; m .
Lemma. Let m > 0. Then, dist(xm Xn )
Proof. We may certainly assume that m 6= n for any n. Given this, we inductively
de ne a sequence of functions by setting P0(x) = xm and
t;1; n Pn;1(t) dt
Pn(x) = ( n ; m) x n
for n 1. For example,
Z1 1
t;1; 1 tm dt = ;x 1 tm;
P1(x) = ( 1 ; m) x = xm ; x 1 :
1 1
By induction, each Pn is of the form xm ; n ak x k for some scalars (ak ):
t;1; n Pn;1(t) dt
Pn (x) = ( n ; m) x n
" #
Z1 X
t;1; tm ;
= ( n ; m) x ak t dt
n n k
x k=0
n;1 ak
xm ; x n + ( n ; m) (x k ; x n ):
k=0 n ; k
Finally, kP0k = 1 and kPnk j1 ; m jkPn;1k, because
t;1; n dt = j n ; mj (1 ; x n ) 1; m :
j n ; mj x n
n n
1; m :
dist(xm Xn ) kPnk

The preceding result is due to v. Golitschek. A slightly better estimate, also due to
Qn jm; k j .
v. Golitschek (1970), is dist(xm Xn) k=1 m+ k
Now a well-known fact about in nite products is that, for positive ak 's, the product
Q1 1 ; ak diverges (to 0) if and only if the series P1 ak diverges (to 1) if and only
k=1 k=1
Muntz Theorems 141
if the product 1 1 + ak diverges (to 1). In particular, n 1 ; m ! 0 if and only
Pn 1 ! 1.k=1 is, dist(xm X ) ! 0 if and only if P1k=1 1 = 1. This proves the
if k=1 k That n k=1 k
\backward" direction of Muntz's theorem.
We'll prove the \forward" direction of Muntz's theorem by proving a version of Muntz's
theorem for the space L2 0 1 ]. For our purposes, L2 0 1 ] denotes the space C 0 1 ]
endowed with the norm Z1 1=2
kfk2 = jf(x)j2 dx
although our results are equally valid in the \real" space L2 0 1 ] (consisting of square-
integrable, Lebegue measurable functions). In the latter case, we no longer need to assume
that 0 = 0, but we do need to assume that each n > ;1=2 (in order that x2 n be
integrable on 0 1 ]).
Remarkably, the distance from f to the span of x 0 x 1 : : : x n can be computed
exactly in the L2 norm. For this we'll need some more notation: Given linearly independent
vectors f1 : : : fn in an inner product space, we call
h f1 f1 i h f1 fn i
. .
... = det hfi fj i i j
G(f1 : : : fn ) = . .
. .
h f n f1 i h fn fn i
the Gram determinant of the fk 's.
Lemma. (Gram) Let F be a nite dimensional subspace of an inner product space V ,
and let g 2 V n F. Then, the distance d from g to F is given by
d 2 = G(g f1: :: :: : ffn)
G(f1 n)
where f1 : : : fn is any basis for F.
Pn a f be the best approximation to g out of F. Then, since g ; f
Proof. Let f = i=1 i i
is orthogonal to F, we have, in particular, h fj g i = h fj g i for all j that is,
ai h fj fi i = hfj g i j = 1 : : : n: ()
Since this system of equations always has a unique solution a1 : : : an, we must have
G(f1 : : : fn) 6= 0 (and so the formula in our Lemma at least makes sense).
Muntz Theorems 142
Next, notice that

d 2 = hg ; f g ; f i = h g ; f g i = h g g i ; hg f i
in other words,
aih g fi i = hg g i:
d2 + ()
Now consider ( ) and ( ) as a system of n+1 equations in the n+1 unknowns a1 : : : an,
and d 2 in matrix form we have
2 1 h g f1 i h g fn i 3 2 d 2 3 2 hg g i 3
6 0 h f1 f1 i h f1 fn i 7 6 a1 7 = 6 hf1 g i 7 :
6. 76 . 7 6 . 7
6 . .. 76 . 7 6 . 7
. .
4. 54 . 5 4 . 5
. .
. . .
0 h fn f1 i hfn fn i an hfn g i
Solving for d 2 using Cramer's rule gives the desired result expanding along the rst
column shows that the matrix of coe cients has determinant G(f1 : : : fn ), while the
matrix obtained by replacing the \d column" by the right-hand side has determinant
G(g f1 : : : fn).
Note: By our last Lemma and induction, every Gram determinant is positive!
In what follows, we will still use Xn to denote the span of x 0 : : : x n , but now we'll
write dist 2 (f Xn) to denote the distance from f to Xn in the L2 norm.
Theorem. Let m, > ;1=2 for k = 0 1 2 : : :. Then,
Y jm ; k j
dist 2(xm Xn ) = p :
2m + 1 k=0 m + k + 1

Proof. The proof is based on a determinant formula due to Cauchy:
1 1
a1 +b1 a1 +bn
. .
. . (ai ; aj )(bi ; bj ):
(ai + bj ) =
. .
ij i>j
1 1
an +b1 an +bn
If we consider each of the ai 's and bj 's as \variables," then each side of the equation is a
polynomial in a1 : : : an b1 : : : bn. (Why?) Now the right-hand side clearly vanishes if
Muntz Theorems 143
ai = aj or bi = bj for some i 6= j, but the left-hand side also vanishes in any of these cases.
Thus, the right-hand side divides the left-hand side. But both polynomials have degree
n ; 1 in each of the ai 's and bj 's. (Why?) Thus, the left-hand side is a constant multiple
of the right-hand side. To show that the constant must be 1, write the left-hand side as
a1 +b1 a1 +b1
1 a1 +b2 a1 +bn
Y a2 +b2 a2 +b2
(ai + bj ) a2+b1 a2 +bn
. ..
. .
an +bn an +bn 1
an +b1 an +bn;1
and now take the limit as b1 ! ;a1 , b2 ! ;a2, etc. The expression above tends to
Q (ai ; aj ), as does the right-hand side of Cauchy's formula.
Now, h xp xq i = 01 xp+q dx = p+q+1 for p, q > ;1=2, so
! Q(;
j )2
1 i
Q i>j +
G(x : : : x n) = det + j + 1 i j = i j i j + 1)
with a similar formula holding for G(xm x 0 : : : x n ). Substituting these expressions into
our distance formula and taking square roots nishes the proof.
Now we can determine exactly when X is dense in L2 0 1 ]. For easier comparison to
the C 0 1 ] case, we suppose that the n 's are nonnegative.
Theorem. Let 0 < 1 < 2 < . Then, the functions (x n ) have dense linear span
in L2 0 1 ] if and only if 1 ;1 = 1.
n=1 n
P1 Q Q
n n (m+1)
n=1 n < 1, then each of the products k=1 1 ; k and k=1 1 + k
Proof. If
converges to some nonzero limit for any m not equal to any k . Thus, dist 2(xm Xn) 6! 0,
as n ! 1, for any m 6= k , k = 0 1 2 : : :. In particular, the functions (x n ) cannot have
dense linear span in L2 0 1 ].
Conversely, if 1 1n = 1, then n 1 ; m diverges to 0 while n 1 + (m+1)
n=1 k=1 k=1
k k
diverges to +1. Thus, dist 2(xm Xn) ! 0, as n ! 1, for every m > ;1=2. Since the
polynomials are dense in L2 0 1 ], this nishes the proof.
Finally, we can nish the proof of Muntz's theorem in the case of C 0 1 ]. Suppose
that the functions (x n ) have dense linear span in C 0 1 ]. Then, since kfk2 kfk, it
Muntz Theorems 144
follows that the functions (x n ) must also have dense linear span in L2 0 1 ]. (Why?)
Hence, 1 1n = 1.
Just for good measure, here's a second proof of the \backward" direction for C 0 1 ]
based on the L2 0 1 ] version. Suppose that 1 1n = 1, and let m 1. Then,

1 Z x tm;1 dt ; X ak Z x t k ;1 dt
n n
xm ; ak x =m
k=0 k 0
Z1 1 tm;1 ; X ak t k;1 dt
m k=0 k
0Z 11=2
1 tm;1 ; X ak
@ dtA :
t k;1
m k=0 k
Now the functions (x k ;1) have dense linear span in L2 0 1 ] because n>1 n1;1 = 1.
Thus, we can nd ak 's so that the right-hand side of this inequality is less than some ".
Since this estimate is independent of x, we've shown that
max xm ; ak x < ":
0x1 k=0
Application. Let 0 = ;1 = 1, and let f be a continuous
0< 1< 2< with n=1 n
function on 0 1) for which c = t!1 f(t) exists. Then, f can be uniformly approximated
by nite linear combinations of the exponentials (e; nt )1 .

Proof. The function g(x) = f(; log x), for 0 < x 1, and g(0) = c, is continuous on
0 1 ]. In other words, g(e;t) = f(t) for each 0 t < 1. Thus, given " > 0, we can nd
n and a0 : : : an such that
n n
ak e; kt
max g(x) ; = max f(t) ;
ak x < ":
0x1 0 t<1
k=0 k=0
The Stone-Weierstrass Theorem
Math 682

To begin, an algebra is a vector space A on which there is a multiplication (f g) 7! fg
(from A A into A) satisfying
(i) (fg)h = f(gh), for all f, g, h 2 A
(ii) f(g + h) = fg + fh and (f + g)h = fg + gh, for all f, g, h 2 A
(iii) (fg) = ( f)g = f( g), for all scalars and all f, g 2 A.
In other words, an algebra is a ring under vector addition and multiplication, together
with a compatible scalar multiplication. The algebra is commutative if
(iv) fg = gf, for all f, g 2 A.
And we say that A has an identity element if there is a vector e 2 A such that
(v) fe = ef = f, for all f 2 A.
In case A is a normed vector space, we also require that the norm satisfy
(vi) kfgk kfkkgk
(this simpli es things a bit), and in this case we refer to A as a normed algebra. If a
normed algebra is complete, we refer to it as a Banach algebra. Finally, a subset B of an
algebra A is called a subalgebra (of A) if B is itself an algebra (under the same operations)
that is, if B is a (vector) subspace of A which is closed under multiplication.
If A is a normed algebra, then all of the various operations on A (or A A) are
continuous. For example, since
kfg ; hkk = kfg ; fk + fk ; hkk kfk kg ; kk + kkk kf ; hk
it follows that multiplication is continuous. (How?) In particular, if B is a subspace (or
subalgebra) of A, then B, the closure of B, is also a subspace (or subalgebra) of A.
1. If we de ne multiplication of vectors \coordinatewise," then Rn is a commutative
Banach algebra with identity (the vector (1 : : : 1)) when equipped with the norm
kxk1 = max jxi j.
Stone-Weierstrass 146
2. It's not hard to identify the subalgebras of Rn among its subspaces. For example, the
subalgebras of R2 are f(x 0) : x 2 Rg, f(0 y) : y 2 Rg, and f(x x) : x 2 Rg, along
with f(0 0)g and R2.
3. Given a set X, we write B(X) for the space of all bounded, real-valued functions on
X. If we endow B(X) with the sup norm, and if we de ne arithmetic with functions
pointwise, then B(X) is a commutative Banach algebra with identity (the constant
1 function). The constant functions in B(X) form a subalgebra isomorphic (in every
sense of the word) to R.
4. If X is a metric (or topological) space, then we may consider C(X), the space of all
continuous, real-valued functions on X. If we again de ne arithmetic with functions
pointwise, then C(X) is a commutative algebra with identity (the constant 1 function).
The bounded, continuous functions on X, written Cb(X) = C(X) \ B(X), form a
closed subalgebra of B(X). If X is compact, then Cb(X) = C(X). In other words,
if X is compact, then C(X) is itself a closed subalgebra of B(X) and, in particular,
C(X) is a Banach algebra with identity.
5. The polynomials form a dense subalgebra of C a b ]. The trig polynomials form a
dense subalgebra of C 2 . These two sentences summarize Weierstrass's two classical
theorems in modern parlance and form the basis for Stone's version of the theorem.

Using this new language, we may restate the classical Weierstrass theorem to read:
If a subalgebra A of C a b ] contains the functions e(x) = 1 and f(x) = x, then A is
dense in C a b ]. Any subalgebra of C a b ] containing 1 and x actually contains all the
polynomials thus, our restatement of Weierstrass's theorem amounts to the observation
that any subalgebra containing a dense set is itself dense in C a b ].
Our goal in this section is to prove an analogue of this new version of the Weierstrass
theorem for subalgebras of C(X), where X is a compact metric space. In particular, we
will want to extract the essence of the functions 1 and x from this statement. That is, we
seek conditions on a subalgebra A of C(X) that will force A to be dense in C(X). The
key role played by 1 and x, in the case of C a b ], is that a subalgebra containing these
two functions must actually contain a much larger set of functions. But since we can't
be assured of anything remotely like polynomials living in the more general C(X) spaces,
Stone-Weierstrass 147
we might want to change our point of view. What we really need is some requirement on
a subalgebra A of C(X) that will allow us to construct a wide variety of functions in A.
And, if A contains a su ciently rich variety of functions, it might just be possible to show
that A is dense.
Since the two replacement conditions we have in mind make sense in any collection of
real-valued functions, we state them in some generality.
Let A be a collection of real-valued functions on some set X. We say that A separates
points in X if, given x 6= y 2 X, there is some f 2 A such that f(x) 6= f(y). We say that
A vanishes at no point of X if, given x 2 X, there is some f 2 A such that f(x) 6= 0.

6. The single function f(x) = x clearly separates points in a b ], and the function e(x) =
1 obviously vanishes at no point in a b ]. Any subalgebra A of C a b ] containing
these two functions will likewise separate points and vanish at no point in a b ].
7. The set E of even functions in C ;1 1 ] fails to separate points in ;1 1 ] indeed,
f(x) = f(;x) for any even function. However, since the constant functions are even,
E vanishes at no point of ;1 1 ]. It's not hard to see that E is a proper closed
subalgebra of C ;1 1 ]. The set of odd functions will separate points (since f(x) = x
is odd), but the odd functions all vanish at 0. The set of odd functions is a proper
closed subspace of C ;1 1 ], although not a subalgebra.
8. The set of all functions f 2 C ;1 1 ] for which f(0) = 0 is a proper closed subalgebra
of C ;1 1 ]. In fact, this set is a maximal (in the sense of containment) proper closed
subalgebra of C ;1 1 ]. Note, however, that this set of functions does separate points
in ;1 1 ] (again, because it contains f(x) = x).
9. It's easy to construct examples of non-trivial closed subalgebras of C(X). Indeed,
given any closed subset X0 of X, the set A(X0 ) = ff 2 C(X) : f vanishes on X0g is
a non-empty, proper subalgebra of C(X). It's closed in any reasonable topology on
C(X) because it's closed under pointwise limits. Subalgebras of the type A(X0 ) are
of interest because they're actually ideals in the ring C(X). That is, if f 2 C(X),
and if g 2 A(X0 ), then fg 2 A(X0 ).
Stone-Weierstrass 148
As these few examples illustrate, neither of our new conditions, taken separately, is
enough to force a subalgebra of C(X) to be dense. But both conditions together turn
out to be su cient. In order to better appreciate the utility of these new conditions, let's
isolate the key computational tool that they permit within an algebra of functions.
Lemma. Let A be an algebra of real-valued functions on some set X, and suppose that
A separates points in X and vanishes at no point of X. Then, given x = y 2 X and a,
b 2 R, we can nd an f 2 A with f(x) = a and f(y) = b.
Proof. Given any pair of distinct points x 6= y 2 X, the set A = f f(x) f(y) : f 2 Ag
is a subalgebra of R2. If A separates points in X, then A is evidently neither f(0 0)g nor
f(x x) : x 2 Rg. If A vanishes at no point, then f(x 0) : x 2 Rg and f(0 y) : y 2 Rg are
both excluded. Thus A = R2. That is, for any a, b 2 R, there is some f 2 A for which
(f(x) f(y)) = (a b).
Now we can state Stone's version of the Weierstrass theorem (for compact metric
spaces). It should be pointed out that the theorem, as stated, also holds in C(X) when
X is a compact Hausdor topological space (with the same proof), but does not hold for
algebras of complex-valued functions over C . More on this later.
Stone-Weierstrass Theorem. (real scalars) Let X be a compact metric space, and let
A be a subalgebra of C(X). If A separates points in X and vanishes at no point of X,
then A is dense in C(X).
What Cheney calls an \embryonic" version of this theorem appeared in 1937, as a small
part of a massive 106 page paper! Later versions, appearing in 1948 and 1962, bene tted
from the work of the great Japanese mathematician Kakutani and were somewhat more
palatable to the general mathematical public. But, no matter which version you consult,
you'll nd them di cult to read. For more details, I would recommend you rst consult
Folland's Real Analysis, or Simmons's Topology and Modern Analysis.
As a rst step in attacking the proof of Stone's theorem, notice that if A satis es the
conditions of the theorem, then so does its closure A. (Why?) Thus, we may assume that
A is actually a closed subalgebra of C(X) and prove, instead, that A = C(X). Now the
closed subalgebras of C(X) inherit more structure than you might rst imagine.
Stone-Weierstrass 149
Theorem. If A is a subalgebra of C(X), and if f 2 A, then jfj 2 A. Consequently, A is
a sublattice of C(X).
jtj on the interval ;kfk kfk . By the
Proof. Let " > 0, and consider the function
Weierstrass theorem, there is a polynomial p(t) = n ak tk such that jtj; p(t) < " for
all jtj kfk. In particular, notice that jp(0)j = ja0 j < ".
Now, since jf(x)j kfk for all x 2 X, it follows that jf(x)j ; p(f(x)) < " for all
x 2 X. But p(f(x)) = (p(f))(x), where p(f) = a0 1 + a1f + + an f n , and the function
g = a1 f + + an f n 2 A, since A is an algebra. Thus, jf(x)j ; g(x) ja0j + " < 2"
for all x 2 X. In other words, for each " > 0, we can supply an element g 2 A such that
k jfj ; gk < 2". That is, jfj 2 A.
The statement that A is a sublattice of C(X) means that if we're given f, g 2 A, then
maxff gg 2 A and minff gg 2 A, too. But this is actually just a statement about real
numbers. Indeed, since

2 maxfa bg = a + b + ja ; bj 2 minfa bg = a + b ; ja ; bj

it follows that a subspace of C(X) is a sublattice precisely when it contains the absolute
values of all its elements.
The point to our last result is that if we're given a closed subalgebra A of C(X), then
A is \closed" in every sense of the word: Sums, products, absolute values, max's, and
min's of elements from A, and even limits of sequences of these, are all back in A. This is
precisely the sort of freedom we'll need if we hope to show that A = C(X).
Please notice that we could have avoided our appeal to the Weierstrass theorem in this
last result. Indeed, we really only need to supply polynomial approximations for the single
function jxj on ;1 1 ], and this can be done directly. For example, we could appeal instead
p1 ; (1 ; x2 ). The resulting series can be shown
to the binomial theorem, using jxj =
to converge uniformly on ;1 1 ]. By side-stepping the classical Weierstrass theorem, it
becomes a corollary to Stone's version (rather than the other way around).
Now we're ready for the proof of the Stone-Weierstrass theorem. As we've already
pointed out, we may assume that we're given a closed subalgebra (subspace, and sublattice)
Stone-Weierstrass 150
A of C(X) and we want to show that A = C(X). We'll break the remainder of the proof
into two steps:
Step 1: Given f 2 C(X), x 2 X, and " > 0, there is an element gx 2 A with gx(x) = f(x)
and gx(y) > f(y) ; " for all y 2 X.
From our \computational" Lemma, we know that for each y 2 X, y 6= x, we can nd
an hy 2 A so that hy (x) = f(x) and hy (y) = f(y). Since hy ;f is continuous and vanishes
at both x and y, the set Uy = ft 2 X : hy (t) > f(t);"g is open and contains both x and y.
Thus, the sets (Uy )y6=x form an open cover for X. Since X is compact, nitely many Uy 's
Uyn . Now set gx = maxfhy1 : : : hyn g. Because A is a lattice,
su ce, say X = Uy1
we have gx 2 A. Note that gx(x) = f(x) since each hyi agrees with f at x. And gx > f ;"
since, given y 6= x, we have y 2 Uyi for some i, and hence gx(y) hyi (y) > f(y) ; ".
Step 2: Given f 2 C(X) and " > 0, there is an h 2 A with kf ; hk < ".
From Step 1, for each x 2 X we can nd some gx 2 A such that gx(x) = f(x) and
gx(y) > f(y);" for all y 2 X. And now we reverse the process used in Step 1: For each x,
the set Vx = fy 2 X : gx(y) < f(y)+"g is open and contains x. Again, since X is compact,
Vxm for some x1 : : : xm . This time, set h = minfgx1 : : : gxm g 2 A. As
X = Vx1
before, h(y) > f(y) ;" for all y, since each gxi does so, and h(y) < f(y) + " for all y, since
at least one gxi does so.
The conclusion of Step 2 is that A is dense in C(X) but, since A is closed, this means
that A = C(X).
Corollary. If X and Y are compact metric spaces, then the subspace of C(X Y ) spanned
by the functions of the form f(x y) = g(x) h(y), g 2 C(X), h 2 C(Y ), is dense in C(X Y ).
Corollary. If K is a compact subset of Rn, then the polynomials (in n-variables) are
dense in C(K).
Applications to C 2
In many texts, the Stone-Weierstrass theorem is used to show that the trig polynomials are
dense in C 2 . One approach here might be to identify C 2 with the closed subalgebra of
C 0 2 ] consisting of those functions f satisfying f(0) = f(2 ). Probably easier, though,
Stone-Weierstrass 151
is to identify C 2 with the continuous functions on the unit circle T = fei : 2 Rg =
fz 2 C : jzj = 1g in the complex plane using the identi cation
f 2 C2 ! g 2 C(T) where g(eit) = f(t):
Under this correspondence, the trig polynomials in C 2 match up with (certain) polyno-
mials in z = eit and z = e;it . But, as we've seen, even if we start with real-valued trig
polynomials, we'll end up with polynomials in z and z having complex coe cients.
Given this, it might make more sense to consider the complex-valued continuous func-
tions on T. We'll write CC (T) to denote the complex-valued continuous functions on
T, and CR (T) to denote the real-valued continuous functions on T. Similarly, CC is
the space of complex-valued, 2 -periodic functions on R, while CR2 stands for the real-
valued, 2 -periodic functions on R. Now, under the identi cation we made earlier, we have
CC (T) = CC2 and CR (T) = CR2 . The complex-valued trig polynomials in CC2 now match
up with the full set of polynomials, with complex coe cients, in z = eit and z = e;it . We'll
use the Stone-Weierstrass theorem to show that these polynomials are dense in CC (T).
Now the polynomials in z obviously separate points in T and vanish at no point of T.
Nevertheless, the polynomials in z alone are not dense in CC (T). To see this, here's a proof
that f(z) = z cannot be uniformly approximated by polynomials in z. First, suppose that
we're given some polynomial p(z) = n ck zk . Then,
Z2 Z2 X Z2
f(eit ) p(eit ) dt = eit p(eit ) dt = ei(k+1)t dt = 0
0 0 0
and so Z2 Z2
f(eit ) f(eit ) ; p(eit ) dt
f(eit ) f(eit ) dt =
0 0
because f(z) f(z) = jf(z)j2 = 1. Now, taking absolute values, we get
f(eit ) ; p(eit ) dt 2 kf ; pk:
That is, kf ; pk 1 for any polynomial p.
We might as well proceed in some generality: Given a compact metric space X, we'll
write CC (X) for the set of all continuous, complex-valued functions f : X ! C , and we
Stone-Weierstrass 152
norm CC (X) by kfk = max jf(x)j (where jf(x)j is the modulus of the complex number
f(x), of course). CC (X) is a Banach algebra over C . In order to make it clear which eld
of scalars is involved, we'll write CR (X) for the real-valued members of CC (X). Notice,
though, that CR (X) is nothing other than C(X) with a new name.
More generally, we'll write AC to denote an algebra, over C , of complex-valued func-
tions and AR to denote the real-valued members of AC . It's not hard to see that AR is
then an algebra, over R, of real-valued functions.
Now if f is in CC (X), then so is the function f(x) = f(x) (the complex-conjugate of
f(x)). This puts
; 1;
Ref = 1 f + f and Imf = 2i f ; f
the real and imaginary parts of f, in CR (X) too. Conversely, if g, h 2 CR (X), then
g + ih 2 CC (X).
This simple observation gives us a hint as to how we might apply the Stone-Weierstrass
theorem to subalgebras of CC (X). Given a subalgebra AC of CC (X), suppose that we could
prove that AR is dense in CR (X). Then, given any f 2 CC (X), we could approximate Ref
and Imf by elements g, h 2 AR . But since AR AC , this means that g + ih 2 AC , and
g + ih approximates f. That is, AC is dense in CC (X). Great! And what did we really use
here? Well, we need AR to contain the real and imaginary parts of \most" functions in
CC (X). If we insist that AC separate points and vanish at no point, then AR will contain
\most" of CR (X). And, to be sure that we get both the real and imaginary parts of each
element of AC , we'll insist that AC contain the conjugates of each of its members: f 2 AC
whenever f 2 AC . That is, we'll require that AC be self-conjugate (or, as some authors
say, self-adjoint).
Stone-Weierstrass Theorem. (complex scalars) Let X be a compact metric space, and
let AC be a subalgebra, over C , of CC (X). If AC separates points in X, vanishes at no
point of X, and is self-conjugate, then AC is dense in CC (X).
Proof. Again, write AR for the set of real-valued members of AC . Since AC is self-
conjugate, AR contains the real and imaginary parts of every f 2 AC
1 ;f + f 2 A and Imf = 1 ;f ; f 2 A :
Ref = 2 2i
Stone-Weierstrass 153
Moreover, AR is a subalgebra, over R, of CR (X). In addition, AR separates points in X
and vanishes at no point of X. Indeed, given x 6= y 2 X and f 2 AC with f(x) 6= f(y),
we must have at least one of Ref(x) 6= Ref(y) or Imf(x) 6= Imf(y). Similarly, f(x) 6= 0
means that at least one of Ref(x) 6= 0 or Imf(x) 6= 0 holds. That is, AR satis es the
hypotheses of the real-scalar version of the Stone-Weierstrass theorem. Consequently, AR
is dense in CR (X).
Now, given f 2 CC (X) and " > 0, take g, h 2 AR with kg ; Refk < "=2 and
kh ; Imfk < "=2. Then, g + ih 2 AC and kf ; (g + ih)k < ". Thus, AC is dense in
CC (X).
Corollary. The polynomials, with complex coe cients, in z and z are dense in CC (T). In
other words, the complex trig polynomials are dense in CC2 .
Note that it follows from the complex-scalar proof that the real parts of the polyno-
mials in z and z, that is, the real trig polynomials, are dense in CR (T) = CR2 .
Corollary. The real trig polynomials are dense in CR2 .
Application: Lipschitz Functions
In most Real Analysis courses, the classical Weierstrass theorem is used to prove that
C a b ] is separable. Likewise, the Stone-Weierstrass theorem can be used to show that
C(X) is separable, where X is a compact metric space. While we won't have anything quite
so convenient as polynomials at our disposal, we do, at least, have a familiar collection of
functions to work with.
Given a metric space (X d ), and 0 K < 1, we'll write lipK (X) to denote the
collection of all real-valued Lipschitz functions on X with constant at most K that is,
f : X ! R is in lipK (X) if jf(x) ; f(y)j Kd(x y) for all x, y 2 X. And we'll write
lip(X) to denote the set of functions that are in lipK (X) for some K in other words,
lip(X) = 1 lipK (X). It's easy to see that lip(X) is a subspace of C(X) in fact, if X
is compact, then lip(X) is even a subalgebra of C(X). Indeed, given f 2 lipK (X) and
g 2 lipM (X), we have
jf(x)g(x) ; f(y)g(y)j jf(x)g(x) ; f(y)g(x)j + jf(y)g(x) ; f(y)g(y)j
Kkgk jx ; yj + Mkfk jx ; yj:
Stone-Weierstrass 154
Lemma. If X is a compact metric space, then lip(X) is dense in C(X).
Proof. Clearly, lip(X) contains the constant functions and so vanishes at no point of
X. To see that lip(X) separates point in X, we use the fact that the metric d is Lipschitz:
Given x0 6= y0 2 X, the function f(x) = d(x y0 ) satis es f(x0 ) > 0 = f(y0 ) moreover,
f 2 lip1(X) since

jf(x) ; f(y)j = jd(x y0 ) ; d(y y0 )j d(x y):

Thus, by the Stone-Weierstrass Theorem, lip(X) is dense in C(X).
Theorem. If X is a compact metric space, then C(X) is separable.
Proof. It su ces to show that lip(X) is separable. (Why?) To see this, rst notice that
lip(X) = 1 EK , where

EK = ff 2 C(X) : kfk K and f 2 lipK (X)g:

(Why?) The sets EK are (uniformly) bounded and equicontinuous. Hence, by the Arzela-
Ascoli theorem, each EK is compact in C(X). Since compact sets are separable, as are
countable unions of compact sets, it follows that lip(X) is separable.
As it happens, the converse is also true (which is why this is interesting) see Folland's
Real Analysis for more details.
Theorem. If C(X) is separable, where X is a compact Hausdor topological space, then
X is metrizable.
A Short List of References
Abramowitz, M. and Stegun, I., eds., Handbook of Mathematical Functions with For-
mulas, Graphs, and Mathematical Tables, Dover, 1965.
Birkhoff, G., A Source Book in Classical Analysis, Harvard, 1973.
Buck, R. C., ed., Studies in Modern Analysis, MAA, 1962.
Carothers, N. L., Real Analysis, Cambridge, 2000.
Cheney, E. W., Introduction to Approximation Theory, Chelsea, 1982.
Davis, P. J., Interpolation and Approximation, Dover, 1975.
DeVore, R. A. and Lorentz, G. G., Constructive Approximation, Springer-Verlag,
Dudley, R. M., Real Analysis and Probability, Wadsworth & Brooks/Cole, 1989.
Folland, G. B., Real Analysis: modern techniques and their applications, Wiley, 1984.
Fox, L. and Parker, I. B., Chebyshev Polynomials, Oxford University Press, 1968.
Hoffman, K., Analysis in Euclidean Space, Prentice-Hall, 1975.
Jackson, D., Theory of Approximation, AMS, 1930.
Jackson, D., Fourier Series and Orthogonal Polynomials, MAA, 1941.
Korner, T. W., Fourier Analysis, Cambridge, 1988.
Korovkin, P. P., Linear Operators and Approximation Theory, Hindustan Publishing,
La Vallee Poussin, Ch.-J. de, Lecons sur l'Approximation des Fonctions d'une Vari-
able Reele, Gauthier-Villars, 1919.
Lorentz, G. G., Bernstein Polynomials, Chelsea, 1986.
Lorentz, G. G., Approximation of Functions, Chelsea, 1986.
Natanson, I., Constructive Function Theory, 3 vols., Ungar, 1964{1965.
Powell, M. J. D., Approximation Theory and Methods, Cambridge, 1981.
Rivlin, T. J., An Introduction to the Approximation of Functions, Dover, 1981.
Rivlin, T. J., The Chebyshev Polynomials, Wiley, 1974.
Rudin, W., Principles of Mathematical Analysis, 3rd. ed., McGraw-Hill, 1976.
Simmons, G. F., Introduction to Topology and Modern Analysis, McGraw-Hill, 1963
reprinted by Robert E. Krieger Publishing, 1986.
Boas, R. P., \Inequalities for the derivatives of polynomials," Mathematics Magazine,
42 (1969), 165{174.
Fisher, S., \Quantitative approximation theory," The American Mathematical Monthly,
85 (1978), 318{332.
References 156
Hedrick, E. R., \The signi cance of Weierstrass's theorem," The American Mathemat-
ical Monthly, 20 (1927), 211{213.
Jackson, D., \The general theory of approximation by polynomials and trigonometric
sums," Bulletin of the American Mathematical Society, 27 (1920{1921), 415{431.
Lebesgue, H., \Sur l'approximation des fonctions," Bulletin des Sciences Mathematique,
22 (1898), 278{287.
Shields, A., \Polynomial approximation," The Mathematical Intelligencer, 9 (1987),
No. 3, 5{7.
Shohat, J. A., \On the development of functions in series of orthogonal polynomials,"
Bulletin of the American Mathematical Society, 41 (1935), 49{82.
Stone, M. H., \Applications of the theory of Boolean rings to general topology," Trans-
actions of the American Mathematical Society, 41 (1937), 375{481.
Stone, M. H., \A generalized Weierstrass theorem," in Studies in Modern Analysis, R.
C. Buck, ed., MAA, 1962.
Van Vleck, E. B., \The in uence of Fourier's series upon the development of mathe-
matics," Science, 39 (1914), 113{124.
Weierstrass, K., \Uber die analytische Darstellbarkeit sogenannter willkurlicher Func-
tionen einer reellen Veranderlichen," Sitzungsberichte der Koniglich Preussischen
Akademie der Wissenshcaften zu Berlin, (1885), 633{639, 789{805.
Weierstrass, K., \Sur la possibilite d'une representation analytique des fonctions
dites arbitraires d'une variable reele," Journal de Mathematiques Pures et Appliquees,
2 (1886), 105{138.


. 5
( 5)