ñòð. 1 |

Approximation Theory

Math 682 Summer 1998

N. L. Carothers

Department of Mathematics and Statistics

Bowling Green State University

Table of Contents

Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Problem Set: Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Approximation by Algebraic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Problem Set: Uniform Approximation by Polynomials . . . . . . . . . . . . . . . . 42

Trigonometric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Problem Set: Trigonometric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Characterization of Best Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Problem Set: Chebyshev Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Examples: Chebyshev Polynomials in Practice . . . . . . . . . . . . . . . . . . . . . . 74

A Brief Introduction to Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Problem Set: Lagrange Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Approximation on Finite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

A Brief Introduction to Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Problem Set: Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Jackson's Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Problem Set: Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

The Muntz Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

The Stone-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

A Short List of References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

Preface

These are notes for a six week summer course on approximation theory that I o er oc-

casionally at Bowling Green State University. Our summer classes meet for 90 minutes,

ve days a week, for a total of 45 hours. But the pace is somewhat leisurely and there is

probably not quite enough material here for a \regulation" one semester (45 hour) course.

On the other hand, there is more than enough material here for a one quarter (30 hour)

course and evidently enough for a ve or six week summer course.

I should stress that my presentation here is by no means original: I borrow heavily

from a number of well known texts on approximation theory (see the list of references at

the end of these notes). I use T. J. Rivlin's book, An Introduction to the Approximation

of Functions, as a complementary text and thus you will see many references to Rivlin

throughout the notes. Also, a few passages here and there are taken from my book, Real

Analysis. In particular, large portions of these notes are based on copyrighted material.

They are o ered here solely as an aid to teachers and students of approximation theory

and are intended for limited personal use only. I should also point out that I am not an

expert in approximation theory and I make no claims that the material presented here is

in current fashion among experts in the eld.

My interest in approximation theory stems from its beauty, its utility, and its rich

history. There are also many connections that can be drawn to questions in both classical

and modern analysis. For the purposes of this short introductory course, I focus on a

handful of classical topics (with a little bit of modern terminology here and there) and

\name" theorems. Indeed, the Weierstrass approximation theorem, along with its various

relatives, is the central theme of the course.

In terms of prerequisites, I assume at least a one semester course in advanced calcu-

lus or real analysis (compactness, completeness, uniform convergence, uniform continuity,

normed spaces, etc.) along with a course in linear algebra. The rst chapter, entitled

Preliminaries, contains four brief appendices that provide an all too brief review of such

topics they are included in order to make the notes as self-contained as possible. The

course is designed for beginning master's students (in both pure and applied mathemat-

ics), but should be largely accessible to advanced undergraduates. From my experience,

there are plenty of topics here that even advanced PhD students will nd entertaining.

Preliminaries

Math 682 5/18/98

Introduction

In 1853, the great Russian mathematician, P. L. Chebyshev Cebysev], while working on a

problem of linkages, devices which translate the linear motion of a steam engine into the

circular motion of a wheel, considered the following problem:

Given a continuous function f de ned on a closed interval a b ] and a positive

P

integer n, can we \represent" f by a polynomial p(x) = n ak xk , of degree

k=0

at most n, in such a way that the maximum error at any point x in a b ] is

controlled? In particular, is it possible to construct p in such a way that the error

max jf(x) ; p(x)j is minimized?

axb

This problem raises several questions, the rst of which Chebyshev himself ignored:

{ Why should such a polynomial even exist?

{ If it does, can we hope to construct it?

{ If it exists, is it also unique?

R b jf(x);p(x)j2 dx?

{ What happens if we change the measure of the error to, say, a

Chebyshev's problem is perhaps best understood by rephrasing it in modern terms.

What we have here is a problem of linear approximation in a normed linear space. Recall

that a norm on a (real) vector space X is a nonnegative function on X satisfying

kxk 0, and kxk = 0 () x = 0

k xk = j jkxk for 2 R

kx + yk kxk + kyk for any x, y 2 X.

Any norm on X induces a metric or distance function by setting dist(x y) = kx ;yk. The

abstract version of our problem(s) can now be restated:

{ Given a subset (or even a subspace) Y of X and a point x 2 X, is there an

element y 2 Y which is \nearest" to x that is, can we nd a vector y 2 Y such

that kx ; yk = z2Y kx ; zk? If there is such a \best approximation" to x from

inf

elements of Y , is it unique?

Preliminaries 2

Examples

;Pn jx j2 1=2, the problem has

k(xk )n k2

Rn

1. In X = with its usual norm = k=1 k

k=1

a complete solution for any subspace (or, indeed, any closed convex set) Y . This

problem is often considered in Calculus or Linear Algebra where it is called \least-

squares approximation." A large part of the current course will be taken up with

least-squares approximations, too. For now let's simply note that the problem changes

character dramatically if we consider a di erent norm on Rn.

Consider X = R2 under the norm k(x y)k = maxfjxj jyjg, and consider the

subspace Y = f(0 y) : y 2 Rg (i.e., the y-axis). It's not hard to see that the point

x = (1 0) 2 R2 has in nitely many nearest points in Y indeed, every point (0 y),

;1 y 1, is nearest to x.

2. There are many norms we might consider on Rn. Of particular interest are the `p-

norms that is, the scale of norms:

!1=p

X

n

k(xi )n kp = jxk jp 1 p<1

i=1

k=1

and

k(xi )n k1 = 1maxn jxi j:

i=1 i

It's easy to see that k k1 and k k1 de ne norms. The other cases take a bit more

work we'll supply full details later.

3. Our original problem concerns X = C a b ], the space of all continuous functions

f : a b ] ! R under the uniform norm kfk = amaxb jf(x)j. The word \uniform" is

x

used because convergence in this norm is the same as uniform convergence on a b ]:

kfn ; fk ! 0 () fn f on a b ]:

In this case we're interested in approximations by elements of Y = Pn, the subspace

of all polynomials of degree at most n in C a b ]. It's not hard to see that Pn is a

nite-dimensional subspace of C a b ] of dimension exactly n + 1. (Why?)

If we consider the subspace Y = P consisting of all polynomials in X = C a b ],

we readily see that the existence of best approximations can be problematic. It follows

Preliminaries 3

from the Weierstrass theorem, for example, that each f 2 C a b ] has distance 0

from P but, since not every f 2 C a b ] is a polynomial (why?), we can't hope for

a best approximating polynomial to exist in every case. For example, the function

f(x) = x sin(1=x) is continuous on 0 1 ] but can't possibly agree with any polynomial

on 0 1 ]. (Why?)

The key to the problem of polynomial approximation is the fact that each Pn is

nite-dimensional. To see this, it will be most e cient to consider the abstract setting of

nite-dimensional subspaces of arbitrary normed spaces.

\Soft" Approximation

Lemma. Let V be a nite-dimensional vector space. Then, all norms on V are equivalent.

That is, if k k and jjj jjj are norms on V , then there exist constants 0 < A B < 1 such

that

A kxk jjjx jjj B kxk

for all vectors x 2 V .

k k is a norm on V . Fix a basis

Proof. Suppose that V is n-dimensional and that

e1 : : : en for V and consider the norm

X X

n n

jaij = k(ai )n k1

ai ei = i=1

i=1 i=1

1

P

for x = n ai ei 2 V . Since e1 : : : en is a basis for V , it's not hard to see that k k1 is,

i=1

indeed, a norm on V . It now su ces to show that k k and k k1 are equivalent. (Why?)

One inequality is easy to show indeed, notice that

X X X X

n n n n

jai jkei k max kei k jai j = B

ai ei ai ei :

1in

i=1 i=1 i=1 i=1 1

The real work comes in establishing the other inequality.

To begin, notice that we've actually set-up a correspondence between Rn and V

P

speci cally, the map (ai )n 7! n ai ei is obviously both one-to-one and onto. Moreover,

i=1 i=1

this correspondence is an isometry between (Rn k k1) and (V k k1 ).

Preliminaries 4

Now the inequality we've just established shows that the function x 7! kxk is contin-

uous on the space (V k k1) since

kxk ; kyk kx ; yk B kx ; yk1

for any x, y 2 V . Thus, k k assumes a minimum value on the compact set

S = fx 2 V : kxk1 = 1g:

(Why is S compact?) In particular, there is some A > 0 such that kxk A whenever

kxk1 = 1. (Why can we assume that A > 0?) The inequality we need now follows from

the homogeneity of the norm:

x A =) kxk A kxk : 1

kxk1

Corollary. Given a < b ( xed) and a positive integer n, there exist 0 < An , Bn < 1

(constants which may depend on n) such that

X X X

n n n

ak xk

jak j jak j:

An Bn

max

a x b k=0

k=0 k=0

Exercise

Find explicit \formulas" for An and Bn, above. (This can be done without any fancy

theorems.) If it helps, you may consider the case a b ] = 0 1 ].

Corollary. Let Y be a nite-dimensional normed space and let M > 0. Then, any closed

ball fy 2 Y : kyk Mg is compact.

Proof. Again suppose that Y is n-dimensional and that e1 : : : en is a basis for Y . From

our previous lemma we know that there is some constant A > 0 such that

X X

n n

jai j

A ai ei

i=1 i=1

P

for all x = n ai ei 2 Y . In particular,

i=1

X

n

M =) jai j M for i = 1 : : : n:

A jai j ai ei A

i=1

Preliminaries 5

Thus, fy 2 Y : kyk Mg is a closed subset (why?) of the compact set

( )

X

n

ai ei : jai j M i = 1 : : : n :

x= A

i=1

Corollary. Every nite-dimensional normed space is complete. In particular, if Y is a

nite-dimensional subspace of a normed linear space X, then Y is a closed subset of X.

Theorem. Let Y be a nite-dimensional subspace of a normed linear space X, and let

x 2 X. Then, there exists a (not necessarily unique) y 2 Y such that

kx ; y k = min kx ; yk

y2Y

for all y 2 Y . That is, there is a best approximation to x by elements of Y .

2 Y , we know that a nearest point y will satisfy

Proof. First notice that since 0

kx ; y k kxk = kx ; 0k. Thus, it su ces to look for y among the vectors y 2 Y

satisfying kx ; yk kxk. It will be convenient to use a slightly larger set of vectors,

though. By the triangle inequality,

kx ; yk kxk =) kyk kx ; yk + kxk 2kxk:

Thus, we may restrict our attention to those y's in the compact set

K = fy 2 Y : kyk 2kxkg:

To nish the proof, we need only notice that the function f(y) = kx ; yk is continuous:

jf(y) ; f(z)j = kx ; yk ; kx ; zk ky ; zk

hence attains a minimum value at some point y 2 K.

Corollary. For each f 2 C a b ], and each positive integer n, there is a (not necessarily

unique) polynomial pn 2 Pn such that

kf ; pnk = p2Pn kf ; pk:

min

Preliminaries 6

Corollary. Given f 2 C a b ] and a ( xed) positive integer n, there exists a constant

R < 1 such that if

Xk

n

f ; ak x kfk

k=0

then 0maxn jak j R.

k

Examples

Nothing in our Corollary says that pn will be a polynomial of degree exactly n|rather, a

polynomial of degree at most n. For example, the best approximation to f(x) = x by a

polynomial of degree at most 3 is, of course, p(x) = x. Even examples of non-polynomial

functions are easy to come by for instance, the best linear approximation to f(x) = jxj

on ;1 1 ] is actually the constant function p(x) = 1=2, and this makes for an entertaining

exercise.

Before we leave these \soft" arguments behind, let's discuss the problem of uniqueness

of best approximations. First, let's see why we want best approximations to be unique:

Lemma. Let Y be a nite-dimensional subspace of a normed linear space X, and suppose

that each x 2 X has a unique nearest point yx 2 Y . Then, the nearest point map x 7! yx

is continuous.

Proof. Let's write P(x) = yx for the nearest point map, and let's suppose that xn ! x

in X. We want to show that P(xn ) ! P(x), and for this it's enough to show that there is

a subsequence of (P (xn )) which converges to P(x). (Why?)

Since the sequence (xn ) is bounded in X, say kxn k M for all n, we have

kP(xn )k kP(xn ) ; xn k + kxnk 2kxnk 2M:

Thus, (P (xn )) is a bounded sequence in Y , a nite-dimensional space. As such, by passing

to a subsequence, we may suppose that (P(xn )) converges to some element P0 2 Y . (How?)

Now we need to show that P0 = P(x). But

kP(xn) ; xnk kP(x) ; xnk (why?)

for any n, and hence, letting n ! 1,

kP0 ; xk kP(x) ; xk:

Preliminaries 7

Since nearest points in Y are unique, we must have P0 = P(x).

Exercise

Let X be a normed linear space and let P : X ! X. Show that P is continuous at x 2 X

if and only if, whenever xn ! x in X, some subsequence of (P(xn )) converges to P(x).

Hint: The forward direction is easy for the backward implication, suppose that (P(xn ))

fails to converge to P(x) and work toward a contradiction.]

It should be pointed out that the nearest point map is, in general, nonlinear and, as

such, can be very di cult to work with. Later we'll see at least one case in which nearest

point maps always turn out to be linear.

We next observe that the set of best approximations is always pretty reasonable:

Theorem. Let Y be a subspace of a normed linear space X, and let x 2 X. The set Yx,

consisting of all best approximations to x out of Y , is a bounded, convex set.

fy 2 X : kyk 2kxkg and, hence, is

Proof. As we've seen, the set Yx is a subset of

bounded.

Now recall that a subset K of a vector space V is said to be convex if K contains the

line segment joining any pair of its points. Speci cally, K is convex if

x y2K 0 1 =) x + (1 ; )y 2 K:

Now, y1 , y2 2 Yx means that

kx ; y1k = kx ; y2 k = min kx ; yk:

y2Y

1, set y = y1 + (1; )y2 . We want to show that y 2 Yx, but notice

Next, given 0

that we at least have y 2 Y . Finally, we estimate:

kx ; y k = kx ; ( y1 + (1 ; )y2 )k

= k (x ; y1 ) + (1 ; )(x ; y2 )k

kx ; y1 k + (1 ; )kx ; y2k

= min kx ; yk:

y2Y

Preliminaries 8

Hence, kx ; y k = min kx ; yk that is, y 2 Yx.

y2Y

Exercise

If, in addition, Y is nite-dimensional, show that Yx is closed (hence compact).

If Yx contains more than one point, then, in fact, it contains an entire line segment.

Thus, Yx is either empty, contains exactly one point, or contains in nitely many points.

This observation gives us a su cient condition for uniqueness of nearest points: If our

normed space X contains no line segments on any sphere fx 2 X : kxk = rg, then any

best approximation (out of any set) will be unique.

A norm k k on a vector space X is said to be strictly convex if, for any x 6= y 2 X

with kxk = r = kyk, we always have k x + (1 ; )yk < r for any 0 < < 1. That is,

the open line segment between any pair of points on the surface of the ball of radius r in

X lies entirely inside the ball. We often simply say that the space X is strictly convex,

with the understanding that a property of the norm in X is implied. Here's an immediate

corollary to our last result:

Corollary. If X has a strictly convex norm, then, for any subspace Y of X and any point

x 2 X, there can be at most one best approximation to x out of Y . That is, Yx is either

empty or consists of a single point.

In order to arrive at a condition that's somewhat easier to check, let's translate our

original de nition into a statement about the triangle inequality in X.

Lemma. X has a strictly convex norm if and only if the triangle inequality is strict on

non-parallel vectors that is, if and only if

x 6= y y 6= x all 2 R =) kx + yk < kxk + kyk:

Proof. First suppose that X is strictly convex, and let x and y be non-parallel vectors

in X. Then, in particular, the vectors x=kxk and y=kyk must be di erent. (Why?) Hence,

kxk kyk

x+ y < 1:

kxk + kyk kxk kxk + kyk kyk

That is, kx + yk < kxk + kyk.

Preliminaries 9

Next suppose that the triangle inequality is strict on non-parallel vectors, and let

x 6= y 2 X with kxk = r = kyk. If x and y are parallel, then we must have y = ;x.

(Why?) In this case,

k x + (1 ; ) yk = j2 ; 1j kxk < r

since j2 ; 1j < 1 whenever 0 < < 1. Otherwise, x and y are non-parallel. In this case,

for any 0 < < 1, the vectors x and (1 ; ) y are likewise non-parallel. Thus,

k x + (1 ; ) yk < kxk + (1 ; )kyk = r:

Examples

1. The usual norm on C a b ] is not strictly convex (and so the problem of uniqueness

of best approximations is all the more interesting to tackle). For example, if f(x) = x

and g(x) = x2 in C 0 1 ], then kfk = 1 = kgk, f 6= g, while kf + gk = 2. (Why?)

2. The usual norm on Rn is strictly convex, as is any one of the norms k kp, 1 < p < 1.

(We'll prove these facts shortly.) The norms k k1 and k k1, on the other hand, are

not strictly convex. (Why?)

Appendix A

For completeness, we supply a few of the missing details concerning the `p-norms. We

begin with a handful of classical inequalities of independent interest. First recall that we

have de ned a scale of \norms" on Rn by setting:

!1=p

X

n

jxi jp

kxkp = 1 p<1

i=1

and

kxk1 = 1maxn jxi j

i

where x = (xi )n 2 Rn. Please note that the case p = 2 gives the usual Euclidean norm

i=1

on Rn and that the cases p = 1 and p = 1 clearly give rise to legitimate norms on Rn.

Common parlance is to refer to these expressions as `p-norms and to refer to the space

(Rn k kp) as `n. The space of all in nite sequences x = (xn )1 for which the analogous

p n=1

Preliminaries 10

in nite sum (or supremum) kxkp is nite is referred to as `p. What's more, there is a

\continuous" analogue of this scale: We might also consider the norms

!1=p

Zb

kfkp = jf(x)jp dx 1 p<1

a

and

kfk1 = sup jf(x)j

axb

where f is in C a b ] (or is simply Lebesgue integrable). The subsequent discussion actually

covers all of these cases, but we will settle for writing our proofs in the Rn setting only.

Lemma. (Young's inequality): Let 1 < p < 1, and let 1 < q < 1 be de ned by

p

1 1

= 1 that is, q = p;1 . Then, for any a, b 0, we have

p+q

ab 1 ap + 1 bq :

p q

Moreover, equality can only occur if ap = bq . (We refer to p and q as conjugate exponents

q

note that p satis es p = q;1 . Please note that the case p = q = 2 yields the familiar

arithmetic-geometric mean inequality.)

Proof. A quick calculation before we begin:

q ; 1 = p ; 1 ; 1 = p ;p(p ; 1) = p ; 1 :

p 1

;1

Now we just estimate areas for this you might nd it helpful to draw the graph of y = xp;1

(or, equivalently, the graph of x = yq;1 ). Comparing areas we get:

Za Zb

xp;1 dx + yq;1 dy = p ap + 1 bq :

1

ab q

0 0

The case for equality also follows easily from the graph of y = xp;1 (or x = yq;1 ), since

b = ap;1 = ap=q means that ap = bq .

Corollary. (Holder's inequality): Let 1 < p < 1, and let 1 < q < 1 be de ned by

1 1 = 1. Then, for any a1 : : : an and b1 : : : bn in R we have:

p+q

!1=p X !1=q

X X

n n n

jai bi j jaij jbi jq

p :

i=1 i=1 i=1

(Please note that the case p = q = 2 yields the familiar Cauchy-Schwarz inequality.)

Moreover, equality in Holder's inequality can only occur if there exist nonnegative

scalars and such that jai jp = jbi jq for all i = 1 : : : n.

Preliminaries 11

Pn ja jp)1=p and let B = (Pn jb jq )1=q .

Let A = ( i=1 i We may clearly assume

Proof. i=1 i

that A, B 6= 0 (why?), and hence we may divide (and appeal to Young's inequality):

jaibi j jai jp + jbi jq :

AB pAp qBq

Adding, we get:

1 X ja b j 1 X ja jp + 1 X jb jq = 1 + 1 = 1:

n n n

AB i=1 i i pAp i=1 i qBq i=1 i pq

P

That is, n jai bi j AB.

i=1

The case for equality in Holder's inequality follows from what we know about Young's

inequality: Equality in Holder's inequality means that either A = 0, or B = 0, or else

jai jp=pAp = jbi jq =qBq for all i = 1 : : : n. In short, there must exist nonnegative scalars

and such that jai jp = jbi jq for all i = 1 : : : n.

Notice, too, that the case p = 1 (q = 1) works, and is easy:

X!

X

n n

jaibi j jai j max jb j :

1in i

i=1 i=1

Exercise

When does equality occur in the case p = 1 (q = 1)?

Finally, an application of Holder's inequality leads to an easy proof that k kp is actually

a norm. It will help matters here if we rst make a simple observation: If 1 < p < 1 and

p

if q = p;1 , notice that

!(p;1)=p

X

n

( jai jp;1)n q = jai jp = kakp;1:

i=1 p

i=1

Lemma. (Minkowski's inequality): Let 1 < p < 1 and let a = (ai )n , b = (bi )n 2 Rn.

i=1 i=1

Then, ka + bkp kakp + kbkp.

Proof. In order to prove the triangle inequality, we once again let q be de ned by

1 1

p+q = 1, and now we use Holder's inequality to estimate:

X X

n n

jai + bi jp = jai + bi j jai + bi jp;1

i=1 i=1

Preliminaries 12

X X

n n

jaij jai + bi jp;1 jbi j jai + bi jp;1

+

i=1 i=1

kakp k ( jai + bi jp;1)n kq + kykp k( jai + bi jp;1)n kq

i=1 i=1

ka + bkp;1 ( kakp + kbkp) :

= p

That is, ka + bkp ka + bkp;1 ( kakp + kbkp), and the triangle inequality follows.

p p

If 1 < p < 1, then equality in Minkowski's inequality can only occur if a and b

are parallel that is, the `p-norm is strictly convex for 1 < p < 1. Indeed, if ka + bkp =

kakp +kbkp , then either a = 0, or b = 0, or else a, b 6= 0 and we have equality at each stage of

our proof. Now equality in the rst inequality means that jai +bi j = jai j+jbi j, which easily

implies that ai and bi have the same sign. Next, equality in our application of Holder's

inequality implies that there are nonnegative scalars C and D such that jaijp = C jai +bi jp

and jbi jp = D jai + bi jp for all i = 1 : : : n. Thus, ai = E bi for some scalar E and all

i = 1 : : : n.

Of course, the triangle inequality also holds in either of the cases p = 1 or p = 1

(with much simpler proofs).

Exercises

When does equality occur in the triangle inequality in the cases p = 1 or p = 1? In

particular, show that neither of the norms k k1 or k k1 is strictly convex.

Appendix B

Next, we provide a brief review of completeness and compactness. Such review is doomed

to inadequacy the reader unfamiliar with these concepts would be well served to consult

a text on advanced calculus such as Analysis in Euclidean Spaces by K. Ho man, or

Principles of Mathematical Analysis by W. Rudin.

To begin, we recall that a subset A of normed space X (such as R or Rn) is said to be

closed if A is closed under the taking of sequential limits. That is, A is closed if, whenever

(an ) is a sequence from A converging to some point x 2 X, we always have x 2 A. It's not

hard to see that any closed interval, such as a b ] or a 1), is, indeed, a closed subset of

R in this sense. There are, however, much more complicated examples of closed sets in R.

Preliminaries 13

A normed space X is said to be complete if every Cauchy sequence from X converges

(to a point in X). It is a familiar fact from Calculus that R is complete, as is Rn. In fact,

the completeness of R is often assumed as an axiom (in the form of the least upper bound

axiom). There are, however, many examples of normed spaces which are not complete

that is, there are examples of normed spaces in which Cauchy sequences need not converge.

We say that a subset A of a normed space X is complete if every Cauchy sequence

from A converges to a point in A. Please note here that we require not only that Cauchy

sequences from A converge, but also that the limit be back in A. As you might imagine,

the completeness of A depends on properties of both A and the containing space X.

First note that a complete subset is necessarily also closed. Indeed, since every con-

vergent sequence is also Cauchy, it follows that a complete subset is closed.

Exercise

If A is a complete subset of a normed space X, show that A is also closed.

If the containing space X is itself complete, then it's easy to tell which of its subsets

are complete. Indeed, since every Cauchy sequence in X converges (somewhere), all we

need to know is whether the subset is closed.

Exercise

Let A be a subset of a complete normed space X. Show that A is complete if and only if

A is a closed subset of X. In particular, please note that every closed subset of R (or Rn)

is complete.

Finally, we recall that a subset A of a normed space X is said to be compact if every

sequence from A has a subsequence which converges to a point in A. Again, since we

have insisted that certain limits remain in A, it's not hard to see that compact sets are

necessarily also closed.

Exercise

If A is a compact subset of a normed space X, show that A is also closed.

Moreover, since a Cauchy sequence with a convergent subsequence must itself converge

Preliminaries 14

(why?), we actually have that every compact set is necessarily complete.

Exercise

If A is a compact subset of a normed space X, show that A is also complete.

Since the compactness of a subset A has something to do with every sequence in A,

it's not hard to believe that it is a more stringent property than the others we've considered

so far. In particular, it's not hard to see that a compact set must be bounded.

Exercise

If A is a compact subset of a normed space X, show that A is also bounded. Hint: If not,

then A would contain a sequence (an ) with kank ! 1.]

Now it is generally not so easy to describe the compact subsets of a particular normed

space X, however, it is quite easy to describe the compact subsets of R (or Rn). This

well-known result goes by many names we will refer to it as the Heine-Borel theorem.

Theorem. A subset A of (or Rn) is compact if and only if A is both closed and

R

bounded.

Proof. One direction of the proof is easy: As we've already seen, compact sets in R

are necessarily closed and bounded. For the other direction, notice that if A is a bounded

subset of R, then it follows from the Bolzano-Weierstrass theorem that every sequence

from A has a subsequence which converges in R. If A is also a closed set, then this limit

must, in fact, be back in A. Thus, every sequence in A has a subsequence converging to a

point in A.

Appendix C

We next o er a brief review of pointwise and uniform convergence. We begin with an

elementary example:

Example

(a) For each n = 1 2 3 : : :, consider the function fn(x) = ex + n for x 2 R. Note that

x

for each ( xed) x the sequence (fn (x))1 converges to f(x) = ex because

n=1

jfn(x) ; f(x)j = jxj ! 0 as n ! 1:

n

Preliminaries 15

In this case we say that the sequence of functions (fn ) converges pointwise to the

function f on R. But notice, too, that the rate of convergence depends on x. In

particular, in order to get jfn(x) ;f(x)j < 1=2 we would need to take n > 2jxj. Thus,

at x = 2, the inequality is satis ed for all n > 4, while at x = 1000, the inequality is

satis ed only for n > 2000. In short, the rate of convergence is not uniform in x.

(b) Consider the same sequence of functions as above, but now let's suppose that we

restrict that values of x to the interval ;5 5 ]. Of course, we still have that fn(x) !

f(x) for each ( xed) x in ;5 5 ] in other words, we still have that (fn ) converges

pointwise to f on ;5 5 ]. But notice that the rate of convergence is now uniform over

x in ;5 5 ]. To see this, just rewrite the initial calculation:

jfn(x) ; f(x)j = jxj n for x 2 ;5 5 ]

5

n

and notice that the upper bound 5=n tends to 0, as n ! 1, independent of the choice

of x. In this case, we say that (fn ) converges uniformly to f on ;5 5 ]. The point

here is that the notion of uniform convergence depends on the underlying domain as

well as on the sequence of functions at hand.

With this example in mind, we now o er formal de nitions of pointwise and uniform

convergence. In both cases we consider a sequence of functions fn : X ! R, n = 1 2 3 : : :,

each de ned on the same underlying set X, and another function f : X ! R (the candidate

for the limit).

We say that (fn ) converges pointwise to f on X if, for each x 2 X, we have fn(x) !

f(x) as n ! 1 thus, for each x 2 X and each " > 0, we can nd an integer N (which

depends on " and which may also depend on x) such that jfn(x) ; f(x)j < " whenever

n > N. A convenient shorthand for pointwise convergence is: fn ! f on X or, if X is

understood, simply fn ! f.

We say that (fn ) converges uniformly to f on X if, for each " > 0, we can nd

an integer N (which depends on " but not on x) such that jfn(x) ; f(x)j < " for each

x 2 X, provided that n > N. Please notice that the phrase \for each x 2 X" now occurs

well after the phrase \for each " > 0" and, in particular, that the rate of convergence N

does not depend on x. It should be reasonably clear that uniform convergence implies

Preliminaries 16

pointwise convergence in other words, uniform convergence is \stronger" than pointwise

convergence. For this reason, we sometimes use the shorthand: fn f on X or, if X is

understood, simply fn f.

The de nition of uniform convergence can be simpli ed by \hiding" one of the quan-

ti ers under di erent notation indeed, note that the phrase \jfn(x) ; f(x)j < " for any

x 2 X" is (essentially) equivalent to the phrase \supx2X jfn (x) ; f(x)j < "." Thus, our

de nition may be reworded as follows: (fn) converges uniformly to f on X if, given " > 0,

there is an integer N such that supx2X jfn (x) ; f(x)j < " for all n > N.

The notion of uniform convergence exists for one very good reason: Continuity is

preserved under uniform limits. This fact is well worth stating.

Exercise

Let X be a subset of R, let f, fn : X ! R for n = 1 2 3 : : :, and let x0 2 X. If each fn

is continuous at x0 , and if fn f on X, then f is continuous at x0. In particular, if each

fn is continuous on all of X, then so is f. Give an example showing that this result may

fail if we only assume that fn ! f on X.

Appendix D

Lastly, we discuss continuity for linear transformations between normed vector spaces.

Throughout this section, we consider a linear map T : V ! W between vector spaces V

and W that is we suppose that T satis es T( x + y) = T(x) + T(y) for all x, y 2 V ,

and all scalars , . Please note that every linear map T satis es T(0) = 0. If we further

suppose that V is endowed with the norm k k, and that W is endowed with the norm

jjj jjj, the we may consider the issue of continuity of the map T.

The key result for our purposes is that, for linear maps, continuity|even at a single

point|is equivalent to uniform continuity (and then some!).

Theorem. Let (V k k ) and (W jjj jjj ) be normed vector spaces, and let T : V ! W be a

linear map. Then, the following are equivalent:

(i) T is Lipschitz

(ii) T is uniformly continuous

Preliminaries 17

(iii) T is continuous (everywhere)

(iv) T is continuous at 0 2 V

(v) there is a constant C < 1 such that jjj T(x) jjj Ckxk for all x 2 V .

Proof. Clearly, (i) =) (ii) =) (iii) =) (iv). We need to show that (iv) =) (v), and

that (v) =) (i) (for example). The second of these is easier, so let's start there.

(v) =) (i): If condition (v) holds for a linear map T, then T is Lipschitz (with constant

C) since jjj T(x) ; T(y) jjj = jjj T(x ; y) jjj Ckx ; yk for any x, y 2 V .

(iv) =) (v): Suppose that T is continuous at 0. Then we may choose a > 0 so that

jjj T(x) jjj = jjj T(x) ; T(0) jjj 1 whenever kxk = kx ; 0k . (How?)

Given 0 6= x 2 V , we may scale by the factor =kxk to get x=kxk = . Hence,

; ;

T x=kxk 1. But T x=kxk = ( =kxk) T(x), since T is linear, and so we get

jjj T(x) jjj (1= )kxk. That is, C = 1= works in condition (v). (Note that since condition

(v) is trivial for x = 0, we only care about the case x 6= 0.)

A linear map satisfying condition (v) of the Theorem (i.e., a continuous linear map)

is often said to be bounded. The meaning in this context is slightly di erent than usual.

Here it means that T maps bounded sets to bounded sets. This follows from the fact

that T is Lipschitz. Indeed, if jjj T(x) jjj Ckxk for all x 2 V , then (as we've seen)

jjj T(x) ; T(y) jjj Ckx ; yk for any x, y 2 V , and hence T maps the ball about x of

;B (x) B ( T(x)). More

radius r into the ball about T(x) of radius Cr. In symbols, T r Cr

generally, T maps a set of diameter d into a set of diameter at most Cd. There's no danger

of confusion in our using the word bounded to mean something new here the ordinary

usage of the word (as applied to functions) is uninteresting for linear maps. A nonzero

linear map always has an unbounded range. (Why?)

The smallest constant that works in (v) is called the norm of the operator T and is

usually written kTk. In symbols,

kTk = sup jjjkxkjjj = sup jjjTx jjj:

Tx

kxk=1

x6=0

Thus, T is bounded (continuous) if and only if kTk < 1.

Preliminaries 18

The fact that all norms on a nite-dimensional normed space are equivalent provides

a nal (rather spectacular) corollary.

Corollary. Let V and W be normed vector spaces with V nite-dimensional. Then, every

linear map T : V ! W is continuous.

k Pn ixi k1 = Pn j ij, as before.

Proof. Let x1 : : : xn be a basis for V and let i=1 i=1

From the Lemma on page 3, we know that there is a constant B < 1 such that kxk1

B kxk for every x 2 V .

Now if T : (V k k ) ! (W jjj jjj) is linear, we get

!

X X

n n

T i xi i T(xi )

=

i=1 i=1

X

n

j ij jjj T(xi ) jjj

i=1

X

n

max jjj T(xj ) jjj j ij

1jn i=1

X

n

B max jjj T(xj ) jjj i xi :

1jn i=1

That is, jjj T(x) jjj Ckxk, where C = B max jjj T(xj ) jjj (a constant depending only on T

1jn

and the choice of basis for V ). From our last result, T is continuous (bounded).

Problem Set: Function Spaces

Math 682 5/18/98

Problems marked (.) are essential to a full understanding of the course we will discuss

most of these in class. Problems marked ( ) are of general interest and are o ered as a

contribution to your personal growth. Unmarked problems are just for fun.]

The most important collection of functions for our purposes is the space C a b ], consisting

of all continuous functions f : a b ] ! R. It's easy to see that C a b ] is a vector space

under the usual pointwise operations on functions: (f +g)(x) = f(x)+g(x) and ( f)(x) =

f(x) for 2 R. Actually, we will be most interested in the nite-dimensional subspaces

Pn of C a b ], consisting of all algebraic polynomials of degree at most n.

. 1. The subspace Pn has dimension exactly n + 1. Why?

Another useful subset of C a b ] is the collection lipK , consisting of all those f's which

satisfy a Lipschitz condition of order > 0 with constant 0 < K < 1 i.e., those f's for

which jf(x) ; f(y)j K jx ; yj for all x, y in a b ]. Some authors would say that f is

Holder continuous with exponent .]

2. (a) Show that lipK is, indeed, a subset of C a b ].

(b) If > 1, show that lipK contains only the constant functions.

p

(c) Show that x is in lip1 (1=2) and that sin x is in lip11 on 0 1 ].

(d) Show that the collection lip , consisting of all those f's which are in lipK for

some K, is a subspace of C a b ].

(e) Show that lip1 contains all the polynomials.

(f) If f 2 lip for some > 0, show that f 2 lip for all 0 < < .

(g) Given 0 < < 1, show that x is in lip1 on 0 1 ] but not in lip for any > .

We will also want to consider a norm on the vector space C a b ] we typically use the

uniform or sup norm (Rivlin calls this the Chebyshev norm) de ned by kfk = amaxb jf(x)j.

x

Some authors write kfku or kfk1.]

3. Show that Pn and lipK are closed subsets of C a b ] (under the sup norm). Is lip

closed? A bit harder: Show that lip 1 is both rst category and dense in C a b ].

P

. 4. Fix n and consider the norm kpk1 = n jak j for p(x) = a0 + a1x + + anxn 2 Pn.

k=0

Function Spaces 20

Show that there are constants 0 < An Bn < 1 such that Ankpk1 kpk Bnkpk1,

where kpk = max jp(x)j. Do An and Bn really depend on n ?

axb

We will occasionally consider spaces of real-valued functions de ned on nite sets that

is, we will consider Rn under various norms. (Why is this the same?) We de ne a scale

P

of norms on Rn by kxkp = ( n jxi jp)1=p, where x = (x1 : : : xn ) and 1 p < 1 (we

i=1

need p 1 in order for this expression to be a legitimate norm, but the expression makes

perfect sense for any p > 0, and even for p < 0 provided no xi is 0). Notice, please, that

the usual norm on Rn is given by kxk2.

5. Show that p!1 kxkp = 1maxn jxi j. For this reason we de ne kxk1 = 1maxn jxi j. Thus

lim i i

n under the norm k k1 is the same as C(f1 2 : : : ng) with its usual norm.

R

6. Assuming xi 6= 0 for i = 1 : : : n, compute p!0+ kxkp and p!;1 kxkp .

lim lim

7. Consider R2 under the norm kxkp . Draw the graph of the unit sphere fx : kxkp = 1g

for various values of p (especially p = 1, 2, 1).

8. (Young's inequality): Let 1 < p < 1 and let q satisfy p + 1 = 1. Show that

1

q

ab p ap + 1 bq for all a, b 0 with equality if and only if ap = bq .

1

q

9. (Holder's inequality): Let 1 < p < 1 and let q satisfy p + q = 1. Show that

11

Pn ja b j (Pn ja jp)1=p (Pn jb jq )1=q , and

(a) i=1 i i i=1 i i=1 i

R b jf(x) g(x)j dx R b jf(x)jp dx 1=p R b jg(x)jq dx 1=q

(b) .

a a a

Describe the case for equality in each inequality. What happens if p = 1 (q = 1)?

10. (Minkowski's inequality): For 1 p < 1, show that

P P P

(a) ( n jai + bi jp)1=p ( n jai jp)1=p + ( n jbi jp)1=p and that

i=1 i=1 i=1

R b jf(x) + g(x)jp dx R b jf(x)jp dx R b jg(x)jp dx

1=p 1=p 1=p

(b) +a .

a a

Describe the case for equality in each inequality. What happens if p = 1?

Exercise 10 shows that k kp is indeed a norm for 1 p < 1. We write Lp a b ] to mean

the vector space of functions on a b ] for which the integral norm is de ned and nite, we

write `n to mean the vector space of sequences of length n that is, Rn supplied with the

p

Function Spaces 21

norm k kp, and we write `p to mean the vector space of in nite sequences x = (xn )1 n=1

for which kxkp < 1. In each space, the usual algebraic operations are de ned pointwise

(or coordinatewise) and the norm is understood to be k kp.

A normed space (X k k) is said to be strictly convex if kx + yk = kxk + kyk always

implies that x and y lie in the same direction that is, either x = y or y = x for some

nonnegative scalar . Equivalently, X is strictly convex if the triangle inequality is always

strict on nonparallel vectors.

11. Prove that the following are equivalent:

(a) (X k k) is strictly convex.

(b) If x, y 2 X are nonparallel, then x + y < kxk + kyk .

2 2

(c) If x 6= y 2 X with kxk = 1 = kyk, then x + y < 1.

2

12. Show that Lp and `p are strictly convex for 1 < p < 1. Show also that this fails in

case p = 1. Hint: This is actually a statement about the function jtjp, 1 < p < 1.]

Strictly convex spaces are of interest when considering the problem of nearest points: Given

a nonempty subset K of a normed space X and a point x 2 K, we ask whether there is a

=

best approximation to x from elements of K that is, we want to know if there exist one

or more points y0 2 K satisfying

kx ; y0 k = y2K kx ; yk = dist (x K):

inf

It's not hard to see that a satisfactory answer to this question will require that we take K

to be a closed set in X (for otherwise the points in K n K wouldn't have nearest points).

Less easy to see is that we typically also want to assume that K is a convex set. Recall

that a subset K of a vector space X is said to be convex if it contains the line segment

joining any pair of its points that is, K is convex if

x y2K 0 1 =) x + (1 ; )y 2 K:

Obviously, any subspace of X is a convex set and, for our purposes at least, this is the

most important example.

13. Let X be a normed space and let B = fx 2 X : kxk 1g. Show that B is a closed

convex set.

Function Spaces 22

14. Consider R2 under the norm k k1. Let B = fy 2 R2 : kyk1 1g and let x = (2 0).

Show that there are in nitely many points in B nearest to x.

15. (a) Let K = ff 2 L1 0 1 ] : f 0 and kfk1 = 1g. Show that K is a closed convex

set in L1 0 1 ], that 0 2 K, and that every point in K is a nearest point to 0.

=

R

(b) Let K = ff 2 C 0 1 ] : f(0) = 0 and 01 f = 1g. Again, show that K is a closed

convex set in C 0 1 ], that 0 2 K, but that no point in K is nearest to 0.

=

16. Let K be a compact convex set in a strictly convex space X and let x 2 X. Show

that x has a unique nearest point y0 2 K.

17. Let K be a closed subset of a complete normed space X. Prove that K is convex if

and only if K is midpoint convex that is, if and only if (x + y)=2 2 K whenever x,

y 2 K. Is this result true in more general settings? For example, can you prove it

without assuming completeness? Or, for that matter, is it true for arbitrary sets in

any vector space (i.e., without even assuming the presence of a norm)?

Approximation by Algebraic Polynomials

Math 682 5/20/98

Introduction

Let's begin with some notation. Throughout, we're concerned with the problem of best

(uniform) approximation of a given function f 2 C a b ] by elements from Pn, the subspace

of algebraic polynomials of degree at most n in C a b ]. We know that the problem has a

solution (possibly more than one), which we've chosen to write as pn . We set

En(f) = p2P kf ; pk = kf ; pnk:

min

n

Since Pn Pn+1 for each n, it's clear that En(f) En+1(f) for each n. Our goal in this

chapter is to prove that En(f) ! 0. We'll accomplish this by proving:

Theorem. (The Weierstrass Approximation Theorem, 1885): Let f 2 C a b ]. Then,

for every " > 0, there is a polynomial p such that kf ; pk < ".

It follows from the Weierstrass theorem that pn f for each f 2 C a b ]. (Why?)

This is an important rst step in determining the exact nature of En(f) as a function of

f and n. We'll look for much more precise information in later sections.

Now there are many proofs of the Weierstrass theorem (a mere three are outlined in

the exercises, but there are hundreds!), and all of them start with one simpli cation: The

underlying interval a b ] is of no consequence.

Lemma. If the Weierstrass theorem holds for C 0 1 ], then it also holds for C a b ], and

conversely. In fact, C 0 1 ] and C a b ] are, for all practical purposes, identical: They

are linearly isometric as normed spaces, order isomorphic as lattices, and isomorphic as

algebras (rings).

Proof. We'll settle for proving only the rst assertion the second is outlined in the

exercises (and uses a similar argument).

Given f 2 C a b ], notice that the function

;

g(x) = f a + (b ; a)x 0x1

Algebraic Polynomials 24

de nes an element of C 0 1 ]. Now, given " > 0, suppose that we can nd a polynomial p

such that kg ; pk < " in other words, suppose that

;

max f a + (b ; a)x ; p(x) < ":

0x1

Then,

maxb f(t) ; p b ; a < ":

t

;a

at

t;a

(Why?) But if p(x) is a polynomial in x, then q(t) = p b;a is a polynomial in t (again,

why?) satisfying kf ; qk < ".

The proof of the converse is entirely similar: If g(x) is an element of C 0 1 ], then

t;a

f(t) = g b;a , a t b, de nes an element of C a b ]. Moreover, if q(t) is a polynomial

in t approximating f(t), then p(x) = q(a + (b ; a)x) is a polynomial in x approximating

g(x). The remaining details are left as an exercise.

The point to our rst result is that it su ces to prove the Weierstrass theorem for

any interval we like 0 1 ] and ;1 1 ] are popular choices, but it hardly matters which

interval we use.

Bernstein's Proof

The proof of the Weierstrass theorem we present here is due to the great Russian math-

ematician S. N. Bernstein in 1912. Bernstein's proof is of interest to us for a variety of

reasons perhaps most important is that Bernstein actually displays a sequence of polyno-

mials that approximate a given f 2 C 0 1 ]. Moreover, as we'll see later, Bernstein's proof

generalizes to yield a powerful, unifying theorem, called the Bohman-Korovkin theorem.

If f is any bounded function on 0 1 ], we de ne the sequence of Bernstein polynomials

for f by

;B (f) (x) = X f k n xk (1 ; x)n;k 0 x 1:

n

n n k

k=0

Please note that Bn(f) is a polynomial of degree at most n. Also, it's easy to see that

;B (f) (0) = f(0), and ;B (f) (1) = f(1). In general, ;B (f) (x) is an average of

n n n

the numbers f(k=n), k = 0 : : : n. Bernstein's theorem states that Bn(f) f for each

f 2 C 0 1 ]. Surprisingly, the proof actually only requires that we check three easy cases:

f0 (x) = 1 f1(x) = x and f2(x) = x2 :

Algebraic Polynomials 25

This, and more, is the content of the following lemma.

Lemma. (i) Bn(f0 ) = f0 and Bn(f1 ) = f1 .

(ii) Bn(f2 ) = 1 ; 1 f2 + 1 f1 , and hence Bn(f2 ) f2.

n n

Xk 2n k

n ;

; x k x (1 ; x)n;k = x(1 n x) 4n , if 0 x 1.

1

(iii)

k=0 n

(iv) Given > 0 and 0 x 1, let F denote the set of k's in f0 : : : ng for which

k ; x . Then X n xk (1 ; x)n;k 1.

n k 4n 2

k2F

Proof. That Bn (f0 ) = f0 follows from the binomial formula:

Xn k

n

x (1 ; x)n;k = x + (1 ; x)]n = 1:

k

k=0

To see that Bn(f1 ) = f1, rst notice that for k 1 we have

(n ; 1) ! ;1

kn= = n;1 :

(k ; 1) ! (n ; k) !

nk k

Consequently,

Xk n k X n ; 1 k;1

n n

x (1 ; x)n;k x (1 ; x)n;k

x

= k;1

nk

k=0 k=1

X

n;1 n ; 1 xj (1 ; x)(n;1);j = x:

=x

j

j=0

Next, to compute Bn(f2 ), we rewrite twice:

2 n = k n ; 1 = n ; 1 k ; 1 n ; 1 + 1 n ; 1 if k 1

k

n k;1 n n;1 k;1 n k;1

n k

;2 1 ;1

= 1 ; n n ; 2 + n n ; 1 if k 2:

1

k k

Algebraic Polynomials 26

Thus,

Xk

n 2 n xk (1 ; x)n;k

n k

k=0

1 X n ; 2 xk (1 ; x)n;k + 1 X n ; 1 xk (1 ; x)n;k

n n

= 1; n

k=2 k ; 2 n k=1 k ; 1

1 1

= 1 ; n x2 + n x

which establishes (ii) since kBn(f2 ) ; f2 k = n kf1 ; f2 k ! 0 as n ! 1.

1

To prove (iii) we combine the results in (i) and (ii) and simplify. Since ((k=n) ;x)2 =

(k=n)2 ; 2x(k=n) + x2, we get

Xk

n 2 n xk (1 ; x)n;k = 1 ; 1 x2 + 1 x ; 2x2 + x2

n ;x k n n

k=0

1 1

= n x(1 ; x) 4n

for 0 x 1.

Finally, to prove (iv), note that 1 ((k=n) ; x)2 = 2 for k 2 F, and hence

Xn k Xk n xk (1 ; x)n;k

1 2

x (1 ; x)n;k n ;x

k k

2

k2F k2F

Xn k ;x n xk (1 ; x)n;k

1 2

n k

2

k=0

1 from (iii).

4n 2

Now we're ready for the proof of Bernstein's theorem:

Proof. Let f 2 C 0 1 ] and let " > 0. Then, since f is uniformly continuous, there is

a > 0 such that jf(x) ; f(y)j < "=2 whenever jx ; yj < . Now we use the previous

;

lemma to estimate kf ; Bn(f)k. First notice that since the numbers n xk (1 ; x)n;k are

k

Algebraic Polynomials 27

nonnegative and sum to 1, we have

Xn k k

n

f n x (1 ; x)n;k

jf(x) ; Bn(f)(x)j = f(x) ;

k=0 k

Xn k nk n;k

f(x) ; f n k x (1 ; x)

=

k=0

X

n

f(x) ; f n n xk (1 ; x)n;k

k

k

k=0

Now x n (to be speci ed in a moment) and let F denote the set of k's in f0 : : : ng for

which j(k=n);xj . Then jf(x);f(k=n)j < "=2 for k 2 F, while jf(x);f(k=n)j 2kfk

=

for k 2 F. Thus,

;

f(x) ; Bn(f) (x)

" X n xk (1 ; x)n;k + 2kfk X n xk (1 ; x)n;k

2 k= F k k2F k

2

< " 1 + 2kfk 4n 2 from (iv) of the Lemma,

1

2

< " provided that n > kfk=" 2 :

Landau's Proof

Just because it's good for us, let's give a second proof of Weierstrass's theorem. This one

is due to Landau in 1908. First, given f 2 C 0 1 ], notice that it su ces to approximate

f ;p, where p is any polynomial. (Why?) In particular, by subtracting the linear function

f(0)+x(f(1);f(0)), we may suppose that f(0) = f(1) = 0 and, hence, that f 0 outside

0 1 ]. That is, we may suppose that f is de ned and uniformly continuous on all of R.

Again we will display a sequence of polynomials that converge uniformly to f this

time we de ne Z1

Ln(x) = cn f(x + t) (1 ; t2)n dt

;1

where cn is chosen so that Z1

cn (1 ; t2)n dt = 1:

;1

Note that by our assumptions on f, we may rewrite this expression as

Z 1;x Z1

f(x + t) (1 ; t2)n dt = cn f(t) (1 ; (t ; x)2 )n dt:

Ln(x) = cn

;x 0

Algebraic Polynomials 28

Written this way, it's clear that Ln is a polynomial in x of degree at most n.

We rst need to estimate cn. An easy induction argument will convince you that

(1 ; t2)n 1 ; nt2, and so we get

Z1 Z 1=pn 4 1

(1 ; nt2) dt = 3pn > pn

(1 ; t2 )n dt 2

;1 0

p

from which it follows that cn < n. In particular, for any 0 < < 1,

Z1 p

(1 ; t2)n dt < n (1 ; 2)n ! 0 (n ! 1)

cn

which is the inequality we'll need.

Next, let " > 0 be given, and choose 0 < < 1 such that

jf(x) ; f(y)j "=2 whenever jx ; yj :

Then, since cn(1 ; t2)n 0 and integrates to 1, we get

Z1

jLn(x) ; f(x)j = cn f(x + t) ; f(x) (1 ; t2)n dt

0

Z1

jf(x + t) ; f(x)j(1 ; t2)n dt

cn

0

" c Z (1 ; t2)n dt + 2kfk c Z 1(1 ; t2)n dt

2n0 n

" + 2kfk pn (1 ; 2)n < "

2

provided that n is su ciently large.

A third proof of the Weierstrass theorem, due to Lebesgue in 1898, is outlined in the

exercises. Lebesgue's proof is of particular interest since it inspired Stone's version of the

Weierstrass theorem we'll discuss the Stone-Weierstrass theorem a bit later in the course.

Before we go on, let's stop and make an observation or two: While the Bernstein

polynomials Bn(f) o er a convenient and explicit polynomial approximation to f, they

are by no means the best approximations. Indeed, recall that if f1(x) = x and f2 (x) = x2 ,

then Bn(f2 ) = (1 ; n )f2 + n f1 6= f2 . Clearly, the best approximation to f2 out of Pn

1 1

should be f2 itself whenever n 2. On the other hand, since we always have

En(f) kf ; Bn(f)k (why?)

Algebraic Polynomials 29

a detailed understanding of Bernstein's proof will lend insight into the general problem of

polynomial approximation. Our next project, then, is to improve upon our estimate of the

error kf ; Bn(f)k.

Improved Estimates

To begin, we will need a bit more notation. The modulus of continuity of a bounded

function f on the interval a b ] is de ned by

!f ( ) = !f ( a b ] ) = sup jf(x) ; f(y)j : x y 2 a b ] jx ; yj

for any > 0. Note that !f ( ) is a measure of the \"" that goes along with (in the

de nition of uniform continuity) literally, we have written " = !f ( ) as a function of .

Here are a few easy facts about the modulus of continuity:

Exercises

We always have jf(x) ; f(y)j !f ( jx ; yj ) for any x 6= y 2 a b ].

1.

If 0 < 0 , then !f ( 0 ) !f ( ).

2.

f is uniformly continuous if and only if !f ( ) ! 0 as ! 0+.

3.

If f 0 exists and is bounded on a b ], then !f ( ) K for some constant K.

4.

More generally, we say that f satis es a Lipschitz condition of order with constant

5.

1 and 0 K < 1, if jf(x) ; f(y)j Kjx ; yj for all x, y. We

K, where 0 <

abbreviate this statement by the symbols: f 2 lipK . Check that if f 2 lipK , then

!f ( ) K for all > 0.

For the time being, we actually only need one simple fact about !f ( ):

Lemma. Let f be a bounded function on a b ] and let > 0. Then, !f (n ) n !f ( )

for n = 1 2 : : :. Consequently, !f ( ) (1 + ) !f ( ) for any > 0.

jx ; yj n , split the interval x y ] into n pieces, each of

Proof. Given x < y with

length at most . Speci cally, if we set zk = x + k(y ; x)=n, for k = 0 1 : : : n, then

Algebraic Polynomials 30

jzk ; zk;1 j for any k 1, and so

X

n

jf(x) ; f(y)j = f(zk ) ; f(zk;1 )

k=1

X

n

jf(zk ) ; f(zk;1 )j

k=1

n !f ( ):

Thus, !f (n ) n !f ( ).

The second assertion follows from the rst (and one of our exercises). Given > 0,

choose an integer n so that n ; 1 < n. Then,

!f ( ) !f (n ) n !f ( ) (1 + ) !f ( ):

We next repeat the proof of Bernstein's theorem, making a few minor adjustments

here and there.

Theorem. For any bounded function f on 0 1 ] we have

3 1

kf ; Bn(f)k 2 !f pn :

In particular, if f 2 C 0 1 ], then En(f) 2 !f ( pn ) ! 0 as n ! 1.

3 1

Proof. We rst do some term juggling:

X

n k n xk (1 ; x)n;k

jf(x) ; Bn(f)(x)j = f(x) ; f n k

k=0

X

n k n xk (1 ; x)n;k

f(x) ; f n k

k=0

X

n k n xk (1 ; x)n;k

!f x ; n k

k=0

1 X 1 + pn x ; k

n n xk (1 ; x)n;k

!f pn n k

k=0

" #

X knk

n

pn x ;

1

= !f pn x (1 ; x)n;k

1+ nk

k=0

Algebraic Polynomials 31

p

where the third inequality follows from our previous Lemma (by taking = n x ; n k

and = p1n ). All that remains is to estimate the sum, and for this we'll use Cauchy-

Schwarz (and our earlier observations about Bernstein polynomials). Since each of the

;

terms n xk (1 ; x)n;k is nonnegative, we have

k

X

n k n xk (1 ; x)n;k

x; n k

k=0

"X #1=2 "X #1=2

n nn

2

k n xk (1 ; x)n;k

x; n xk (1 ; x)n;k

k k

k=0 k=0

1=2

1 1

= 2pn :

4n

Finally,

pn 1 + pn 2pn = 2 !f pn :

jf(x) ; Bn(f)(x)j !f 1 1 3 1

Examples

3 ; =2and hence En(f) 2 Kn; =2.

1. If f 2 lipK , it follows that kf ; Bn(f)k 3

2 Kn

2. As a particular case of the rst example, consider f(x) = x ; 1 on 0 1 ]. Then

2

f 2 lip11, and so kf ; Bn(f)k 2 n;1=2 . But, as Rivlin points out (see Remark 3 on

3

p. 16 of his book), kf ; Bn(f)k > 1 n;1=2 . Thus, we can't hope to improve on the

2

power of n in this estimate. Nevertheless, we will see an improvement in our estimate

of En(f).

The Bohman-Korovkin Theorem

The real value to us in Bernstein's approach is that the map f 7! Bn(f), while providing

a simple formula for an approximating polynomial, is also linear and positive. In other

words,

Bn(f + g) = Bn(f) + Bn(g)

2R

Bn( f) = Bn(f)

and

Bn(f) 0 whenever f 0:

As it happens, any positive, linear map T : C 0 1 ] ! C 0 1 ] is necessarily also continuous!

Algebraic Polynomials 32

Lemma. If T : C a b ] ! C a b ] is both positive and linear, then T is continuous.

Proof. First note that a positive, linear map is also monotone. That is, T satis es

T(f) T(g) whenever f g. (Why?) Thus, for any f 2 C a b ], we have

;f f jfj =) ;T(f) T(f) T(jfj)

T(jfj). But now jfj kfk 1, where 1 denotes the constant 1 function,

that is, jT(f)j

and so we get

jT(f)j T(jfj) kfk T(1):

Thus,

kT(f)k kfk kT(1)k

for any f 2 C a b ]. Finally, since T is linear, it follows that T is Lipschitz with constant

kT(1)k:

kT(f) ; T(g)k = kT(f ; g)k kT(1)kkf ; gk:

Consequently, T is continuous.

Now positive, linear maps abound in analysis, so this is a fortunate turn of events.

What's more, Bernstein's theorem generalizes very nicely when placed in this new setting.

The following elegant theorem was proved (independently) by Bohman and Korovkin in,

roughly, 1952.

Theorem. Let Tn : C 0 1 ] ! C 0 1 ] be a sequence of positive, linear maps, and suppose

that Tn(f) ! f uniformly in each of the three cases

f0 (x) = 1 f1(x) = x and f2(x) = x2 :

Then, Tn(f) ! f uniformly for every f 2 C 0 1 ].

The proof of the Bohman-Korovkin theorem is essentially identical to the proof of

Bernstein's theorem except, of course, we write Tn (f) in place of Bn(f). For full details,

see Cheney's book An Introduction to Approximation Theory, Chelsea, 1982. Rather than

proving the theorem, let's settle for a quick application.

Algebraic Polynomials 33

Example

Let f 2 C 0 1 ] and, for each n, let Ln(f) be the \polygonal" approximation to f with

nodes at k=n, k = 0 1 : : : n. That is, Ln(f) is linear on each subinterval (k ;1)=n k=n ]

and agrees with f at each of the endpoints Ln(f)(k=n) = f(k=n). Then, Ln(f) ! f

uniformly for each f 2 C 0 1 ]. This is actually an easy calculation all by itself, but let's

see why the Bohman-Korovkin theorem makes short work of it.

That Ln(f) is positive and linear is (nearly) obvious that Ln(f0 ) = f0 and Ln(f1 ) = f1

are really easy since, in fact, Ln(f) = f for any linear function f. We just need to show

that Ln(f2 ) f2 . But a picture will convince you that the maximum distance between

Ln(f2 ) and f2 on the interval (k ; 1)=n k=n ] is at most

2 2

; k;1 = 2kn; 1

k 2:

n n n

2

That is, kf2 ; Ln(f2 )k 2=n ! 0 as n ! 1.

Note that Ln is a linear projection from C 0 1 ] onto the subspace of polygonal

functions based on the nodes k=n, k = 0 : : : n. An easy calculation, similar in spirit

to the example above, will show that kf ; Ln(f)k 2 !f (1=n) ! 0 as n ! 1 for any

f 2 C 0 1 ].]

Problem Set: Uniform Approximation by Polynomials

Math 682 5/20/98

One of our rst tasks will be to give a constructive proof of Weierstrass's Theorem, stating

that each f 2 C a b ] is the uniform limit of a sequence of polynomials. As it happens, the

choice of interval a b ] is inconsequential: If Weierstrass's theorem is true for one, then

it's true for all.

18. De ne : 0 1 ] ! a b ] by (t) = a + t(b ; a) for 0 t 1, and de ne a transfor-

.

mation T : C a b ] ! C 0 1 ] by (T (f))(t) = f( (t)). Prove that T satis es:

(a) T (f + g) = T (f) + T (g) and T (cf) = c T (f) for c 2 R.

(b) T (fg) = T (f) T (g). In particular, T maps polynomials to polynomials.

(c) T (f) T (g) if and only if f g.

(d) kT (f)k = kfk.

(e) T is both one-to-one and onto. Moreover, (T );1 = T ;1.

The point to exercise 18 is that C a b ] and C 0 1 ] are identical as vector spaces, metric

spaces, algebras, and lattices. For all practical purposes, they are one and the same space.

While Bernstein's proof of the Weierstrass theorem (below) will prove most useful for our

purposes, there are many others two of these (in the case of C 0 1 ]) are sketched below.

R

1, de ne In( ) = 1 (1 ; x2 )n dx.

19. (Landau's proof): For each n = 1 2 : : : and 0

Show that In( )=In (0) ! 0 as n ! 1 for any > 0. Now, given f 2 C 0 1 ] with

R

f(0) = f(1) = 0, show that the polynomial Ln(x) = (2In(0));1 01 f(t)(1;(t;x)2 )n dt

converges uniformly to f(x) on 0 1 ] as n ! 1. Hint: You may assume that f 0

outside of 0 1 ].] To get the result for general f 2 C 0 1 ], we simply need to subtract

the linear function f(0) + x(f(1) ; f(0)).

20. (Lebesgue's proof): Given f 2 C 0 1 ], rst show that f can be uniformly approxi-

mated by a polygonal function. Speci cally, given a positive integer N, de ne L(x)

by the conditions L(k=N) = f(k=N) for k = 0 1 : : : N, and L(x) is linear for

k=N x (k+1)=N show that kf ;Lk is small provided that N is su ciently large.

The function L(x) can be written (uniquely) as a linear combination of the \angles"

P

'k (x) = jx;k=Nj + x ;k=N and 'N (x) = 1 the equation L(x) = N ck 'k (x) can

k=0

PN ck 'k (k=N), k = 0 : : : N,

be solved since the system of equations L(k=N) = k=0

Polynomials 35

can be solved (uniquely) for c0 : : : cN . (How?) To nish the proof, we need to show

that jxj can be approximated by polynomials on any interval a b ]. (Why?)

21. Here's an elementary proof that there is a sequence of polynomials (Pn) converging

uniformly to jxj on ;1 1 ].

(a) De ne (Pn) recursively by Pn+1(x) = Pn(x) + x ; Pn(x)2 ]=2, where P0(x) = 0.

Clearly, each Pn is a polynomial.

px for 0 x 1. Use Dini's theorem to

(b) Check that 0 Pn(x) Pn+1(x)

px on 0 1 ].

conclude that Pn(x)

(c) Pn(x2 ) is also a polynomial, and Pn(x2 ) jxj on ;1 1 ].

22. The result in problem 19 (or 20) shows that the polynomials are dense in C 0 1 ].

.

Using the results in 18, conclude that the polynomials are also dense in C a b ].

23. How do we know that there are non-polynomial elements in C 0 1 ]? In other words,

.

is it possible that every element of C 0 1 ] agrees with some polynomial on 0 1 ]?

24. Let (Qn ) be a sequence of polynomials of degree mn, and suppose that (Qn ) converges

uniformly to f on a b ], where f is not a polynomial. Show that mn ! 1.

25. If f 2 C ;1 1 ] (or C 2 ) is an even function, show that f may be uniformly approxi-

mated by even polynomials (or even trig polynomials).

26. If f 2 C 0 1 ] and if f(0) = f(1) = 0, show that the sequence of polynomials

Pn ;n f(k=n) xk (1 ; x)n;k with integer coe cients converges uniformly to f

k=0 k

(where x] denotes the greatest integer in x). The same trick works for any f 2 C a b ]

provided that 0 < a < b < 1.

27. If p is a polynomial and " > 0, prove that there is a polynomial q with rational

coe cients such that kp ; qk < " on 0 1 ]. Conclude that C 0 1 ] is separable.

28. Let (xi ) be a sequence of numbers in (0 1) such that n!1 n Pn xk exists for every

lim 1 i=1 i

P

k = 0 1 2 : : :. Show that n!1 n n f(xi ) exists for every f 2 C 0 1 ].

lim 1 i=1

29. If f 2 C 0 1 ] and if R01 xn f(x) dx = 0 for each n = 0 1 2 : : :, show that f 0. Hint:

R 1 f 2 = 0.]

Using the Weierstrass theorem, show that 0

The next proof of the Weierstrass theorem that we consider is quite explicit we actually

Polynomials 36

display a sequence of polynomials that converges uniformly to a given f 2 C 0 1 ]. Given

;

f 2 C 0 1 ], we de ne the sequence Bn(f) 1 of Bernstein polynomials for f by

n=1

;B (f) (x) = X f k n xk (1 ; x)n;k :

n

n

k=0 n k

Please note that Bn(f) is a polynomial of degree at most n. Also, it's easy to see that

;B (f) (0) = f(0) and ;B (f) (1) = f(1). In general, ;B (f) (x) is an average of

n n n

the numbers f(k=n), k = 0 : : : n. Bernstein's theorem states that the sequence Bn(f)

converges uniformly to f for each f 2 C 0 1 ] the proof is rather simple once we have a

few facts about the Bernstein polynomials at our disposal. For later reference, let's write

f0 (x) = 1 f1(x) = x and f2(x) = x2 :

Among other things, the following exercise establishes Bernstein's theorem for these three

polynomials. Curiously, these few special cases will imply the general result.

. 30. (i) Bn(f0 ) = f0 and Bn(f1 ) = f1 . Hint: Use the binomial theorem.]

;1 ; 1 f + 1 f , and hence (B (f )) converges uniformly to f .

ñòð. 1 |