ñòð. 2 |

n 2 n1 2

P ;k ;

(iii) n n ; x 2 n xk (1 ; x)n;k = x(1;x) 4n , if 0 x 1.

1

k=0 k n

(iv) Given > 0 and 0 x 1, let F denote the set of k's in f0 : : : ng for which

P;

. Then k2F n xk (1 ; x)n;k 4n1 2 .

k ;x

n k

. 31. Show that jBn(f)j Bn(jfj), and that Bn(f) 0 whenever f 0. Conclude that

kBn(f)k kfk.

32. If f is a bounded function on 0 1 ], show that Bn(f)(x) ! f(x) at each point of

continuity of f.

33. (Bohman, Korovkin) Let (Tn) be a sequence of monotone linear operators on C 0 1 ]

that is, each Tn is a linear map from C 0 1 ] into itself satisfying Tn(f) Tn (g)

whenever f g. Suppose also that Tn(f0 ) f0, Tn(f1 ) f1 , and Tn(f2 ) f2 .

Prove that Tn(f) f for every f 2 C 0 1 ]. Hint: Mimic the proof of Bernstein's

theorem.]

34. Find Bn(f) for f(x) = x3 . Hint: k2 = (k ; 1)(k ; 2) + 3(k ; 1) + 1.] The same

method of calculation can be used to show that Bn(f) 2 Pm whenever f 2 Pm and

n > m.

Polynomials 37

35. Let f be continuously di erentiable on a b ], and let " > 0. Show that there is a

polynomial p such that kf ; pk < " and kf 0 ; p0 k < ".

36. Suppose that f 2 C a b ] is twice continuously di erentiable and has f 00 > 0. Prove

that the best linear approximation to f on a b ] is a0 + a1 x where a0 = f 0 (c),

a1 = f(a) + f(c) + f 0 (c)(a + c)]=2, and where c is the unique solution to f 0 (c) =

(f(b) ; f(a))=(b ; a).

The next several exercises concern the modulus of continuity. Given a bounded real-valued

function f de ned on some interval I, we de ne !f , the modulus of continuity of f, by

!f (I ) = !f ( ) = sup jf(x) ; f(y)j : x y 2 I jx ; yj 0:

Note, for example, that if f is uniformly continuous, then !f ( ) ! 0 as ! 0. Indeed,

the statement that jf(x) ; f(y)j " whenever jx ; yj is equivalent to the statement

that !f ( ) ". On the other hand, if the graph of f has a jump of magnitude 1, say, then

!f ( ) 1 for all > 0.

. 37. If f satis es the Lipschitz condition jf(x);f(y)j Kjx;yj, what can you say about

!f ? Calculate !g for g(x) = px.

38. If f 2 C a b ], show that !f ( 1 + 2 ) !f ( 1) + !f ( 2 ) and that !f ( ) # 0 as # 0.

Use this to show that !f is continuous for 0. Finally, show that the modulus of

continuity of !f is again !f .

. 39. (a) If x = cos , where ;1 x 1, and if g( ) = f(cos ), show that !g ( ; ] ) =

!g ( 0 ] ) !f ( ;1 1 ] ).

(b) If g(x) = f(ax+b) for c x d, show that !g ( c d ] ) = !f ( ac+b ad+b ] a ).

40. Let f be continuously di erentiable on 0 1 ]. Show that (Bn(f)0 ) converges uniformly

to f 0 by showing that kBn(f 0 ) ; (Bn+1(f))0 k !f 0 (1=(n + 1)). In order to see

why this is of interest, nd a uniformly convergent sequence of polynomials whose

derivatives fail to converge uniformly. Compare this result with problem 35.]

Trigonometric Polynomials

Math 682 5/26/98

Introduction

A (real) trigonometric polynomial, or trig polynomial for short, is a function of the form

X;

n

a0 + ak cos kx + bk sinkx ()

k=1

where a0 : : : an and b1 : : : bn are real numbers. The degree of a trig polynomial is the

highest frequency occurring in any representation of the form ( ) thus, ( ) has degree n

provided that one of an or bn is nonzero. We will use Tn to denote the collection of trig

polynomials of degree at most n, and T to denote the collection of all trig polynomials

(i.e., the union of the Tn's).

It is convenient to take the space of all continuous 2 -periodic functions on R as the

containing space for Tn a space we denote by C 2 . The space C 2 has several equivalent

descriptions. For one, it's obvious that C 2 is a subspace of C(R), the space of all con-

tinuous functions on R. But we might also consider C 2 as a subspace of C 0 2 ] in the

following way: The 2 -periodic continuous functions on R may be identi ed with the set

of functions f 2 C 0 2 ] satisfying f(0) = f(2 ). Each such f extends to a 2 -periodic

element of C(R) in an obvious way, and it's not hard to see that the condition f(0) = f(2 )

de nes a subspace of C 0 2 ]. As a third description, it is often convenient to identify C 2

with the collection C(T), consisting of all the continuous real-valued functions on T, where

T is the unit circle in the complex plane C . That is, we simply make the identi cations

! ei f( ) ! f(ei ):

and

In any case, each f 2 C 2 is uniformly continuous and uniformly bounded on all of R, and

is completely determined by its values on any interval of length 2 . In particular, we may

(and will) endow C 2 with the sup norm:

kfk = max jf(x)j = max jf(x)j:

0x2 x2R

Our goal in this chapter is to prove what is sometimes called Weierstrass's second

theorem (also from 1885).

Trig Polynomials 39

Theorem. (Weierstrass's Second Theorem, 1885) Let f 2 C 2 . Then, for every " > 0,

there exists a trig polynomial T such that kf ; Tk < ".

Ultimately, we will give several di erent proofs of this theorem. Weierstrass gave a

separate proof of this result in the same paper containing his theorem on approximation

by algebraic polynomials, but it was later pointed out by Lebesgue (1898) that the two

theorems are, in fact, equivalent. Lebesgue's proof is based on several elementary obser-

vations. We will outline these elementary facts as \exercises with hints," supplying a few

proofs here and there, but leaving full details to the reader.

We rst justify the use of the word \polynomial" in describing ( ).

Lemma. cos nx and sin(n + 1)x= sin x can be written as polynomials of degree exactly n

in cos x for any integer n 0.

Proof. Using the recurrence formula cos kx + cos(k ; 2)x = 2 cos(k ; 1)x cos x it's not

hard to see that cos 2x = 2 cos2 x ; 1, cos 3x = 4 cos3 x ; 3 cos x, and cos 4x = 8 cos4 x ;

8 cos2 x + 1. More generally, by induction, cos nx is a polynomial of degree n in cos x

with leading coe cient 2n;1. Using this fact and the identity sin(k + 1)x ; sin(k ; 1)x =

2 cos kx sin x (along with another easy induction argument), it follows that sin(n+1)x can

be written as sin x times a polynomial of degree n in cos x with leading coe cient 2n.

Alternatively, notice that by writing (i sin x)2k = (cos2 x ; 1)k we have

"X #

n n (i sin x)k cosn;k x

cos nx = Re (cos x + i sin x)n ] = Re k

k=0

Xn

n=2]

(cos2 x ; 1)k cosn;2k x:

=

k=0 2k

The coe cient of cosn x in this expansion is then

Xn 1 X n = 2n;1:

n=2] n

2k = 2 k=0 k

k=0

(All the binomial coe cients together sum to (1 + 1)n = 2n, but the even or odd terms

taken separately sum to exactly half this amount since (1 + (;1))n = 0.)

Trig Polynomials 40

Similarly,

"n+1 #

X n + 1 (i sin x)k cosn+1;k x

sin(n + 1)x = Im (cos x + i sin x)n+1 = Im k

k=0

X n+1

n=2]

(cos2 x ; 1)k cosn;2k x sin x

=

k=0 2k + 1

where we've written (i sin x)2k+1 = i(cos2 x ; 1)k sin x. The coe cient of cosn x sin x is

X n+1 X

n=2]

1 n+1 n + 1 = 2n:

=2 k

k=0 2k + 1 k=0

Corollary. Any trig polynomial ( ) may be written as P(cos x) + Q(cos x) sin x, where

P and Q are algebraic polynomials of degree at most n and n ; 1, respectively. If ( )

represents an even function, then it can be written using only cosines.

Corollary. The collection T , consisting of all trig polynomials, is both a subspace and

a subring of C 2 (that is, T is closed under both linear combinations and products). In

other words, T is a subalgebra of C 2 .

It's not hard to see that the procedure we've described above can be reversed that is,

each algebraic polynomial in cos x and sin x can be written in the form ( ). For example,

4 cos3 x = 3 cos x + cos 3x. But, rather than duplicate our e orts, let's use a bit of linear

algebra. First, the 2n + 1 functions

A = f1 cos x cos 2x : : : cos nx sinx sin 2x : : : sin nx g

are linearly independent the easiest way to see this is to notice that we may de ne an

inner product on C 2 under which these functions are orthogonal. Speci cally,

Z2 Z2

hf gi = hf fi = f(x)2 dx 6= 0

f(x) g(x) dx = 0

0 0

for any pair of functions f 6= g 2 A. (We'll pursue this direction in greater detail later in

the course.) Second, we've shown that each element of A lives in the space spanned by

the 2n + 1 functions

B = f1 cos x cos2 x : : : cosn x sinx cos x sin x : : : cosn;1 x sin x g:

Trig Polynomials 41

That is,

Tn span B:

spanA

By comparing dimensions, we have

2n + 1 = dimTn = dim(span A) dim(span B) 2n + 1

and hence we must have span A = span B. The point here is that Tn is a nite-dimensional

subspace of C 2 of dimension 2n + 1, and we may use either one of these sets of functions

as a basis for Tn.

Before we leave these issues behind, let's summarize the situation for complex trig

polynomials i.e., the case where we allow complex coe cients in ( ). Now it's clear that

every trig polynomial ( ), whether real or complex, can be written as

X

n

ck eikx ()

k=;n

where the ck 's are complex that is, a trig polynomial is actually a polynomial (over C ) in

z = eix and z = e;ix . Conversely, every polynomial ( ) can be written in the form ( ),

using complex ak 's and bk 's. Thus, the complex trig polynomials of degree n form a vector

space of dimension 2n+1 over C (hence of dimension 2(2n+1) when considered as a vector

space over R). But, not every polynomial in z and z represents a real trig polynomial.

Rather, the real trig polynomials are the real parts of the complex trig polynomials. To

see this, notice that ( ) represents a real-valued function if and only if

X X X

n n n

ck eikx = c;k eikx

ck eikx =

k=;n k=;n k=;n

that is, ck = c;k for each k. In particular, c0 must be real, and hence

X ikx = c0 + X(ck eikx + c;k e;ikx )

n n

ck e

k=;n k=1

X ikx

n

= c0 + (ck e + ck e;ikx)

k=1

Xn

(ck + ck ) cos kx + i(ck ; ck ) sin kx

= c0 +

k=1

Xn

2Re(ck ) cos kx ; 2Im(ck ) sin kx

= c0 +

k=1

Trig Polynomials 42

which is of the form ( ) with ak and bk real.

Conversely, given any real trig polynomial ( ), we have

X; X ak ; ibk ikx ak + ibk ;ikx

n n

a0 + ak cos kx + bk sin kx = a0 + e+ e

2 2

k=1 k=1

which of of the form ( ) with ck = c;k for each k.

It's time we returned to approximation theory! Since we've been able to identify C 2

with a subspace of C 0 2 ], and since Tn is a nite-dimensional subspace of C 2 , we have

Corollary. Each f 2 C 2 has a best approximation (on all of R) out of Tn. If f is an

even function, then it has a best approximation which is also even.

2 C 2 is even and

Proof. We only need to prove the second claim, so suppose that f

that T 2 Tn satis es

kf ; T k = min kf ; Tk:

T2Tn

e

Then, since f is even, T (x) = T (;x) is also a best approximation to f out of Tn indeed,

e

kf ; T k = max jf(x) ; T (;x)j

x2R

= max jf(;x) ; T (x)j

x2R

= max jf(x) ; T (x)j = kf ; T k:

x2R

But now, the even trig polynomial

e

b(x) = T (x) + T (x) = T (;x) + T (x)

T 2 2

is also a best approximation out of Tn since

e e

(f ; T ) + (f ; T ) kf ; T k + kf ; T k = min kf ; Tk:

b

kf ; T k = 2 2 T2Tn

We next give (de la Vallee Poussin's version of) Lebesgue's proof of Weierstrass's

second theorem that is, we will deduce the second theorem from the rst.

Trig Polynomials 43

Theorem. Let f 2 C 2 and let " > 0. Then, there is a trig polynomial T such that

kf ; Tk = max jf(x) ; T(x)j < ".

x2R

Proof. We will prove that Weierstrass's rst theorem for C ;1 1 ] implies his second

theorem for C 2 .

Step 1. If f is even, then f may be uniformly approximated by even trig polynomials.

If f is even, then it's enough to approximate f on the interval 0 ]. In this case, we

may consider the function g(y) = f(arccos y), ;1 y 1, in C ;1 1 ]. By Weierstrass's

rst theorem, there is an algebraic polynomial p(y) such that

max jf(arccosy) ; p(y)j = max jf(x) ; p(cos x)j < ":

;1 y 1 0x

But T(x) = p(cos x) is an even trig polynomial! Hence,

kf ; Tk = max jf(x) ; T(x)j < ":

x2R

Let's agree to abbreviate kf ; Tk < " as f T.

Step 2. Given f 2 C 2 , there is a trig polynomial T such that 2f(x) sin2 x T(x).

Each of the functions f(x) + f(;x) and f(x) ; f(;x)] sin x is even. Thus, we may

choose even trig polynomials T1 and T2 such that

f(x) ; f(;x)] sin x T2(x):

f(x) + f(;x) T1(x) and

Multiplying the rst expression by sin2 x, the second by sin x, and adding, we get

2f(x) sin2 x T1 (x) sin2 x + T2(x) sin x T3(x)

where T3 (x) is still a trig polynomial, and where \ " now means \within 2"" (since

jsin x j 1).

Step 3. Given f 2 C 2 , there is a trig polynomial T such that 2f(x) cos2 x T(x), where

\ " means \within 2"."

Repeat Step 2 for f(x ; =2) and translate: We rst choose a trig polynomial T4 (x)

such that

2f x ; 2 sin2 x T4 (x):

Trig Polynomials 44

That is,

2f(x) cos2 x T5(x)

where T5(x) is a trig polynomial.

Finally, by combining the conclusions of Steps 2 and 3, we nd that there is a trig

polynomial T6 (x) such that f T6 (x), where, again, \ " means \within 2"."

Just for fun, let's complete the circle and show that Weierstrass's second theorem

for C 2 implies his rst theorem for C ;1 1 ]. Since, as we'll see, it's possible to give an

independent proof of the second theorem, this is a meaningful exercise.

Theorem. Given f 2 C ;1 1 ] and " > 0, there exists an algebraic polynomial p such

that kf ; pk < ".

2 C ;1 1 ], the function f(cos x) is an even function in C 2 . By our

Proof. Given f

Corollary to Weierstrass's second theorem, we may approximate f(cos x) by an even trig

polynomial:

f(cos x) a0 + a1 cos x + a2 cos 2x + + an cos nx:

But, as we've seen, cos kx can be written as an algebraic polynomial in cos x. Hence, there

is some algebraic polynomial p such that f(cos x) p(cos x). That is,

max jf(cos x) ; p(cos x)j = max jf(t) ; p(t)j < ":

;1 t 1

0x

The algebraic polynomials Tn (x) satisfying

Tn(cos x) = cos nx for n = 0 1 2 : : :

are called the Chebyshev polynomials of the rst kind. Please note that this formula

uniquely de nes Tn as a polynomial of degree exactly n, and hence uniquely determines

the values of Tn(x) for jxj > 1, too. The algebraic polynomials Un (x) satisfying

Un(cos x) = sin(n + 1)x for n = 0 1 2 : : :

sinx

are called the Chebyshev polynomials of the second kind. Likewise, note that this formula

uniquely de nes Un as a polynomial of degree exactly n.

Trig Polynomials 45

We will discover many intriguing properties of the Chebyshev polynomials in the next

chapter. For now, let's settle for just one: The recurrence formula we gave earlier

cos nx = 2 cos x cos(n ; 1)x ; cos(n ; 2)x

now becomes

Tn(x) = 2x Tn;1 (x) ; Tn;2(x) n 2

where T0 (x) = 1 and T1(x) = x. This recurrence relation (along with the initial cases T0

and T1 ) may be taken as a de nition for the Chebyshev polynomials of the rst kind. At

any rate, it's now easy to list any number of the Chebyshev polynomials Tn for example,

the next few are T2 (x) = 2x2 ; 1, T3(x) = 4x3 ; 3x, T4(x) = 8x4 ; 8x2 + 1, and T5(x) =

16x5 ; 20x3 + 5x.

Problem Set: Trigonometric Polynomials

Math 682 5/26/98

A (real) trigonometric polynomial, or trig polynomial for short, is a function of the form

X;

n

a0 + ak cos kx + bk sin kx ()

k=1

where a0 : : : an and b1 : : : bn are real numbers. We will use Tn to denote the collection

of trig polynomials of degree at most n, considered as a subspace C 2 , the space of all

continuous 2 -periodic functions on R. The space C 2 may, in turn, be considered as a

subspace of C 0 2 ]. Indeed, the 2 -periodic continuous functions on R may be identi ed

with the subspace of C 0 2 ] consisting of those f's which satisfy f(0) = f(2 ). As an

alternate description, it is often convenient to instead identify C 2 with the collection

C(T), consisting of all continuous real-valued functions on T, where T is the unit circle in

the complex plane C . In this case, we simply make the identi cations

! ei f( ) ! f(ei ):

and

. 41. (a) By using the recurrence formulas cos kx + cos(k ; 2)x = 2 cos(k ; 1)x cos x and

sin(k + 1)x ; sin(k ; 1)x = 2 cos kx sin x, show that each of the functions cos kx

and sin(k + 1)x= sin x may be written as algebraic polynomials of degree exactly

k in cos x. In each case, what is the coe cient of cosk x?

(b) Equivalently, use the binomial formula to write the real and imaginary parts of

(cos x + i sin x)n = cos nx + i sin nx as algebraic polynomials in cos x and sin x.

Again, what are the leading coe cients of these polynomials?

(c) If P(x y) is an algebraic polynomial (in two variables) of degree at most n, show

that P(cos x sin x) may be written as Q(cos x) + R(cos x) sin x, where Q and

R are algebraic polynomials (in one variable) of degrees at most n and n ; 1,

respectively.

(d) Show that cosn x can be written as a linear combination of the functions cos kx,

k = 1 : : : n, and that cosn;1 x sin x can be written as a linear combinations of

the functions sin kx, k = 1 : : : n. Thus, each polynomial P(cos x sin x) in cos x

and sin x can be written in the form ( ).

Trigonometric Polynomials 47

(e) If ( ) represents an even function, show that it can be written using only cosines.

Conversely, if P(x y) is an even polynomial, show that P(cos x sin x) can be

written using only cosines.

42. Show that Tn has dimension exactly 2n + 1 (as a vector space over R).

43. We might also consider complex trig polynomials that is, functions of the form ( ) in

which we now allow the ak 's and bk 's to be complex numbers.

(a) Show that every trig polynomial, whether real or complex, may be written as

X

n

ck eikx ()

k=;n

where the ck 's are complex. Thus, complex trig polynomials are just algebraic

polynomials in z and z, where z = eix 2 T.

(b) Show that ( ) is real-valued if and only if ck = c;k for any k.

(c) If ( ) is a real-valued function, show that it may be written as a real trig poly-

nomial that is, it may be written in the form ( ) using only real coe cients.

Characterization of Best Approximation

Math 682 5/27/98

We next discuss Chebyshev's solution to the problem of best polynomial approximation

from 1854. Given that there was no reason to believe that the problem even had a solution,

let alone a unique solution, Chebyshev's accomplishment should not be underestimated.

Chebyshev might very well have been able to prove Weierstrass's result|30 years early|

had the thought simply occurred to him! Chebyshev's original papers are apparently

rather sketchy. It wasn't until 1903 that full details were given by Kirchberger. Curiously,

Kirchberger's proofs foreshadow very modern techniques such as convexity and separation

arguments. The presentation we'll give owes much to Haar and to de la Vallee Poussin

(both from around 1918).

We begin with an easy observation:

Lemma. Let f 2 C a b ] and let p = pn be a best approximation to f out of Pn. Then,

there are at least two distinct points x1 , x2 2 a b ] such that

f(x1 ) ; p(x1 ) = ;(f(x2 ) ; p(x2 )) = kf ; pk:

That is, f ; p attains both of the values kf ; pk.

Proof. Let's write E = En (f) = kf ; pk = max jf(x) ; p(x)j. If the conclusion of the

axb

Lemma is false, then we might as well suppose that f(x1 ) ; p(x1 ) = E, for some x1, but

that

e = min (f(x) ; p(x)) > ;E:

axb

In particular, E + e 6= 0 and so q = p + (E + e)=2 is an element of Pn with q 6= p. We

claim that q is a better approximation to f than p. Here's why:

E; E2 e f(x) ; p(x) ; E 2 e e; E2 e

+ + +

or

E ;e ;

f(x) ; q(x) ; E 2 e :

2

That is,

E ; e < E = kf ; pk

kf ; qk 2

Best Approximation 49

a contradiction.

Corollary. The best approximating constant to f 2 C a b ] is

2 3

1 4 max f(x) + f(x)5

p0 = min

2 axb axb

2 3

and

1 4 max f(x) ; f(x)5 :

E0(f) = 2 min

axb axb

Exercise.

Proof.

Now all of this is meant as motivation for the general case, which essentially repeats

the observation of our rst Lemma inductively. A little experimentation will convince you

that a best linear approximation, for example, would imply the existence of three points

(at least) at which f ; p1 alternates between kf ; p1 k.

A bit of notation will help us set up the argument for the general case: Given g in

C a b ], we'll say that x 2 a b ] is a (+) point for g (respectively, a (;) point for g) if

g(x) = kgk (respectively, g(x) = ;kgk). A set of distinct point a x0 < x1 < < xn b

will be called an alternating set for g if the xi 's are alternately (+) points and (;) points

that is, if

jg(xi )j = kgk i = 0 1 ::: n

and

g(xi ) = ;g(xi;1 ) i = 1 2 : : : n:

Using this notation, we will be able to characterize the polynomial of best approximation.

Since the following three theorems are particularly important, we will number them for

future reference. Our rst result is where all the ghting takes place:

Theorem 1. Let f 2 C a b ], and suppose that p = pn is a best approximation to f out

of Pn. Then, there is an alternating set for f ; p consisting of at least n + 2 points.

2 Pn, there's nothing to show. (Why?) Thus, we may suppose that f 2 Pn

Proof. If f =

and, hence, that E = En(f) = kf ; pk > 0.

Best Approximation 50

Now consider the (uniformly) continuous function ' = f ; p. We may partition a b ]

by way of a = t0 < t1 < < tn = b into su ciently small intervals so that

j'(x) ; '(y)j < E=2 x y 2 ti ti+1 ]:

whenever

Here's why we'd want to do such a thing: If ti ti+1 ] contains a (+) point for ' = f ; p,

then ' is positive on all of ti ti+1 ]. Indeed,

x y 2 ti ti+1 ] and '(x) = E =) '(y) > E=2 > 0:

Similarly, if ti ti+1 ] contains a (;) point for ', then ' is negative on all of ti ti+1 ].

Consequently, no interval ti ti+1 ] can contain both (+) points and (;) points.

Call ti ti+1 ] a (+) interval (respectively, a (;) interval) if it contains a (+) point

(respectively, a (;) point) for ' = f ;p. Notice that no (+) interval can even touch a

(;) interval. In other words, a (+) interval and a (;) interval must be strictly separated

(by some interval containing a zero for ').

We now relabel the (+) and (;) intervals from left to right, ignoring the \neither"

intervals. There's no harm in supposing that the rst \signed" interval is a (+) interval.

Thus, we suppose that our relabeled intervals are written

I1 I2 : : : Ik1 (+) intervals

Ik1 +1 Ik1+2 : : : Ik2 (;) intervals

:::::::::::::::::

(;1)m;1 intervals

Ikm;1+1 Ik1 +2 : : : Ikm

where Ik1 is the last (+) interval before we reach the rst (;) interval, Ik1 +1. And so on.

For later reference, we let S denote the union of all the \signed" intervals ti ti+1 ]

Sm I , and we let N denote the union of all the \neither" intervals t t ].

that is, S = j=1 kj i i+1

Thus, S and N are compact sets with S N = a b ] (note that while S and N aren't

quite disjoint, they are at least \non-overlapping"|their interiors are disjoint).

Our goal here is to show that m n + 2. (So far we only know that m 2!) Let's

suppose that m < n + 2 and see what goes wrong.

Best Approximation 51

Since any (+) interval is strictly separated from any (;) interval, we can nd points

z1 : : : zm;1 2 N such that

max Ik1 < z1 < minIk1 +1

max Ik2 < z2 < minIk2 +1

::::::::::::::::::::

max Ikm;1 < zm;1 < minIkm;1+1

And now we construct the o ending polynomial:

q(x) = (z1 ; x)(z2 ; x) (zm;1 ; x):

Notice that q 2 Pn since m ; 1 n. (Here is the only use we'll make of the assumption

m < n + 2!) We're going to show that p + q 2 Pn is a better approximation to f than p,

for some suitable scalar .

We rst claim that q and f ; p have the same sign. Indeed, q has no zeros in any

of the ( ) intervals, hence is of constant sign on any such interval. Thus, q > 0 on

I1 : : : Ik1 because each (zj ; x) > 0 on these intervals q < 0 on Ik1+1 : : : Ik2 because

here (z1 ; x) < 0, while (zj ; x) > 0 for j > 1 and so on.

We next nd . Let e = max jf(x) ; p(x)j, where N is the union of all the subin-

x2N

tervals ti ti+1 ] which are neither (+) intervals nor (;) intervals. Then, e < E. (Why?)

Now choose > 0 so that kqk < minfE ; e E=2g. We claim that p + q is a better

approximation to f than p. One case is easy: If x 2 N, then

jf(x) ; (p(x) + q(x))j jf(x) ; p(x)j + jq(x)j e + kqk < E:

On the other hand, if x 2 N, then x is in either a (+) interval or a (;) interval. In

=

particular, we know that jf(x) ;p(x)j > E=2 > kqk and that f(x) ;p(x) and q(x) have

the same sign. Thus,

jf(x) ; (p(x) + q(x))j = jf(x) ; p(x)j ; jq(x)j

E ; min jq(x)j < E

x2S

Best Approximation 52

since q is nonzero on S. This contradiction nishes the proof. (Phew!)

Remarks

1. It should be pointed out that the number n + 2 here is actually 1 + dim Pn.

2. Notice, too, that if f ; pn alternates in sign n + 2 times, then f ; pn must have at

least n+1 zeros. Thus, pn actually agrees with f (or \interpolates" f) at n+1 points.

We're now ready to establish the uniqueness of the polynomial of best approximation.

Theorem 2. Let f 2 C a b ]. Then, the polynomial of best approximation to f out of

Pn is unique.

2 Pn both satisfy kf ; pk = kf ; qk = En(f) = E. Then,

Proof. Suppose that p, q

as we've seen, their average r = (p + q)=2 2 Pn is also best: kf ; rk = E since f ; r =

(f ; p)=2 + (f ; q)=2.

By Theorem 1, f ; r has an alternating set x0 x1 : : : xn+1, containing n + 2 points.

Thus, for each i,

(f ; p)(xi ) + (f ; q)(xi ) = 2E (alternating)

while

;E (f ; p)(xi ) (f ; q)(xi ) E:

But this means that

(f ; p)(xi ) = (f ; q)(xi ) = E (alternating)

for each i. (Why?) That is, x0 x1 : : : xn+1 is an alternating set for both f ; p and f ; q.

In particular, the polynomial q ; p = (f ; p) ; (f ; q) has n + 2 zeros! Since q ; p 2 Pn,

we must have p = q.

Finally, we come full circle:

Theorem 3. Let f 2 C a b ], and let p 2 Pn. If f ; p has an alternating set containing

n + 2 (or more) points, then p is the best approximation to f out of Pn.

Best Approximation 53

Proof. Let x0 x1 : : : xn+1 be an alternating set for f;p, and suppose that some q 2 Pn

is a better approximation to f than p that is, kf ; qk < kf ; pk. In particular, then, we

must have

jf(xi ) ; p(xi )j = kf ; pk > kf ; qk jf(xi ) ; q(xi )j

for each i = 0 1 : : : n + 1. Now the inequality jaj > jbj implies that a and a ; b have the

same sign (why?), hence q ; p = (f ; p) ; (f ; q) alternates in sign n + 2 times (because

f ;p does). But then, q;p would have at least n+1 zeros. Since q;p 2 Pn, we must have

q = p, which is a contradiction. Thus, p is the best approximation to f out of Pn.

Example (taken from Rivlin)

While an alternating set for f ; pn is supposed to have at least n + 2 points, it may well

have more than n + 2 points thus, alternating sets need not be unique. For example,

consider the function f(x) = sin 4x on ; ]. Since there are 8 points where f alternates

between 1, it follows that p0 = 0 and that there are 4 4 = 16 di erent alternating

sets consisting of exactly 2 points (not to mention all those with more than 2 points). In

addition, notice that we actually have p1 = = p6 = 0, but that p7 6= 0. (Why?)

Exercise

Show that y = x ; 1=8 is the best linear approximation to y = x2 on 0 1 ].

Essentially repeating the proof given for Theorem 3 yields a lower bound for En(f).

Theorem. Let f 2 C a b ], and suppose that q 2 Pn is such that f(xi ) ;q(xi ) alternates

in sign at n + 2 points a x0 x1 : : : xn+1 b. Then,

min jf(xi ) ; q(xi )j:

En(f)

i=0 ::: n+1

Proof. If the inequality fails, then the best approximation p = pn would satisfy

max jf(xi ) ; p(xi )j min jf(xi ) ; q(xi )j:

En(f) <

1 i n+1 1 i n+1

Now we could repeat (essentially) the same argument used in the proof of Theorem 3 to

arrive at a contradiction. The details are left as an exercise.

Best Approximation 54

Even for relatively simple functions, the problem of actually nding the polynomial

of best approximation is genuinely di cult (even computationally). We end this section

by stating two important problems that Chebyshev was able to solve.

Problem

Find the polynomial pn;1 2 Pn;1, of degree at most n ; 1, that best approximates

f(x) = xn on the interval ;1 1 ]. (This particular choice of interval makes for a tidy

solution we'll discuss the general situation later.)

Since pn;1 is to minimize max jxn ; pn;1 (x)j, our rst problem is equivalent to:

jxj 1

Problem

Find the monic polynomial of degree n which deviates least from 0 on ;1 1 ]. In other

words, nd the monic polynomial of degree n which has smallest norm in C ;1 1 ].

We'll give two solutions to this problem (which we know has a unique solution, of

course). First, let's simplify our notation. We write

p(x) = xn ; pn;1 (x) (the solution)

and

M = kpk = En;1(xn ;1 1 ]):

All we know about p is that it has an alternating set ;1 x0 < x1 < < xn 1

containing (n ; 1) + 2 = n + 1 points that is, jp(xi )j = M and p(xi+1 ) = ;p(xi ) for all i.

Using this tiny bit of information, Chebyshev was led to compare the polynomials p2 and

p 0 . Watch closely!

Step 1. At any xi in (;1 1), we must have p 0 (xi ) = 0 (because p(xi ) is a relative extreme

value for p). But, p 0 is a polynomial of degree n ; 1 and so can have at most n ; 1 zeros.

Thus, we must have

xi 2 (;1 1) and p 0 (xi ) = 0 for i = 1 : : : n ; 1

(in fact, x1 : : : xn;1 are all the zeros of p 0 ) and

x0 = ;1 p 0 (x0 ) 6= 0 xn;1 = 1 p 0 (xn;1 ) 6= 0:

Best Approximation 55

Step 2. Now consider the polynomial M 2 ; p2 2 P2n. We know that M 2 ; (p(xi ))2 = 0

for i = 0 1 : : : n, and that M 2 ; p2 0 on ;1 1 ]. Thus, x1 : : : xn;1 must be double

roots (at least) of M 2 ; p2. But this makes for 2(n ; 1) + 2 = 2n roots already, so we

must have them all. Hence, x1 : : : xn;1 are double roots, x0 and xn are simple roots, and

these are all the roots of M 2 ; p2.

Step 3. Next consider (p 0 )2 2 P2(n;1). We know that (p 0 )2 has a double root at each

of x1 : : : xn;1 (and no other roots), hence (1 ; x2 )(p 0 (x))2 has a double root at each

x1 : : : xn;1 , and simple roots at x0 and xn . Since (1 ; x2)(p 0 (x))2 2 P2n, we've found all

of its roots.

Here's the point to all this \rooting":

Step 4. Since M 2 ;(p(x))2 and (1 ;x2 )(p 0 (x))2 are polynomials of the same degree with

the same roots, they are, up to a constant multiple, the same polynomial! It's easy to see

what constant, too: The leading coe cient of p is 1 while the leading coe cient of p 0 is

n thus, 20

2 ; (p(x))2 = (1 ; x )(p (x)) :

2

M n2

After tidying up,

p 0 (x)

p M 2 ; (p(x))2 = p 1 n x2 :

;

We really should have an extra here, but we know that p 0 is positive on some interval

we'll simply assume that it's positive on ;1 x1 ]. Now, upon integrating,

arccos p(x) = n arccos x + C

M

or

p(x) = M cos(n arccos x + C):

But p(;1) = ;M (because p 0 (;1) 0), so

cos(n + C) = ;1 =) C = m (with n + m odd)

=) p(x) = M cos(n arccos x)

=) p(cos x) = M cos nx:

Look familiar? Since we know that cos nx is a polynomial of degree n with leading coe -

cient 2n;1 (the n-th Chebyshev polynomial Tn ), the solution to our problem must be

p(x) = 2;n+1 Tn(x):

Best Approximation 56

Since jTn(x)j 1 for jxj 1 (why?), the minimum norm is M = 2;n+1.

Next we give a \fancy" solution, based on our characterization of best approximations

(Theorem 3) and a few simple properties of the Chebyshev polynomials.

Theorem. For any n 1, the formula p(x) = xn ; 2;n+1 Tn (x) de nes a polynomial

p 2 Pn;1 satisfying

2;n+1 = max jxn ; p(x)j < max jxn ; q(x)j

jxj 1 jxj 1

for any other q 2 Pn;1.

Proof. We know that 2;n+1 Tn (x) has leading coe cient 1, and so p 2 Pn;1 . Now set

xk = cos((n ; k) =n) for k = 0 1 : : : n. Then, ;1 = x0 < x1 < < xn = 1 and

Tn(xk ) = Tn(cos((n ; k) =n)) = cos((n ; k) ) = (;1)n;k :

Since jTn (x)j = jTn(cos )j = jcosn j 1, for ;1 x 1, we've found an alternating set

for Tn containing n + 1 points.

In other words, xn ; p(x) = 2;n+1 Tn(x) satis es jxn ; p(x)j 2;n+1 and, for each

k = 0 1 : : : n, has xn ; p(xk ) = 2;n+1 Tn(xk ) = (;1)n;k 2;n+1. By our characterization

k

of best approximations (Theorem 3), p must be the best approximation to xn out of

Pn;1.

Corollary. The monic polynomial of degree exactly n having smallest norm in C a b ] is

(b ; a)n T 2x ; b ; a :

2n2n;1 n b;a

Exercise. Hint: If p(x) is a polynomial of degree n with leading coe cient 1,

Proof.

then p(x) = p((2x ; b ; a)=(b ; a)) is a polynomial of degree n with leading coe cient

˜

2n=(b ; a)n. Moreover, max jp(x)j = max j˜(x)j.]

p

;1 x 1

axb

Properties of the Chebyshev Polynomials

As we've seen, the Chebyshev polynomial Tn (x) is the (unique, real) polynomial of degree

n (having leading coe cient 1 if n = 0, and 2n;1 if n 1) such that Tn(cos ) = cos n

Best Approximation 57

for all . The Chebyshev polynomials have dozens of interesting properties and satisfy all

sorts of curious equations. We'll catalogue just a few.

C1. Tn(x) = 2x Tn;1 (x) ; Tn;2 (x) for n 2.

Proof. It follows from the trig identity cos n = 2 cos cos(n ; 1) ; cos(n ; 2) that

Tn(cos ) = 2 cos Tn;1(cos ) ; Tn;2(cos ) for all that is, the equation Tn (x) =

2x Tn;1 (x) ; Tn;2(x) holds for all ;1 x 1. But since both sides are polynomials,

equality must hold for all x.

The next two properties are proved in essentially the same way:

C2. Tm (x) + Tn(x) = 1 Tm+n(x) + Tm;n (x) for m > n.

2

C3. Tm (Tn(x)) = Tmn (x).

1 (x + px2 ; 1 )n + (x ; px2 ; 1 )n .

C4. Tn(x) = 2

Proof. First notice that the expression on the right-hand side is actually a polynomial

p p

x2 ; 1 )n

and (x ; x2 ; 1 )n, the

since, on combining the binomial expansions of (x +

p

odd powers of x2 ; 1 cancel. Next, for x = cos ,

Tn(x) = Tn (cos ) = cos n = 1 (ein + e;in )

2

= 1 (cos + i sin )n + (cos ; i sin )n

2

1 ;x + ip1 ; x2 n + ;x ; ip1 ; x2 n

=2

1 ;x + px2 ; 1 n + ;x ; px2 ; 1 n :

=2

We've shown that these two polynomials agree for jxj 1, hence they must agree for all x

(real or complex, for that matter).

p 2 ; 1 )n + (x ; px2 ; 1 )n

For real x with jxj 1, the expression (x +

1 x equals

2

cosh(n cosh;1 x). In other words,

C5. Tn(cosh x) = cosh nx for all real x.

The next property also follows from property C4.

Best Approximation 58

p

C6. Tn(x) (jxj + x2 ; 1 )n for jxj 1.

An approach similar to the proof of property C4 allows us to write xn in terms of the

Chebyshev polynomials T0 T1 : : : Tn .

Xn

n=2]

C7. For n odd, 2nxn = k 2 Tn;2k (x) for n even, 2 T0 should be replaced by T0.

k=0

Proof. For ;1 x 1,

2nxn = 2n(cos )n = (ei + e;i )n

= ein + n ei(n;2) + n ei(n;4) +

1 2

n n

+ n ; 2 e;i(n;4) + n ; 1 e;i(n;2) + e;in

= 2 cos n + n 2 cos(n ; 2) + n 2 cos(n ; 4) +

1 2

= 2 Tn(x) + n 2 Tn;2 (x) + n 2 Tn;4(x) +

1 2

; n T (since the central term in the

where, if n is even, the last term in this last sum is n=2] 0

; ;

binomial expansion, namely n = n T , isn't doubled in this case).

n=2] 0

n=2]

C8. The zeros of Tn are x(n) = cos((2k ; 1) =2n), k = 1 : : : n. They're real, simple, and

k

lie in the open interval (;1 1).

Proof. Just check! But notice, please, that the zeros are listed here in decreasing order

(because cosine decreases).

C9. Between two consecutive zeros of Tn, there is precisely one root of Tn;1.

Proof. It's not hard to check that

2k ; 1 < 2k ; 1 < 2k + 1

2 (n ; 1)

2n 2n

for k = 1 : : : n ; 1, which means that x(n) > x(n;1) > x(n) .

k k k+1

C10. Tn and Tn;1 have no common zeros.

Best Approximation 59

C9, there's another way to see it:

Proof. Although this is immediate from property

Tn(x0 ) = 0 = Tn;1(x0 ) implies that Tn;2(x0 ) = 0 by property C1. Repeating this

observation, we would have Tk (x0 ) = 0 for every k < n, including k = 0. No good!

T0(x) = 1 has no zeros.

C11. The set fx(n) : 1 k n n = 1 2 : : :g is dense in ;1 1 ].

k

Proof. Since cos x is (strictly) monotone on 0 ], it's enough to know that the set

f(2k;1) =2ngk n is dense in 0 ], and for this it's enough to know that f(2k ;1)=2ngk n

is dense in 0 1 ]. (Why?) But

2k ; 1 = k ; 1 k

n 2n n

2n

for n large that is, the set f(2k ; 1)=2ngk n is dense among the rationals in 0 1 ].

It's interesting to note here that the distribution of the roots fx(n)gk n can be esti-

k

mated (see Natanson, Constructive Function Theory, Vol. I, pp. 48{51). For large n, the

number of roots of Tn that lie in an interval x x + x ] ;1 1 ] is approximately

pn x 2 :

1;x

In particular, for n large, the roots of Tn are \thickest" near the endpoints 1.

In probabilistic terms, this means that if we assign equal probability to each of the

roots x(n) : : : x(n) (that is, if we think of each root as the position of a point with mass

n

0

1=n), then the density of this probability distribution (or the density of the system of point

p

masses) at a point x is approximately 1= 1 ; x2 for large n. In still other words, this

tells us that the probability that a root of Tn lies in the interval a b ] is approximately

1 Z b p 1 dx :

a 1 ; x2

C12. The Chebyshev polynomials are mutually orthogonal relative to the weight w(x) =

(1 ; x2 );1=2 on ;1 1 ].

Proof. For m 6= n the substitution x = cos yields

Z1 dx = Z cos m cos n d = 0

Tn(x) Tm (x) p

1 ; x2

;1 0

Best Approximation 60

while for m = n we get

Z1 dx = Z cos2 n d = if n = 0

Tn (x) p

2

=2 if n > 0.

1 ; x2

;1 0

C13. jTn(x)j n2 for ;1 x 1, and jTn( 1)j = n2.

0 0

Proof. For ;1 < x < 1 we have

d

d Tn (cos

d T (x) = = n sin n :

)

dx n d cos sin

d

0

Thus, jTn(x)j n2 because j sin n j nj sin j (which can be easily checked by induction,

for example). At x = 1, we interpret this derivative formula as a limit (as ! 0 and

0

! ) and nd that jTn( 1)j = n2.

As we'll see later, each p 2 Pn satis es jp 0 (x)j kpkn2 = kpk Tn(1) for ;1 x 1,

0

and this is, of course, best possible. As it happens, Tn(x) has the largest possible rate of

growth outside of ;1 1 ] among all polynomials of degree n. Speci cally:

Theorem. Let p 2 Pn and let kpk = max jp(x)j. Then, for any x0 with jx0j 1 and

;1 x 1

any k = 0 1 : : : n we have

jp(k)(x0 )j kpk jTn (x0 )j

(k)

where p(k) is the k-th derivative of p.

We'll prove only the case k = 0. In other words, we'll check that jp(x0 )j kpkjTn(x0 )j.

The more general case is in Rivlin, Theorem 1.10, p. 31.

Proof. Since all the zeros of Tn lie in (;1 1), we know that Tn (x0 ) 6= 0. Thus, we may

consider the polynomial

q(x) = Tp(x0 )) Tn (x) ; p(x) 2 Pn:

n (x0

If the claim is false, then

kpk < Tp(x0 )) :

n (x0

Best Approximation 61

Now at each of the points yk = cos(k =n), k = 0 1 : : : n, we have Tn(yk ) = (;1)k and,

hence,

q(yk ) = (;1)k Tp(x0 )) ; p(yk ):

(x n0

Since jp(yk )j kpk, it follows that q alternates in sign at these n + 1 points. In particular,

q must have at least n zeros in (;1 1). But q(x0 ) = 0, by design, and jx0 j 1. That is,

we've found n + 1 zeros for a polynomial of degree n. So, q 0 that is,

p(x) = Tp(x0 )) Tn (x):

n (x0

But then,

jp(1)j = Tp(x0 )) > kpk

n (x0

since Tn(1) = Tn (cos 0) = 1, which is a contradiction.

Corollary. Let p 2 Pn and let kpk = max jp(x)j. Then, for any x0 with jx0 j 1, we

;1 x 1

have q n

jp(x0 )j kpk jx0 j + x2 ; 1 :

0

Rivlin's proof of our last Theorem in the general case uses the following observation:

(k)

C14. For x 1 and k = 0 1 : : : n, we have Tn (x) > 0.

(k)

Exercise. Hint: It follows from Rolle's theorem that Tn is never zero for

Proof.

(k)

x 1. (Why?) Now just compute Tn (1).]

Uniform Approximation by Trig Polynomials

We end this section by summarizing (without proofs) the analogues of Theorems 1{3

for uniform approximation by trig polynomials. Throughout, f 2 C 2 and Tn denotes the

collection of trig polynomials of degree at most n.

1. f has a best approximation T 2 Tn.

2. f ;T has an alternating set containing 2n+2 (or more) points in 0 2 ). (Note here

that 2n + 2 = 1 + dim Tn.)

3. T is unique.

Best Approximation 62

4. If T 2 Tn is such that f ; T has an alternating set containing 2n + 2 or more points

in 0 2 ), then T = T .

The proofs of 1{4 are very similar to the corresponding results for algebraic polyno-

mials. As you might imagine, 2 is where all the ghting takes place, and there are a few

technical di culties to cope with. Nevertheless, we'll swallow these facts whole and apply

them with a clear conscience to a few examples.

Example

For m > n, the best approximation to f(x) = A cos mx + B sin mx out of Tn is 0!

Proof. We may write f(x) = R cos m(x ; x0 ) for some R and x0 . (How?) Now we need

only display a su ciently large alternating set for f (in some interval of length 2 ).

Setting xk = x0 + k =m, k = 1 2 : : : 2m, we get f(xk ) = R cos k = R(;1)k and

xk 2 (x0 x0 + 2 ]. Since m > n, it follows that 2m 2n + 2.

Example

The best approximation to

X

n+1;

f(x) = a0 + ak cos kx + bk sinkx

k=1

out of Tn is

X;

n

T(x) = a0 + ak cos kx + bk sinkx

k=1

q2

and kf ; Tk = an+1 + b2 in C 2 .

n+1

Proof. By our last example, the best approximation to f ; T out of Tn is 0, hence T

must be the best approximation to f. (Why?) The last assertion is easy to check: Since

p2 2

we can always write A q mx + B sin mx = A + B cos m(x ; x0 ), for some x0, it

cos

follows that kf ; Tk = a2 + b2 .

n+1 n+1

Finally, let's make a simple connection between the two types of polynomial approxi-

mation:

Best Approximation 63

Theorem. Let f 2 C ;1 1 ] and de ne ' 2 C 2 by '( ) = f(cos ). Then,

T

En(f) = min kf ; pk = min k' ; Tk En ('):

p2Pn T2Tn

Pn ak xk is the best approximation to f out of Pn.

Proof. Suppose that p (x) = k=0

b

Then, T ( ) = p (cos ) is in Tn and, clearly,

max jf(x) ; p (x)j = max jf(cos ) ; p (cos )j:

;1 x 1 0 2

b

Thus, kf ; p k = k' ; Tk min k' ; Tk.

T2Tn

On the other hand, since ' is even, we know that T , its best approximation out of Tn,

is also even. Thus, T ( ) = q(cos ) for some algebraic polynomial q 2 Pn. Consequently,

k' ; T k = kf ; qk min kf ; pk.

p2Pn

Remarks

1. Once we know that min kf ; pk = min k' ; Tk, it follows that we must also have

p2Pn T2Tn

T ( ) = p (cos ).

2. Each even ' 2 C 2 corresponds to an f 2 C ;1 1 ] by f(x) = '(arccos x) and, of

course, the conclusions of the Theorem and of Remark 1 hold in this case, too.

3. Whenever we speak of even trig polynomials, the Chebyshev polynomials are lurking

somewhere in the background. Indeed, let T( ) be an even trig polynomial, write

x = cos , as usual, and consider the following cryptic equation:

X X

n n

T( ) = ak cos k = ak Tk (cos ) = p(cos )

k=0 k=0

P

where p(x) = n ak Tk (x) 2 Pn.

k=0

Problem Set: Chebyshev Polynomials

Math 682 5/28/98

We've shown that cos n and sin(n + 1) = sin can be written as algebraic polynomials

of degree n in cos we use this observation to de ne the Chebyshev polynomials. The

Chebyshev polynomials of the rst kind (Tn(x)) are de ned by Tn(cos ) = cos n , for

n = 0 1 2 : : :, while the Chebyshev polynomials of the second kind (Un(x)) are de ned

by Un(cos ) = sin(n + 1) = sin for n = 0 1 2 : : :.

. 44. Establish the following properties of Tn(x).

(i) T0 (x) = 1, T1(x) = x, and Tn(x) = 2xTn;1 (x) ; Tn;2(x) for n 2.

(ii) Tn(x) is a polynomial of degree n having leading coe cient 2n;1 for n 1, and

containing only even (resp., odd) powers of x if n is even (resp., odd).

(iii) jTn(x)j 1 for ;1 x 1 when does equality occur? Where are the zeros of

Tn(x)? Show that between two consecutive zeros of Tn(x) there is exactly one

zero of Tn;1(x). Can Tn(x) and Tn;1(x) have a common zero?

0 0

(iv) jTn(x)j n2 for ;1 x 1, and jTn( 1)j = n2.

(v) Tm (x) + Tn (x) = 1 Tm+n(x) + Tm;n (x) for m > n.

2

(vi) Tm (Tn (x)) = Tmn(x).

Z1

Tn(x) Tm (x) p dx 2 .

(vii) Evaluate

1;x

;1

(viii) Show that Tn is a solution to (1 ; x2)y00 ; xy0 + n2y = 0.

p p

(ix) Tn(x) = 2 (x + x2 ; 1 )n + (x ; x2 ; 1 )n for any x, real or complex.

1

;P P ; t cos

(x) Re 1 tnein = 1 tn cos n = 1 ;12t cos + t2 for ;1 < t < 1 that is,

n=0 n=0

P1 tnTn(x) = 1 ; tx (this is a generating function for Tn it's closely

1 ; 2tx + t2

n=0

related to the Poisson kernel).

(xi) Find analogues of (i){(x) (if possible) for Un(x).

. 45. Show that every p 2 Pn has a unique representation as p = a0 + a1 T1 + + an Tn.

Find this representation in the case p(x) = xn.

Chebyshev Polynomials 65

. 46. The polynomial of degree n having leading coe cient 1 and deviating least from 0 on

;1 1 ] is given by Tn(x)=2n;1 . On an arbitrary interval a b ] we would instead take

(b ; a)n T 2x ; b ; a :

22n;1 n b ; a

Is this solution unique? Explain.

47. If p is a polynomial on a b ] of degree n having leading coe cient an > 0, then

kpk an(b ; a)n =22n;1. If b ; a 4, then no polynomial of degree exactly n with

integer coe cients can satisfy kpk < 2 (compare this with problem 26 on the \Uniform

Approximation by Polynomials" problem set).

48. Given p 2 Pn, show that jp(x)j kpkjTn(x)j for jxj > 1.

49. If p 2 Pn with kpk = 1 on ;1 1 ], and if jp(xi )j = 1 at n + 1 distinct point x0 : : : xn

in ;1 1 ], show that either p = 1, or else p = Tn. Hint: One approach is to

compare the polynomials 1 ; p2 and (1 ; x2)(p 0 )2.]

(k) (k)

50. Compute Tn (1) for k = 0 1 : : : n, where Tn is the k-th derivative of Tn. For x 1

(k)

and k = 0 1 : : : n, show that Tn (x) > 0.

Examples: Chebyshev Polynomials in Practice

Math 682 5/28/98

The following examples are cribbed from the book Chebyshev Polynomials, by L. Fox and

I. B. Parker (Oxford University Press, 1968).

As we've seen, the Chebyshev polynomals can be generated by a recurrence relation. By

reversing the procedure, we could solve for xn in terms of T0 T1 : : : Tn (we'll do this

calculation in class). Here are the rst few terms in each of these relations:

T0(x) = 1 1 = T0(x)

T1(x) = x x = T1(x)

T2(x) = 2x2 ; 1 x2 = (T0 (x) + T2(x))=2

T3(x) = 4x3 ; 3x x3 = (3 T1 (x) + T3 (x))=4

T4(x) = 8x4 ; 8x2 + 1 x4 = (3 T0 (x) + 4 T2 (x) + T4(x))=8

T5(x) = 16x5 ; 20x3 + 5x x5 = (10 T1 (x) + 5 T3 (x) + T5(x))=16

Note the separation of even and odd terms in each case. Writing ordinary, garden variety

polynomials in their equivalent Chebyshev form has some distinct advantages for numerical

computations. Here's why:

1 ; x + x2 ; x3 + x4 = 15 T0(x) ; 7 T1 (x) + T2(x) ; 1 T3(x) + 8 T4 (x)

1

6 4 4

(after some simpli cation). Now we see at once that we can get a cubic approximation to

1 ; x + x2 ; x3 + x4 on ;1 1 ] with error at most 1=8 by simply dropping the T4 term on

the right-hand side (since jT4(x)j 1), whereas simply using 1 ; x + x2 ; x3 as our cubic

approximation could cause an error as big as 1. Pretty slick! This gimmick of truncating

the equivalent Chebyshev form is called economization.

As a second example we note that a polynomial with small norm on ;1 1 ] may have

annoyingly large coe cients:

(1 ; x2)10 = 1 ; 10x2 + 45x4 ; 120x6 + 210x8 ; 252x10

+ 210x12 ; 120x14 + 45x16 ; 10x18 + x20

Chebyshev Polynomials in Practice 67

but in Chebyshev form (look out!):

(1 ; x2 )10 = 5241288 92 378 T0(x) ; 167 960 T2 (x) + 125 970 T4 (x) ; 77 520 T6(x)

+ 38 760 T8 (x) ; 15 504 T10 (x) + 4 845 T12(x) ; 1 140 T14(x)

+ 190 T16(x) ; 20 T18(x) + T20(x)

The largest coe cient is now only about 0.3, and the omission of the last three terms

produces a maximum error of about 0.0004. Not bad.

P

As a last example, consider the Taylor polynomial ex = n xk =k! + xn+1e =(n + 1)!

k=0

(with remainder), where ;1 x, 1. Taking n = 6, the truncated series has error no

greater than e=7! 0:0005. But if we \economize" the rst six terms, then:

X

6

xk =k = 1:26606 T0(x) + 1:13021 T1(x) + 0:27148 T2(x) + 0:04427 T3(x)

k=0

+ 0:00547 T4(x) + 0:00052 T5(x) + 0:00004 T6(x):

The initial approximation already has an error of about 0:0005, so we can certainly drop

the T6 term without any additional error. Even dropping the T5 term causes an error of

no more than 0:001 (or thereabouts). The resulting approximation has a far smaller error

than the corresponding truncated Taylor series: e=5! 0:023.

The approach used in our last example has the decided disadvantage that we must rst

decide where to truncate the Taylor series|which might converge very slowly. A better

approach would be to write ex as a series involving Chebyshev polynomials directly. That

P

is, if possible, we want to write ex = 1 ak Tk (x). If the ak 's are absolutely summable,

k=0

it will be very easy to estimate any truncation error. We'll get some idea on how to go

about this when we talk about \least-squares" approximation. As it happens, such a series

is easy to nd (it's rather like a Fourier series), and its partial sums are remarkably good

uniform approximations.

A Brief Introduction to Interpolation

Math 682 6/2/98

Our goal in this section is to prove the following result (as well as discuss its rami cations).

In fact, this result is so fundamental that we will present three proofs!

Theorem. Let x0 x1 : : : xn be distinct points, and let y0 y1 : : : yn be arbitrary points

in R. Then, there exists a unique polynomial p 2 Pn satisfying p(xi ) = yi , i = 0 1 : : : n.

First notice that uniqueness is obvious. Indeed, if two polynomials p, q 2 Pn agree at

n + 1 points, then p q. (Why?) The real work comes in proving existence.

First Proof. (Vandermonde's determinant.) We seek c0 c1 : : : cn so that p(x) =

Pn ck xk satis es

k=0

Xkn

p(xi ) = ck xi = yi i = 0 1 : : : n:

k=0

That is, we need to solve a system of n + 1 linear equations for the ci's. In matrix form:

2 32 3 2 3

xn

1 x0 x2 y0

c0

6 1 x1 x0 7 6 c1 7 6 y1 7

0

6 1 76 7 = 6 7:

xn

2

6 . . .1 . 7 6 . 7 6 .. 7

6 .. .. .. . 76 . 7 6 . 7

...

4 . 54 . 5 4 5

xn cn yn

1 xn x2 n n

This equation always has a unique solution because the coe cient matrix has determinant

Y

(xi ; xj ) 6= 0:

D=

0 j<i n

(D is called Vandermonde's determinant note that D > 0 if x0 < x1 < < xn .) Since

this fact is of independent interest, we'll sketch a short proof below.

Lemma. D = Q (xi ; xj ).

0 j<i n

Proof. Consider

xn

x2

1 x0 0 0

xn

x2

1 x1 1 1

V (x0 x1 : : : xn;1 x) = ... ... . . . . .. :

.

. .

xn

1 xn;1 x2

n;1 n;1

xn

x2

1x

Interpolation 69

V (x0 x1 : : : xn;1 x) is a polynomial of degree n in x, and it's 0 whenever x = xi , i =

Q

0 1 : : : n ; 1. Thus, V (x0 : : : x) = c n;1(x ; xi ), by comparing roots and degree.

i=0

However, it's easy to see that the coe cient of xn in V (x0 : : : x) is V (x0 : : : xn;1 ).

Qn;1

Thus, V (x0 : : : x) = V (x0 : : : xn;1 ) i=0 (x ; xi ). The result now follows by induction

and the obvious case 1 x0 = x1 ; x0 .

1 x1

Second Proof. (Lagrange interpolation.) We could de ne p immediately if we had

polynomials `i (x) 2 Pn, i = 0 : : : n, such that `i (xj ) = i j (where i j is Kronecker's

P

delta that is, i j = 0 for i 6= j, and i j = 1 for i = j). Indeed, p(x) = n yi `i (x)

i=0

would then work as our interpolating polynomial. In short, notice that the polynomials

f`0 `1 : : : `ng would form a (particularly convenient) basis for Pn.

We'll give two formulas for `i(x):

Y x ; xj

(a). Clearly, `i(x) = x ; x works.

j6=i i j

(b). Start with W(x) = (x ; x0 )(x ; x1 ) (x ; xn), and notice that the polynomial we

need satis es

W(x)

`i(x) = ai x ; x

i

for some ai 2 R. (Why?) But then, 1 = `i (xi ) = ai W 0 (xi ) (again, why?) that is,

`i(x) = (x ; W(x) 0 (x ) :

x )W

i i

Q

Please note that `i(x) is a multiple of the polynomial j6=i (x ; xj ), for i = 0 : : : n, and

that p(x) is then a suitable linear combination of such polynomials.

Third Proof. (Newton's formula.) We seek p(x) of the form

p(x) = a0 + a1 (x ; x0 ) + a2(x ; x0 )(x ; x1 ) + + an (x ; x0) (x ; xn;1 ):

(Please note that xn does not appear on the right-hand side.) This form makes it almost

e ortless to solve for the ai 's by plugging-in the xi 's, i = 0 : : : n ; 1.

y0 = p(x0 ) = a0

y1 = p(x1 ) = a0 + a1(x1 ; x0 ) =) a1 = x1 ; a0 :

y

;x 1 0

Interpolation 70

Continuing, we nd

a2 = y2(x a0 x )(x(x2 ; x)0 )

; ; a1

; ;x

2 0 2 1

a3 = y3 ; a0 ; a1 (x3x; x0 ) ; x 2)(x3 ; x0 )(x3 ; x1 )

; a (x

(x ; )(x ;x )

3 0 3 1 3 2

and so on. Natanson, Vol. III, gives another formula for the ai's.]

Example

As a quick means of comparing these three solutions, let's nd the interpolating polynomial

(quadratic) passing through (1 2), (2 ;1), and (3 1). You're invited to check the following:

Vandermonde: p(x) = 10 ; 21 x + 5 x2 .

2 2

Lagrange: p(x) = (x ; 2)(x ; 3) + (x ; 1)(x ; 3) + 1 (x ; 1)(x ; 2).

2

Newton: p(x) = 2 ; 3(x ; 1) + 2 (x ; 1)(x ; 2).

5

As you might have already surmised, Lagrange's method is the easiest to apply by

hand, although Newton's formula has much to recommend it too (it's especially well-suited

to situations where we introduce additional nodes). We next set up the necessary notation

to discuss the ner points of Lagrange's method.

Given n + 1 distinct points a x0 < x1 < < xn b (sometimes called nodes), we

rst form the polynomials

Y

n

W(x) = (x ; xi )

i=0

and

Y x ; xj

= (x ; W(x) 0 (x ) :

`i(x) =

j6=i xi ; xj xi ) W i

The Lagrange interpolation formula is

X

n

Ln(f)(x) = f(xi ) `i (x):

i=0

That is, Ln (f) is the unique polynomial in Pn that agrees with f at the xi 's. In particular,

notice that we must have Ln(p) = p whenever p 2 Pn. In fact, Ln is a linear projection

from C a b ] onto Pn. Why is Ln(f) linear in f?]

Interpolation 71

Typically we're given (or construct) an array of nodes:

8 x(0)

>0

> (1) (1)

< x0 x1

X > x(2) x(2) x(2)

> 0. 1 2 .

:. ..

.

and form the corresponding sequence of projections

X

n

f(x(n)) `(n)(x):

Ln(f)(x) = i i

i=0

An easy (but admittedly pointless) observation is that for a given f 2 C a b ] we can always

nd an array X so that Ln(f) = pn , the polynomial of best approximation to f out of Pn

(since f;pn has n+1 zeros, we may use these for the xi 's). Thus, kLn(f);fk = En(f) ! 0

in this case. However, the problem of convergence changes character dramatically if we

rst choose X and then consider Ln(f). In general, there's no reason to believe that Ln (f)

converges to f. In fact, quite the opposite is true:

Theorem. (Faber, 1914) Given any array of nodes X in a b ], there is some f 2 C a b ]

for which kLn(f) ; fk is unbounded.

The problem here has little to do with interpolation and everything to do with pro-

jections:

Theorem. (Kharshiladze, Lozinski, 1941) For each n, let Ln be a continuous, linear

projection from C a b ] onto Pn. Then, there is some f 2 C a b ] for which kLn(f) ; fk

is unbounded.

Evidently, the operators Ln aren't positive (monotone), for otherwise the Bohman-

Korovkin theorem (and the fact that Ln is a projection onto Pn) would imply that Ln (f)

converges uniformly to f for every f 2 C a b ].

The proofs of these theorems are long and di cult|we'll save them for another day.

(Some of you may recognize the Principle of Uniform Boundedness at work here.) The

real point here is that we can't have everything: A positive result about convergence of

interpolation will require that we impose some extra conditions on the functions f we want

to approximate. As a rst step in this direction, we prove that if f has su ciently many

derivatives, then the error kLn(f) ; fk can at least be measured.

Interpolation 72

Theorem. Suppose that f has n + 1 continuous derivatives on a b ]. Let a x0 <

x1 < < xn b, let p 2 Pn be the polynomial that interpolates f at the xi 's, and let

Q

W(x) = n (x ; xi ). Then,

i=0

1 kf (n+1)kkWk:

kf ; pk (n + 1)!

Proof. We'll prove the Theorem by showing that, given x in a b ], there is a in (a b)

with

1

f(x) ; p(x) = (n + 1)! f (n+1)( ) W(x): ()

If x is one of the xi 's, then both sides of this formula are 0 and we're done. Otherwise,

W(x) 6= 0 and we may set = f(x) ; p(x)]=W(x). Now consider

'(t) = f(t) ; p(t) ; W(t):

Clearly, '(xi ) = 0 for each i = 0 1 : : : n and, by our choice of , we also have '(x) = 0.

Here comes Rolle's theorem! Since ' has n + 2 distinct zeros in a b ], we must have

'(n+1)( ) = 0 for some in (a b). (Why?) Hence,

0 = '(n+1)( ) = f (n+1)( ) ; p(n+1)( ) ; W (n+1)( )

= f (n+1)( ) ; f(x) ; p(x) (n + 1)!

W(x)

because p has degree at most n and W is monic and degree n + 1.

Observations

1. Equation ( ) is called the Lagrange formula with remainder. Compare this result to

Taylor's formula with remainder.]

2. The term f (n+1)( ) is actually a continuous function of x. That is, f(x);p(x)]=W(x)

is continuous its value at an xi is f 0(xi ) ; p 0 (xi )]=W 0 (xi ) (why?) and W 0(xi ) =

Q (x ; x ) 6= 0.

j6=i i j

3. On any interval a b ], using any nodes, the sequence of Lagrange interpolating poly-

nomials for ex converge uniformly to ex. In this case,

c (b ; a)n

kex ; Ln(ex )k (n + 1)!

Interpolation 73

where c = kexk in C a b ]. The same would hold true for any in nitely di erentiable

function satisfying, say, kf (n)k M n (any entire function, for example).

4. On ;1 1 ], the norm of Qn (x ; xi ) is minimized by taking xi = cos((2i ; 1) =2n),

i=1

the zeros of the n-th Chebyshev polynomial Tn . (Why?) As Rivlin points out, the

zeros of the Chebyshev polynomials are a nearly optimal choice for the nodes if good

uniform approximation is desired.

The question of convergence of interpolation is actually very closely related to the

analogous question for the convergence of Fourier series|and the answer here is nearly

the same. We'll have more to say about this analogy later. First, let's note that Ln is

continuous (bounded) this will give us our rst bit of insight into Faber's negative result.

Pn j` (x)j

Lemma. kLn(f)k kfk i=0 i for any f 2 C a b ].

Exercise.

Proof.

P

The numbers n = n j`i(x)j are called the Lebesgue numbers associated to this

i=0

process. It's not hard to see that n is the smallest possible constant that will work in

this inequality (in other words, kLnk = n). Indeed, if

X X

n n

j`i(x)j = j`i(x0 )j

i=0 i=0

then we can nd an f 2 C a b ] with kfk = 1 and f(xi ) = sgn(`i (x0 )) for all i. (How?)

Then,

X X

n n

kLn(f)k jLn(f)(x0 )j = j`i(x0 )j = nkfk:

sgn(`i (x0 )) `i (x0 ) =

i=0 i=0

As it happens, n c log n (this is where the hard work comes in see Rivlin or Natanson

for further details), and, in particular, n ! 1 as n ! 1.

A simple application of the triangle inequality will allow us to bring En(f) back into

the picture:

Interpolation 74

Lemma. (Lebesgue's theorem) kf ; Ln(f)k (1 + n) En(f), for any f 2 C a b ].

Proof. Let p be the best approximation to f out of Pn . Then, since Ln (p ) = p , we

have

kf ; Ln(f)k kf ; p k + kLn(f ; p )k

(1 + n) kf ; p k = (1 + n) En(f):

Appendix

Although we won't need anything quite so fancy, it is of some interest to discuss more

general problems of interpolation. We again suppose that we are given distinct points

x0 < < xn in a b ], but now we suppose that we are given an array of information

y0 y0 y0 : : : y0 0) (m

0 00

y1 y1 y1 : : : y1 1) (m

0 00

... .

... .

... . (m )

0 00

yn yn yn : : : yn n

where each mi is a nonnegative integer. Our problem is to nd the polynomial p of least

degree that incorporates all of this data by satisfying

p(x0 ) = y0 p 0 (x0 ) = y0 : : : p(m0)(x0 ) = y0 0 )

(m

0

p(x1 ) = y1 p 0 (x1 ) = y1 : : : p(m1)(x1 ) = y1 1 )

(m

0

. . ..

. .

. . . (m )

p(xn ) = yn p 0 (xn ) = yn : : : p(mn )(xn ) = yn n :

0

In other words, we specify not only the value of p at each xi , but also the rst mi derivatives

of p at xi . This is often referred to as the problem of Hermite interpolation.

Since the problem has a total of m0 + m1 + + mn + n + 1 \degrees of freedom,"

it won't come as any surprise that is has a (unique) solution p of degree (at most) N =

m0 + m1 + + mn + n. Rather than discuss this particular problem any further, let's

instead discuss the general problem of linear interpolation.

The notational framework for our problem is an n-dimensional vector space X on

which m linear, real-valued functions (or linear functionals) L1 : : : Lm are de ned. The

general problem of linear interpolation asks whether the system of equations

Li (f) = yi i = 1 ::: m ()

Interpolation 75

has a (unique) solution f 2 X for any given set of scalars y1 : : : ym 2 R. Since a linear

functional is completely determined by its values on any basis, we would next be led to

consider a basis f1 : : : fn for X, and from here it is a small step to rewrite ( ) as a matrix

equation. That is, we seek a solution f = a1f1 + + anfn satisfying

a1 L1(f1 ) + + an L1(fn ) = y1

a1 L2(f1 ) + + an L2(fn ) = y2

.

.

.

a1 Lm (f1 ) + + anLm (fn ) = ym :

If we are to guarantee a solution a1 : : : an for each choice of y1 : : : ym , then we'll need

to have m = n and, moreover, the matrix Li (fj )] will have to be nonsingular.

Lemma. Let X be an n-dimensional vector space with basis f1 : : : fn , and let L1 : : : Ln

be linear functionals on X. Then, L1 : : : Ln are linearly independent if and only if the

;

matrix Li(fj )] is nonsingular that is, if and only if det Li (fj ) 6= 0.

Proof. If Li (fj )] is singular, then the matrix equation

c1L1(f1 ) + + cnLn(f1 ) = 0

c1L1(f2 ) + + cnLn(f2 ) = 0

..

.

c1L1(fn ) + + cnLn(fn ) = 0

has a nontrivial solution c1 : : : cn. Thus, the functional c1L1 + + cnLn satis es

ñòð. 2 |