ñòð. 3 |

Since f1 : : : fn form a basis for X, this means that

(c1L1 + + cnLn)(f) = 0

for all f 2 X. That is, c1L1 + + cnLn = 0 (the zero functional), and so L1 : : : Ln are

linearly dependent.

Conversely, if L1 : : : Ln are linearly dependent, just reverse the steps in the rst part

of the proof to see that Li(fj )] is singular.

Interpolation 76

Theorem. Let X be an n-dimensional vector space and let L1 : : : Ln be linear function-

als on X. Then, the interpolation problem

Li (f) = yi i = 1 ::: n ()

always has a (unique) solution f 2 X for any choice of scalars y1 : : : yn if and only if

L1 : : : Ln are linearly independent.

Proof. Let f1 : : : fn be a basis for X. Then, ( ) is equivalent to the system of equations

a1L1 (f1 ) + + anL1 (fn ) = y1

a1L2 (f1 ) + + anL2 (fn ) = y2

()

..

.

a1 Ln(f1 ) + + an Ln(fn ) = yn

by taking f = a1 f1 + + an fn. Thus, ( ) always has a solution if and only if ( ) always

has a solution if and only if Li(fj )] is nonsingular if and only if L1 : : : Ln are linearly

independent. In any of these cases, note that the solution must be unique.

In the case of Lagrange interpolation, X = Pn and Li is evaluation at xi i.e., Li (f) =

f(xi ), which is easily seen to be linear in f. Moreover, L0 : : : Ln are linearly independent

provided that x0 : : : xn are distinct. (Why?)

In the case of Hermite interpolation, the linear functionals are of the form Lx k (f) =

f (k)(x), di erentiation composed with a point evaluation. If x 6= y, then Lx k and Ly m

are independent for any k and m if k 6= m, then Lx k and Lx m are independent. (How

would you check this?)

Problem Set: Lagrange Interpolation

Math 682 6/2/98

Throughout, x0 x1 : : : xn are distinct points in some interval a b ], and V (x0 x1 : : : xn )

denotes the Vandermonde determinant:

xn

1 x0 x2 0 0

xn

1 x1 x2

V (x0 x1 : : : xn) = .. .. ..1 . . 1

.. :

.

... .

xn

1 xn x2 n n

51. Show by induction that V (x0 x1 : : : xn ) = Q (xi ; xj ).

.

0 j<i n

Hint: In order to reduce to the n n case, replace cj , the j-th column, by cj ;x0 cj;1 ,

starting on the right with j = n. Factor and use the induction hypothesis.]

52. Let y0 y1 : : : yn 2 R be given. Show that the polynomial p 2 Pn satisfying p(xi ) = yi ,

i = 0 1 : : : n, may be written as

xn

0 1 x x2

xn

y0 1 x0 x2

p(x) = c .. .. .. . . .. 00

..

...

xn

y n 1 xn x2 n n

where c is a certain constant. Find c and prove the formula.

53. Given f 2 C a b ], let Ln(f) denote the polynomial of degree at most n that agrees

with f at the xi 's. Prove that Ln is a linear projection onto Pn. That is, show that

Ln( f + g) = Ln(f) + Ln(g), and that Ln(f) = f if and only if f 2 Pn.

54. Let `i (x), i = 0 1 : : : n, denote the Lagrange interpolating polynomials of degree

at most n associated with the nodes x0 x1 : : : xn that is, `i(xj ) = i j . Show that

Pn ` (x) 1 and, more generally, that Pn xk ` (x) = xk , for k = 0 1 : : : n.

i=0 i i=0 i i

55. If `i and Ln are as above, show that the error in the Lagrange interpolation formula

P

is (Ln(f) ; f)(x) = n f(xi ) ; f(x) ] `i (x).

i=0

56. With `i and Ln as above, show that kLn(f)k nkfk, where n = Pn j`i(x)j . i=0

Show that no smaller number has this property for all f 2 C a b ].

Approximation on Finite Sets

Math 682 6/3/98

In this section we consider a question of computational interest: Since best approximations

are often very hard to nd, how might we approximate the best approximation? The

answer to this question lies in approximations over nite sets. Here's the plan:

(1) Fix a nite subset Xm of a b ] consisting of m distinct points a x1 < < xm b,

and nd the best approximation to f out of Pn considered as a subspace of C(Xm ).

In other words, if we call the best approximation pn(Xm ), then

max jf(xi ) ; pn(Xm )(xi )j = min max jf(xi ) ; p(xi )j En(f Xm ):

1im p2Pn 1 i m

(2) Argue that this process converges (in some sense) to the best approximation on all

of a b ] provided that Xm \gets big" as m ! 1. In actual practice, there's no need

to worry about pn(Xm ) converging to pn (the best approximation on all of a b ])

rather, we will argue that En(f Xm ) ! En(f) and appeal to \abstract nonsense."

(3) Find an e cient strategy for carrying out items (1) and (2).

Observations

1. If m n + 1, then En(f Xm ) = 0. That is, we can always nd a polynomial p 2 Pn

that agrees with f at n + 1 (or fewer) points. (How?) Of course, p won't be unique if

m < n + 1. (Why?) In any case, we might as well assume that m n + 2. In fact, as

we'll see, the case m = n + 2 is all that we really need to worry about.

2. If X Y a b ], then En(f X) En(f Y ) En(f). Indeed, if p 2 Pn is the best

approximation on Y , then

En(f X) max jf(x) ; p(x)j max jf(x) ; p(x)j = En(f Y ):

x2X x2Y

Consequently, we expect En(f Xm ) to increase to En(f) as Xm \gets big."

Now if we were to repeat our earlier work on characterizing best approximations,

restricting ourselves to Xm everywhere, here's what we'd get:

Finite Sets 79

Theorem. Let m n + 2. Then,

(i) p 2 Pn is a best approximation to f on Xm if and only if f ; p has an alternating set

containing n+2 points out of Xm that is, f ;p = En(f Xm ), alternately, on Xm .

(ii) pn(Xm ) is unique.

Next let's see how this reduces our study to the case m = n + 2.

Theorem. Fix n, m n + 2, and f 2 C a b ].

(i) If pn 2 Pn is best on all of a b ], then there is a subset Xn+2 of a b ], containing n+2

points, such that pn = pn(Xn+2). Moreover, En(f Xn+2 ) En(f) = En(f Xn+2 )

for any other subset Xn+2 of a b ], with equality if and only if pn(Xn+2) = pn.

(ii) If pn(Xm ) 2 Pn is best on Xm , then there is a subset Xn+2 of Xm such that

pn(Xm ) = pn(Xn+2 ) and En(f Xm ) = En(f Xn+2 ). For any other Xn+2 Xm

we have En(f Xn+2 ) En(f Xn+2 ) = En(f Xm ), with equality if and only if

pn(Xn+2) = pn(Xm ).

Proof. (i): Let Xn+2 be an alternating set for f ;pn over a b ] containing exactly n+2

points. Then, Xn+2 is also an alternating set for f ;pn over Xn+2. That is, for x 2 Xn+2,

(f(x) ; pn(x)) = En(f) = max jf(y) ; pn (y)j:

y2Xn+2

So, by uniqueness of best approximations on Xn+2, we must have pn = pn(Xn+2) and

En(f) = En(f Xn+2 ). The second assertion follows from a similar argument using the

uniqueness of pn on a b ].

(ii): This is just (i) with a b ] replaced everywhere by Xm .

Here's the point: Through some as yet undisclosed method, we choose Xm with

m n + 2 (in fact, m >> n + 2) such that En(f Xm ) En(f) En(f Xm ) + ", and

then we search for the \best" Xn+2 Xm , meaning the largest value of En(f Xn+2 ). We

then take pn(Xn+2) as an approximation for pn. As we'll see momentarily, pn(Xn+2) can

be computed directly and explicitly.

Finite Sets 80

Now suppose that the elements of Xn+2 are a x0 < x1 < < xn+1 b, let

p = pn (Xn+2) be p(x) = a0 + a1 x + + anxn , and let

E = En(f Xn+2 ) = max jf(xi ) ; p(xi )j:

0 i n+1

In order to compute p and E, we use the fact that f(xi ) ; p(xi ) = E, alternately, and

write (for instance)

f(x0 ) = E + p(x0 )

f(x1 ) = ;E + p(x1 )

.

.

.

f(xn+1 ) = (;1)n+1E + p(xn+1)

(where the \E column" might, instead, read ;E, E, : : :, (;1)n E). That is, in order to

nd p and E, we need to solve a system of n + 2 linear equations in the n + 2 unknowns

E, a0 : : : an . The determinant of this system is (up to sign)

xn

1 x0

1 0

;1 n

1 x1 x1

. . . . = A0 + A1 + + An+1 > 0

.. .

. .

. . .

(;1)n+1 1 xn xnn

where we have expanded by cofactors along the rst column and have used the fact that

each minor Ak is a Vandermonde determinant (and hence each Ak > 0). If we apply

Cramer's rule to nd E we get

f(x0 )A0 ; f(x1 )A1 + + (;1)n+1f(xn+1 )An+1

E= A0 + A1 + + An+1

= 0f(x0 ) ; 1 f(x1 ) + + (;1)n+1 n+1f(xn+1 )

P P

where i > 0 and n+1 1 = 1. Moreover, these same i's satisfy n+1(;1)i i q(xi ) = 0

i=0 i=0

for every polynomial q 2 Pn since E = En(q Xn+2 ) = 0 for polynomials of degree at most

n (and since Cramer's rule supplies the same coe cients for all f's).

It may be instructive to see a more explicit solution to this problem. For this, recall

that since we have n + 2 points we may interpolate exactly out of Pn+1. Given this, our

original problem can be rephrased quite succinctly.

Finite Sets 81

Let p be the (unique) polynomial in Pn+1 satisfying p(xi ) = f(xi ), i = 0 1 : : : n + 1,

and let e be the (unique) polynomial in Pn+1 satisfying e(xi ) = (;1)i , i = 0 1 : : : n + 1.

If it is possible to nd a scalar so that p ; e 2 Pn, then p ; e = pn(Xn+2) and

j j = En(f Xn+2 ). Why? Because f ; (p ; e) = e = , alternately, on Xn+2 and so

j j = max jf(x) ; (p(x) ; e(x))j. Thus, we need to compare leading coe cients of p

x2Xn+2

and e.

Now if p has degree less than n + 1, then p = pn(Xn+2) and En(f Xn+2 ) = 0. Thus,

= 0 would do nicely in this case. Otherwise, p has degree exactly n + 1 and the question

is whether e does too. Now,

X

n+1

(;1)i W(x)

e(x) = W 0 (x ) (x ; x )

i i

i=0

Q Pn+1(;1)i =W 0(x ).

where W(x) = n+1(x ; xi ), and so the leading coe cient of e is i

i=0 i=0

We'll be done if we can convince ourselves that this is nonzero. But

Y n;i+1 Y (xi ; xj ) Y (xj ; xi )

i;1 n+1

W 0 (xi ) = (xi ; xj ) = (;1)

j=0 j=i+1

j6=i

hence (;1)i =W 0 (xi ) is of constant sign (;1)n+1. Finally, since

X

n+1 f(xi ) W(x)

p(x) =

i=0 W (xi ) (x ; xi )

0

Pn+1 f(x )=W 0 (x ) and it's easy to

p has leading coe cient nd the value of .

i i

i=0

Conclusion. pn(Xn+2) = p ; e, where

Pn+1 f(x )=W 0 (x ) X

n+1

Pi=0 (;1)i =W 0 (xii)

i (;1)i i f(xi )

= =

n+1

i=0 i=0

and

1=jW 0 (xi )j

i = Pn+1

j=0 1=jW 0 (xj )j

P

and j j = En(f Xn+2 ). Moreover, n+1(;1)i i q(xi ) = 0 for every q 2 Pn.

i=0

Finite Sets 82

Example

Find the best linear approximation to f(x) = x2 on X4 = f0 1=3 2=3 1g 0 1 ].

We seek p(x) = a0 + a1 x and we need only consider subsets of X4 of size 1 + 2 = 3.

There are four:

X4 1 = f0 1=3 2=3g X4 2 = f0 1=3 1g X4 3 = f0 2=3 1g X4 4 = f1=3 2=3 1g:

In each case we nd a p and a (= E in our earlier setup). For instance, in the case of

X4 2 we would solve the system of equations f(x) = + p(x) for x = 0 1=3 1.

9 1

>

(2) + a (2) =

0= >

0

= 9

1 = ; (2) + a + 1 a

a0 = ; 1

> =)

0 31

9 > 9

1 = (2) + a0 + a1 a1 = 1

In the other three cases you would nd that (1) = 1=18, (3) = 1=9, and (4) = 1=18.

Since we need the largest , we're done: X4 2 (or X4 3) works, and p1 (X4 )(x) = x ; 1=9.

(Recall that the best approximation on all of 0 1 ] is p1 (x) = x ; 1=8.)

Where does this leave us? We still need to know that there is some hope of nding an

initial set Xm with En(f) ; " En(f Xm ) En(f), and we need a more e cient means

; m subsets X

n+2 Xm . In order to attack the problem of

of searching through the n+2

nding an initial Xm , we'll need a few classical inequalities. We won't directly attack the

0

second problem instead, we'll outline an algorithm that begins with an initial set Xn+2,

1

containing exactly n + 2 points, which is then \improved" to some Xn+2 by changing only

a single point.

The Inequalities of Markov and Bernstein

In order to discuss the convergence of approximations over nite sets, we will need to know

that di erentiation is bounded on Pn (a fact that is nearly obvious by itself).

The inequality we'll use is due to A. A. Markov from 1889:

Finite Sets 83

Theorem. If p 2 Pn, and if jp(x)j 1 for jxj 1, then jp 0 (x)j n2 for jxj 1.

Moreover, jp 0 (x)j = n2 can only occur at x = 1, and only when p = Tn, the Chebyshev

polynomial of degree n.

Markov's brother, V. A. Markov, later improved on this, in 1916, by showing that

(k)

jp(k)(x)j Tn (1). We've alluded to this fact already (see Rivlin, p. 31), and even more

is true. For our purposes, it's enough to have some bound on di erentiation in particular,

we'll only use

kp 0 k n2 kpk and kp 00 k n4kpk

where k k is the norm in C ;1 1 ].

About 20 years after Markov, in 1912, Bernstein asked for a similar bound for the

derivative of a complex polynomial over the unit disk jzj 1. Now the maximum modulus

theorem tells us that we may reduce to the case jzj = 1, that is, z = ei , and so Bernstein

was able to restate the problem in terms of trig polynomials.

Theorem. If S 2 Tn, and if jS( )j 1, then jS 0 ( )j n. Equality is only possible for

S( ) = sin n( ; 0).

Our plan is to deduce Markov's inequality from Bernstein's inequality by a method of

proof due to Polya and Szego in 1928. To begin, let's consider the Lagrange interpolation

formula in the case where xi = cos((2i ; 1) =2n), i = 1 : : : n, are the zeros of the

Chebyshev polynomial Tn. Recall that we have ;1 < xn < xn;1 < < x1 < 1.

Lemma 1. Each polynomial p 2 Pn;1 may be written

1 X p(x ) (;1)i;1 q1 ; x2 Tn(x) :

n

p(x) = n i i x ; xi

i=1

Proof. We know that the Lagrange interpolation formula is exact for polynomials of

degree < n, and we know that, up to a constant multiple, Tn(x) is the product W(x) =

0

(x ; x1 ) (x ; xn). All that remains is to compute Tn(xi ). But recall that for x = cos

we have

Tn(x) = n sin = pn sin n 2 = p sin n 2 :

n

sinn

0

1 ; cos 1;x

Finite Sets 84

But for xi = cos((2i ; 1) =2n), i.e., for i = (2i ; 1) =2n, it follows that sin n i =

sin((2i ; 1) =2) = (;1)i;1 that is,

i;1 p1 ; x2

(;1)

1= i:

0

Tn(xi ) n

Lemma 2. For any polynomial p 2 Pn;1, we have

p

max jp(x)j max n 1 ; x2 p(x) :

;1 x 1 ;1 x 1

p

Proof. To save wear and tear, let's write M = max n 1 ; x2 p(x) .

;1 x 1

First consider an x in the interval xn x1 ] that is, jxj cos( =2n) = x1. In this case

p

we can estimate 1 ; x2 from below:

r

q

p 1

1 ; x2 1 ; x2 = 1 ; cos2 = sin 2n

1 n

2n

=2 (from the mean value theorem). Hence, for jxj

2 = for 0

because sin

p

cos( =2n), we get jp(x)j n 1 ; x2 jp(x)j M.

Now, for x's outside the interval xn x1 ], we apply our interpolation formula. In this

case, each of the factors x ; xi is of the same sign. Thus,

p

1 X p(x ) (;1)i;1 1 ; x2 Tn(x)

n

i

jp(x)j = n i x ; xi

i=1

X Tn (x) X Tn(x)

n n

M =M x ; xi :

x ; xi

n2 n2

i=1 i=1

But,

X Tn(x)

n

0

= Tn (x) (why?)

i=1 x ; xi

0

and we know that jTn(x)j n2. Thus, jp(x)j M.

We next turn our attention to trig polynomials. As usual, given an algebraic poly-

nomial p 2 Pn, we will sooner or later consider S( ) = p(cos ). In this case, S 0( ) =

p 0 (cos ) sin is an odd trig polynomial of degree at most n and jS 0 ( )j = jp 0 (cos ) sin j =

Finite Sets 85

p

jp 0 (x) 1 ; x2 j. Conversely, if S 2 Tn is an odd trig polynomial, then S( )= sin is even,

and so may be written S( )= sin = p(cos ) for some algebraic polynomial p of degree at

most n ; 1. From Lemma 2,

max S( ) = max jp(cos )j n max jp(cos ) sin j = n max jS( )j:

0 2 sin 02 0 2 0 2

This proves

Corollary. If S 2 Tn is an odd trig polynomial, then

max S( ) n max jS( )j:

0 2 sin 0 2

Now we're ready for Bernstein's inequality.

Bernstein's Inequality. If S 2 Tn, then

max jS 0 ( )j n max jS( )j:

0 2 0 2

) = S( + ) ; S( ; ) 2. For

Proof. We rst de ne an auxiliary function f(

xed, f( ) is an odd trig polynomial in of degree at most n. Consequently,

f( ) n max jf( n max jS( )j:

)j

sin 0 2 0 2

But

S 0 ( ) = lim S( + ) 2 S( ; ) = lim f( )

;

!0 sin

!0

and hence jS 0( )j n max jS( )j.

0 2

Finally, we prove Markov's inequality.

Markov's Inequality. If p 2 Pn, then

max jp 0 (x)j n2 max jp(x)j:

;1 x 1 ;1 x 1

Finite Sets 86

Proof. We know that S( ) = p(cos ) is a trig polynomial of degree at most n satisfying

max jp(x)j = max jp(cos )j:

;1 x 1 0 2

Since S 0 ( ) = p 0 (cos ) sin is also trig polynomial of degree at most n, Bernstein's in-

equality yields

max jp0 (cos ) sin j n max jp(cos )j:

0 2 0 2

In other words,

0 (x)p1 ; x2 n max jp(x)j:

max p

;1 x 1 ;1 x 1

Since p 0 2 Pn;1, the desired inequality now follows easily from Lemma 2:

0 (x)p1 ; x2

max jp 0 (x)j n2 max jp(x)j:

n max p

;1 x 1 ;1 x 1 ;1 x 1

Convergence of Approximations over Finite Sets

In order to simplify things here, we will make several assumptions: For one, we will

consider only approximation over the interval I = ;1 1 ]. As before, we consider a xed

f 2 C ;1 1 ] and a xed integer n = 0 1 2 : : :. For each integer m 1 we choose a nite

subset Xm I, consisting of m points ;1 x1 < < xm 1 in addition, we will

assume that x1 = ;1 and xm = 1. If we put

= max min jx ; xi j > 0

m

x2I 1 i m

then each x 2 I is within m of some xi . If Xm consists of equally spaced points, for

example, it's easy to see that m = 1=(m ; 1).

Our goal is to prove

Theorem. If m ! 0, then En (f Xm ) ! En(f).

And we would hope to accomplish this in such a way that m is a measurable quantity,

depending on f, m, and a prescribed tolerance " = En(f Xm ) ; En(f).

As a rst step in this direction, let's bring Markov's inequality into the picture.

Finite Sets 87

Lemma. Suppose that m m n4=2 < 1. Then, for any p 2 Pn, we have

2

max jp(x)j (1 ; m);1 max jp(xi )j

(1)

;1 x 1 1im

and

m n2 (1 ; m );1

!p( ;1 1 ] max jp(xi )j.

(2) m)

1im

Proof. (1): Take a in ;1 1 ] with jp(a)j = kpk. If a = 1 2 Xm , we're done (since

(1 ; m );1 > 1). Otherwise, we'll have ;1 < a < 1 and p 0 (a) = 0. Next, choose xi 2 Xm

with ja ; xi j and apply Taylor's theorem:

m

p(xi ) = p(a) + (xi ; a) p 0 (a) + (xi ; a) p 00 (c)

2

2

for some c in (;1 1). Re-writing, we have

2

jp(a)j jp(xi )j + 2 jp 00 (c)j:

m

And now we bring in Markov:

24

mn

kpk max jp(xi )j + 2 kpk

1im

which is what we need.

(2): The real point here is that each p 2 Pn is Lipschitz with constant n2 kpk. Indeed,

jp(s) ; p(t)j = j(s ; t) p 0 (c)j js ; tj kp 0 k n2kpkjs ; tj

n2 kpk and,

(from the mean value theorem and Markov's inequality). Thus, !p( )

combining this with (1), we get

m n2 (1 ; m );1

m n2 kpk max jp(xi )j:

!p( m )

1im

Now we're ready to compare En(f Xm ) to En(f). Our result won't be as good as

Rivlin's (he uses a fancier version of Markov's inequality), but it will be a bit easier to

prove. As in the Lemma, we'll suppose that

24

mn < 1

m= 2

and we'll set

m n2 :

m = 1;

m

Note that as m ! 0 we also have m ! 0 and m ! 0.]

Finite Sets 88

Theorem. For f 2 C ;1 1 ],

(1 + m) En (f Xm ) + !f ( ;1 1 ] m ) + m kfk:

En(f Xm ) En(f)

Consequently, if m ! 0, then En(f Xm ) ! En(f) (as m ! 1).

Proof. Let p = pn (Xm ) 2 Pn be the best approximation to f on Xm . Recall that

max jf(xi ) ; p(xi )j = En(f Xm ) kf ; pk:

En(f)

1im

Our plan is to estimate kf ; pk.

Let x 2 ;1 1 ] and choose xi 2 Xm with jx ; xi j m. Then,

jf(x) ; p(x)j jf(x) ; f(xi )j + jf(xi ) ; p(xi )j + jp(xi ) ; p(x)j

!f ( m ) + En(f Xm ) + !p( m )

!f ( m ) + En(f Xm ) + m max jp(xi )j

1im

where we've used (2) from the previous Lemma to estimate !p( m ). All that remains is to

revise this last estimate, eliminating reference to p. For this we use the triangle inequality

again:

max jp(xi )j max jf(xi ) ; p(xi )j + max jf(xi )j

1im 1im 1im

En(f Xm ) + kfk:

Putting all the pieces together gives us our result:

En(f Xm ) + kfk :

En(f) !f ( m ) + En(f Xm ) + m

As Rivlin points out, it is quite possible to give a lower bound on m in the case of, say,

equally spaced points, which will give En(f Xm ) En(f) En(f Xm ) + ", but this is

surely an ine cient approach to the problem. Instead, we'll discuss the one point exchange

algorithm.

The One Point Exchange Algorithm

We're given f 2 C ;1 1 ], n, and " > 0.

Finite Sets 89

1. Pick a starting \reference" Xn+2. A convenient choice is the set xi = cos n+1;i ,

n+1

These are the \peak points" of Tn+1 that is, Tn+1(xi ) = (;1)n+1;i

i = 0 1 : : : n+1.

(and so Tn+1 is the polynomial e from our \Conclusion").

2. Find p = pn (Xn+2) and (by solving a system of linear equations). Recall that

j j = jf(xi ) ; p(xi )j kf ; p k kf ; pk

where p is the best approximation to f on all of ;1 1 ].

3. Find (approximately, if necessary) the \error function" e(x) = f(x) ; p(x) and any

point where jf( );p( )j = kf ;pk. (According to Powell, this can be accomplished

using \local quadratic ts.")

4. Replace an appropriate xi by so that the new reference Xn+2 = fx01 x02 : : :g has the

0

properties that f(x0i ) ; p(x0i ) alternates in sign, and that jf(x0i ) ; p(x0i )j j j for all

i. The new polynomial p0 = pn (Xn+2) and new 0 must then satisfy

0

j j = min jf(x0i ) ; p(x0i )j max jf(x0i ) ; p0 (x0i )j = j 0 j:

0 i n+1 0 i n+1

This is an observation due to de la Vallee Poussin: Since f ; p alternates in sign on

an alternating set for f ; p0 , it follows that f ; p0 increases the minimum error over

this set. (See the Theorem on page 53 of \Characterization of Best Approximation"

for a precise statement.) Again according to Powell, the new p0 and 0 can be found

quickly through matrix \updating" techniques. (Since we've only changed one of the

xi 's, only one row of the matrix on page 82 needs to be changed.)

5. The new 0 satis es j 0j kf ; p k kf ; p0 k, and the calculation stops when

kf ; p0 k ; j 0 j = jf( 0 ) ; p0 ( 0 )j ; j 0 j < ":

A Brief Introduction to Fourier Series

Math 682 6/8/98

The Fourier series of a 2 -periodic (bounded, integrable) function f is

a0 + X;a cos kx + b sin kx

1

2 k=1 k k

where the coe cients are de ned by

1 Z f(t) cos kt dt 1 Z f(t) sin kt dt:

ak = bk =

and

; ;

Please note that if f is Riemann integrable on ; ], then each of these integrals is

well-de ned and nite indeed,

1 Z jf(t)j dt

jak j

;

and so, for example, we would have jak j 2kfk for f 2 C 2 .

We write the partial sums of the series as

a0 + X;a cos kx + b sin kx :

n

sn(f)(x) = 2 k k

k=1

Now while sn(f) need not converge pointwise to f (in fact, it may even diverge at a given

point), and while sn (f) is not typically a good uniform approximation to f, it is still a

very natural choice for an approximation to f in the \least-squares" sense (which we'll

make precise shortly). Said in other words, the Fourier series for f provides a useful

representation for f even if it fails to converge pointwise to f.

Observations

1. The functions 1 cos x cos 2x : : :, sin x sin 2x : : :, are orthogonal on ; ]. That is,

Z Z Z

cos mx cos nx dx = sin mx sinnx dx = cos mx sinnx dx = 0

; ; ;

for any m 6= n (and the last equation even holds for m = n),

Z Z

sin2 mx dx =

cos2 mx dx =

; ;

Fourier Series 91

R

for any m 6= 0, and, of course, ; 1 dx = 2 .

Pn ; cos kx + sin kx , then

2. What this means is that if T(x) = 2 0+

k=1 k k

1 Z T(x) cos mx dx = m Z cos2 mx dx =

m

; ;

for m 6= 0, while

1 Z T(x) dx = 0 Z dx = :

0

2

; ;

That is, if T 2 Tn, then T is actually equal to its own Fourier series.

3. The partial sum operator sn(f) is a linear projection from C 2 onto Tn.

Pn ; cos kx + sin kx is a trig polynomial, then

4. If T(x) = 2 0+

k=1 k k

1 Z f(x) T(x) dx = 0 Z f(x) dx + X k Z f(x) cos kx dx

n

2

; ; ;

k=1

X kZ

n

f(x) sinkx dx

+

;

k=1

X;

n

= 02a0 + k ak + k bk

k=1

where (ak ) and (bk ) are the Fourier coe cients for f. This should remind you of the

dot product of the coe cients.]

5. Motivated by 1, 2, and 4, we de ne the inner product of two elements f, g 2 C 2 by

1 Z f(x) g(x) dx:

hf gi =

;

Note that from 4 we have hf sn (f)i = hsn (f) sn (f)i for any n. (Why?)

6. If some f 2 C 2 has ak = bk = 0 for all k, then f 0.

Indeed, by 4 (or linearity of the integral), this means that

Z

f(x) T(x) dx = 0

;

for any trig polynomial T. But from Weierstrass's second theorem we know that f is

the uniform limit of some sequence of trig polynomials (Tn ). Thus,

Z Z

f(x)2 dx = n!1 f(x) Tn (x) dx = 0:

lim

; ;

Fourier Series 92

Since f is continuous, this easily implies that f 0.

7. If f, g 2 C 2 have the same Fourier series, then f g. Hence, the Fourier series for

an f 2 C 2 provides a representation for f (even if the series fails to converge to f).

8. The coe cients a0 a1 : : : an and b1 b2 : : : bn minimize the expression

Z

f(x) ; sn (f)(x) 2 dx:

'(a0 a1 : : : bn ) =

;

It's not hard to see, for example, that

@ ' = Z 2 f(x) ; s (f)(x) cos kx dx = 0

n

@ ak ;

precisely when ak satis es

Z Z

cos2 kx dx:

f(x) cos kx dx = ak

; ;

9. The partial sum sn (f) is the best approximation to f out of Tn relative to the L2

norm Z

q 1=2

1

kfk2 = hf fi = f(x)2 dx :

;

(Be forewarned: Some authors prefer 1=2 in place of 1= .) That is,

kf ; sn (f)k2 = min kf ; Tk2:

T2Tn

Moreover, using 4 and 5, we have

kf ; sn (f)k2 = hf ; sn (f) f ; sn(f)i

2

= hf fi ; 2 hf sn (f)i + hsn (f) sn (f)i

= kfk2 ; ksn(f)k2

Z2 2

a2 ; X(a2 + b2 ):

n

1 f(x)2 dx ; 20

= kk

; k=1

This should remind you of the Pythagorean theorem.]

10. It follows from 9 that

1 Z s (f)(x)2 dx = a2 + X;a2 + b2 1 Z f(x)2 dx:

n

0

n 2 k=1 k k

; ;

Fourier Series 93

In other symbols, ksn (f)k2 kfk2 . In particular, the Fourier coe cients of any

f 2 C 2 are square summable. (Why?)

11. If f 2 C 2 , then its Fourier coe cients (an ) and (bn ) tend to zero as n ! 1.

12. It follows from 10 and Weierstrass's second theorem that sn (f) ! f in the L2 norm

whenever f 2 C 2 . Indeed, given " > 0, choose a trig polynomial T such that

kf ; Tk < ". Then, since sn(T) = T for large enough n, we have

kf ; sn(f)k2 kf ; Tk2 + ksn(T ; f)k2

p p

2kf ; Tk2 2 2 kf ; Tk < 2 2 ":

(Compare this calculation with Lebesgue's Theorem, page 74.)

By way of comparison, let's give a simple class of functions whose Fourier partial sums

provide good uniform approximations.

Theorem. If f 00 2 C 2 , then the Fourier series for f converges absolutely and uniformly

to f.

Proof. First notice that integration by-parts leads to an estimate on the order of growth

of the Fourier coe cients of f:

Z Z sin kx = ; 1 Z f 0 (x) sin kx dx

ak = f(x) cos kx dx = f(x) d k k;

; ;

2kf 0 k=k ! 0 as k ! 1. Now we integrate

(because f is 2 -periodic). Thus, jak j

by-parts again:

Z Z cos kx = 1 Z f 00 (x) cos kx dx

; kak = f 0 (x) sinkx dx = f 0 (x) d k k;

; ;

(because f 0 is 2 -periodic). Thus, jak j 2kf 00 k=k2 ! 0 as k ! 1. More importantly,

this inequality (along with the Weierstrass M-test) implies that the Fourier series for f is

both uniformly and absolutely convergent:

a0 + X;a cos kx + b sinkx a0 + X;ja j + jb j X

1 1 1 1:

C

2 k=1 k k 2 k=1 k k

k=1 k

2

Fourier Series 94

But why should the series actually converge to f ? Well, if we call the sum

a0 + X;a cos kx + b sin kx

1

g(x) = 2 k k

k=1

then g 2 C 2 (why?) and g has the same Fourier coe cients as f (why?). Hence (by 7),

g = f.

Our next chore is to nd a closed expression for sn(f). For this we'll need a couple of

trig identities the rst two need no explanation.

cos kt cos kx + sin kt sinkx = cos k(t ; x)

2 cos sin = sin( + ) ; sin( ; )

1

1 + cos + cos 2 + + cos n = sin (n + 2 )

1

2 2 sin 2

Here's a short proof for the third:

X X

n n

sin (k + 2 ) ; sin (k ; 1 )

sin 1 + 2 cos k sin 1 = sin 2 +

1 1 = sin (n + 1 ) :

2 2 2 2

k=1 k=1

The function 1

sin (n + 2 ) t

Dn(t) = 1

2 sin 2 t

is called Dirichlet's kernel. It plays an important role in our next calculation.

Now we're ready to re-write our formula for sn (f).

X;

n

1

sn (f)(x) = 2 a0 + ak cos kx + bk sinkx

k=1

" #

1Z X

n

1

f(t) cos kt cos kx + sinkt sin kx dt

= +

2

; k=1

" #

1Z Xn

cos k(t ; x) dt

1

f(t)

= +

2

; k=1

1 Z f(t) sin (n + 1 ) (t ; x) dt

2

= 1 (t ; x)

2 sin 2

;

1 Z f(t) D (t ; x) dt = 1 Z f(x + t) D (t) dt:

= n n

; ;

Fourier Series 95

It now follows easily that sn (f) is linear in f (because integration against Dn is linear),

that sn(f) 2 Tn (because Dn 2 Tn), and, in fact, that sn(Tm ) = Tmin(m n). In other words,

sn is indeed a linear projection onto Tn.

While we know that sn(f) is a good approximation to f in the L2 norm, a better

understanding of its e ectiveness as a uniform approximation will require a better under-

standing of the Dirichlet kernel Dn . Here are a few pertinent facts:

Lemma. (a) Dn is even,

1 Z D (t) dt = 2 Z D (t) dt = 1,

(b) n n

; 0

(c) jDn(t)j n + 1 and Dn(0) = n + 2 , 1

2

j sin (n + 1 ) t j jD (t)j

2

2t for 0 < t < ,

(d) n

tZ

(e) If n = 1 jDn(t)j dt, then 42 log n 3 + log n.

n

;

Proof. (a), (b), and (c) are relatively clear from the fact that

1

Dn(t) = 2 + cos t + cos 2t + + cos nt:

(Notice, too, that (b) follows from the fact that sn(1) = 1.) For (d) we use a more delicate

estimate: Since 2 = sin for 0 < < =2, it follows that 2t= 2 sin(t=2) t for

0 < t < . Hence,

jsin (n + 2 ) t j j sin (n + 1 ) t j

1

2

t

1t

2t 2 sin 2

for 0 < t < . Next, the upper estimate in (e) is easy:

Z Z j sin(n + 1 ) t j

2 jD (t)j dt = 2 2 dt

n 1t

2 sin 2

0 0

2Z 2Z

1=n

(n + 1 ) dt + 2t dt

2

0 1=n

= 2n + 1 + log + log n < 3 + log n:

n

The lower estimate takes some work:

2 Z jD (t)j dt = 2 Z jsin(n + 2 ) t j dt 2 Z jsin(n + 1 ) t j dt

1

2

n t

1t

2 sin 2

0 0 0

Fourier Series 96

2Z 2 Z n jsinx j dx

j sin x j dx

(n+ 2 )

1

= x x

0 0

2 XZ k

n j sinx j dx

= x

k=1 (k;1)

X 1 Zk 4 X1

n n

2 jsinx jdx = 2 k

k=1 k (k;1) k=1

4 log n

2

P1

because n k log n.

k=1

R

The numbers n = kDn k1 = 1 ; jDn(t)j dt are called the Lebesgue numbers asso-

ciated to this process (compare this to the terminology we used for interpolation). The

point here is that n gives the norm of the partial sum operator (projection) on C 2 and

(just as with interpolation) n ! 1 as n ! 1. As a matter of no small curiosity, notice

that, from Observation 10, the norm of sn as an operator on L2 is 1.

Corollary. If f 2 C 2 , then

1 Z jf(x + t)j jD (t)j dt

jsn(f)(x)j n kfk: ()

n

;

In particular, ksn(f)k nkfk (3 + log n)kfk.

If we approximate the function sgn Dn by a continuous function f of norm one, then

1 Z jD (t)j dt = :

s (f)(0)

n n n

;

Thus, n is the smallest constant that works in ( ). The fact that the partial sum operators

are not uniformly bounded on C 2 , along with the Baire category theorem, tells us that

there must be some f 2 C 2 for which ksn(f)k is unbounded. But, as we've seen, this has

more to do with projections than it does with Fourier series:

Theorem. (Kharshiladze, Lozinski) For each n, let Ln be a continuous, linear projection

from C 2 onto Tn . Then, there is some f 2 C 2 for which kLn(f) ; fk is unbounded.

Although our last Corollary may not look very useful, it does give us some information

about the e ectiveness of sn(f) as a uniform approximation to f. Speci cally, we have

Lebesgue's theorem:

Fourier Series 97

Theorem. If f 2 C 2 , and if we set En (f) = min kf ; Tk, then

T

T2Tn

T T

kf ; sn (f)k

En (f) (4 + log n) En (f):

Proof. Let T be the best uniform approximation to f out of Tn . Then, since sn (T ) =

T , we get

kf ; sn(f)k kf ; T k + ksn (T ; f)k (4 + log n) kf ; T k:

As an application of Lebesgue's theorem, let's speak brie y about \Chebyshev se-

ries," a notion that ts neatly in between our discussions of approximation by algebraic

polynomials and by trig polynomials.

Theorem. Suppose that f 2 C ;1 1 ] is twice continously di erentiable. Then, f may

be written as a uniformly and absolutely convergent Chebyshev series that is, f(x) =

P1 ak Tk (x), where P1 jak j < 1.

k=0 k=0

2 C 2 . Since ' is even and twice di eren-

Proof. As usual, consider '( ) = f(cos )

tiable, its Fourier series is an absolutely and uniformly convergent cosine series:

X X

1 1

f(cos ) = '( ) = ak cos k = ak Tk (cos )

k=0 k=0

P

where jak j 2k' 00 k=k2. Thus, f(x) = 1 ak Tk (x).

k=0

Pn a T (x),

If we write Sn(f)(x) = we get an interesting consequence of this

k=0 k k

Theorem. First, notice that

Sn(f)(cos ) = sn(')( ):

Thus, from Lebesgue's theorem,

kf ; Sn(f)kC ;1 1 ] = k' ; sn(')kC2

En(f)

T

(4 + log n) En (') = (4 + log n) En(f):

For n < 400, this reads

kf ; Sn(f)k

En(f) 10 En(f):

Fourier Series 98

P

That is, for numerical purposes, the error incurred by using n ak Tk (x) to approximate

k=0

f is within one decimal place accuracy of the best approximation! Notice, too, that En(f)

would be very easy to estimate in this case since

X X X 1:

2 k' 00 k

kf ; Sn(f)k = jak j

En(f) ak Tk

k>n k

2

k>n k>n

Lebesgue's theorem should remind you of our \fancy" version of Bernstein's theorem

if we knew that En (f) log n ! 0 as n ! 1, then we'd know that sn(f) converged uni-

T

T

formly to f. Our goal, then, is to improve our estimates on En (f), and the idea behind

these improvements is to replace Dn by a better kernel (with regard to uniform approx-

T

imation). Before we pursue anything quite so delicate as an estimate on En (f), though,

let's investigate a simple (and useful) replacement for Dn.

Since the sequence of partial sums (sn ) need not converge to f, we might try looking

at their arithmetic means (or Cesaro sums):

s0 + s1 + + sn;1 :

n= n

(These averages typically have better convergence properties than the partial sums them-

selves. Consider n in the (scalar) case sn = (;1)n, for example.) Speci cally, we set

1 hs (f)(x) + + s (f)(x)i

n (f)(x) = n 0 n;1

" n;1 #

Z 1 X D (t) dt = 1 Z f(x + t) K (t) dt

1 f(x + t) n

= k n

; ;

k=0

where Kn = (D0 + D1 + + Dn;1)=n is called Fejer's kernel. The same techniques we

used earlier can be applied to nd a closed form for n (f) which, of course, reduces to

simplifying (D0 + D1 + + Dn;1 )=n. As before, we begin with a trig identity:

X X

n;1 n;1

cos 2k ; cos (2k + 2)

2 sin sin (2k + 1) =

k=0 k=0

= 1 ; cos 2n = 2 sin2 n :

Thus,

X

1 n;1 sin (2k + 1) t=2 = sin2 (nt=2) :

Kn(t) = n

2n sin2(t=2)

k=0 2 sin (t=2)

Fourier Series 99

R

Please note that Kn is even, nonnegative, and 1 ; Kn (t) dt = 1. Thus, n(f) is

a positive, linear map from C 2 onto Tn (but it's not a projection|why?), satisfying

k n(f)k2 kfk2 (why?).

Now the arithmetic mean operator n (f) is still a good approximation f in L2 norm.

Indeed,

X X

1 n;1(f ; s (f)) 1 n;1 kf ; s (f)k ! 0

kf ; n (f)k2 = n k k 2

n

k=0 k=0

2

as n ! 1 (since kf ; sk (f)k2 ! 0). But, more to the point, n (f) is actually a good

uniform approximation to f, a fact that we'll call Fejer's theorem:

Theorem. If f 2 C 2 , then n(f) converges uniformly to f as n ! 1.

Note that, since n(f) 2 Tn, Fejer's theorem implies Weierstrass's second theorem.

Curiously, Fejer was only 19 years old when he proved this result (about 1900) while

Weierstrass was 75 at the time he proved his approximation theorems.

We'll give two proofs of Fejer's theorem one with details, one without. But both

follow from quite general considerations. First:

Theorem. Suppose that kn 2 C 2 satis es

(a) kn 0,

1 Z k (t) dt = 1, and

(b) n

;

Z

kn(t) dt ! 0 for every > 0.

(c)

jtj

1 Z f(x + t) k (t) dt f(x) for each f 2 C 2 .

Then, n

;

Proof. Let " > 0. Since f is uniformly continuous, we may choose > 0 so that

jf(x) ; f(x + t)j < ", for any x, whenever jtj < . Next, we use the fact that kn is

nonnegative and integrates to 1 to write

1 Z f(x + t) k (t) dt = 1 Z

f(x) ; f(x) ; f(x + t) kn(t) dt

n

; ;

1Z f(x) ; f(x + t) kn(t) dt

;

Fourier Series 100

"Z 2kfk Z

kn(t) dt + kn(t) dt

jtj< jtj

< " + " = 2"

for n su ciently large.

To see that Fejer's kernel satis es the conditions of the Theorem is easy: In particular,

jtj . Indeed, since sin(t=2)

(c) follows from the fact that Kn (t) 0 on the set

t

increases on we have

sin2 (nt=2) 1 ! 0:

Kn(t) =

2n sin2(t=2) 2n sin2( =2)

Our second proof, or sketch, really, is based on a variant of the Bohman-Korovkin

theorem for C 2 , due to Korovkin. In this setting, the three \test cases" are

f0(x) = 1 f1(x) = cos x f2 (x) = sin x:

and

Theorem. Let (Ln) be a sequence of positive, linear maps on C 2 . If Ln(f) f for

each of the three functions f0(x) = 1, f1 (x) = cos x, and f2 (x) = sin x, then Ln(f) f

for every f 2 C 2 .

f in each of the three

We won't prove this theorem rather, we'll check that n(f)

test cases. Since sn is a projection, this is painfully simple!

1

n (f0 ) = n (f0 + f0 + + f0 ) = f0

+ f1) = n;1 f1

1

n (f1 ) = n (0 + f1 + f1

n

+ f2) = n;1 f2

1

n (f2 ) = n (0 + f2 + f2 :

n

Kernel operators abound in analysis for example, Landau's proof of the Weierstrass

theorem uses the kernel Ln(x) = cn(1 ; x2)n . And, in the next section, we'll encounter

Jackson's kernel Jn(t) = cn sin4 nt=n3 sin4 t, which is essentially the square of Fejer's kernel.

While we will have no need for a general theory of such operators, please note that the key

to their utility is the fact that they're nonnegative!

Fourier Series 101

Lastly, a word or two about Fourier series involving complex coe cients. Most modern

textbooks consider the case of a 2 -periodic, integrable function f : R ! C and de ne the

Fourier series of f by

X ikt

1

ck e

k=;1

where now we have only one formula for the ck 's:

1 Z f(t) e;ikt dt

ck = 2

;

but, of course, the ck 's may well be complex. This somewhat simpler approach has other

advantages for one, the exponentials eikt are now an orthonormal set (relative to the

normalizing constant 1=2 ). And, if we remain consistent with this choice and de ne the

L2 norm by

1 Z jf(t)j2 dt 1=2

kfk2 = 2

;

then we have the simpler estimate kfk2 kfk for f 2 C 2 .

The Dirichlet and Fejer kernels are essentially the same in this case, too, except that

P

we would now write sn(f)(x) = n ck eikx. Given this, the Dirichlet and Fejer kernels

k=;n

can be written

X ikx = 1 + X(eikx + e;ikx )

n n

Dn (x) = e

k=;n k=1

X n

cos kx

=1+2

k=1

sin (n + 1 ) x

2

= 1x

sin 2

and

1 n;1 X eikx = X 1 ; jkj eikx

Xm n

Kn (x) = n n

m=0 k=;m k=;n

X

1 n;1 sin(m + 1 ) x

2

= 1x

n m=0 sin 2

sin2(nt=2) :

=

n sin2(t=2)

Fourier Series 102

In other words, each is twice its real coe cient counterpart. Since the choice of normalizing

p

p

constant (1= versus 1=2 , and sometimes even 1= or 1= 2 ) has a (small) e ect on

these formulas, you may nd some variation in other textbooks.

Problem Set: Fourier Series

Math 682 6/8/98

57. De ne f(x) = ( ; x)2 for 0 x 2 , and extend f to a 2 -periodic continuous

function on R in the obvious way. Check that the Fourier series for f is 2 =3 +

P

4 1 cos nx=n2 . Since this series is uniformly convergent, it actually converges to f.

n=1

P

In particular, note that setting x = 0 yields the familiar formula 1 1=n2 = 2 =6.

n=1

58. (a) Given n 1 and " > 0, show that there is a continuous function f 2 C 2

R

satisfying kfk = 1 and 1 ; jf(t) ; sgnDn(t)j dt < "=(n + 1).

(b) Show that sn(f)(0) n ; " and, hence, that ksn (f)k n ; ".

2 , prove that g(x) = R f(x + t) k(t) dt is also in C 2 .

59. (a) If f, k 2 C ;

(b) If we only assume that f is 2 -periodic and Riemann integrable on ; ] (but

still k 2 C 2 ), is g still continuous?

(c) If we simply assume that f and k are 2 -periodic and Riemann integrable on

; ], is g still continuous?

60. Suppose that kn 2 C 2 satis es

1 Z k (t) dt = 1 and Z k (t) dt ! 0 (n ! 1)

k0

n n n

; jtj

R

for every > 0. If f is Riemann integrable, show that 1 ; f(x + t) kn(t) dt ! f(x)

pointwise, as n ! 1, at each point of continuity of f. In particular, n (f)(x) ! f(x)

at each point of continuity of f.

61. Given f, g 2 C 2 , we de ne the convolution of f and g, written f g, by

1 Z f(t) g(x ; t) dt:

(f g)(x) =

;

(Compare this integral with that used in problem 59.)

(a) Show that f g = g f and that f g 2 C 2 .

(b) If one of f or g is a trig polynomial, show that f g is again a trig polynomial

(of the same degree).

(c) If one of f or g is continuously di erentiable, show that f g is likewise continu-

ously di erentiable and nd an integral formula for (f g)0(x).

Jackson's Theorems

Math 682 6/16/98

We continue our investigations of the \middle ground" between algebraic and trigonometric

approximation by presenting several results due to the great American mathematician

Dunham Jackson (from roughly 1911{1912). The rst of these results will give us the best

possible estimate of En(f) in terms of !f and n.

Jackson's Theorem 1. If f 2 C 2 , then En (f) 6 !f ( ;

T 1

] n ).

Theorem 1 should be viewed as an improvement over Bernstein's Theorem, which

stated that En(f) 3 !f ( p1n ) for f 2 C ;1 1 ]. As we'll see, the proof of Theorem 1

2

not only mimics the proof of Bernstein's result, but also uses some of the ideas we talked

about in the last section. In particular, the proof we'll give involves integration against an

\improved" Dirichlet kernel.

Before we dive into the proof, let's list several immediate and important Corollaries:

Corollary. Weierstrass's second theorem (since !f ( n ) ! 0 for any f 2 C 2 ).

1

Corollary. The Dini-Lipschitz theorem: If !f ( n ) log n ! 0 as n ! 1, then the Fourier

1

series for f converges uniformly to f.

Proof. From Lebesgue's theorem,

1

kf ; sn (f)k 6 (4 + log n) !f n ! 0:

T

(4 + log n) En (f)

Jackson's Theorem 2. If f 2 C ;1 1 ], then En(f) 6 !f ( ;1 1 ] n ).

1

Proof. Let '( ) = f(cos ). Then, as we've seen,

1 1

T 6 !' ; 6 !f ;1 1 ] n

En(f) = En (') ]n

where the last inequality follows from the fact that

j'( ) ; '( )j = jf(cos ) ; f(cos )j !f (j cos ; cos j) !f (j ; j):

Jackson's Theorems 105

Corollary. If f 2 lipK on ;1 1 ], then En(f) 6Kn; . (Recall that Bernstein's

theorem gives only n; =2 .)

Corollary. If f 2 C ;1 1 ] has a bounded derivative, then En(f) 0 k.

n kf

6

Corollary. If f 2 C ;1 1 ] has a continuous derivative, then En(f) 0 ).

6

n En;1 (f

2 Pn;1 be the best uniform approximation to f 0 and consider p(x) =

Proof. Let p

R x p (t) dt 2 P . From the previous Corollary,

n

;1

En(f) = En(f ; p) (Why?)

6 kf 0 ; p k = 6 E (f 0 ):

n n;1

n

Iterating this last inequality will give the following result:

Corollary. If f 2 C ;1 1 ] is k-times continuously di erentiable, then

6k+1 1

En(f) !k n ; k

n(n ; 1) (n ; k + 1)

where !k is the modulus of continuity of f (k).

Well, enough corollaries. It's time we proved Jackson's Theorem 1. Now Jackson's

approach was to show that

1 Z f(x + t) c sin nt 4 dt f(x)

n sin t

;

where Jn(t) = cn(sin nt= sin t)4 is the \improved" kernel we alluded to earlier (it's essen-

tially the square of Fejer's kernel). The approach we'll take, due to Korovkin, proves the

existence of a suitable kernel without giving a tidy formula for it. On the other hand,

it's relatively easy to outline the idea. The key here is that Jn(t) should be an even,

R

nonnegative, trig polynomial of degree n with 1 ; Jn(t) dt = 1. In other words,

1 + X cos kt

n

Jn(t) = 2 kn

k=1

(why is the rst term 1/2?), where 1 n : : : n n must be chosen so that Jn(t) 0. As-

suming we can nd such k n's, here's what we get:

Jackson's Theorems 106

Lemma. If f 2 C 2 , then

" #

r

1Z 1; 1n :

1

f(x) ; f(x + t) Jn (t) dt !f n 1+n 2

;

Proof. We already know how the rst several lines of the proof will go:

1 Z f(x + t) J (t) dt = 1 Z

f(x) ; f(x) ; f(x + t) Jn(t) dt

n

; ;

1 Z jf(x) ; f(x + t)j J (t) dt

n

;

1 Z ! ( jtj)J (t) dt:

f n

;

Next we borrow a trick from Bernstein. We replace !f ( jtj ) by

;1 + njtj ! 1

1

!f ( jtj) = !f njtj n fn

and so the last integral on the right-hand side, above, is dominated by

Z n Z jtj J (t) dt :

1 1 ;1 + njtj J (t) dt = ! 1

!f n 1+

n fn n

; ;

R

All that remains is to estimate ; jtj Jn(t) dt, and for this we'll appeal to the Cauchy-

Schwarz inequality (again, compare this to the proof of Bernstein's theorem).

1 Z jtj J (t) dt = 1 Z jtjJ (t)1=2 J (t)1=2 dt

n n n

; ;

1 Z jtj2 J (t) dt Z

1=2 1=2

1 Jn(t) dt

n

; ;

1 Z jtj2 J (t) dt 1=2

:

= n

;

But,

2 2

t

jtj2 = 2 (1 ; cos t ):

sin 2

So,

r

1Z 1Z 1=2

1 ; 1 n:

2

jtjJn(t) dt (1 ; cos t) Jn(t) dt =

2 2

; ;

Jackson's Theorems 107

Now we still have to prove that we can actually nd a suitable choice of scalars

1 n : : : n n . We already know that we need to choose the k n 's so that Jn (t) will be

nonnegative, but now it's clear that we also want 1 n to be very close to 1. To get us

started, let's rst see why it's easy to generate nonnegative cosine polynomials. Given real

numbers c0 : : : cn, note that

! 0X 1

X X X

2

n n n

ck eikx @ cj e;ijx A

ck eikx ck cj ei(k;j)x

0 = =

j=0

k=0 k=0 kj

X2 X ;

n

ck cj ei(k;j)x + ei(j;k)x

ck

= +

k=0 k>j

X2 X

n

ck cj cos(k ; j)x

ck

= +2

k=0 k>j

X2 X

n n;1

ck ck ck+1 cos x + + 2c0cn cos nx:

= +2 ()

k=0 k=0

In particular, we need to nd c0 : : : cn with

X X

n n;1

c2 = 1 ck ck+1

and 1n=2 1:

k 2

k=0 k=0

Pn;1 c c Pn c2 , and then normalize. But, in fact,

nd c 's with

What we'll do is k k=0 k k+1 k=0 k

we won't actually nd anything|we'll simply write down a choice of ck 's that happens to

work! Consider:

X X

n n

k k k k

sin n + 1 sin n + 2 sin n + 1

= sin n + 2

+2 +2 +2

k=0 k=0

1 X sin k

n k k

+ sin n + 2 sin n + 1 :

=2 n+2 +2 +2

k=0

By changing the index of summation, it's easy to see that rst two sums are equal and,

hence, each is equal to the average of the two. Next we re-write this last sum, using the

1; ; ;

trig identity 2 sinA + sinB = cos A;B sin A+B , to get

2 2

X X

n n

k k k

sin n + 1 sin n + 2 sin2 n + 1 :

= cos n + 2

+2 +2 +2

k=0 k=0

Jackson's Theorems 108

k+1

1 for large n, we've done it! If we de ne ck = c sin n+2 , where

Since cos n+2

P

c is chosen so that n c2 = 1=2, and if we de ne Jn(x) using ( ), then Jn(x) 0 and

k=0 k

1 n = cos n+2 (why?). The estimate needed in our Lemma becomes

v

u

ñòð. 3 |