<<

. 2
( 5)



>>

(ii) Bn(f2 ) = n2
n 2 n1 2
P ;k ;
(iii) n n ; x 2 n xk (1 ; x)n;k = x(1;x) 4n , if 0 x 1.
1
k=0 k n
(iv) Given > 0 and 0 x 1, let F denote the set of k's in f0 : : : ng for which
P;
. Then k2F n xk (1 ; x)n;k 4n1 2 .
k ;x
n k
. 31. Show that jBn(f)j Bn(jfj), and that Bn(f) 0 whenever f 0. Conclude that
kBn(f)k kfk.
32. If f is a bounded function on 0 1 ], show that Bn(f)(x) ! f(x) at each point of
continuity of f.
33. (Bohman, Korovkin) Let (Tn) be a sequence of monotone linear operators on C 0 1 ]
that is, each Tn is a linear map from C 0 1 ] into itself satisfying Tn(f) Tn (g)
whenever f g. Suppose also that Tn(f0 ) f0, Tn(f1 ) f1 , and Tn(f2 ) f2 .
Prove that Tn(f) f for every f 2 C 0 1 ]. Hint: Mimic the proof of Bernstein's
theorem.]
34. Find Bn(f) for f(x) = x3 . Hint: k2 = (k ; 1)(k ; 2) + 3(k ; 1) + 1.] The same
method of calculation can be used to show that Bn(f) 2 Pm whenever f 2 Pm and
n > m.
Polynomials 37
35. Let f be continuously di erentiable on a b ], and let " > 0. Show that there is a
polynomial p such that kf ; pk < " and kf 0 ; p0 k < ".
36. Suppose that f 2 C a b ] is twice continuously di erentiable and has f 00 > 0. Prove
that the best linear approximation to f on a b ] is a0 + a1 x where a0 = f 0 (c),
a1 = f(a) + f(c) + f 0 (c)(a + c)]=2, and where c is the unique solution to f 0 (c) =
(f(b) ; f(a))=(b ; a).
The next several exercises concern the modulus of continuity. Given a bounded real-valued
function f de ned on some interval I, we de ne !f , the modulus of continuity of f, by

!f (I ) = !f ( ) = sup jf(x) ; f(y)j : x y 2 I jx ; yj 0:

Note, for example, that if f is uniformly continuous, then !f ( ) ! 0 as ! 0. Indeed,
the statement that jf(x) ; f(y)j " whenever jx ; yj is equivalent to the statement
that !f ( ) ". On the other hand, if the graph of f has a jump of magnitude 1, say, then
!f ( ) 1 for all > 0.
. 37. If f satis es the Lipschitz condition jf(x);f(y)j Kjx;yj, what can you say about
!f ? Calculate !g for g(x) = px.
38. If f 2 C a b ], show that !f ( 1 + 2 ) !f ( 1) + !f ( 2 ) and that !f ( ) # 0 as # 0.
Use this to show that !f is continuous for 0. Finally, show that the modulus of
continuity of !f is again !f .
. 39. (a) If x = cos , where ;1 x 1, and if g( ) = f(cos ), show that !g ( ; ] ) =
!g ( 0 ] ) !f ( ;1 1 ] ).
(b) If g(x) = f(ax+b) for c x d, show that !g ( c d ] ) = !f ( ac+b ad+b ] a ).
40. Let f be continuously di erentiable on 0 1 ]. Show that (Bn(f)0 ) converges uniformly
to f 0 by showing that kBn(f 0 ) ; (Bn+1(f))0 k !f 0 (1=(n + 1)). In order to see
why this is of interest, nd a uniformly convergent sequence of polynomials whose
derivatives fail to converge uniformly. Compare this result with problem 35.]
Trigonometric Polynomials
Math 682 5/26/98

Introduction
A (real) trigonometric polynomial, or trig polynomial for short, is a function of the form
X;
n
a0 + ak cos kx + bk sinkx ()
k=1
where a0 : : : an and b1 : : : bn are real numbers. The degree of a trig polynomial is the
highest frequency occurring in any representation of the form ( ) thus, ( ) has degree n
provided that one of an or bn is nonzero. We will use Tn to denote the collection of trig
polynomials of degree at most n, and T to denote the collection of all trig polynomials
(i.e., the union of the Tn's).
It is convenient to take the space of all continuous 2 -periodic functions on R as the
containing space for Tn a space we denote by C 2 . The space C 2 has several equivalent
descriptions. For one, it's obvious that C 2 is a subspace of C(R), the space of all con-
tinuous functions on R. But we might also consider C 2 as a subspace of C 0 2 ] in the
following way: The 2 -periodic continuous functions on R may be identi ed with the set
of functions f 2 C 0 2 ] satisfying f(0) = f(2 ). Each such f extends to a 2 -periodic
element of C(R) in an obvious way, and it's not hard to see that the condition f(0) = f(2 )
de nes a subspace of C 0 2 ]. As a third description, it is often convenient to identify C 2
with the collection C(T), consisting of all the continuous real-valued functions on T, where
T is the unit circle in the complex plane C . That is, we simply make the identi cations

! ei f( ) ! f(ei ):
and
In any case, each f 2 C 2 is uniformly continuous and uniformly bounded on all of R, and
is completely determined by its values on any interval of length 2 . In particular, we may
(and will) endow C 2 with the sup norm:
kfk = max jf(x)j = max jf(x)j:
0x2 x2R

Our goal in this chapter is to prove what is sometimes called Weierstrass's second
theorem (also from 1885).
Trig Polynomials 39
Theorem. (Weierstrass's Second Theorem, 1885) Let f 2 C 2 . Then, for every " > 0,
there exists a trig polynomial T such that kf ; Tk < ".
Ultimately, we will give several di erent proofs of this theorem. Weierstrass gave a
separate proof of this result in the same paper containing his theorem on approximation
by algebraic polynomials, but it was later pointed out by Lebesgue (1898) that the two
theorems are, in fact, equivalent. Lebesgue's proof is based on several elementary obser-
vations. We will outline these elementary facts as \exercises with hints," supplying a few
proofs here and there, but leaving full details to the reader.
We rst justify the use of the word \polynomial" in describing ( ).

Lemma. cos nx and sin(n + 1)x= sin x can be written as polynomials of degree exactly n
in cos x for any integer n 0.

Proof. Using the recurrence formula cos kx + cos(k ; 2)x = 2 cos(k ; 1)x cos x it's not
hard to see that cos 2x = 2 cos2 x ; 1, cos 3x = 4 cos3 x ; 3 cos x, and cos 4x = 8 cos4 x ;
8 cos2 x + 1. More generally, by induction, cos nx is a polynomial of degree n in cos x
with leading coe cient 2n;1. Using this fact and the identity sin(k + 1)x ; sin(k ; 1)x =
2 cos kx sin x (along with another easy induction argument), it follows that sin(n+1)x can
be written as sin x times a polynomial of degree n in cos x with leading coe cient 2n.
Alternatively, notice that by writing (i sin x)2k = (cos2 x ; 1)k we have
"X #
n n (i sin x)k cosn;k x
cos nx = Re (cos x + i sin x)n ] = Re k
k=0
Xn
n=2]
(cos2 x ; 1)k cosn;2k x:
=
k=0 2k
The coe cient of cosn x in this expansion is then
Xn 1 X n = 2n;1:
n=2] n
2k = 2 k=0 k
k=0

(All the binomial coe cients together sum to (1 + 1)n = 2n, but the even or odd terms
taken separately sum to exactly half this amount since (1 + (;1))n = 0.)
Trig Polynomials 40
Similarly,
"n+1 #
X n + 1 (i sin x)k cosn+1;k x
sin(n + 1)x = Im (cos x + i sin x)n+1 = Im k
k=0
X n+1
n=2]
(cos2 x ; 1)k cosn;2k x sin x
=
k=0 2k + 1
where we've written (i sin x)2k+1 = i(cos2 x ; 1)k sin x. The coe cient of cosn x sin x is
X n+1 X
n=2]
1 n+1 n + 1 = 2n:
=2 k
k=0 2k + 1 k=0

Corollary. Any trig polynomial ( ) may be written as P(cos x) + Q(cos x) sin x, where
P and Q are algebraic polynomials of degree at most n and n ; 1, respectively. If ( )
represents an even function, then it can be written using only cosines.
Corollary. The collection T , consisting of all trig polynomials, is both a subspace and
a subring of C 2 (that is, T is closed under both linear combinations and products). In
other words, T is a subalgebra of C 2 .
It's not hard to see that the procedure we've described above can be reversed that is,
each algebraic polynomial in cos x and sin x can be written in the form ( ). For example,
4 cos3 x = 3 cos x + cos 3x. But, rather than duplicate our e orts, let's use a bit of linear
algebra. First, the 2n + 1 functions
A = f1 cos x cos 2x : : : cos nx sinx sin 2x : : : sin nx g
are linearly independent the easiest way to see this is to notice that we may de ne an
inner product on C 2 under which these functions are orthogonal. Speci cally,
Z2 Z2
hf gi = hf fi = f(x)2 dx 6= 0
f(x) g(x) dx = 0
0 0
for any pair of functions f 6= g 2 A. (We'll pursue this direction in greater detail later in
the course.) Second, we've shown that each element of A lives in the space spanned by
the 2n + 1 functions
B = f1 cos x cos2 x : : : cosn x sinx cos x sin x : : : cosn;1 x sin x g:
Trig Polynomials 41
That is,
Tn span B:
spanA
By comparing dimensions, we have
2n + 1 = dimTn = dim(span A) dim(span B) 2n + 1
and hence we must have span A = span B. The point here is that Tn is a nite-dimensional
subspace of C 2 of dimension 2n + 1, and we may use either one of these sets of functions
as a basis for Tn.
Before we leave these issues behind, let's summarize the situation for complex trig
polynomials i.e., the case where we allow complex coe cients in ( ). Now it's clear that
every trig polynomial ( ), whether real or complex, can be written as
X
n
ck eikx ()
k=;n
where the ck 's are complex that is, a trig polynomial is actually a polynomial (over C ) in
z = eix and z = e;ix . Conversely, every polynomial ( ) can be written in the form ( ),
using complex ak 's and bk 's. Thus, the complex trig polynomials of degree n form a vector
space of dimension 2n+1 over C (hence of dimension 2(2n+1) when considered as a vector
space over R). But, not every polynomial in z and z represents a real trig polynomial.
Rather, the real trig polynomials are the real parts of the complex trig polynomials. To
see this, notice that ( ) represents a real-valued function if and only if
X X X
n n n
ck eikx = c;k eikx
ck eikx =
k=;n k=;n k=;n
that is, ck = c;k for each k. In particular, c0 must be real, and hence
X ikx = c0 + X(ck eikx + c;k e;ikx )
n n
ck e
k=;n k=1
X ikx
n
= c0 + (ck e + ck e;ikx)
k=1
Xn
(ck + ck ) cos kx + i(ck ; ck ) sin kx
= c0 +
k=1
Xn
2Re(ck ) cos kx ; 2Im(ck ) sin kx
= c0 +
k=1
Trig Polynomials 42
which is of the form ( ) with ak and bk real.
Conversely, given any real trig polynomial ( ), we have
X; X ak ; ibk ikx ak + ibk ;ikx
n n
a0 + ak cos kx + bk sin kx = a0 + e+ e
2 2
k=1 k=1
which of of the form ( ) with ck = c;k for each k.
It's time we returned to approximation theory! Since we've been able to identify C 2
with a subspace of C 0 2 ], and since Tn is a nite-dimensional subspace of C 2 , we have
Corollary. Each f 2 C 2 has a best approximation (on all of R) out of Tn. If f is an
even function, then it has a best approximation which is also even.
2 C 2 is even and
Proof. We only need to prove the second claim, so suppose that f
that T 2 Tn satis es
kf ; T k = min kf ; Tk:
T2Tn
e
Then, since f is even, T (x) = T (;x) is also a best approximation to f out of Tn indeed,
e
kf ; T k = max jf(x) ; T (;x)j
x2R
= max jf(;x) ; T (x)j
x2R
= max jf(x) ; T (x)j = kf ; T k:
x2R

But now, the even trig polynomial
e
b(x) = T (x) + T (x) = T (;x) + T (x)
T 2 2
is also a best approximation out of Tn since
e e
(f ; T ) + (f ; T ) kf ; T k + kf ; T k = min kf ; Tk:
b
kf ; T k = 2 2 T2Tn

We next give (de la Vallee Poussin's version of) Lebesgue's proof of Weierstrass's
second theorem that is, we will deduce the second theorem from the rst.
Trig Polynomials 43
Theorem. Let f 2 C 2 and let " > 0. Then, there is a trig polynomial T such that
kf ; Tk = max jf(x) ; T(x)j < ".
x2R
Proof. We will prove that Weierstrass's rst theorem for C ;1 1 ] implies his second
theorem for C 2 .
Step 1. If f is even, then f may be uniformly approximated by even trig polynomials.
If f is even, then it's enough to approximate f on the interval 0 ]. In this case, we
may consider the function g(y) = f(arccos y), ;1 y 1, in C ;1 1 ]. By Weierstrass's
rst theorem, there is an algebraic polynomial p(y) such that
max jf(arccosy) ; p(y)j = max jf(x) ; p(cos x)j < ":
;1 y 1 0x
But T(x) = p(cos x) is an even trig polynomial! Hence,
kf ; Tk = max jf(x) ; T(x)j < ":
x2R
Let's agree to abbreviate kf ; Tk < " as f T.
Step 2. Given f 2 C 2 , there is a trig polynomial T such that 2f(x) sin2 x T(x).
Each of the functions f(x) + f(;x) and f(x) ; f(;x)] sin x is even. Thus, we may
choose even trig polynomials T1 and T2 such that
f(x) ; f(;x)] sin x T2(x):
f(x) + f(;x) T1(x) and
Multiplying the rst expression by sin2 x, the second by sin x, and adding, we get
2f(x) sin2 x T1 (x) sin2 x + T2(x) sin x T3(x)
where T3 (x) is still a trig polynomial, and where \ " now means \within 2"" (since
jsin x j 1).
Step 3. Given f 2 C 2 , there is a trig polynomial T such that 2f(x) cos2 x T(x), where
\ " means \within 2"."
Repeat Step 2 for f(x ; =2) and translate: We rst choose a trig polynomial T4 (x)
such that
2f x ; 2 sin2 x T4 (x):
Trig Polynomials 44
That is,
2f(x) cos2 x T5(x)
where T5(x) is a trig polynomial.
Finally, by combining the conclusions of Steps 2 and 3, we nd that there is a trig
polynomial T6 (x) such that f T6 (x), where, again, \ " means \within 2"."
Just for fun, let's complete the circle and show that Weierstrass's second theorem
for C 2 implies his rst theorem for C ;1 1 ]. Since, as we'll see, it's possible to give an
independent proof of the second theorem, this is a meaningful exercise.
Theorem. Given f 2 C ;1 1 ] and " > 0, there exists an algebraic polynomial p such
that kf ; pk < ".
2 C ;1 1 ], the function f(cos x) is an even function in C 2 . By our
Proof. Given f
Corollary to Weierstrass's second theorem, we may approximate f(cos x) by an even trig
polynomial:
f(cos x) a0 + a1 cos x + a2 cos 2x + + an cos nx:
But, as we've seen, cos kx can be written as an algebraic polynomial in cos x. Hence, there
is some algebraic polynomial p such that f(cos x) p(cos x). That is,
max jf(cos x) ; p(cos x)j = max jf(t) ; p(t)j < ":
;1 t 1
0x

The algebraic polynomials Tn (x) satisfying
Tn(cos x) = cos nx for n = 0 1 2 : : :
are called the Chebyshev polynomials of the rst kind. Please note that this formula
uniquely de nes Tn as a polynomial of degree exactly n, and hence uniquely determines
the values of Tn(x) for jxj > 1, too. The algebraic polynomials Un (x) satisfying
Un(cos x) = sin(n + 1)x for n = 0 1 2 : : :
sinx
are called the Chebyshev polynomials of the second kind. Likewise, note that this formula
uniquely de nes Un as a polynomial of degree exactly n.
Trig Polynomials 45
We will discover many intriguing properties of the Chebyshev polynomials in the next
chapter. For now, let's settle for just one: The recurrence formula we gave earlier

cos nx = 2 cos x cos(n ; 1)x ; cos(n ; 2)x

now becomes
Tn(x) = 2x Tn;1 (x) ; Tn;2(x) n 2
where T0 (x) = 1 and T1(x) = x. This recurrence relation (along with the initial cases T0
and T1 ) may be taken as a de nition for the Chebyshev polynomials of the rst kind. At
any rate, it's now easy to list any number of the Chebyshev polynomials Tn for example,
the next few are T2 (x) = 2x2 ; 1, T3(x) = 4x3 ; 3x, T4(x) = 8x4 ; 8x2 + 1, and T5(x) =
16x5 ; 20x3 + 5x.
Problem Set: Trigonometric Polynomials
Math 682 5/26/98

A (real) trigonometric polynomial, or trig polynomial for short, is a function of the form
X;
n
a0 + ak cos kx + bk sin kx ()
k=1
where a0 : : : an and b1 : : : bn are real numbers. We will use Tn to denote the collection
of trig polynomials of degree at most n, considered as a subspace C 2 , the space of all
continuous 2 -periodic functions on R. The space C 2 may, in turn, be considered as a
subspace of C 0 2 ]. Indeed, the 2 -periodic continuous functions on R may be identi ed
with the subspace of C 0 2 ] consisting of those f's which satisfy f(0) = f(2 ). As an
alternate description, it is often convenient to instead identify C 2 with the collection
C(T), consisting of all continuous real-valued functions on T, where T is the unit circle in
the complex plane C . In this case, we simply make the identi cations

! ei f( ) ! f(ei ):
and

. 41. (a) By using the recurrence formulas cos kx + cos(k ; 2)x = 2 cos(k ; 1)x cos x and
sin(k + 1)x ; sin(k ; 1)x = 2 cos kx sin x, show that each of the functions cos kx
and sin(k + 1)x= sin x may be written as algebraic polynomials of degree exactly
k in cos x. In each case, what is the coe cient of cosk x?
(b) Equivalently, use the binomial formula to write the real and imaginary parts of
(cos x + i sin x)n = cos nx + i sin nx as algebraic polynomials in cos x and sin x.
Again, what are the leading coe cients of these polynomials?
(c) If P(x y) is an algebraic polynomial (in two variables) of degree at most n, show
that P(cos x sin x) may be written as Q(cos x) + R(cos x) sin x, where Q and
R are algebraic polynomials (in one variable) of degrees at most n and n ; 1,
respectively.
(d) Show that cosn x can be written as a linear combination of the functions cos kx,
k = 1 : : : n, and that cosn;1 x sin x can be written as a linear combinations of
the functions sin kx, k = 1 : : : n. Thus, each polynomial P(cos x sin x) in cos x
and sin x can be written in the form ( ).
Trigonometric Polynomials 47
(e) If ( ) represents an even function, show that it can be written using only cosines.
Conversely, if P(x y) is an even polynomial, show that P(cos x sin x) can be
written using only cosines.
42. Show that Tn has dimension exactly 2n + 1 (as a vector space over R).
43. We might also consider complex trig polynomials that is, functions of the form ( ) in
which we now allow the ak 's and bk 's to be complex numbers.
(a) Show that every trig polynomial, whether real or complex, may be written as
X
n
ck eikx ()
k=;n
where the ck 's are complex. Thus, complex trig polynomials are just algebraic
polynomials in z and z, where z = eix 2 T.
(b) Show that ( ) is real-valued if and only if ck = c;k for any k.
(c) If ( ) is a real-valued function, show that it may be written as a real trig poly-
nomial that is, it may be written in the form ( ) using only real coe cients.
Characterization of Best Approximation
Math 682 5/27/98

We next discuss Chebyshev's solution to the problem of best polynomial approximation
from 1854. Given that there was no reason to believe that the problem even had a solution,
let alone a unique solution, Chebyshev's accomplishment should not be underestimated.
Chebyshev might very well have been able to prove Weierstrass's result|30 years early|
had the thought simply occurred to him! Chebyshev's original papers are apparently
rather sketchy. It wasn't until 1903 that full details were given by Kirchberger. Curiously,
Kirchberger's proofs foreshadow very modern techniques such as convexity and separation
arguments. The presentation we'll give owes much to Haar and to de la Vallee Poussin
(both from around 1918).
We begin with an easy observation:
Lemma. Let f 2 C a b ] and let p = pn be a best approximation to f out of Pn. Then,
there are at least two distinct points x1 , x2 2 a b ] such that
f(x1 ) ; p(x1 ) = ;(f(x2 ) ; p(x2 )) = kf ; pk:
That is, f ; p attains both of the values kf ; pk.
Proof. Let's write E = En (f) = kf ; pk = max jf(x) ; p(x)j. If the conclusion of the
axb
Lemma is false, then we might as well suppose that f(x1 ) ; p(x1 ) = E, for some x1, but
that
e = min (f(x) ; p(x)) > ;E:
axb
In particular, E + e 6= 0 and so q = p + (E + e)=2 is an element of Pn with q 6= p. We
claim that q is a better approximation to f than p. Here's why:
E; E2 e f(x) ; p(x) ; E 2 e e; E2 e
+ + +

or
E ;e ;
f(x) ; q(x) ; E 2 e :
2
That is,
E ; e < E = kf ; pk
kf ; qk 2
Best Approximation 49
a contradiction.
Corollary. The best approximating constant to f 2 C a b ] is
2 3
1 4 max f(x) + f(x)5
p0 = min
2 axb axb

2 3
and
1 4 max f(x) ; f(x)5 :
E0(f) = 2 min
axb axb

Exercise.
Proof.

Now all of this is meant as motivation for the general case, which essentially repeats
the observation of our rst Lemma inductively. A little experimentation will convince you
that a best linear approximation, for example, would imply the existence of three points
(at least) at which f ; p1 alternates between kf ; p1 k.
A bit of notation will help us set up the argument for the general case: Given g in
C a b ], we'll say that x 2 a b ] is a (+) point for g (respectively, a (;) point for g) if
g(x) = kgk (respectively, g(x) = ;kgk). A set of distinct point a x0 < x1 < < xn b
will be called an alternating set for g if the xi 's are alternately (+) points and (;) points
that is, if
jg(xi )j = kgk i = 0 1 ::: n
and
g(xi ) = ;g(xi;1 ) i = 1 2 : : : n:
Using this notation, we will be able to characterize the polynomial of best approximation.
Since the following three theorems are particularly important, we will number them for
future reference. Our rst result is where all the ghting takes place:
Theorem 1. Let f 2 C a b ], and suppose that p = pn is a best approximation to f out
of Pn. Then, there is an alternating set for f ; p consisting of at least n + 2 points.
2 Pn, there's nothing to show. (Why?) Thus, we may suppose that f 2 Pn
Proof. If f =
and, hence, that E = En(f) = kf ; pk > 0.
Best Approximation 50
Now consider the (uniformly) continuous function ' = f ; p. We may partition a b ]
by way of a = t0 < t1 < < tn = b into su ciently small intervals so that

j'(x) ; '(y)j < E=2 x y 2 ti ti+1 ]:
whenever

Here's why we'd want to do such a thing: If ti ti+1 ] contains a (+) point for ' = f ; p,
then ' is positive on all of ti ti+1 ]. Indeed,

x y 2 ti ti+1 ] and '(x) = E =) '(y) > E=2 > 0:

Similarly, if ti ti+1 ] contains a (;) point for ', then ' is negative on all of ti ti+1 ].
Consequently, no interval ti ti+1 ] can contain both (+) points and (;) points.
Call ti ti+1 ] a (+) interval (respectively, a (;) interval) if it contains a (+) point
(respectively, a (;) point) for ' = f ;p. Notice that no (+) interval can even touch a
(;) interval. In other words, a (+) interval and a (;) interval must be strictly separated
(by some interval containing a zero for ').
We now relabel the (+) and (;) intervals from left to right, ignoring the \neither"
intervals. There's no harm in supposing that the rst \signed" interval is a (+) interval.
Thus, we suppose that our relabeled intervals are written

I1 I2 : : : Ik1 (+) intervals
Ik1 +1 Ik1+2 : : : Ik2 (;) intervals
:::::::::::::::::
(;1)m;1 intervals
Ikm;1+1 Ik1 +2 : : : Ikm

where Ik1 is the last (+) interval before we reach the rst (;) interval, Ik1 +1. And so on.
For later reference, we let S denote the union of all the \signed" intervals ti ti+1 ]
Sm I , and we let N denote the union of all the \neither" intervals t t ].
that is, S = j=1 kj i i+1
Thus, S and N are compact sets with S N = a b ] (note that while S and N aren't
quite disjoint, they are at least \non-overlapping"|their interiors are disjoint).
Our goal here is to show that m n + 2. (So far we only know that m 2!) Let's
suppose that m < n + 2 and see what goes wrong.
Best Approximation 51
Since any (+) interval is strictly separated from any (;) interval, we can nd points
z1 : : : zm;1 2 N such that
max Ik1 < z1 < minIk1 +1
max Ik2 < z2 < minIk2 +1
::::::::::::::::::::
max Ikm;1 < zm;1 < minIkm;1+1

And now we construct the o ending polynomial:

q(x) = (z1 ; x)(z2 ; x) (zm;1 ; x):

Notice that q 2 Pn since m ; 1 n. (Here is the only use we'll make of the assumption
m < n + 2!) We're going to show that p + q 2 Pn is a better approximation to f than p,
for some suitable scalar .
We rst claim that q and f ; p have the same sign. Indeed, q has no zeros in any
of the ( ) intervals, hence is of constant sign on any such interval. Thus, q > 0 on
I1 : : : Ik1 because each (zj ; x) > 0 on these intervals q < 0 on Ik1+1 : : : Ik2 because
here (z1 ; x) < 0, while (zj ; x) > 0 for j > 1 and so on.
We next nd . Let e = max jf(x) ; p(x)j, where N is the union of all the subin-
x2N
tervals ti ti+1 ] which are neither (+) intervals nor (;) intervals. Then, e < E. (Why?)
Now choose > 0 so that kqk < minfE ; e E=2g. We claim that p + q is a better
approximation to f than p. One case is easy: If x 2 N, then

jf(x) ; (p(x) + q(x))j jf(x) ; p(x)j + jq(x)j e + kqk < E:
On the other hand, if x 2 N, then x is in either a (+) interval or a (;) interval. In
=
particular, we know that jf(x) ;p(x)j > E=2 > kqk and that f(x) ;p(x) and q(x) have
the same sign. Thus,
jf(x) ; (p(x) + q(x))j = jf(x) ; p(x)j ; jq(x)j
E ; min jq(x)j < E
x2S
Best Approximation 52
since q is nonzero on S. This contradiction nishes the proof. (Phew!)
Remarks
1. It should be pointed out that the number n + 2 here is actually 1 + dim Pn.
2. Notice, too, that if f ; pn alternates in sign n + 2 times, then f ; pn must have at
least n+1 zeros. Thus, pn actually agrees with f (or \interpolates" f) at n+1 points.
We're now ready to establish the uniqueness of the polynomial of best approximation.
Theorem 2. Let f 2 C a b ]. Then, the polynomial of best approximation to f out of
Pn is unique.
2 Pn both satisfy kf ; pk = kf ; qk = En(f) = E. Then,
Proof. Suppose that p, q
as we've seen, their average r = (p + q)=2 2 Pn is also best: kf ; rk = E since f ; r =
(f ; p)=2 + (f ; q)=2.
By Theorem 1, f ; r has an alternating set x0 x1 : : : xn+1, containing n + 2 points.
Thus, for each i,

(f ; p)(xi ) + (f ; q)(xi ) = 2E (alternating)

while
;E (f ; p)(xi ) (f ; q)(xi ) E:
But this means that

(f ; p)(xi ) = (f ; q)(xi ) = E (alternating)

for each i. (Why?) That is, x0 x1 : : : xn+1 is an alternating set for both f ; p and f ; q.
In particular, the polynomial q ; p = (f ; p) ; (f ; q) has n + 2 zeros! Since q ; p 2 Pn,
we must have p = q.
Finally, we come full circle:
Theorem 3. Let f 2 C a b ], and let p 2 Pn. If f ; p has an alternating set containing
n + 2 (or more) points, then p is the best approximation to f out of Pn.
Best Approximation 53
Proof. Let x0 x1 : : : xn+1 be an alternating set for f;p, and suppose that some q 2 Pn
is a better approximation to f than p that is, kf ; qk < kf ; pk. In particular, then, we
must have
jf(xi ) ; p(xi )j = kf ; pk > kf ; qk jf(xi ) ; q(xi )j
for each i = 0 1 : : : n + 1. Now the inequality jaj > jbj implies that a and a ; b have the
same sign (why?), hence q ; p = (f ; p) ; (f ; q) alternates in sign n + 2 times (because
f ;p does). But then, q;p would have at least n+1 zeros. Since q;p 2 Pn, we must have
q = p, which is a contradiction. Thus, p is the best approximation to f out of Pn.
Example (taken from Rivlin)
While an alternating set for f ; pn is supposed to have at least n + 2 points, it may well
have more than n + 2 points thus, alternating sets need not be unique. For example,
consider the function f(x) = sin 4x on ; ]. Since there are 8 points where f alternates
between 1, it follows that p0 = 0 and that there are 4 4 = 16 di erent alternating
sets consisting of exactly 2 points (not to mention all those with more than 2 points). In
addition, notice that we actually have p1 = = p6 = 0, but that p7 6= 0. (Why?)
Exercise
Show that y = x ; 1=8 is the best linear approximation to y = x2 on 0 1 ].
Essentially repeating the proof given for Theorem 3 yields a lower bound for En(f).
Theorem. Let f 2 C a b ], and suppose that q 2 Pn is such that f(xi ) ;q(xi ) alternates
in sign at n + 2 points a x0 x1 : : : xn+1 b. Then,

min jf(xi ) ; q(xi )j:
En(f)
i=0 ::: n+1

Proof. If the inequality fails, then the best approximation p = pn would satisfy

max jf(xi ) ; p(xi )j min jf(xi ) ; q(xi )j:
En(f) <
1 i n+1 1 i n+1

Now we could repeat (essentially) the same argument used in the proof of Theorem 3 to
arrive at a contradiction. The details are left as an exercise.
Best Approximation 54
Even for relatively simple functions, the problem of actually nding the polynomial
of best approximation is genuinely di cult (even computationally). We end this section
by stating two important problems that Chebyshev was able to solve.
Problem
Find the polynomial pn;1 2 Pn;1, of degree at most n ; 1, that best approximates
f(x) = xn on the interval ;1 1 ]. (This particular choice of interval makes for a tidy
solution we'll discuss the general situation later.)
Since pn;1 is to minimize max jxn ; pn;1 (x)j, our rst problem is equivalent to:
jxj 1
Problem
Find the monic polynomial of degree n which deviates least from 0 on ;1 1 ]. In other
words, nd the monic polynomial of degree n which has smallest norm in C ;1 1 ].
We'll give two solutions to this problem (which we know has a unique solution, of
course). First, let's simplify our notation. We write
p(x) = xn ; pn;1 (x) (the solution)
and
M = kpk = En;1(xn ;1 1 ]):
All we know about p is that it has an alternating set ;1 x0 < x1 < < xn 1
containing (n ; 1) + 2 = n + 1 points that is, jp(xi )j = M and p(xi+1 ) = ;p(xi ) for all i.
Using this tiny bit of information, Chebyshev was led to compare the polynomials p2 and
p 0 . Watch closely!
Step 1. At any xi in (;1 1), we must have p 0 (xi ) = 0 (because p(xi ) is a relative extreme
value for p). But, p 0 is a polynomial of degree n ; 1 and so can have at most n ; 1 zeros.
Thus, we must have
xi 2 (;1 1) and p 0 (xi ) = 0 for i = 1 : : : n ; 1
(in fact, x1 : : : xn;1 are all the zeros of p 0 ) and
x0 = ;1 p 0 (x0 ) 6= 0 xn;1 = 1 p 0 (xn;1 ) 6= 0:
Best Approximation 55
Step 2. Now consider the polynomial M 2 ; p2 2 P2n. We know that M 2 ; (p(xi ))2 = 0
for i = 0 1 : : : n, and that M 2 ; p2 0 on ;1 1 ]. Thus, x1 : : : xn;1 must be double
roots (at least) of M 2 ; p2. But this makes for 2(n ; 1) + 2 = 2n roots already, so we
must have them all. Hence, x1 : : : xn;1 are double roots, x0 and xn are simple roots, and
these are all the roots of M 2 ; p2.
Step 3. Next consider (p 0 )2 2 P2(n;1). We know that (p 0 )2 has a double root at each
of x1 : : : xn;1 (and no other roots), hence (1 ; x2 )(p 0 (x))2 has a double root at each
x1 : : : xn;1 , and simple roots at x0 and xn . Since (1 ; x2)(p 0 (x))2 2 P2n, we've found all
of its roots.
Here's the point to all this \rooting":
Step 4. Since M 2 ;(p(x))2 and (1 ;x2 )(p 0 (x))2 are polynomials of the same degree with
the same roots, they are, up to a constant multiple, the same polynomial! It's easy to see
what constant, too: The leading coe cient of p is 1 while the leading coe cient of p 0 is
n thus, 20
2 ; (p(x))2 = (1 ; x )(p (x)) :
2
M n2
After tidying up,
p 0 (x)
p M 2 ; (p(x))2 = p 1 n x2 :
;
We really should have an extra here, but we know that p 0 is positive on some interval
we'll simply assume that it's positive on ;1 x1 ]. Now, upon integrating,
arccos p(x) = n arccos x + C
M
or
p(x) = M cos(n arccos x + C):
But p(;1) = ;M (because p 0 (;1) 0), so
cos(n + C) = ;1 =) C = m (with n + m odd)
=) p(x) = M cos(n arccos x)
=) p(cos x) = M cos nx:
Look familiar? Since we know that cos nx is a polynomial of degree n with leading coe -
cient 2n;1 (the n-th Chebyshev polynomial Tn ), the solution to our problem must be
p(x) = 2;n+1 Tn(x):
Best Approximation 56
Since jTn(x)j 1 for jxj 1 (why?), the minimum norm is M = 2;n+1.
Next we give a \fancy" solution, based on our characterization of best approximations
(Theorem 3) and a few simple properties of the Chebyshev polynomials.
Theorem. For any n 1, the formula p(x) = xn ; 2;n+1 Tn (x) de nes a polynomial
p 2 Pn;1 satisfying
2;n+1 = max jxn ; p(x)j < max jxn ; q(x)j
jxj 1 jxj 1
for any other q 2 Pn;1.
Proof. We know that 2;n+1 Tn (x) has leading coe cient 1, and so p 2 Pn;1 . Now set
xk = cos((n ; k) =n) for k = 0 1 : : : n. Then, ;1 = x0 < x1 < < xn = 1 and
Tn(xk ) = Tn(cos((n ; k) =n)) = cos((n ; k) ) = (;1)n;k :
Since jTn (x)j = jTn(cos )j = jcosn j 1, for ;1 x 1, we've found an alternating set
for Tn containing n + 1 points.
In other words, xn ; p(x) = 2;n+1 Tn(x) satis es jxn ; p(x)j 2;n+1 and, for each
k = 0 1 : : : n, has xn ; p(xk ) = 2;n+1 Tn(xk ) = (;1)n;k 2;n+1. By our characterization
k
of best approximations (Theorem 3), p must be the best approximation to xn out of
Pn;1.
Corollary. The monic polynomial of degree exactly n having smallest norm in C a b ] is
(b ; a)n T 2x ; b ; a :
2n2n;1 n b;a
Exercise. Hint: If p(x) is a polynomial of degree n with leading coe cient 1,
Proof.
then p(x) = p((2x ; b ; a)=(b ; a)) is a polynomial of degree n with leading coe cient
˜
2n=(b ; a)n. Moreover, max jp(x)j = max j˜(x)j.]
p
;1 x 1
axb
Properties of the Chebyshev Polynomials
As we've seen, the Chebyshev polynomial Tn (x) is the (unique, real) polynomial of degree
n (having leading coe cient 1 if n = 0, and 2n;1 if n 1) such that Tn(cos ) = cos n
Best Approximation 57
for all . The Chebyshev polynomials have dozens of interesting properties and satisfy all
sorts of curious equations. We'll catalogue just a few.
C1. Tn(x) = 2x Tn;1 (x) ; Tn;2 (x) for n 2.
Proof. It follows from the trig identity cos n = 2 cos cos(n ; 1) ; cos(n ; 2) that
Tn(cos ) = 2 cos Tn;1(cos ) ; Tn;2(cos ) for all that is, the equation Tn (x) =
2x Tn;1 (x) ; Tn;2(x) holds for all ;1 x 1. But since both sides are polynomials,
equality must hold for all x.
The next two properties are proved in essentially the same way:
C2. Tm (x) + Tn(x) = 1 Tm+n(x) + Tm;n (x) for m > n.
2

C3. Tm (Tn(x)) = Tmn (x).
1 (x + px2 ; 1 )n + (x ; px2 ; 1 )n .
C4. Tn(x) = 2
Proof. First notice that the expression on the right-hand side is actually a polynomial
p p
x2 ; 1 )n
and (x ; x2 ; 1 )n, the
since, on combining the binomial expansions of (x +
p
odd powers of x2 ; 1 cancel. Next, for x = cos ,
Tn(x) = Tn (cos ) = cos n = 1 (ein + e;in )
2
= 1 (cos + i sin )n + (cos ; i sin )n
2
1 ;x + ip1 ; x2 n + ;x ; ip1 ; x2 n
=2
1 ;x + px2 ; 1 n + ;x ; px2 ; 1 n :
=2
We've shown that these two polynomials agree for jxj 1, hence they must agree for all x
(real or complex, for that matter).
p 2 ; 1 )n + (x ; px2 ; 1 )n
For real x with jxj 1, the expression (x +
1 x equals
2
cosh(n cosh;1 x). In other words,
C5. Tn(cosh x) = cosh nx for all real x.
The next property also follows from property C4.
Best Approximation 58
p
C6. Tn(x) (jxj + x2 ; 1 )n for jxj 1.
An approach similar to the proof of property C4 allows us to write xn in terms of the
Chebyshev polynomials T0 T1 : : : Tn .
Xn
n=2]
C7. For n odd, 2nxn = k 2 Tn;2k (x) for n even, 2 T0 should be replaced by T0.
k=0
Proof. For ;1 x 1,
2nxn = 2n(cos )n = (ei + e;i )n
= ein + n ei(n;2) + n ei(n;4) +
1 2
n n
+ n ; 2 e;i(n;4) + n ; 1 e;i(n;2) + e;in

= 2 cos n + n 2 cos(n ; 2) + n 2 cos(n ; 4) +
1 2
= 2 Tn(x) + n 2 Tn;2 (x) + n 2 Tn;4(x) +
1 2
; n T (since the central term in the
where, if n is even, the last term in this last sum is n=2] 0
; ;
binomial expansion, namely n = n T , isn't doubled in this case).
n=2] 0
n=2]

C8. The zeros of Tn are x(n) = cos((2k ; 1) =2n), k = 1 : : : n. They're real, simple, and
k
lie in the open interval (;1 1).
Proof. Just check! But notice, please, that the zeros are listed here in decreasing order
(because cosine decreases).
C9. Between two consecutive zeros of Tn, there is precisely one root of Tn;1.
Proof. It's not hard to check that
2k ; 1 < 2k ; 1 < 2k + 1
2 (n ; 1)
2n 2n
for k = 1 : : : n ; 1, which means that x(n) > x(n;1) > x(n) .
k k k+1

C10. Tn and Tn;1 have no common zeros.
Best Approximation 59
C9, there's another way to see it:
Proof. Although this is immediate from property
Tn(x0 ) = 0 = Tn;1(x0 ) implies that Tn;2(x0 ) = 0 by property C1. Repeating this
observation, we would have Tk (x0 ) = 0 for every k < n, including k = 0. No good!
T0(x) = 1 has no zeros.
C11. The set fx(n) : 1 k n n = 1 2 : : :g is dense in ;1 1 ].
k
Proof. Since cos x is (strictly) monotone on 0 ], it's enough to know that the set
f(2k;1) =2ngk n is dense in 0 ], and for this it's enough to know that f(2k ;1)=2ngk n
is dense in 0 1 ]. (Why?) But
2k ; 1 = k ; 1 k
n 2n n
2n
for n large that is, the set f(2k ; 1)=2ngk n is dense among the rationals in 0 1 ].
It's interesting to note here that the distribution of the roots fx(n)gk n can be esti-
k
mated (see Natanson, Constructive Function Theory, Vol. I, pp. 48{51). For large n, the
number of roots of Tn that lie in an interval x x + x ] ;1 1 ] is approximately
pn x 2 :
1;x
In particular, for n large, the roots of Tn are \thickest" near the endpoints 1.
In probabilistic terms, this means that if we assign equal probability to each of the
roots x(n) : : : x(n) (that is, if we think of each root as the position of a point with mass
n
0
1=n), then the density of this probability distribution (or the density of the system of point
p
masses) at a point x is approximately 1= 1 ; x2 for large n. In still other words, this
tells us that the probability that a root of Tn lies in the interval a b ] is approximately
1 Z b p 1 dx :
a 1 ; x2

C12. The Chebyshev polynomials are mutually orthogonal relative to the weight w(x) =
(1 ; x2 );1=2 on ;1 1 ].
Proof. For m 6= n the substitution x = cos yields
Z1 dx = Z cos m cos n d = 0
Tn(x) Tm (x) p
1 ; x2
;1 0
Best Approximation 60
while for m = n we get
Z1 dx = Z cos2 n d = if n = 0
Tn (x) p
2
=2 if n > 0.
1 ; x2
;1 0

C13. jTn(x)j n2 for ;1 x 1, and jTn( 1)j = n2.
0 0

Proof. For ;1 < x < 1 we have
d
d Tn (cos
d T (x) = = n sin n :
)
dx n d cos sin
d
0
Thus, jTn(x)j n2 because j sin n j nj sin j (which can be easily checked by induction,
for example). At x = 1, we interpret this derivative formula as a limit (as ! 0 and
0
! ) and nd that jTn( 1)j = n2.
As we'll see later, each p 2 Pn satis es jp 0 (x)j kpkn2 = kpk Tn(1) for ;1 x 1,
0
and this is, of course, best possible. As it happens, Tn(x) has the largest possible rate of
growth outside of ;1 1 ] among all polynomials of degree n. Speci cally:
Theorem. Let p 2 Pn and let kpk = max jp(x)j. Then, for any x0 with jx0j 1 and
;1 x 1
any k = 0 1 : : : n we have
jp(k)(x0 )j kpk jTn (x0 )j
(k)

where p(k) is the k-th derivative of p.
We'll prove only the case k = 0. In other words, we'll check that jp(x0 )j kpkjTn(x0 )j.
The more general case is in Rivlin, Theorem 1.10, p. 31.
Proof. Since all the zeros of Tn lie in (;1 1), we know that Tn (x0 ) 6= 0. Thus, we may
consider the polynomial

q(x) = Tp(x0 )) Tn (x) ; p(x) 2 Pn:
n (x0
If the claim is false, then
kpk < Tp(x0 )) :
n (x0
Best Approximation 61
Now at each of the points yk = cos(k =n), k = 0 1 : : : n, we have Tn(yk ) = (;1)k and,
hence,
q(yk ) = (;1)k Tp(x0 )) ; p(yk ):
(x n0
Since jp(yk )j kpk, it follows that q alternates in sign at these n + 1 points. In particular,
q must have at least n zeros in (;1 1). But q(x0 ) = 0, by design, and jx0 j 1. That is,
we've found n + 1 zeros for a polynomial of degree n. So, q 0 that is,

p(x) = Tp(x0 )) Tn (x):
n (x0
But then,
jp(1)j = Tp(x0 )) > kpk
n (x0
since Tn(1) = Tn (cos 0) = 1, which is a contradiction.
Corollary. Let p 2 Pn and let kpk = max jp(x)j. Then, for any x0 with jx0 j 1, we
;1 x 1
have q n
jp(x0 )j kpk jx0 j + x2 ; 1 :
0

Rivlin's proof of our last Theorem in the general case uses the following observation:
(k)
C14. For x 1 and k = 0 1 : : : n, we have Tn (x) > 0.
(k)
Exercise. Hint: It follows from Rolle's theorem that Tn is never zero for
Proof.
(k)
x 1. (Why?) Now just compute Tn (1).]
Uniform Approximation by Trig Polynomials
We end this section by summarizing (without proofs) the analogues of Theorems 1{3
for uniform approximation by trig polynomials. Throughout, f 2 C 2 and Tn denotes the
collection of trig polynomials of degree at most n.
1. f has a best approximation T 2 Tn.
2. f ;T has an alternating set containing 2n+2 (or more) points in 0 2 ). (Note here
that 2n + 2 = 1 + dim Tn.)
3. T is unique.
Best Approximation 62
4. If T 2 Tn is such that f ; T has an alternating set containing 2n + 2 or more points
in 0 2 ), then T = T .
The proofs of 1{4 are very similar to the corresponding results for algebraic polyno-
mials. As you might imagine, 2 is where all the ghting takes place, and there are a few
technical di culties to cope with. Nevertheless, we'll swallow these facts whole and apply
them with a clear conscience to a few examples.
Example
For m > n, the best approximation to f(x) = A cos mx + B sin mx out of Tn is 0!

Proof. We may write f(x) = R cos m(x ; x0 ) for some R and x0 . (How?) Now we need
only display a su ciently large alternating set for f (in some interval of length 2 ).
Setting xk = x0 + k =m, k = 1 2 : : : 2m, we get f(xk ) = R cos k = R(;1)k and
xk 2 (x0 x0 + 2 ]. Since m > n, it follows that 2m 2n + 2.
Example
The best approximation to
X
n+1;
f(x) = a0 + ak cos kx + bk sinkx
k=1
out of Tn is
X;
n
T(x) = a0 + ak cos kx + bk sinkx
k=1
q2
and kf ; Tk = an+1 + b2 in C 2 .
n+1

Proof. By our last example, the best approximation to f ; T out of Tn is 0, hence T
must be the best approximation to f. (Why?) The last assertion is easy to check: Since
p2 2
we can always write A q mx + B sin mx = A + B cos m(x ; x0 ), for some x0, it
cos
follows that kf ; Tk = a2 + b2 .
n+1 n+1

Finally, let's make a simple connection between the two types of polynomial approxi-
mation:
Best Approximation 63
Theorem. Let f 2 C ;1 1 ] and de ne ' 2 C 2 by '( ) = f(cos ). Then,
T
En(f) = min kf ; pk = min k' ; Tk En ('):
p2Pn T2Tn
Pn ak xk is the best approximation to f out of Pn.
Proof. Suppose that p (x) = k=0
b
Then, T ( ) = p (cos ) is in Tn and, clearly,

max jf(x) ; p (x)j = max jf(cos ) ; p (cos )j:
;1 x 1 0 2

b
Thus, kf ; p k = k' ; Tk min k' ; Tk.
T2Tn
On the other hand, since ' is even, we know that T , its best approximation out of Tn,
is also even. Thus, T ( ) = q(cos ) for some algebraic polynomial q 2 Pn. Consequently,
k' ; T k = kf ; qk min kf ; pk.
p2Pn
Remarks
1. Once we know that min kf ; pk = min k' ; Tk, it follows that we must also have
p2Pn T2Tn
T ( ) = p (cos ).
2. Each even ' 2 C 2 corresponds to an f 2 C ;1 1 ] by f(x) = '(arccos x) and, of
course, the conclusions of the Theorem and of Remark 1 hold in this case, too.
3. Whenever we speak of even trig polynomials, the Chebyshev polynomials are lurking
somewhere in the background. Indeed, let T( ) be an even trig polynomial, write
x = cos , as usual, and consider the following cryptic equation:
X X
n n
T( ) = ak cos k = ak Tk (cos ) = p(cos )
k=0 k=0
P
where p(x) = n ak Tk (x) 2 Pn.
k=0
Problem Set: Chebyshev Polynomials
Math 682 5/28/98


We've shown that cos n and sin(n + 1) = sin can be written as algebraic polynomials
of degree n in cos we use this observation to de ne the Chebyshev polynomials. The
Chebyshev polynomials of the rst kind (Tn(x)) are de ned by Tn(cos ) = cos n , for
n = 0 1 2 : : :, while the Chebyshev polynomials of the second kind (Un(x)) are de ned
by Un(cos ) = sin(n + 1) = sin for n = 0 1 2 : : :.
. 44. Establish the following properties of Tn(x).
(i) T0 (x) = 1, T1(x) = x, and Tn(x) = 2xTn;1 (x) ; Tn;2(x) for n 2.
(ii) Tn(x) is a polynomial of degree n having leading coe cient 2n;1 for n 1, and
containing only even (resp., odd) powers of x if n is even (resp., odd).
(iii) jTn(x)j 1 for ;1 x 1 when does equality occur? Where are the zeros of
Tn(x)? Show that between two consecutive zeros of Tn(x) there is exactly one
zero of Tn;1(x). Can Tn(x) and Tn;1(x) have a common zero?
0 0
(iv) jTn(x)j n2 for ;1 x 1, and jTn( 1)j = n2.
(v) Tm (x) + Tn (x) = 1 Tm+n(x) + Tm;n (x) for m > n.
2
(vi) Tm (Tn (x)) = Tmn(x).
Z1
Tn(x) Tm (x) p dx 2 .
(vii) Evaluate
1;x
;1
(viii) Show that Tn is a solution to (1 ; x2)y00 ; xy0 + n2y = 0.
p p
(ix) Tn(x) = 2 (x + x2 ; 1 )n + (x ; x2 ; 1 )n for any x, real or complex.
1
;P P ; t cos
(x) Re 1 tnein = 1 tn cos n = 1 ;12t cos + t2 for ;1 < t < 1 that is,
n=0 n=0
P1 tnTn(x) = 1 ; tx (this is a generating function for Tn it's closely
1 ; 2tx + t2
n=0
related to the Poisson kernel).
(xi) Find analogues of (i){(x) (if possible) for Un(x).
. 45. Show that every p 2 Pn has a unique representation as p = a0 + a1 T1 + + an Tn.
Find this representation in the case p(x) = xn.
Chebyshev Polynomials 65
. 46. The polynomial of degree n having leading coe cient 1 and deviating least from 0 on
;1 1 ] is given by Tn(x)=2n;1 . On an arbitrary interval a b ] we would instead take
(b ; a)n T 2x ; b ; a :
22n;1 n b ; a
Is this solution unique? Explain.
47. If p is a polynomial on a b ] of degree n having leading coe cient an > 0, then
kpk an(b ; a)n =22n;1. If b ; a 4, then no polynomial of degree exactly n with
integer coe cients can satisfy kpk < 2 (compare this with problem 26 on the \Uniform
Approximation by Polynomials" problem set).
48. Given p 2 Pn, show that jp(x)j kpkjTn(x)j for jxj > 1.
49. If p 2 Pn with kpk = 1 on ;1 1 ], and if jp(xi )j = 1 at n + 1 distinct point x0 : : : xn
in ;1 1 ], show that either p = 1, or else p = Tn. Hint: One approach is to
compare the polynomials 1 ; p2 and (1 ; x2)(p 0 )2.]
(k) (k)
50. Compute Tn (1) for k = 0 1 : : : n, where Tn is the k-th derivative of Tn. For x 1
(k)
and k = 0 1 : : : n, show that Tn (x) > 0.
Examples: Chebyshev Polynomials in Practice
Math 682 5/28/98


The following examples are cribbed from the book Chebyshev Polynomials, by L. Fox and
I. B. Parker (Oxford University Press, 1968).
As we've seen, the Chebyshev polynomals can be generated by a recurrence relation. By
reversing the procedure, we could solve for xn in terms of T0 T1 : : : Tn (we'll do this
calculation in class). Here are the rst few terms in each of these relations:
T0(x) = 1 1 = T0(x)
T1(x) = x x = T1(x)
T2(x) = 2x2 ; 1 x2 = (T0 (x) + T2(x))=2
T3(x) = 4x3 ; 3x x3 = (3 T1 (x) + T3 (x))=4
T4(x) = 8x4 ; 8x2 + 1 x4 = (3 T0 (x) + 4 T2 (x) + T4(x))=8
T5(x) = 16x5 ; 20x3 + 5x x5 = (10 T1 (x) + 5 T3 (x) + T5(x))=16
Note the separation of even and odd terms in each case. Writing ordinary, garden variety
polynomials in their equivalent Chebyshev form has some distinct advantages for numerical
computations. Here's why:

1 ; x + x2 ; x3 + x4 = 15 T0(x) ; 7 T1 (x) + T2(x) ; 1 T3(x) + 8 T4 (x)
1
6 4 4

(after some simpli cation). Now we see at once that we can get a cubic approximation to
1 ; x + x2 ; x3 + x4 on ;1 1 ] with error at most 1=8 by simply dropping the T4 term on
the right-hand side (since jT4(x)j 1), whereas simply using 1 ; x + x2 ; x3 as our cubic
approximation could cause an error as big as 1. Pretty slick! This gimmick of truncating
the equivalent Chebyshev form is called economization.
As a second example we note that a polynomial with small norm on ;1 1 ] may have
annoyingly large coe cients:

(1 ; x2)10 = 1 ; 10x2 + 45x4 ; 120x6 + 210x8 ; 252x10
+ 210x12 ; 120x14 + 45x16 ; 10x18 + x20
Chebyshev Polynomials in Practice 67
but in Chebyshev form (look out!):

(1 ; x2 )10 = 5241288 92 378 T0(x) ; 167 960 T2 (x) + 125 970 T4 (x) ; 77 520 T6(x)
+ 38 760 T8 (x) ; 15 504 T10 (x) + 4 845 T12(x) ; 1 140 T14(x)
+ 190 T16(x) ; 20 T18(x) + T20(x)
The largest coe cient is now only about 0.3, and the omission of the last three terms
produces a maximum error of about 0.0004. Not bad.
P
As a last example, consider the Taylor polynomial ex = n xk =k! + xn+1e =(n + 1)!
k=0
(with remainder), where ;1 x, 1. Taking n = 6, the truncated series has error no
greater than e=7! 0:0005. But if we \economize" the rst six terms, then:
X
6
xk =k = 1:26606 T0(x) + 1:13021 T1(x) + 0:27148 T2(x) + 0:04427 T3(x)
k=0
+ 0:00547 T4(x) + 0:00052 T5(x) + 0:00004 T6(x):
The initial approximation already has an error of about 0:0005, so we can certainly drop
the T6 term without any additional error. Even dropping the T5 term causes an error of
no more than 0:001 (or thereabouts). The resulting approximation has a far smaller error
than the corresponding truncated Taylor series: e=5! 0:023.
The approach used in our last example has the decided disadvantage that we must rst
decide where to truncate the Taylor series|which might converge very slowly. A better
approach would be to write ex as a series involving Chebyshev polynomials directly. That
P
is, if possible, we want to write ex = 1 ak Tk (x). If the ak 's are absolutely summable,
k=0
it will be very easy to estimate any truncation error. We'll get some idea on how to go
about this when we talk about \least-squares" approximation. As it happens, such a series
is easy to nd (it's rather like a Fourier series), and its partial sums are remarkably good
uniform approximations.
A Brief Introduction to Interpolation
Math 682 6/2/98

Our goal in this section is to prove the following result (as well as discuss its rami cations).
In fact, this result is so fundamental that we will present three proofs!
Theorem. Let x0 x1 : : : xn be distinct points, and let y0 y1 : : : yn be arbitrary points
in R. Then, there exists a unique polynomial p 2 Pn satisfying p(xi ) = yi , i = 0 1 : : : n.
First notice that uniqueness is obvious. Indeed, if two polynomials p, q 2 Pn agree at
n + 1 points, then p q. (Why?) The real work comes in proving existence.
First Proof. (Vandermonde's determinant.) We seek c0 c1 : : : cn so that p(x) =
Pn ck xk satis es
k=0
Xkn
p(xi ) = ck xi = yi i = 0 1 : : : n:
k=0
That is, we need to solve a system of n + 1 linear equations for the ci's. In matrix form:
2 32 3 2 3
xn
1 x0 x2 y0
c0
6 1 x1 x0 7 6 c1 7 6 y1 7
0
6 1 76 7 = 6 7:
xn
2
6 . . .1 . 7 6 . 7 6 .. 7
6 .. .. .. . 76 . 7 6 . 7
...
4 . 54 . 5 4 5
xn cn yn
1 xn x2 n n
This equation always has a unique solution because the coe cient matrix has determinant
Y
(xi ; xj ) 6= 0:
D=
0 j<i n
(D is called Vandermonde's determinant note that D > 0 if x0 < x1 < < xn .) Since
this fact is of independent interest, we'll sketch a short proof below.
Lemma. D = Q (xi ; xj ).
0 j<i n
Proof. Consider
xn
x2
1 x0 0 0
xn
x2
1 x1 1 1
V (x0 x1 : : : xn;1 x) = ... ... . . . . .. :
.
. .
xn
1 xn;1 x2
n;1 n;1
xn
x2
1x
Interpolation 69
V (x0 x1 : : : xn;1 x) is a polynomial of degree n in x, and it's 0 whenever x = xi , i =
Q
0 1 : : : n ; 1. Thus, V (x0 : : : x) = c n;1(x ; xi ), by comparing roots and degree.
i=0
However, it's easy to see that the coe cient of xn in V (x0 : : : x) is V (x0 : : : xn;1 ).
Qn;1
Thus, V (x0 : : : x) = V (x0 : : : xn;1 ) i=0 (x ; xi ). The result now follows by induction
and the obvious case 1 x0 = x1 ; x0 .
1 x1

Second Proof. (Lagrange interpolation.) We could de ne p immediately if we had
polynomials `i (x) 2 Pn, i = 0 : : : n, such that `i (xj ) = i j (where i j is Kronecker's
P
delta that is, i j = 0 for i 6= j, and i j = 1 for i = j). Indeed, p(x) = n yi `i (x)
i=0
would then work as our interpolating polynomial. In short, notice that the polynomials
f`0 `1 : : : `ng would form a (particularly convenient) basis for Pn.
We'll give two formulas for `i(x):
Y x ; xj
(a). Clearly, `i(x) = x ; x works.
j6=i i j
(b). Start with W(x) = (x ; x0 )(x ; x1 ) (x ; xn), and notice that the polynomial we
need satis es
W(x)
`i(x) = ai x ; x
i
for some ai 2 R. (Why?) But then, 1 = `i (xi ) = ai W 0 (xi ) (again, why?) that is,

`i(x) = (x ; W(x) 0 (x ) :
x )W
i i
Q
Please note that `i(x) is a multiple of the polynomial j6=i (x ; xj ), for i = 0 : : : n, and
that p(x) is then a suitable linear combination of such polynomials.
Third Proof. (Newton's formula.) We seek p(x) of the form

p(x) = a0 + a1 (x ; x0 ) + a2(x ; x0 )(x ; x1 ) + + an (x ; x0) (x ; xn;1 ):

(Please note that xn does not appear on the right-hand side.) This form makes it almost
e ortless to solve for the ai 's by plugging-in the xi 's, i = 0 : : : n ; 1.

y0 = p(x0 ) = a0
y1 = p(x1 ) = a0 + a1(x1 ; x0 ) =) a1 = x1 ; a0 :
y
;x 1 0
Interpolation 70
Continuing, we nd

a2 = y2(x a0 x )(x(x2 ; x)0 )
; ; a1
; ;x
2 0 2 1

a3 = y3 ; a0 ; a1 (x3x; x0 ) ; x 2)(x3 ; x0 )(x3 ; x1 )
; a (x
(x ; )(x ;x )
3 0 3 1 3 2
and so on. Natanson, Vol. III, gives another formula for the ai's.]
Example
As a quick means of comparing these three solutions, let's nd the interpolating polynomial
(quadratic) passing through (1 2), (2 ;1), and (3 1). You're invited to check the following:
Vandermonde: p(x) = 10 ; 21 x + 5 x2 .
2 2
Lagrange: p(x) = (x ; 2)(x ; 3) + (x ; 1)(x ; 3) + 1 (x ; 1)(x ; 2).
2
Newton: p(x) = 2 ; 3(x ; 1) + 2 (x ; 1)(x ; 2).
5

As you might have already surmised, Lagrange's method is the easiest to apply by
hand, although Newton's formula has much to recommend it too (it's especially well-suited
to situations where we introduce additional nodes). We next set up the necessary notation
to discuss the ner points of Lagrange's method.
Given n + 1 distinct points a x0 < x1 < < xn b (sometimes called nodes), we
rst form the polynomials
Y
n
W(x) = (x ; xi )
i=0
and
Y x ; xj
= (x ; W(x) 0 (x ) :
`i(x) =
j6=i xi ; xj xi ) W i
The Lagrange interpolation formula is
X
n
Ln(f)(x) = f(xi ) `i (x):
i=0
That is, Ln (f) is the unique polynomial in Pn that agrees with f at the xi 's. In particular,
notice that we must have Ln(p) = p whenever p 2 Pn. In fact, Ln is a linear projection
from C a b ] onto Pn. Why is Ln(f) linear in f?]
Interpolation 71
Typically we're given (or construct) an array of nodes:
8 x(0)
>0
> (1) (1)
< x0 x1
X > x(2) x(2) x(2)
> 0. 1 2 .
:. ..
.
and form the corresponding sequence of projections
X
n
f(x(n)) `(n)(x):
Ln(f)(x) = i i
i=0
An easy (but admittedly pointless) observation is that for a given f 2 C a b ] we can always
nd an array X so that Ln(f) = pn , the polynomial of best approximation to f out of Pn
(since f;pn has n+1 zeros, we may use these for the xi 's). Thus, kLn(f);fk = En(f) ! 0
in this case. However, the problem of convergence changes character dramatically if we
rst choose X and then consider Ln(f). In general, there's no reason to believe that Ln (f)
converges to f. In fact, quite the opposite is true:
Theorem. (Faber, 1914) Given any array of nodes X in a b ], there is some f 2 C a b ]
for which kLn(f) ; fk is unbounded.
The problem here has little to do with interpolation and everything to do with pro-
jections:
Theorem. (Kharshiladze, Lozinski, 1941) For each n, let Ln be a continuous, linear
projection from C a b ] onto Pn. Then, there is some f 2 C a b ] for which kLn(f) ; fk
is unbounded.
Evidently, the operators Ln aren't positive (monotone), for otherwise the Bohman-
Korovkin theorem (and the fact that Ln is a projection onto Pn) would imply that Ln (f)
converges uniformly to f for every f 2 C a b ].
The proofs of these theorems are long and di cult|we'll save them for another day.
(Some of you may recognize the Principle of Uniform Boundedness at work here.) The
real point here is that we can't have everything: A positive result about convergence of
interpolation will require that we impose some extra conditions on the functions f we want
to approximate. As a rst step in this direction, we prove that if f has su ciently many
derivatives, then the error kLn(f) ; fk can at least be measured.
Interpolation 72
Theorem. Suppose that f has n + 1 continuous derivatives on a b ]. Let a x0 <
x1 < < xn b, let p 2 Pn be the polynomial that interpolates f at the xi 's, and let
Q
W(x) = n (x ; xi ). Then,
i=0
1 kf (n+1)kkWk:
kf ; pk (n + 1)!
Proof. We'll prove the Theorem by showing that, given x in a b ], there is a in (a b)
with
1
f(x) ; p(x) = (n + 1)! f (n+1)( ) W(x): ()
If x is one of the xi 's, then both sides of this formula are 0 and we're done. Otherwise,
W(x) 6= 0 and we may set = f(x) ; p(x)]=W(x). Now consider
'(t) = f(t) ; p(t) ; W(t):
Clearly, '(xi ) = 0 for each i = 0 1 : : : n and, by our choice of , we also have '(x) = 0.
Here comes Rolle's theorem! Since ' has n + 2 distinct zeros in a b ], we must have
'(n+1)( ) = 0 for some in (a b). (Why?) Hence,
0 = '(n+1)( ) = f (n+1)( ) ; p(n+1)( ) ; W (n+1)( )
= f (n+1)( ) ; f(x) ; p(x) (n + 1)!
W(x)
because p has degree at most n and W is monic and degree n + 1.
Observations
1. Equation ( ) is called the Lagrange formula with remainder. Compare this result to
Taylor's formula with remainder.]
2. The term f (n+1)( ) is actually a continuous function of x. That is, f(x);p(x)]=W(x)
is continuous its value at an xi is f 0(xi ) ; p 0 (xi )]=W 0 (xi ) (why?) and W 0(xi ) =
Q (x ; x ) 6= 0.
j6=i i j
3. On any interval a b ], using any nodes, the sequence of Lagrange interpolating poly-
nomials for ex converge uniformly to ex. In this case,
c (b ; a)n
kex ; Ln(ex )k (n + 1)!
Interpolation 73
where c = kexk in C a b ]. The same would hold true for any in nitely di erentiable
function satisfying, say, kf (n)k M n (any entire function, for example).
4. On ;1 1 ], the norm of Qn (x ; xi ) is minimized by taking xi = cos((2i ; 1) =2n),
i=1
the zeros of the n-th Chebyshev polynomial Tn . (Why?) As Rivlin points out, the
zeros of the Chebyshev polynomials are a nearly optimal choice for the nodes if good
uniform approximation is desired.

The question of convergence of interpolation is actually very closely related to the
analogous question for the convergence of Fourier series|and the answer here is nearly
the same. We'll have more to say about this analogy later. First, let's note that Ln is
continuous (bounded) this will give us our rst bit of insight into Faber's negative result.
Pn j` (x)j
Lemma. kLn(f)k kfk i=0 i for any f 2 C a b ].

Exercise.
Proof.
P
The numbers n = n j`i(x)j are called the Lebesgue numbers associated to this
i=0
process. It's not hard to see that n is the smallest possible constant that will work in
this inequality (in other words, kLnk = n). Indeed, if
X X
n n
j`i(x)j = j`i(x0 )j
i=0 i=0

then we can nd an f 2 C a b ] with kfk = 1 and f(xi ) = sgn(`i (x0 )) for all i. (How?)
Then,
X X
n n
kLn(f)k jLn(f)(x0 )j = j`i(x0 )j = nkfk:
sgn(`i (x0 )) `i (x0 ) =
i=0 i=0
As it happens, n c log n (this is where the hard work comes in see Rivlin or Natanson
for further details), and, in particular, n ! 1 as n ! 1.
A simple application of the triangle inequality will allow us to bring En(f) back into
the picture:
Interpolation 74
Lemma. (Lebesgue's theorem) kf ; Ln(f)k (1 + n) En(f), for any f 2 C a b ].
Proof. Let p be the best approximation to f out of Pn . Then, since Ln (p ) = p , we
have
kf ; Ln(f)k kf ; p k + kLn(f ; p )k
(1 + n) kf ; p k = (1 + n) En(f):

Appendix
Although we won't need anything quite so fancy, it is of some interest to discuss more
general problems of interpolation. We again suppose that we are given distinct points
x0 < < xn in a b ], but now we suppose that we are given an array of information
y0 y0 y0 : : : y0 0) (m
0 00
y1 y1 y1 : : : y1 1) (m
0 00
... .
... .
... . (m )
0 00
yn yn yn : : : yn n
where each mi is a nonnegative integer. Our problem is to nd the polynomial p of least
degree that incorporates all of this data by satisfying
p(x0 ) = y0 p 0 (x0 ) = y0 : : : p(m0)(x0 ) = y0 0 )
(m
0
p(x1 ) = y1 p 0 (x1 ) = y1 : : : p(m1)(x1 ) = y1 1 )
(m
0
. . ..
. .
. . . (m )
p(xn ) = yn p 0 (xn ) = yn : : : p(mn )(xn ) = yn n :
0
In other words, we specify not only the value of p at each xi , but also the rst mi derivatives
of p at xi . This is often referred to as the problem of Hermite interpolation.
Since the problem has a total of m0 + m1 + + mn + n + 1 \degrees of freedom,"
it won't come as any surprise that is has a (unique) solution p of degree (at most) N =
m0 + m1 + + mn + n. Rather than discuss this particular problem any further, let's
instead discuss the general problem of linear interpolation.
The notational framework for our problem is an n-dimensional vector space X on
which m linear, real-valued functions (or linear functionals) L1 : : : Lm are de ned. The
general problem of linear interpolation asks whether the system of equations
Li (f) = yi i = 1 ::: m ()
Interpolation 75
has a (unique) solution f 2 X for any given set of scalars y1 : : : ym 2 R. Since a linear
functional is completely determined by its values on any basis, we would next be led to
consider a basis f1 : : : fn for X, and from here it is a small step to rewrite ( ) as a matrix
equation. That is, we seek a solution f = a1f1 + + anfn satisfying
a1 L1(f1 ) + + an L1(fn ) = y1
a1 L2(f1 ) + + an L2(fn ) = y2
.
.
.
a1 Lm (f1 ) + + anLm (fn ) = ym :
If we are to guarantee a solution a1 : : : an for each choice of y1 : : : ym , then we'll need
to have m = n and, moreover, the matrix Li (fj )] will have to be nonsingular.
Lemma. Let X be an n-dimensional vector space with basis f1 : : : fn , and let L1 : : : Ln
be linear functionals on X. Then, L1 : : : Ln are linearly independent if and only if the
;
matrix Li(fj )] is nonsingular that is, if and only if det Li (fj ) 6= 0.
Proof. If Li (fj )] is singular, then the matrix equation
c1L1(f1 ) + + cnLn(f1 ) = 0
c1L1(f2 ) + + cnLn(f2 ) = 0
..
.
c1L1(fn ) + + cnLn(fn ) = 0
has a nontrivial solution c1 : : : cn. Thus, the functional c1L1 + + cnLn satis es

<<

. 2
( 5)



>>