<< ñòð. 4(âñåãî 5)ÑÎÄÅÐÆÀÍÈÅ >>
r u 1 ; cos
t
1; 1n = n+2
= sin 2n + 4
2 2 2n
and so we have
2 1 1
T
En (f) !f n < 6 !f n :
1+ 2

Jackson's theorems are what we might call direct theorems. If we know something
about f, then we can say something about En(f). There is also the notion of an inverse
theorem, meaning that if we know something about En(f), we should be able to say
something about f. In other words, we would expect an inverse theorem to be, more or
less, the converse of some direct theorem. Now inverse theorems are typically much harder
to prove than direct theorems, but in order to have some idea of what such theorems might
tell us (and to see some of the techniques used in their proofs), we present one of the easier
inverse theorems, due to Bernstein. This result gives the converse to one of our corollaries
to Jackson's theorem (see the top of page 105).
Theorem. If f 2 C 2 satis es En (f) A n; , for some constants A and 0 < < 1,
T
then f 2 lipK for some constant K.
2 Tn so that kf ; Un k A n; . Then, in particular,
Proof. For each n, choose Un
(Un) converges uniformly to f. Now if we set V0 = U1 and Vn = U2n ; U2n;1 for n 1,
P
then Vn 2 T2n and f = 1 Vn . Indeed,
n=0
kVn k kU2n ; fk + kU2n;1 ; fk A (2n ); + A (2n;1 ); = B 2;n
P
which is summable thus, the (telescoping) series 1 Vn converges uniformly to f.
n=0
(Why?)
Next we estimate jf(x) ; f(y)j using nitely many of the Vn's, the precise number to
be speci ed later. Using the mean value theorem and Bernstein's inequality we get
X
1
jf(x) ; f(y)j jVn(x) ; Vn(y)j
n=0
Jackson's Theorems 109
X X
1
m;1
jVn(x) ; Vn (y)j + 2 kVn k
n=m
n=0
X 0 ( n )j jx ; yj + 2 X kVn k
1
m;1
jVn
=
n=m
n=0
X X
1
m;1
jx ; yj 2nkVnk + 2 kVnk
n=0 n=m
X X
1
m;1
B 2;n
jx ; yj B 2n(1; ) +2
h i
n=m
n=0
C jx ; yj 2m(1; ) + 2;m ()
where we've used, in the fourth line, the fact that Vn 2 T2n and, in the last line, standard
estimates for geometric series. Now we want the right-hand side to be dominated by a
constant times jx ; yj . In other words, if we set jx ; yj = , then we want
2m(1; ) + 2;m D
or, equivalently,
(2m )(1; ) + (2m ); D:
Thus, we should choose m so that 2m is both bounded above and bounded away from
zero. For example, if 0 < < 1, we could choose m so that 1 2m < 2.
In order to better explain the phrase \more or less the converse of some direct theo-
rem," let's see how the previous result falls apart when = 1. Although we might hope
that En (f) A=n would imply that f 2 lipK 1, it happens not to be true. The best result
T
in this regard is due to Zygmund, who gave necessary and su cient conditions on f so
T
that En (f) A=n (and these conditions do not characterize lipK 1 functions). Instead of
pursuing Zygmund's result, we'll settle for simple \surgery" on our previous result, keeping
an eye out for what goes wrong. This result is again due to Bernstein.
Theorem. If f 2 C 2 satis es En (f) A=n, then !f ( ) K j log j for some constant
T
K and all su ciently small.
Proof. If we repeat the previous proof, setting = 1, only a few lines change. In
particular, the conclusion of that long string of inequalites ( ) would now read
C jx ; yj m + 2;m = C m + 2;m :
jf(x) ; f(y)j
Jackson's Theorems 110
Clearly, the right-hand side cannot be dominated by a constant times , as we might have
hoped, for this would force m to be bounded (independent of ), which in turn bounds
away from zero. But, if we again think of 2m as the \variable" in this inequality, then the
term m suggests that the correct order of magnitude of the right-hand side is j log j.
Thus, we would try to nd a constant D so that

m + 2;m D jlog j
or
m(2m ) + 1 D (2m )j log j:

Now if we take 0 < < 1=2, then log 2 < ; log = j log j. Hence, if we again choose
m 1 so that 1 2m < 2, we'll get

m log 2 + log < log 2 =) m < log 2 ; 2 < log 2 j log j
log 2
log
and, nally,
6 j log j 6 (2m ) jlog j:
m(2m ) + 1 2m + 1 3m log 2 log 2
Orthogonal Polynomials
Math 682 6/17/98

Given a positive (except possibly at nitely many points), Riemann integrable weight
function w(x) on a b ], the expression
Zb
hf gi = f(x) g(x) w(x) dx
a
de nes an inner product on C a b ] and
! 1=2
Zb p
kfk2 = hf fi
f(x)2 w(x) dx =
a

de nes a strictly convex norm on C a b ]. Thus, given a nite dimensional subspace E of
C a b ] and an element f 2 C a b ], there is a unique g 2 E such that

kf ; gk2 = min kf ; hk2:
h2E

We say that g is the least-squares approximation to f out of E (relative to w).
Now if we apply the Gram-Schmidt procedure to the sequence 1 x x2 : : :, we will
arrive at a sequence (Qn ) of orthogonal polynomials relative to the above inner product.
In this special case, however, the Gram-Schmidt procedure simpli es substantially:
Theorem. The following procedure de nes a sequence (Qn ) of orthogonal polynomials
(relative to w). Set:

Q0 (x) = 1 Q1(x) = x ; a0 and Qn+1(x) = (x ; an )Qn (x) ; bnQn;1 (x)

for n 1, where

an = hxQn Qn i h Qn Qn i and bn = h x Qn Qn;1 i hQn;1 Qn;1 i

(and where x Qn is shorthand for the polynomial x Qn (x)).
Proof. It's easy to see from these formulas that Qn is a monic polynomial of degree
exactly n. In particular, the Qn 's are linearly independent (and nonzero).
Orthogonal Polynomials 112
Now we checked in class that Q0 , Q1 , and Q2 are mutually orthogonal, so let's use
induction and check that Qn+1 is orthogonal to each Qk , k n. First,

hQn+1 Qn i = h x Qn Qn i ; anhQn Qn i ; bn hQn;1 Qn i = 0
and
hQn+1 Qn;1 i = h x Qn Qn;1 i ; anh Qn Qn;1 i ; bnh Qn;1 Qn;1 i = 0
since hQn;1 Qn i = 0. Next, we take k < n ; 1 and use the recurrence formula twice:
h Qn+1 Qk i = h x Qn Qk i ; an h Qn Qk i ; bnhQn;1 Qk i
= h x Qn Qk i = h Qn x Qk i (Why?)
= h Qn Qk+1 + ak Qk + bk Qk;1 i = 0
since k + 1 < n.
Observations
1. Using the same trick as above, we have
bn = h x Qn Qn;1 i h Qn;1 Qn;1 i = h Qn Qn i hQn;1 Qn;1 i > 0:

2. Each p 2 Pn can be uniquely written p = Pn iQi , where i = hp Qi i hQi Qi i.
i=0
3. If Q is any monic polynomial of degree exactly n, then Q = Qn + Pi=0 i Qi (why?)
n;1
and hence
X
n;1
kQk2 = kQnk2 + 2 kQi k2 > kQnk2
i
2 2 2 2
i=0
unless Q = Qn. That is, Qn has the least k k2 norm of all monic polynomials of
degree n.
4. The Qn 's are unique in the following sense: If (Pn) is another sequence of orthogonal
polynomials such that Pn has degree exactly n, then Pn = nQn for some n 6= 0.
(Why?) Consequently, there's no harm in referring to the Qn 's as the sequence of
orthogonal polynomials relative to w.
5. For n 1 note that Rab Qn(t) w(t) dt = hQ0 Qn i = 0.
Orthogonal Polynomials 113
Examples
1. On ;1 1 ], the Chebyshev polynomials of the rst kind (Tn) are orthogonal relative
p
to the weight w(x) = 1= 1 ; x2 .
80 m = n
Z1 Z
dx = cos m cos n d = < 6
Tm (x) Tn (x) p : =2 m = n = 0
1;x 2
6
;1 m = n = 0.
0
Since Tn has degree exactly n, this must be the right choice. Notice, too, that
p
p2 T0 T1 T2 : : : are orthonormal relative to the weight 2= 1 ; x2 .
1

In terms of the inductive procedure given above, we must have Q0 = T0 = 1
and Qn = 2;n+1Tn for n 1. (Why?) From this it follows that an = 0, b1 = 1=2,
and bn = 1=4 for n 2. (Why?) That is, the recurrence formula given in our rst
Theorem reduces to the familar relationship Tn+1(x) = 2x Tn (x);Tn;1 (x). Curiously,
Qn = 2;n+1Tn minimizes both
Z1 1=2
2 p dx
max jp(x)j and p(x)
1 ; x2
;1
;1 x 1
over all monic polynomials of degree exactly n.
00 0
The Chebyshev polynomials also satisfy (1 ;x2 ) Tn (x) ; x Tn (x) + n2 Tn (x) = 0.
Since this is a polynomial identity, it su ces to check it for all x = cos . In this case,
Tn(x) = n sin n
sin
0

and
00 (x) = n cos n sin ; n sinn cos :
2
Tn
sin2 (; sin )
Hence,
00 0
(1 ; x2 ) Tn (x) ; x Tn (x) + n2 Tn(x)
= ;n2 cos n + n sinn cot ; n sinn cot + n2 cos = 0
2. On ;1 1 ], the Chebyshev polynomials of the second kind (Un) are orthogonal relative
p
to the weight w(x) = 1 ; x2 .
Z1
Um (x) Un (x) (1 ; x2 ) p dx 2
1;x
;1
Z sin (m + 1) sin(n + 1) 6
m=n
sin2 d = 0
= =2 m = n.
sin sin
0
Orthogonal Polynomials 114
While we're at it, notice that
Tn(x) = n sin = n Un;1(x):
sinn
0

As a rule, the derivatives of a sequence of orthogonal polynomials are again orthogonal
polynomials, but relative to a di erent weight.
3. On ;1 1 ] with weight w(x) 1, the sequence (Pn ) of Legendre polynomials are
orthogonal, and are typically normalized by Pn(1) = 1. The rst few Legendre poly-
nomials are P0(x) = 1, P1(x) = x, P2(x) = 2 x2 ; 1 , and P3(x) = 2 x3 ; 3 x. (Check
3 5
2 2
this!) After we've seen a few more examples, we'll come back and give an explicit
formula for Pn.
4. All of the examples we've seen so far are special cases of the following: On ;1 1 ],
consider the weight w(x) = (1 ; x) (1 + x) , where > ;1. The corresponding
orthogonal polynomials (Pn( )) are called the Jacobi polynomials and are typically
normalized by requiring that
Pn( )(1) = n + ( + 1)( + 2) ( + n) :
n!
It follows that Pn(0 0) = Pn,
Pn(;1=2 ;1=2) = 1 3 5 2nn! ; 1) Tn
(2n

and
Pn(1=2 1=2) = 1 3 2n(n + 1)!+ 1) Un :
5 (2n

The polynomials Pn( ) are called ultraspherical polynomials.
5. There are also several classical examples of orthogonal polynomials on unbounded
intervals. In particular,
w(x) = e;x
(0 1) Laguerre polynomials,
w(x) = x e;x
(0 1) generalized Laguerre polynomials,
(;1 1) w(x) = e;x2 Hermite polynomials.

Since Qn is orthogonal to every element of Pn;1, a fuller understanding of Qn will
follow from a characterization of the orthogonal complement of Pn;1. We begin with an
easy fact about least-squares approximations in inner product spaces.
Orthogonal Polynomials 115
Lemma. Let E be a nite dimensional subspace of an inner product space X, and let
x 2 X n E. Then, y 2 E is the least-squares approximation to x out of E (a.k.a. the
nearest point to x in E) if and only if hx ; y yi = 0 for every y 2 E that is, if and only
if (x ; y ) ? E.
We've taken E to be nite dimensional so that nearest points will exist since
Proof.
X is an inner product space, nearest points must also be unique (see the exercises for a
proof that every inner product norm is strictly convex).]
((=) First suppose that (x ; y ) ? E. Then, given any y 2 E, we have
kx ; yk2 = k(x ; y ) + (y ; y)k2 = kx ; y k2 + ky ; yk2
2 2 2 2
because y ; y 2 E and, hence, (x ; y ) ? (y ; y). Thus, kx ; yk > kx ; y k unless
y = y that is, y is the (unique) nearest point to x in E.
(=)) Suppose that x ; y is not orthogonal to E. Then, there is some y 2 E
with kyk = 1 such that = hx ; y yi 6= 0. Now I claim that y + y 2 E is a better
approximation to x than y (and y + y 6= y , of course) that is, y is not the least-squares
approximation to x. To see this, we again compute:
kx ; (y + y)k2 = k(x ; y ) ; yk2 = h(x ; y ) ; y (x ; y ) ; yi
2 2
= kx ; y k2 ; 2 hx ; y yi + 2
2
= kx ; y k2 ; 2 < kx ; y k2:
2 2
Thus, we must have hx ; y yi = 0 for every y 2 E.
Lemma 1. (Integration by-parts.)
Zb Zb
i
X
n
(k;1) b
(;1)k;1 u(n;k)v + (;1)n uv(n):
u(n)v = a
a a
k=1
Now if v is a polynomial of degree < n, then v(n) = 0 and we get:
Zb
Lemma 2. f 2 C a b ] satis es f(x) p(x) w(x) dx = 0 for all polynomials p 2 Pn;1 if
a
and only if there is an n-times di erentiable function u on a b ] satisfying fw = u(n) and
u(k)(a) = u(k)(b) = 0 for all k = 0 1 : : : n ; 1.
R b fpw =
Proof. One direction is clear from Lemma 1: Given u as above, we would have a
R b u(n)p = (;1)n R b up(n) = 0.
a a
Orthogonal Polynomials 116
R
So, suppose we have that ab fpw = 0 for all p 2 Pn;1. By integrating fw repeatedly,
choosing constants appropriately, we may de ne a function u satisfying fw = u(n) and
u(k)(a) = 0 for all k = 0 1 : : : n ; 1. We want to show that the hypotheses on f force
u(k)(b) = 0 for all k = 0 1 : : : n ; 1.
Now Lemma 1 tells us that
Zb X
n
(;1)k;1 u(n;k)(b) p(k;1) (b)
fpw =
0=
a k=1
for all p 2 Pn;1. But the numbers p(b) p0 (b) : : : p(n;1)(b) are completely arbitrary that
is (again by integrating repeatedly, choosing our constants as we please), we can nd
polynomials pk of degree k < n such that p(k)(b) 6= 0 and p(j)(b) = 0 for j 6= k. In
k k
fact, pk (x) = (x ; b)k works just ne! In any case, we must have u(k)(b) = 0 for all
k = 0 1 : : : n ; 1.
Rolle's theorem tells us a bit more about the functions orthogonal to Pn;1:
Zb
Lemma 3. If w(x) > 0 in (a b), and if f 2 C a b ] satis es f(x) p(x) w(x) dx = 0 for
a
all polynomials p 2 Pn;1, then f has at least n distinct zeros in the open interval (a b).
Proof. Write fw = u(n), where u(k)(a) = u(k)(b) = 0 for all k = 0 1 : : : n ; 1. In
particular, since u(a) = u(b) = 0, Rolle's theorem tells us that u0 would have at least one
zero in (a b). But then u0 (a) = u0 (c) = u0(b) = 0, and so u00 must have at least two zeros
in (a b). Continuing, we nd that fw = u(n) must have at least n zeros in (a b). Since
w > 0, the result follows.
Corollary. Let (Qn ) be the sequence of orthogonal polynomials associated to a given
weight w with w > 0 in (a b). Then, the roots of Qn are real, simple, and lie in (a b).
Lemma 4. If p is the least-squares approximation to f 2 C a b ] out of Pn;1, and if
w > 0 in (a b), then f ; p has at least n distinct zeros in (a b).
Proof. The least-squares approximation satis es hf ; p p i = 0 for all p 2 Pn;1 .

The sheer volume of literature on orthogonal polynomials and other \special func-
tions" is truly staggering. We'll content ourselves with the Legendre and the Chebyshev
Orthogonal Polynomials 117
polynomials. In particular, let's return to the problem of nding an explicit formula for
the Legendre polynomials. We could, as Rivlin does, use induction and a few observations
that simplify the basic recurrence formula (you're encouraged to read this see pp. 53{54).
Instead we'll give a simple (but at rst sight intimidating) formula that is of use in more
general settings than ours.
Lemma 2 (with w 1 and a b ] = ;1 1 ]) says that if we want to nd a polynomial
f of degree n that is orthogonal to Pn;1, then we'll need to take a polynomial for u,
and this u will have to be divisible by (x ; 1)n (x + 1)n. (Why?) That is, we must have
Pn(x) = cn Dn (x2 ; 1)n , where D denotes di erentiation, and where we nd cn by
evaluating the right-hand side at x = 1.
X n k n;k
n
Lemma 5. (Leibniz's formula) Dn (fg) k D (f) D (g).
=
k=0
; ; ;
Proof. Induction and the fact that n;1 + n;1 = n .
k;1 k k
P;
Consequently, Q(x) = Dn (x;1)n (x+1)n = n n Dk (x;1)n Dn;k (x+1)n and
k=0 k
it follows that Q(1) = 2nn! and Q(;1) = (;1)n2nn!. This, nally, gives us the formula
discovered by Rodrigues in 1814:

Pn(x) = 2n1n! Dn (x2 ; 1)n :

The Rodrigues formula is quite useful (and easily generalizes to the Jacobi polynomials).
Observations
6. By Lemma 3, the roots of Pn are real, distinct, and lie in (;1 1).
7. (x2 ; 1)n = Pn (;1)k ;n x2n;2k . If we apply 2n1n! Dn and simplify, we get another
k=0 k
formula for the Legendre polynomials.
1 X (;1)k n
n=2]
2n ; 2k xn;2k :
Pn(x) = 2n k n
k=0
In particular, if n is even (odd), then Pn is even (odd). Notice, too, that if we let
e
Pn denote the polynomial given by the standard construction, then we must have
;e
Pn = 2;n 2n Pn .
n
Orthogonal Polynomials 118
8. In terms of our standard recurrence formula, it follows that an = 0 (because xPn(x)2
is always odd). It remains to compute bn. First, integrating by parts,
Z1 Z1
i
21 0
Pn(x)2 dx = xPn(x) ;1 ; x 2Pn(x) Pn (x) dx
;1 ;1
0 0
or hPn Pn i = 2 ; 2h Pn xPn i. But xPn = nPn + lower degree terms hence,
0
h Pn xPn i = nh Pn Pn i. Thus, hPn Pn i = 2=(2n + 1). Using this and the fact
;n ;2n Pn , we'd nd that bn = n2 =(4n2 ; 1). Thus,
that Pn = 2 n e
n2
Pn+1 = 2;n;1 2n+ 12 Pn+1 = 2;n;1 2n+ 12
+e + e e
x Pn ; (4n2 ; 1) Pn;1
n n
n
= 2n+ 11 x Pn ; n + 1 Pn;1:
+
n
That is, the Legendre polynomials satisfy the recurrence formula
(n + 1) Pn+1(x) = (2n + 1) x Pn (x) ; n Pn;1 (x):
q 2n+1
b
9. It follows from 8 that the sequence Pn = 2 Pn is orthonormal on ;1 1 ].
10. The Legendre polynomials satisfy (1 ; x2 ) Pn (x) ; 2x Pn (x) + n (n + 1) Pn(x) = 0. If
00 0
we set u = (x2 ; 1)n that is, if u(n) = 2nn!Pn, note that u0 (x2 ; 1) = 2nxu. Now we
apply Dn+1 to both sides of this last equation (using Leibniz's formula) and simplify:
u(n+2)(x2 ; 1) + (n + 1) u(n+1) 2x + (n + 1)n u(n) 2 = 2n u(n+1) x + (n + 1) u(n)
2
=) (1 ; x2 ) u(n+2) ; 2x u(n+1) + n (n + 1) u(n) = 0:

11. Through a series of exercises, similar in spirit to 10, Rivlin shows that jPn(x)j 1 on
;1 1 ]. See pp. 63{64 of Rivlin for details.
Given an orthogonal sequence, it makes sense to consider \generalized Fourier series"
relative to the sequence and to nd analogues of the Dirichlet kernel, Lebesgue's theorem,
and so on. In case of the Legendre polynomials we have the following:
Example. The \Fourier-Legendre" series for f 2 C ;1 1 ] is given by Pk hf Pk i Pk ,
bb
r
where Z1
Pk = 2k 2 1 Pk
+
b b b
hf Pk i = f(x) Pk (x) dx:
and
;1
Orthogonal Polynomials 119
P bb
The partial sum operator Sn(f) = n hf Pk i Pk is a linear projection onto Pn and may
k=0
be written as Z1
Sn(f)(x) = f(t) Kn (t x) dt
;1
Pbb
where Kn(t x) = n Pk (t) Pk (x). (Why?)
k=0
b
Since the Pk 's are orthonormal, we have
X X
1
n
b b
jhf Pk ij2 = kSn(f)k2 kfk2 = jhf Pk ij2
2 2
k=0 k=0
b
and so the generalized Fourier coe cients h f Pk i are square summable in particular,
b
hf Pk i ! 0 as k ! 1. As in the case of Fourier series, the fact that the polynomials
b
(i.e., the span of the Pk 's) are dense in C a b ] implies that Sn(f) actually converges to f
in the k k2 norm. These same observations remain valid for any sequence of orthogonal
polynomials. The real question remains, just as with Fourier series, whether Sn(f) is a
good uniform (or even pointwise) approximation to f.
If you're willing to swallow the fact that jPn(x)j 1, then
X r 2k + 1 r 2k + 1 1 X(2k + 1) = (n + 1)2 :
n n
jKn (t x)j =2
2 2 2
k=0 k=0
Hence, kSn(f)k (n + 1)2 kfk. That is, the \Lebesgue numbers" for this process are at
most (n + 1)2 . The analogue of Lebesgue's theorem in this case would then read:
kf ; Sn(f)k Cn2En(f):
Thus, Sn(f) f whenever n2En(f) ! 0, and Jackson's theorem tells us when this will
happen: If f is twice continuously di erentiable, then the Fourier-Legendre series for f
converges uniformly to f on ;1 1 ].
The Christo el-Darboux Identity
It would also be of interest to have a closed form for Kn (t x). That this is indeed always
possible, for any sequence of orthogonal polynomials, is a very important fact.
Using our original notation, let (Qn ) be the sequence of monic orthogonal polynomials
b
corresponding to a given weight w, and let (Qn ) be the orthonormal counterpart of (Qn )
Orthogonal Polynomials 120
p
b
in other words, Qn = nQn , where n = h Qn Qn i . It will help things here if you recall
(from Observation 1 on page 112) that 2 = bn 2 .
n n;1
As with the Legendre polynomials, each f 2 C a b ] is represented by the generalized
P bb
Fourier series k hf Qk i Qk , with partial sum operator
Zb
Sn(f)(x) = f(t) Kn (t x) w(t) dt
a
P
k=0 b b
where Kn (t x) = n Qk (t) Qk (x). As before, Sn is a projection onto Pn in particular,
Sn(1) = 1 for every n.
Theorem. (Christo el-Darboux) The kernel Kn(t x) can be written
Xb b b b bb
n
;1 Qn+1 (t) Qn (x) ; Qn (t) Qn+1 (x) :
Qk (t) Qk (x) = n+1 n t;x
k=0
Proof. We begin with the standard recurrence formulas

Qn+1(t) = (t ; an ) Qn (t) ; bn Qn;1(t)
Qn+1(x) = (x ; an ) Qn (x) ; bnQn;1 (x)
(where b0 = 0). Multiplying the rst by Qn(x), the second by Qn(t), and subtracting:
Qn+1(t) Qn (x) ; Qn (t) Qn+1 (x)
= (t ; x) Qn (t) Qn (x) + bn Qn(t) Qn;1 (x) ; Qn (x) Qn;1 (t)
(and again, b0 = 0). If we divide both sides of this equation by 2 we get
n
;2 Qn+1 (t) Qn (x) ; Qn (t) Qn+1 (x)
n
bb
= (t ; x) Qn (t) Qn (x) + ;2 Qn(t) Qn;1 (x) ; Qn (x) Qn;1 (t) :
n;1

Thus, we may repeat the process arriving nally at
Xb
n
b
;2 Qn+1 (t) Qn (x) ; Qn (t) Qn+1 (x) = (t ; x) Qn (t) Qn (x):
n
k=0
b
The Christo el-Darboux identity now follows by writing Qn = nQn , etc.
And we now have a version of the Dini-Lipschitz theorem:
Orthogonal Polynomials 121
Theorem. Let f 2 C a b ] and suppose that at some point x0 in a b ] we have
(i) f is Lipschitz at x0 that is, jf(x0 ) ; f(x)j Kjx0 ; xj for some constant K and all
x in a b ] and
b
(ii) the sequence (Qn (x0 )) is bounded.
P bb
Then, the series k hf Qk i Qk (x0 ) converges to f(x0 ).
Proof. First note that the sequence n+1 ;1 is bounded: Indeed, by Cauchy-Schwarz,
n
= h Qn+1 Qn+1 i = h Qn+1 x Qn i
2
n+1
kQn+1k2 kx k kQnk2 = maxfjaj jbjg n+1 n:
Thus, n+1 ;1 c = maxfjaj jbjg. Now, using the Christo el-Darboux identity,
n
Zb
Sn(f)(x0 ) ; f(x0 ) = f(t) ; f(x0 ) Kn(t x0 ) w(t) dt
a
Z b f(t) ; f(x0 )
b b bb
= n+1 ;1 Qn+1(t) Qn (x0 ) ; Qn (t) Qn+1 (x0 ) w(t) dt
n t ; x0
a
bb bb
;1 h h Qn+1 i Qn (x0 ) ; h h Qn i Qn+1 (x0 )
= n+1 n

where h(t) = (f(t) ; f(x0 ))=(t ; x0 ). But h is bounded (and continuous everywhere
b
except, possibly, at x0 ) by hypothesis (i), n+1 ;1 is bounded, and Qn(x0 ) is bounded by
n
b
hypothesis (ii). All that remains is to notice that the numbers hh Qn i are the generalized
Fourier coe cients of the bounded, Riemann integrable function h, and so must tend to
zero (since, in fact, they're even square summable).
We end this section with a negative result, due to Nikolaev:
Theorem. There is no weight w such that every f 2 C a b ] has a uniformly convergent
expansion in terms of orthogonal polynomials. In fact, given any w, there is always some
f for which kf ; Sn(f)k is unbounded.
Problem Set: Orthogonal Polynomials
Math 682 6/17/98

Throughout, w denotes a xed, positive (except possibly at nitely many points), Riemann
integrable weight function on a b ], and we consider the inner product on C a b ] de ned
by
Zb
hf gi = f(x) g(x) w(x) dx
a
and the associated (strictly convex) norm
! 1=2
Zb
q
kfk2 = hf fi = jf(x)j2 w(x) dx :
a

62. Prove that every inner product norm is strictly convex. Speci cally, let h i be an
phx xi be the associated norm.
inner product on a vector space X, and let kxk =
Show that:
(a) kx+yk2 +kx;yk2 = 2 (kxk2 +kyk2) for all x, y 2 X (the parallelogram identity).
(b) If kxk = r = kyk and if kx ; yk = , then x+y 2 = r2 ; ( =2)2 . In particular,
2
x+y < r whenever x 6= y.
2
We de ne a sequence of polynomials (Qn ) which are mutually orthogonal, relative to w,
by setting Q0(x) = 1, Q1 (x) = x ; a0 , and

Qn+1(x) = (x ; an)Qn (x) ; bn Qn;1(x) for n 1, where
an = hxQn Qn i h Qn Qn i and bn = hxQn Qn;1 i hQn;1 Qn;1 i

(and where x Qn is shorthand for the polynomial x Qn (x)).
63. Check that Qn is a monic polynomial of degree exactly n.
64. If (Pn ) is another sequence of orthogonal polynomials such that Pn has degree exactly
n, for each n, show that Pn = nQn for some n 6= 0. In particular, if Pn is a monic
polynomial, then Pn = Qn . Hint: Choose n so that Pn ; nQn 2 Pn;1 and note
that (Pn ; nQn) ? Pn;1. Conclude that Pn ; nQn = 0.]
65. Check that hxQn Qn;1 i = hQn Qn i, and conclude that bn > 0 for each n.
Orthogonal Polynomials 123
66. Given f 2 C a b ] and n 0, prove that qn 2 Pn is the least-squares approximation
to f out of Pn (with respect to w) if and only if
Zb
hf ; qn p i = (f(x) ; qn (x)) p(x) w(x) dx = 0
a
for every p 2 Pn that is, if and only if (f ; qn) ? Pn.
67. If f 2 C a b ] but f 2 Pn, show that f ; qn changes sign at n + 1 (or more) points in
=
(a b). Hint: If not, show that there is a polynomial p 2 Pn such that (f ; qn) p 0
(but (f ; qn) p 6= 0) in (a b). Now appeal to the result in problem 66 to arrive at a
68. Show that the least-squares approximation to f(x) = xn out of Pn;1 (relative to w)
is qn;1(x) = xn ; Qn (x).
69. Show that Qn has n distinct, simple zeros in (a b). Hint: Combine 67 and 68.]
70. Given f 2 C a b ], let pn denote the best uniform approximation to f out of Pn and
let qn denote the least-squares approximation to f out of Pn. Show that kf ; qnk2
kf ; pnk2 and conclude that kf ; qn k2 ! 0 as n ! 1.
71. Show that the Chebyshev polynomials of the rst kind, (Tn ), and of the second kind,
(Un), satisfy the identities
Tn(x) = Un(x) ; x Un;1 (x) and (1 ; x2 ) Un;1 (x) = x Tn (x) ; Tn+1(x):
72. Show that the Chebyshev polynomials of the second kind, (Un), satisfy the recurrence
relation
Un+1(x) = 2x Un (x) ; Un;1(x) n 1
where U0 (x) = 1 and U1(x) = 2x. Please compare this with the recurrence relation
satis ed by the Tn's!]
Math 682 6/23/98

Numerical integration, or quadrature, is the process of approximating the value of a de nite
R
integral ab f(x) w(x) dx based only on a nite number of values or \samples" of f (much
like a Riemann sum). A linear quadrature formula takes the form
Zb X
n
f(x) w(x) dx Ak f(xk )
a k=1
where the nodes (xk ) and the weights (Ak ) are at our disposal. (Note that both sides of
the formula are linear in f.)
Example. Consider the quadrature formula
Z1 X
1 n;1 f 2k + 1
I(f) = f(x) dx = In(f):
n k=;n 2n
;1
R
1
If f is continuous, then we clearly have In(f) ! ;1 f as n ! 1. (Why?) But in the
particular case f(x) = x2 we have (after some simpli cation)
X X
1 n;1 2k + 1 1 n;1(2k + 1)2 = 2 ; 1 :
2
In(f) = n = 2n3 3 6n2
2n
k=;n k=0

That is, j In(f) ; I(f) j = 1=6n2. In particular, we would need to take n 130 to get
1=6n2 10;5, for example, and this would require that we perform over 250 evaluations
of f. We'd like a method that converges a bit faster! In other words, there's no shortage
of quadrature formulas|we just want faster ones.
One reasonable requirement for our proposed quadrature formula is that it be exact
for polynomials of low degree. As it happens, this is easy to come by.
Lemma 1. Given w(x) on a b ] and nodes a x1 < < xn b, there exist unique
weights A1 : : : An such that
Zb X
n
p(x) w(x) dx = Ai p(xi )
a i=1
for all polynomials p 2 Pn;1.
Proof. Let `1 : : : `n be the Lagrange interpolating polynomials of degree n ; 1 associ-
P
ated to the nodes x1 : : : xn , and recall that we have p = n p(xi ) `i for all p 2 Pn;1.
i=1
Hence, Zb X Zb
n
p(x) w(x) dx = p(xi ) `i(x) w(x) dx:
a a
i=1
R
That is, Ai = ab `i (x) w(x) dx works. To see that this is the only choice, suppose that
Zb X
n
p(x) w(x) dx = Bi p(xi )
a i=1
is exact for all p 2 Pn;1, and set p = `j :
Zb X
n
Aj = `j (x) w(x) dx = Bi `j (xi ) = Bj :
a i=1

The point here is that `1 : : : `n form a basis for Pn;1 and integration is linear thus,
integration is completely determined by its action on the basis|that is, by the n values
Ai = I(`i ), i = 1 : : : n.
T
Said another way, the n point evaluations i (p) = p(xi ) satisfy Pn;1 \ ( n ker i ) =
i=1
f0g, and it follows that every linear, real-valued function on Pn;1 must be a linear com-
bination of the i 's. Here's why: Since the xi 's are distinct, Pn;1 may be identi ed with
Rn by way of the isomorphism p 7! (p(x1 ) : : : p(xn )). A linear, real-valued function on
Pn;1 must, then, correspond to some linear, real-valued function on Rn. In other words,
it's given by inner product against some xed vector (A1 : : : An ) in particular, we must
Pn A p(x ).
have I(p) = i=1 i i
In any case, we now have our quadrature formula: For f 2 C a b ] we de ne In(f) =
Pn A f(x ), where A = R b ` (x) w(x) dx. But notice that the proof of our last result
i=1 i i i ai
suggests an alternate way of writing our quadrature formula. Indeed, if Ln;1(f)(x) =
Pn f(x )` (x) is the Lagrange interpolating polynomial for f of degree n ; 1 based on
i=1 i i
the nodes x1 : : : xn , then
Zb Zb
X X
n n
(Ln;1(f))(x) w(x) dx = f(xi ) `i(x) w(x) dx = Ai f(xi ):
a a
i=1 i=1
In summary, In(f) = I(Ln;1(f)) I(f) that is,
Zb Zb
X
n
In(f) = Ai f(xi ) = (Ln;1(f))(x) w(x) dx f(x) w(x) dx = I(f)
a a
i=1
where Ln;1 is the Lagrange interpolating polynomial of degree n ; 1 based on the nodes
x1 : : : xn . This formula is obviously exact for f 2 Pn;1.
It's easy to give a bound on jIn(f)j in terms of kfk indeed,
!
X X
n n
jIn(f)j jAi jjf(xi )j kfk jAi j :
i=1 i=1
By considering a norm one continuous function f satisfying f(xi ) = sgnAi for each i =
P
1 : : : n, it's easy to see that n jAi j is the smallest constant that works in this inequality.
i=1
Pn jA j, n = 1 2 : : :, are the \Lebesgue numbers" for this process.
In other words, n = i=1 i
As with all previous settings, we want these numbers to be uniformly bounded.
If w(x) 1 and if f is n-times continuously di erentiable, we even have an error
estimate for our quadrature formula:
Zb Zb Zb 1 kf (n)k Z b Y jx ; x j dx
n
f; jf ; Ln;1(f)j
Ln;1(f) i
n!
a a a a i=1
(recall the Theorem on page 72 of \A Brief Introduction to Interpolation"). As it happens,
the integral on the right is minimized when the xi 's are taken to be the zeros of the
Chebyshev polynomial Un (see Rivlin, page 72).
The fact that a quadrature formula is exact for polynomials of low degree does not by
P
itself guarantee that the formula is highly accurate. The problem is that n Ai f(xi ) may
i=1
be estimating a very small quantity through the cancellation of very large quantities. So,
for example, a positive function may yield a negative result in this approximate integral.
This wouldn't happen if the Ai 's were all positive|and we've already seen how useful
positivity can be. Our goal here is to further improve our quadrature formula to have this
property. But we have yet to take advantage of the fact that the xi 's are at our disposal.
We'll let Gauss show us the way!
Theorem. (Gauss) Fix a weight w(x) on a b ], and let (Qn ) be the canonical sequence of
orthogonal polynomials relative to w. Given n, let x1 : : : xn be the zeros of Qn (these all
R
P
lie in (a b)), and choose A1 : : : An so that the formula n Ai f(xi ) ab f(x) w(x) dx
i=1
is exact for polynomials of degree less than n. Then, in fact, the formula is exact for all
polynomials of degree less than 2n.
Proof. Given a polynomial P of degree less than 2n, we may divide: P = Qn R + S,
where R and S are polynomials of degree less than n. Then,
Zb Zb Zb
P(x) w(x) dx = Qn (x) R(x) w(x) dx + S(x) w(x) dx
a
Zab a
S(x) w(x) dx since deg R < n
=
a
X
n
Ai S(xi ) since deg S < n:
=
i=1
R b P(x) w(x) dx =
But P(xi ) = Qn(xi ) R(xi ) + S(xi ) = S(xi ), since Qn(xi ) = 0. Hence, a
Pn A P(x ) for all polynomials P of degree less than 2n.
i=1 i i

Amazing! But, well, not really: P2n;1 is of dimension 2n, and we had 2n numbers
x1 : : : xn and A1 : : : An to choose as we saw t. Said another way, the division algorithm
tells us that P2n;1 = QnPn;1 Pn;1. Since QnPn;1 ker(In), the action of In on P2n;1
is the same as its action on a \copy" of Pn;1.
In still other words, since any polynomial that vanishes at all the xi 's must be divisible
T
by Qn (and conversely), we have QnPn;1 = P2n;1 \ ( n ker i ) = ker(In jP2n;1 ). Thus,
i=1
In \factors through" the quotient space P2n;1=QnPn;1 = Pn;1.
Also not surprising is that this particular choice of xi 's is unique.
Lemma 2. Suppose that a x1 < < xn b and A1 : : : An are given so that the
R P
equation ab P(x) w(x) dx = n Ai P(xi ) is satis ed for all polynomials P of degree less
i=1
than 2n. Then, x1 : : : xn are the zeros of Qn .
Qn
i=1 (x ; xi ).
Let Q(x) = Then, for k < n, the polynomial Q Qk has degree
Proof.
n + k < 2n. Hence,
Zb X
n
Q(x) Qk (x) w(x) dx = Ai Q(xi ) Qk (xi ) = 0:
a i=1
Since Q is a monic polynomial of degree n which is orthogonal to each Qk , k < n, we must
have Q = Qn. Thus, the xi 's are actually the zeros of Qn .

According to Rivlin, the phrase Gaussian quadrature is usually reserved for the speci c
R1 R1
quadrature formula whereby ;1 f(x) dx is approximated by ;1 (Ln;1(f))(x) dx, where
Ln;1(f) is the Lagrange interpolating polynomial to f using the zeros of the n-th Legendre
polynomial as nodes. (What a mouthful!) What is actually being described in our version
of Gauss's theorem is Gaussian-type quadrature.
Before computers, Gaussian quadrature was little more than a curiosity the roots
of Qn are typically irrational, and certainly not easy to come by. By now, though, it's
considered a standard quadrature technique. In any case, we still can't judge the quality
of Gauss's method without a bit more information.

First, let's summarize our rather cumbersome notation.

orthogonal approximate
polynomial zeros weights integral
x(1) A(1)
Q1 I1
1 1
x(2) x(2) A(2) A(2)
Q2 I2
1 2 1 2
x(3) x(3) x(3) A(3) A(3) A(3)
Q3 I3
1 2 3 1 2 3
.. . . ..
. .
. . . .
P
Hidden here is the Lagrange interpolation formula Ln;1(f) = n f(x(n) ) `(n;1), where
i=1 i i
`(n;1) denote the Lagrange polynomials of degree n ; 1 based on x(n) : : : x(n). The n-th
n
i 1
quadrature formula is then
Zb Zb
X
n
A(n)f(x(n) )
In(f) = Ln;1(f)(x) w(x) dx = f(x) w(x) dx
i i
a a
i=1

which is exact for polynomials of degree less than 2n.
By way of one example, Hermite showed that A(n) = =n for the Chebyshev weight
k
w(x) = (1 ; x2 );1=2 on ;1 1 ]. Remarkably, A(n) doesn't depend on k! The quadrature
k
formula in this case reads:
Z 1 f(x) dx X
n ;
f cos 2k2n 1
p :
;1 1 ; x2 n k=1
Or, if you prefer,
Z1 X
n ; ;
f cos 2k2n 1 sin 2k2n 1 :
f(x) dx n k=1
;1
(Why?) You can nd full details in Natanson's Constructive Function Theory, Vol. III.
The key result, due to Stieltjes, is that In is positive:
Lemma 3. A(n) : : : A(n) > 0 and Pn A(n) = Rab w(x) dx.
n i=1 i
1

Proof. The second assertion is obvious (since In (1) = I(1) ). For the rst, x 1 jn
and notice that (`(n;1))2 is of degree 2(n ; 1) < 2n. Thus,
j
Z b h (n;1) i2 h i
X
n
(n) `(n;1)(x(n) ) 2
h `(n;1) `(n;1) i = A(n)
0< `j (x) w(x) dx = A
=
j j i j i j
a i=1
because `(n;1)(x(n)) = i j .
j i

Now our last calculation is quite curious what we've shown is that
Zb Z b h (n;1) i2
A(n) = `(n;1)(x) w(x) dx = `j (x) w(x) dx:
j j
a a
The same calculation as above also proves
Corollary. h `(n;1) `(n;1) i = 0 for i 6= j.
i j

Since A(n) : : : A(n) > 0, it follows that In(f) is positive that is, In(f) 0 whenever
n
1
f 0. The second assertion in Lemma 3 tells us that the In's are uniformly bounded:
Zb
X
n
A(n)
jIn(f)j kfk = kfk w(x) dx
i
a
i=1
R b f(x) w(x) dx itself. Given all of this,
and this is the same bound that holds for I(f) = a
proving that In(f) ! I(f) is a piece of cake. The following result is again due to Stieltjes
(a la Lebesgue).
R b w(x) dx E (f). In particular,
Theorem. In the above notation, jIn(f);I(f)j 2 2n;1
a
In(f) ! I(f) for evey f 2 C a b ].
Proof. Let p be the best uniform approximation to f out of P2n;1 . Then, since
In(p ) = I(p ), we have
jI(f) ; In(f)j jI(f ; p )j + jIn(f ; p )j
Zb X
n
A(n)
kf ; p k w(x) dx + kf ; p k i
a i=1
Zb Zb
= 2 kf ; p k w(x) dx = 2E2n;1(f) w(x) dx:
a a

Computational Considerations
You've probably been asking yourself: \How do I nd the Ai 's without integrating?" Well,
rst let's recall the de nition: In the case of Gaussian-type quadrature we have
Zb ZbQn (x)
A(n) = `(n;1)(x) w(x) dx = (n) ) Q0 (x(n) ) w(x) dx
i i
a (x ; xi
a ni
(because \W " is the same as Qn here|the xi 's are the zeros of Qn ). Next, consider the
function Z b Qn (t) ; Qn (x)
'n(x) = w(t) dt:
t;x
a
Since t;x divides Qn(t);Qn (x), note that 'n is actually a polynomial (of degree at most
n ; 1 ) and that
Z b Qn (t)
= A(n)Q0n (x(n)):
'n(x(n)) = (n) w(t) dt
i i i
t ; xi
a

Now Q0n (x(n)) is readily available we just need to compute 'n(x(n)).
i i

Claim. The 'n's satisfy the same recurrence formula as the Qn's
'n+1(x) = (x ; an )'n(x) ; bn'n;1(x) n 1
but with di erent starting values
Zb
'0(x) 0 '1(x) w(x) dx:
and
a
Proof. The formulas for '0 and '1 are obviously correct, since Q0 (x) 1 and Q1 (x) =
x ; a0 . We only need to check the recurrence formula itself.
Z b Qn+1(t) ; Qn+1(x)
'n+1(x) = w(t) dt
t;x
a
Z b (t ; an) Qn (t) ; bnQn;1 (t) ; (x ; an ) Qn (x) + bnQn;1 (x)
w(t) dt
= t;x
a
Z b Qn (t) ; Qn (x) Z b Qn;1 (t) ; Qn;1 (x)
= (x ; an) w(t) dt ; bn w(t) dt
t;x t;x
a a
= (x ; an) 'n (x) ; bn 'n;1(x)
R
since ab Qn(t) w(t) dt = 0.
Of course, the derivatives Q0n satisfy a recurrence relation of sorts, too:
Q0n+1(x) = Qn(x) + (x ; an) Q0n (x) ; bn Q0n;1(x):
Q
But Q0n(x(n)) can be computed without knowing Q0n(x). Indeed, Qn(x) = n (x ; x(n)),
i i
i=1
Q (x(n) ; x(n)).
so we have Q0n (x(n)) = j6=i i
i j
The weights A(n), or Christo el numbers, together with the zeros of Qn are tabulated
i
in a variety of standard cases. See, for example, Handbook of Mathematical Functions with
Formulas, Graphs, and Tables, by Abramowitz and Stegun, eds. In practice, of course, it's
enough to tabulate data for the case a b ] = ;1 1 ].
Applications to Interpolation
Although Ln(f) isn't typically a good uniform approximation to f, if we interpolate at the
zeros of an orthogonal polynomial Qn+1, then Ln(f) will be a good approximation in the
k k1 or k k2 norm generated by the corresponding weight w. Speci cally, by rewording
R
our earlier results, it's easy to get estimates for each of the errors ab jf ; Ln(f)j w and
R b jf ; L (f)j2 w. We use essentially the same notation as before, except now we take
n
a
X ;
n+1
f x(n+1) `(n)
Ln(f) = i i
i=1
where x(n+1) : : : x(n+1) are the roots of Qn+1 and `(n) is of degree n. This leads to a
n+1 i
1
quadrature formula that's exact on polynomials of degree less than 2(n + 1).
As we've already seen, `(n) : : : `(n) are orthogonal and so kLn(f)k2 may be computed
n+1
1
exactly.
R b w(x) dx 1=2
Lemma. kLn(f)k2 kfk .
a
Proof. Since Ln (f)2 is a polynomial of degree 2n < 2(n + 1), we have
Zb
kLn(f)k2 Ln(f)]2 w(x) dx
=
2
a
"n+1 #2
X X ; ;
n+1
A(n+1) f x(n+1) `(n) x(n+1)
= j i i j
j=1 i=1
h ; (n+1) i2
X
n+1
A(n+1) f xj
= j
j=1
Zb
X
n+1
A(n+1) = kfk2
kfk2 w(x) dx:
j
a
j=1

kfk Rab w(x) dx
1=2
Please note that we also have kfk2 that is, this same estimate
holds for kfk2 itself.
As usual, once we have an estimate for the norm of an operator, we also have an
analogue of Lebesgue's theorem.
R 1=2
2 ab w(x) dx
Theorem. kf ; Ln(f)k2 En(f).
Proof. Here we go again! Let p be the best uniform approximation to f out of Pn and
use the fact that Ln(p ) = p to see that:
kf ; Ln(f)k2 kf ; p k2 + kLn(f ; p )k2
! 1=2 !1=2
Zb Zb
kf ; p k + kf ; p k
w(x) dx w(x) dx
a a
!1=2
Z b
w(x) dx :
= 2En(f)
a

Hence, if we interpolate f 2 C a b ] at the zeros of (Qn ), then Ln(f) ! f in k k2
norm. The analogous result for the k k1 norm is now easy:
R b jf(x) ; L (f)(x)j w(x) dx R b w(x) dx E (f).
Corollary. a 2
n n
a
Proof. We apply the Cauchy-Schwarz inequality:
Zb Zb p p
jf(x) ; Ln(f)(x)j w(x) dx = jf(x) ; Ln(f)(x)j w(x) w(x) dx
a a
!1=2 Z b !1=2
Zb
jf(x) ; Ln(f)(x)j2 w(x) dx w(x) dx
a a
Zb
w(x) dx:
2En(f)
a
R b f(x) dx in terms of R b f(x) w(x) dx
Essentially the same device allows an estimate of a a
(which may be easier to compute).
Corollary. If Rab w(x);1 dx is nite, then
Zb Zb p
jf(x) ; Ln(f)(x)j w(x) p 1 dx
jf(x) ; Ln(f)(x)j dx =
w(x)
a a
!1=2 Z b !1=2
Zb 1 dx
jf(x) ; Ln (f)(x)j2 w(x) dx
a w(x)
a
! 1=2 Z b !1=2
Zb 1 dx
w(x) dx :
2En(f)
a w(x)
a

In particular, the Chebyshev weight satis es
Z 1 dx Z1p
p 1 ; x2 dx = 2 :
= and
1 ; x2
;1 ;1
Thus, interpolation at the zeros of the Chebyshev polynomials (of the rst kind) would
provide good, simultaneous approximation in each of the norms k k1, k k2, and k k.
The Moment Problem
Given a positive, continuous weight function w(x) on a b ], the number
Zb
xk w(x) dx
=
k
a
is called the k-th moment of w. In physical terms, if we think of w(x) as the density of a
thin rod placed on the interval a b ], then 0 is the mass of the rod, 1 = 0 is its center of
mass, 2 is its moment of inertia (about 0), and so on. In probabilistic terms, if 0 = 1,
then w is the probability density function for some random variable, 1 is the expected
value, or mean, of this random variable, and 2 ; 2 is its variance. The moment problem
1
(or problems, really) concern the inverse procedure. What can be measured in real life are
the moments|can the moments be used to nd the density function?
Questions: Do the moments determine w? Do di erent weights have di erent mo-
ment sequences? If we knew the sequence ( k ), could we nd w? How do we tell if a
given sequence ( k ) is the moment sequence for some positive weight? Do \special"
weights give rise to \special" moment sequences?
Now we've already answered one of these questions: The Weierstrass theorem tells us
that di erent weights have di erent moment sequences. Said another way, if
Zb
xk w(x) dx = 0 for all k = 0 1 2 : : :
a
R b p(x) w(x) dx = 0 for all polynomials p
then w 0. Indeed, by linearity, this says that a
R b w(x)2 dx = 0. (Why?) The remaining questions are harder
which, in turn, tells us that a
to answer. We'll settle for simply stating a few pertinent results.
Given a sequence of numbers ( k ), we de ne the n-th di erence sequence ( n k ) by
0
k= k
k = k ; k+1
1
n k = n;1 k ; n;1 k+1 n 1:
= k ; 2 k+1 + k+2. More generally, induction will show that
2k
For example,
X in
n
n k = (;1)
i k+i :
i=0
In the case of a weight w on the interval 0 1 ], this sum is easy to recognize as an integral.
Indeed,
Z1 X i n Z 1 k+i X in
n n
k (1 ; x)n w(x) dx = (;1)
x i 0 x w(x) dx = i=0 (;1) i k+i:
0 i=0
In particular, if w is nonnegative, then we must have n k 0 for every n and k. This
observation serves as motivation for
Theorem. The following are equivalent:
(a) ( k ) is the moment sequence of some nonnegative weight function w on 0 1 ].
(b) n k 0 for every n and k.
(c) a0 0 + a1 1 + + an n 0 whenever a0 + a1 x + + anxn 0 for all 0 x 1.
The equivalence of (a) and (b) is due to Hausdor . A real sequence satisfying (b) or
(c) is sometimes said to be positive de nite.
Now dozens of mathematicians worked on various aspects of the moment problem:
Chebyshev, Markov, Stieltjes, Cauchy, Riesz, Frechet, and on and on. And several of
R b w(t) dt
them, in particular Cauchy and Stieltjes, noticed the importance of the integral a x;t
in attacking the problem. (Compare this expression to Cauchy's integral formula.) It was
Stieltjes, however, who gave the rst complete solution to such a problem|developing his
R b dW(t) ), his own variety of continued fractions, and planting
own integral (by considering a x;t
the seeds for the study of orthogonal polynomials while he was at it! We will attempt to
at least sketch a few of these connections.
To begin, let's x our notation: To simpli y things, we suppose that we're given a
nonnegative weight w(x) on a symmetric interval ;a a ], and that all of the moments of
w are nite. We will otherwise stick to our usual notations for (Qn ), the Gaussian-type
quadrature formulas, and so on. Next, we consider the moment-generating function:
Z a w(t) X
1
k.
Lemma. If x 2 ;a a ], then x ; t dt
= = xk+1
;a k=0
X tk
1
1 =1 1
x ; t x 1 ; (t=x) = k=0 xk+1 , and the sum converges uniformly because
Proof.
jt=xj a=jxj < 1. Now just multiply by w(t) and integrate.
By way of an example, consider the Chebyshev weight w(x) = (1;x2 );1=2 on ;1 1 ].
For x > 1 we have
Z1 ;set t = 2u=(1 + u2)
dt
p = p2
(x ; t) 1 ; t2 x ;1
;1
1 ;1=2
= x 1 ; x2

= x 1 + 1 x2 + 2 2 2! x4 +
1 13 1 1
2
using the binomial formula. Thus, we've found all the moments:
Z 1 dt
p2
= =
0
;1 1 ; t
Z 1 t2n;1dt
p
= =0
2n;1
1 ; t2
;1
Z 1 t2ndt
p 2 = 1 3 5 2nn! ; 1) :
(2n
=
2n
;1 1 ; t
R
a
Stieltjes proved much more: The integral ;a w(t) dt is actually an analytic function of
x;t
x in C n ;a a ]. In any case, since x 2 ;a a ], we know that x;t is continuous on ;a a ].
1
=
In particular, we can apply our quadrature formulas (and Stieltjes theorem, p. 132) to
write Z a w(t) X A(n)
n
i
dt = n!1
lim
;a x ; t (n)
i=1 x ; xi
and these sums are recognizable:
X A(n)
n '
= Qn(x) :
Lemma. i
n (x)
(n)
i=1 x ; xi
Proof. Since 'n has degree < n and 'n (x(n) ) 6= 0 for any i, we may appeal to partial-
i
fractions to write
X ci
n
'n(x) = 'n(x)
(n) ) (x ; x(n)) =
Qn (x) x ; x(n)
(x ; x1 n i=1 i
where ci is given by
'n(x(n)) = A(n):
'
ci = Qn(x) (x ; x(n)) = 0 i(n)
i i
n (x) Qn(xi )
x=x(n)
i

Now here's where the continued fractions come in: Stieltjes recognized the fact that
'n+1(x) = b0
(x ; a0 ) ; (x ; a ) ; b1
Qn+1(x)
...
1
b
; (x ;nan)
R
(which can be proved by induction), where b0 = ab w(t) dt. More generally, induction will
show that the n-th convergent of a continued fraction can be written as
An = p1
q1 ; q ; p2
Bn
...
2
; pn
qn
by means of the recurrence formulas
A0 = 0 B0 = 1
A1 = p1 B1 = q1
An = qnAn;1 + pnAn;2 Bn = qnBn;1 + pnBn;2
where n = 2 3 4 : : :. Please note that An and Bn satisfy the same recurrence formula,
but with di erent starting values (as is the case with 'n and Qn ).
Again using the Chebyshev weight as an example, for x > 1 we have
Z1 dt
p p 2=
=
x2 ; 1 ;1 (x ; t) 1 ; t 1=2
x; 1=4
x; 1=4
x; .
..
since an = 0 for all n, b1 = 1=2, and bn = 1=4 for n 2. In other words, we've just found
a continued fraction expansion for (x2 ; 1);1=2 .
Appendix
Finally, here is a brief review of some of the fancier bits of linear algebra used in this
chapter. To begin, we discuss sums and quotients of vector spaces.
Each subspace M of a nite-dimensional X induces an equivalence relation on X by
x y () x ; y 2 M:
Standard arguments show that the equivalence classes under this relation are the cosets
(translates) x + M, x 2 X. That is,
x + M = y + M () x ; y 2 M () x y:
Equally standard is the induced vector arithmetic

(x + M) + (y + M) = (x + y) + M (x + M) = ( x) + M
and

where x, y 2 X and 2 R. The collection of cosets (or equivalence classes) is a vector
space under these operations it's denoted X=M and called the quotient of X by M. Please
note the the zero vector in X=M is simply M itself.
Associated to the quotient space X=M is the quotient map q(x) = x + M. It's easy
to check that q : X ! X=M is a vector space homomorphism with kernel M. (Why?)
Next we recall the isomorphism theorem.
Theorem. Let T : X ! Y be a linear map between nite-dimensional vector spaces, and
let q : X ! X= ker T be the quotient map. Then, there exists a (unique, into) isomorphism
S : X= ker T ! Y satisfying S(q(x)) = T(x) for every x 2 X.
! Y by
Proof. Since q maps onto X= ker T, it's \legal" to de ne a map S : X= ker T
setting S(q(x)) = T(x) for x 2 X. Please note that S is well-de ned since
T(x) = T(y) () T(x ; y) = 0 () x ; y 2 ker T
() q(x ; y) = 0 () q(x) = q(y):
It's easy to see that S is linear and so precisely the same argument as above shows that S
is one-to-one.
Corollary. Let T : X ! Y be a linear map between nite-dimensional vector spaces.
Then, the range of T is isomorphic to X= ker T.
The Muntz Theorems
Math 682

For several weeks now we've taken advantage of the fact that the monomials 1 x x2 : : :
have dense linear span in C 0 1 ]. What, if anything, is so special about these particular
P
powers? How about if we consider polynomials of the form n ak xk2 are they dense,
k=0
too? More generally, what can be said about the span of a sequence of monomials (x n ),
where 0 < 1 < 2 < ? Of course, we'll have to assume that 0 0, but it's not hard
P
 << ñòð. 4(âñåãî 5)ÑÎÄÅÐÆÀÍÈÅ >>