ñòð. 4 |

t

1; 1n = n+2

= sin 2n + 4

2 2 2n

and so we have

2 1 1

T

En (f) !f n < 6 !f n :

1+ 2

Jackson's theorems are what we might call direct theorems. If we know something

about f, then we can say something about En(f). There is also the notion of an inverse

theorem, meaning that if we know something about En(f), we should be able to say

something about f. In other words, we would expect an inverse theorem to be, more or

less, the converse of some direct theorem. Now inverse theorems are typically much harder

to prove than direct theorems, but in order to have some idea of what such theorems might

tell us (and to see some of the techniques used in their proofs), we present one of the easier

inverse theorems, due to Bernstein. This result gives the converse to one of our corollaries

to Jackson's theorem (see the top of page 105).

Theorem. If f 2 C 2 satis es En (f) A n; , for some constants A and 0 < < 1,

T

then f 2 lipK for some constant K.

2 Tn so that kf ; Un k A n; . Then, in particular,

Proof. For each n, choose Un

(Un) converges uniformly to f. Now if we set V0 = U1 and Vn = U2n ; U2n;1 for n 1,

P

then Vn 2 T2n and f = 1 Vn . Indeed,

n=0

kVn k kU2n ; fk + kU2n;1 ; fk A (2n ); + A (2n;1 ); = B 2;n

P

which is summable thus, the (telescoping) series 1 Vn converges uniformly to f.

n=0

(Why?)

Next we estimate jf(x) ; f(y)j using nitely many of the Vn's, the precise number to

be speci ed later. Using the mean value theorem and Bernstein's inequality we get

X

1

jf(x) ; f(y)j jVn(x) ; Vn(y)j

n=0

Jackson's Theorems 109

X X

1

m;1

jVn(x) ; Vn (y)j + 2 kVn k

n=m

n=0

X 0 ( n )j jx ; yj + 2 X kVn k

1

m;1

jVn

=

n=m

n=0

X X

1

m;1

jx ; yj 2nkVnk + 2 kVnk

n=0 n=m

X X

1

m;1

B 2;n

jx ; yj B 2n(1; ) +2

h i

n=m

n=0

C jx ; yj 2m(1; ) + 2;m ()

where we've used, in the fourth line, the fact that Vn 2 T2n and, in the last line, standard

estimates for geometric series. Now we want the right-hand side to be dominated by a

constant times jx ; yj . In other words, if we set jx ; yj = , then we want

2m(1; ) + 2;m D

or, equivalently,

(2m )(1; ) + (2m ); D:

Thus, we should choose m so that 2m is both bounded above and bounded away from

zero. For example, if 0 < < 1, we could choose m so that 1 2m < 2.

In order to better explain the phrase \more or less the converse of some direct theo-

rem," let's see how the previous result falls apart when = 1. Although we might hope

that En (f) A=n would imply that f 2 lipK 1, it happens not to be true. The best result

T

in this regard is due to Zygmund, who gave necessary and su cient conditions on f so

T

that En (f) A=n (and these conditions do not characterize lipK 1 functions). Instead of

pursuing Zygmund's result, we'll settle for simple \surgery" on our previous result, keeping

an eye out for what goes wrong. This result is again due to Bernstein.

Theorem. If f 2 C 2 satis es En (f) A=n, then !f ( ) K j log j for some constant

T

K and all su ciently small.

Proof. If we repeat the previous proof, setting = 1, only a few lines change. In

particular, the conclusion of that long string of inequalites ( ) would now read

C jx ; yj m + 2;m = C m + 2;m :

jf(x) ; f(y)j

Jackson's Theorems 110

Clearly, the right-hand side cannot be dominated by a constant times , as we might have

hoped, for this would force m to be bounded (independent of ), which in turn bounds

away from zero. But, if we again think of 2m as the \variable" in this inequality, then the

term m suggests that the correct order of magnitude of the right-hand side is j log j.

Thus, we would try to nd a constant D so that

m + 2;m D jlog j

or

m(2m ) + 1 D (2m )j log j:

Now if we take 0 < < 1=2, then log 2 < ; log = j log j. Hence, if we again choose

m 1 so that 1 2m < 2, we'll get

m log 2 + log < log 2 =) m < log 2 ; 2 < log 2 j log j

log 2

log

and, nally,

6 j log j 6 (2m ) jlog j:

m(2m ) + 1 2m + 1 3m log 2 log 2

Orthogonal Polynomials

Math 682 6/17/98

Given a positive (except possibly at nitely many points), Riemann integrable weight

function w(x) on a b ], the expression

Zb

hf gi = f(x) g(x) w(x) dx

a

de nes an inner product on C a b ] and

! 1=2

Zb p

kfk2 = hf fi

f(x)2 w(x) dx =

a

de nes a strictly convex norm on C a b ]. Thus, given a nite dimensional subspace E of

C a b ] and an element f 2 C a b ], there is a unique g 2 E such that

kf ; gk2 = min kf ; hk2:

h2E

We say that g is the least-squares approximation to f out of E (relative to w).

Now if we apply the Gram-Schmidt procedure to the sequence 1 x x2 : : :, we will

arrive at a sequence (Qn ) of orthogonal polynomials relative to the above inner product.

In this special case, however, the Gram-Schmidt procedure simpli es substantially:

Theorem. The following procedure de nes a sequence (Qn ) of orthogonal polynomials

(relative to w). Set:

Q0 (x) = 1 Q1(x) = x ; a0 and Qn+1(x) = (x ; an )Qn (x) ; bnQn;1 (x)

for n 1, where

an = hxQn Qn i h Qn Qn i and bn = h x Qn Qn;1 i hQn;1 Qn;1 i

(and where x Qn is shorthand for the polynomial x Qn (x)).

Proof. It's easy to see from these formulas that Qn is a monic polynomial of degree

exactly n. In particular, the Qn 's are linearly independent (and nonzero).

Orthogonal Polynomials 112

Now we checked in class that Q0 , Q1 , and Q2 are mutually orthogonal, so let's use

induction and check that Qn+1 is orthogonal to each Qk , k n. First,

hQn+1 Qn i = h x Qn Qn i ; anhQn Qn i ; bn hQn;1 Qn i = 0

and

hQn+1 Qn;1 i = h x Qn Qn;1 i ; anh Qn Qn;1 i ; bnh Qn;1 Qn;1 i = 0

since hQn;1 Qn i = 0. Next, we take k < n ; 1 and use the recurrence formula twice:

h Qn+1 Qk i = h x Qn Qk i ; an h Qn Qk i ; bnhQn;1 Qk i

= h x Qn Qk i = h Qn x Qk i (Why?)

= h Qn Qk+1 + ak Qk + bk Qk;1 i = 0

since k + 1 < n.

Observations

1. Using the same trick as above, we have

bn = h x Qn Qn;1 i h Qn;1 Qn;1 i = h Qn Qn i hQn;1 Qn;1 i > 0:

2. Each p 2 Pn can be uniquely written p = Pn iQi , where i = hp Qi i hQi Qi i.

i=0

3. If Q is any monic polynomial of degree exactly n, then Q = Qn + Pi=0 i Qi (why?)

n;1

and hence

X

n;1

kQk2 = kQnk2 + 2 kQi k2 > kQnk2

i

2 2 2 2

i=0

unless Q = Qn. That is, Qn has the least k k2 norm of all monic polynomials of

degree n.

4. The Qn 's are unique in the following sense: If (Pn) is another sequence of orthogonal

polynomials such that Pn has degree exactly n, then Pn = nQn for some n 6= 0.

(Why?) Consequently, there's no harm in referring to the Qn 's as the sequence of

orthogonal polynomials relative to w.

5. For n 1 note that Rab Qn(t) w(t) dt = hQ0 Qn i = 0.

Orthogonal Polynomials 113

Examples

1. On ;1 1 ], the Chebyshev polynomials of the rst kind (Tn) are orthogonal relative

p

to the weight w(x) = 1= 1 ; x2 .

80 m = n

Z1 Z

dx = cos m cos n d = < 6

Tm (x) Tn (x) p : =2 m = n = 0

1;x 2

6

;1 m = n = 0.

0

Since Tn has degree exactly n, this must be the right choice. Notice, too, that

p

p2 T0 T1 T2 : : : are orthonormal relative to the weight 2= 1 ; x2 .

1

In terms of the inductive procedure given above, we must have Q0 = T0 = 1

and Qn = 2;n+1Tn for n 1. (Why?) From this it follows that an = 0, b1 = 1=2,

and bn = 1=4 for n 2. (Why?) That is, the recurrence formula given in our rst

Theorem reduces to the familar relationship Tn+1(x) = 2x Tn (x);Tn;1 (x). Curiously,

Qn = 2;n+1Tn minimizes both

Z1 1=2

2 p dx

max jp(x)j and p(x)

1 ; x2

;1

;1 x 1

over all monic polynomials of degree exactly n.

00 0

The Chebyshev polynomials also satisfy (1 ;x2 ) Tn (x) ; x Tn (x) + n2 Tn (x) = 0.

Since this is a polynomial identity, it su ces to check it for all x = cos . In this case,

Tn(x) = n sin n

sin

0

and

00 (x) = n cos n sin ; n sinn cos :

2

Tn

sin2 (; sin )

Hence,

00 0

(1 ; x2 ) Tn (x) ; x Tn (x) + n2 Tn(x)

= ;n2 cos n + n sinn cot ; n sinn cot + n2 cos = 0

2. On ;1 1 ], the Chebyshev polynomials of the second kind (Un) are orthogonal relative

p

to the weight w(x) = 1 ; x2 .

Z1

Um (x) Un (x) (1 ; x2 ) p dx 2

1;x

;1

Z sin (m + 1) sin(n + 1) 6

m=n

sin2 d = 0

= =2 m = n.

sin sin

0

Orthogonal Polynomials 114

While we're at it, notice that

Tn(x) = n sin = n Un;1(x):

sinn

0

As a rule, the derivatives of a sequence of orthogonal polynomials are again orthogonal

polynomials, but relative to a di erent weight.

3. On ;1 1 ] with weight w(x) 1, the sequence (Pn ) of Legendre polynomials are

orthogonal, and are typically normalized by Pn(1) = 1. The rst few Legendre poly-

nomials are P0(x) = 1, P1(x) = x, P2(x) = 2 x2 ; 1 , and P3(x) = 2 x3 ; 3 x. (Check

3 5

2 2

this!) After we've seen a few more examples, we'll come back and give an explicit

formula for Pn.

4. All of the examples we've seen so far are special cases of the following: On ;1 1 ],

consider the weight w(x) = (1 ; x) (1 + x) , where > ;1. The corresponding

orthogonal polynomials (Pn( )) are called the Jacobi polynomials and are typically

normalized by requiring that

Pn( )(1) = n + ( + 1)( + 2) ( + n) :

n!

It follows that Pn(0 0) = Pn,

Pn(;1=2 ;1=2) = 1 3 5 2nn! ; 1) Tn

(2n

and

Pn(1=2 1=2) = 1 3 2n(n + 1)!+ 1) Un :

5 (2n

The polynomials Pn( ) are called ultraspherical polynomials.

5. There are also several classical examples of orthogonal polynomials on unbounded

intervals. In particular,

w(x) = e;x

(0 1) Laguerre polynomials,

w(x) = x e;x

(0 1) generalized Laguerre polynomials,

(;1 1) w(x) = e;x2 Hermite polynomials.

Since Qn is orthogonal to every element of Pn;1, a fuller understanding of Qn will

follow from a characterization of the orthogonal complement of Pn;1. We begin with an

easy fact about least-squares approximations in inner product spaces.

Orthogonal Polynomials 115

Lemma. Let E be a nite dimensional subspace of an inner product space X, and let

x 2 X n E. Then, y 2 E is the least-squares approximation to x out of E (a.k.a. the

nearest point to x in E) if and only if hx ; y yi = 0 for every y 2 E that is, if and only

if (x ; y ) ? E.

We've taken E to be nite dimensional so that nearest points will exist since

Proof.

X is an inner product space, nearest points must also be unique (see the exercises for a

proof that every inner product norm is strictly convex).]

((=) First suppose that (x ; y ) ? E. Then, given any y 2 E, we have

kx ; yk2 = k(x ; y ) + (y ; y)k2 = kx ; y k2 + ky ; yk2

2 2 2 2

because y ; y 2 E and, hence, (x ; y ) ? (y ; y). Thus, kx ; yk > kx ; y k unless

y = y that is, y is the (unique) nearest point to x in E.

(=)) Suppose that x ; y is not orthogonal to E. Then, there is some y 2 E

with kyk = 1 such that = hx ; y yi 6= 0. Now I claim that y + y 2 E is a better

approximation to x than y (and y + y 6= y , of course) that is, y is not the least-squares

approximation to x. To see this, we again compute:

kx ; (y + y)k2 = k(x ; y ) ; yk2 = h(x ; y ) ; y (x ; y ) ; yi

2 2

= kx ; y k2 ; 2 hx ; y yi + 2

2

= kx ; y k2 ; 2 < kx ; y k2:

2 2

Thus, we must have hx ; y yi = 0 for every y 2 E.

Lemma 1. (Integration by-parts.)

Zb Zb

i

X

n

(k;1) b

(;1)k;1 u(n;k)v + (;1)n uv(n):

u(n)v = a

a a

k=1

Now if v is a polynomial of degree < n, then v(n) = 0 and we get:

Zb

Lemma 2. f 2 C a b ] satis es f(x) p(x) w(x) dx = 0 for all polynomials p 2 Pn;1 if

a

and only if there is an n-times di erentiable function u on a b ] satisfying fw = u(n) and

u(k)(a) = u(k)(b) = 0 for all k = 0 1 : : : n ; 1.

R b fpw =

Proof. One direction is clear from Lemma 1: Given u as above, we would have a

R b u(n)p = (;1)n R b up(n) = 0.

a a

Orthogonal Polynomials 116

R

So, suppose we have that ab fpw = 0 for all p 2 Pn;1. By integrating fw repeatedly,

choosing constants appropriately, we may de ne a function u satisfying fw = u(n) and

u(k)(a) = 0 for all k = 0 1 : : : n ; 1. We want to show that the hypotheses on f force

u(k)(b) = 0 for all k = 0 1 : : : n ; 1.

Now Lemma 1 tells us that

Zb X

n

(;1)k;1 u(n;k)(b) p(k;1) (b)

fpw =

0=

a k=1

for all p 2 Pn;1. But the numbers p(b) p0 (b) : : : p(n;1)(b) are completely arbitrary that

is (again by integrating repeatedly, choosing our constants as we please), we can nd

polynomials pk of degree k < n such that p(k)(b) 6= 0 and p(j)(b) = 0 for j 6= k. In

k k

fact, pk (x) = (x ; b)k works just ne! In any case, we must have u(k)(b) = 0 for all

k = 0 1 : : : n ; 1.

Rolle's theorem tells us a bit more about the functions orthogonal to Pn;1:

Zb

Lemma 3. If w(x) > 0 in (a b), and if f 2 C a b ] satis es f(x) p(x) w(x) dx = 0 for

a

all polynomials p 2 Pn;1, then f has at least n distinct zeros in the open interval (a b).

Proof. Write fw = u(n), where u(k)(a) = u(k)(b) = 0 for all k = 0 1 : : : n ; 1. In

particular, since u(a) = u(b) = 0, Rolle's theorem tells us that u0 would have at least one

zero in (a b). But then u0 (a) = u0 (c) = u0(b) = 0, and so u00 must have at least two zeros

in (a b). Continuing, we nd that fw = u(n) must have at least n zeros in (a b). Since

w > 0, the result follows.

Corollary. Let (Qn ) be the sequence of orthogonal polynomials associated to a given

weight w with w > 0 in (a b). Then, the roots of Qn are real, simple, and lie in (a b).

Lemma 4. If p is the least-squares approximation to f 2 C a b ] out of Pn;1, and if

w > 0 in (a b), then f ; p has at least n distinct zeros in (a b).

Proof. The least-squares approximation satis es hf ; p p i = 0 for all p 2 Pn;1 .

The sheer volume of literature on orthogonal polynomials and other \special func-

tions" is truly staggering. We'll content ourselves with the Legendre and the Chebyshev

Orthogonal Polynomials 117

polynomials. In particular, let's return to the problem of nding an explicit formula for

the Legendre polynomials. We could, as Rivlin does, use induction and a few observations

that simplify the basic recurrence formula (you're encouraged to read this see pp. 53{54).

Instead we'll give a simple (but at rst sight intimidating) formula that is of use in more

general settings than ours.

Lemma 2 (with w 1 and a b ] = ;1 1 ]) says that if we want to nd a polynomial

f of degree n that is orthogonal to Pn;1, then we'll need to take a polynomial for u,

and this u will have to be divisible by (x ; 1)n (x + 1)n. (Why?) That is, we must have

Pn(x) = cn Dn (x2 ; 1)n , where D denotes di erentiation, and where we nd cn by

evaluating the right-hand side at x = 1.

X n k n;k

n

Lemma 5. (Leibniz's formula) Dn (fg) k D (f) D (g).

=

k=0

; ; ;

Proof. Induction and the fact that n;1 + n;1 = n .

k;1 k k

P;

Consequently, Q(x) = Dn (x;1)n (x+1)n = n n Dk (x;1)n Dn;k (x+1)n and

k=0 k

it follows that Q(1) = 2nn! and Q(;1) = (;1)n2nn!. This, nally, gives us the formula

discovered by Rodrigues in 1814:

Pn(x) = 2n1n! Dn (x2 ; 1)n :

The Rodrigues formula is quite useful (and easily generalizes to the Jacobi polynomials).

Observations

6. By Lemma 3, the roots of Pn are real, distinct, and lie in (;1 1).

7. (x2 ; 1)n = Pn (;1)k ;n x2n;2k . If we apply 2n1n! Dn and simplify, we get another

k=0 k

formula for the Legendre polynomials.

1 X (;1)k n

n=2]

2n ; 2k xn;2k :

Pn(x) = 2n k n

k=0

In particular, if n is even (odd), then Pn is even (odd). Notice, too, that if we let

e

Pn denote the polynomial given by the standard construction, then we must have

;e

Pn = 2;n 2n Pn .

n

Orthogonal Polynomials 118

8. In terms of our standard recurrence formula, it follows that an = 0 (because xPn(x)2

is always odd). It remains to compute bn. First, integrating by parts,

Z1 Z1

i

21 0

Pn(x)2 dx = xPn(x) ;1 ; x 2Pn(x) Pn (x) dx

;1 ;1

0 0

or hPn Pn i = 2 ; 2h Pn xPn i. But xPn = nPn + lower degree terms hence,

0

h Pn xPn i = nh Pn Pn i. Thus, hPn Pn i = 2=(2n + 1). Using this and the fact

;n ;2n Pn , we'd nd that bn = n2 =(4n2 ; 1). Thus,

that Pn = 2 n e

n2

Pn+1 = 2;n;1 2n+ 12 Pn+1 = 2;n;1 2n+ 12

+e + e e

x Pn ; (4n2 ; 1) Pn;1

n n

n

= 2n+ 11 x Pn ; n + 1 Pn;1:

+

n

That is, the Legendre polynomials satisfy the recurrence formula

(n + 1) Pn+1(x) = (2n + 1) x Pn (x) ; n Pn;1 (x):

q 2n+1

b

9. It follows from 8 that the sequence Pn = 2 Pn is orthonormal on ;1 1 ].

10. The Legendre polynomials satisfy (1 ; x2 ) Pn (x) ; 2x Pn (x) + n (n + 1) Pn(x) = 0. If

00 0

we set u = (x2 ; 1)n that is, if u(n) = 2nn!Pn, note that u0 (x2 ; 1) = 2nxu. Now we

apply Dn+1 to both sides of this last equation (using Leibniz's formula) and simplify:

u(n+2)(x2 ; 1) + (n + 1) u(n+1) 2x + (n + 1)n u(n) 2 = 2n u(n+1) x + (n + 1) u(n)

2

=) (1 ; x2 ) u(n+2) ; 2x u(n+1) + n (n + 1) u(n) = 0:

11. Through a series of exercises, similar in spirit to 10, Rivlin shows that jPn(x)j 1 on

;1 1 ]. See pp. 63{64 of Rivlin for details.

Given an orthogonal sequence, it makes sense to consider \generalized Fourier series"

relative to the sequence and to nd analogues of the Dirichlet kernel, Lebesgue's theorem,

and so on. In case of the Legendre polynomials we have the following:

Example. The \Fourier-Legendre" series for f 2 C ;1 1 ] is given by Pk hf Pk i Pk ,

bb

r

where Z1

Pk = 2k 2 1 Pk

+

b b b

hf Pk i = f(x) Pk (x) dx:

and

;1

Orthogonal Polynomials 119

P bb

The partial sum operator Sn(f) = n hf Pk i Pk is a linear projection onto Pn and may

k=0

be written as Z1

Sn(f)(x) = f(t) Kn (t x) dt

;1

Pbb

where Kn(t x) = n Pk (t) Pk (x). (Why?)

k=0

b

Since the Pk 's are orthonormal, we have

X X

1

n

b b

jhf Pk ij2 = kSn(f)k2 kfk2 = jhf Pk ij2

2 2

k=0 k=0

b

and so the generalized Fourier coe cients h f Pk i are square summable in particular,

b

hf Pk i ! 0 as k ! 1. As in the case of Fourier series, the fact that the polynomials

b

(i.e., the span of the Pk 's) are dense in C a b ] implies that Sn(f) actually converges to f

in the k k2 norm. These same observations remain valid for any sequence of orthogonal

polynomials. The real question remains, just as with Fourier series, whether Sn(f) is a

good uniform (or even pointwise) approximation to f.

If you're willing to swallow the fact that jPn(x)j 1, then

X r 2k + 1 r 2k + 1 1 X(2k + 1) = (n + 1)2 :

n n

jKn (t x)j =2

2 2 2

k=0 k=0

Hence, kSn(f)k (n + 1)2 kfk. That is, the \Lebesgue numbers" for this process are at

most (n + 1)2 . The analogue of Lebesgue's theorem in this case would then read:

kf ; Sn(f)k Cn2En(f):

Thus, Sn(f) f whenever n2En(f) ! 0, and Jackson's theorem tells us when this will

happen: If f is twice continuously di erentiable, then the Fourier-Legendre series for f

converges uniformly to f on ;1 1 ].

The Christo el-Darboux Identity

It would also be of interest to have a closed form for Kn (t x). That this is indeed always

possible, for any sequence of orthogonal polynomials, is a very important fact.

Using our original notation, let (Qn ) be the sequence of monic orthogonal polynomials

b

corresponding to a given weight w, and let (Qn ) be the orthonormal counterpart of (Qn )

Orthogonal Polynomials 120

p

b

in other words, Qn = nQn , where n = h Qn Qn i . It will help things here if you recall

(from Observation 1 on page 112) that 2 = bn 2 .

n n;1

As with the Legendre polynomials, each f 2 C a b ] is represented by the generalized

P bb

Fourier series k hf Qk i Qk , with partial sum operator

Zb

Sn(f)(x) = f(t) Kn (t x) w(t) dt

a

P

k=0 b b

where Kn (t x) = n Qk (t) Qk (x). As before, Sn is a projection onto Pn in particular,

Sn(1) = 1 for every n.

Theorem. (Christo el-Darboux) The kernel Kn(t x) can be written

Xb b b b bb

n

;1 Qn+1 (t) Qn (x) ; Qn (t) Qn+1 (x) :

Qk (t) Qk (x) = n+1 n t;x

k=0

Proof. We begin with the standard recurrence formulas

Qn+1(t) = (t ; an ) Qn (t) ; bn Qn;1(t)

Qn+1(x) = (x ; an ) Qn (x) ; bnQn;1 (x)

(where b0 = 0). Multiplying the rst by Qn(x), the second by Qn(t), and subtracting:

Qn+1(t) Qn (x) ; Qn (t) Qn+1 (x)

= (t ; x) Qn (t) Qn (x) + bn Qn(t) Qn;1 (x) ; Qn (x) Qn;1 (t)

(and again, b0 = 0). If we divide both sides of this equation by 2 we get

n

;2 Qn+1 (t) Qn (x) ; Qn (t) Qn+1 (x)

n

bb

= (t ; x) Qn (t) Qn (x) + ;2 Qn(t) Qn;1 (x) ; Qn (x) Qn;1 (t) :

n;1

Thus, we may repeat the process arriving nally at

Xb

n

b

;2 Qn+1 (t) Qn (x) ; Qn (t) Qn+1 (x) = (t ; x) Qn (t) Qn (x):

n

k=0

b

The Christo el-Darboux identity now follows by writing Qn = nQn , etc.

And we now have a version of the Dini-Lipschitz theorem:

Orthogonal Polynomials 121

Theorem. Let f 2 C a b ] and suppose that at some point x0 in a b ] we have

(i) f is Lipschitz at x0 that is, jf(x0 ) ; f(x)j Kjx0 ; xj for some constant K and all

x in a b ] and

b

(ii) the sequence (Qn (x0 )) is bounded.

P bb

Then, the series k hf Qk i Qk (x0 ) converges to f(x0 ).

Proof. First note that the sequence n+1 ;1 is bounded: Indeed, by Cauchy-Schwarz,

n

= h Qn+1 Qn+1 i = h Qn+1 x Qn i

2

n+1

kQn+1k2 kx k kQnk2 = maxfjaj jbjg n+1 n:

Thus, n+1 ;1 c = maxfjaj jbjg. Now, using the Christo el-Darboux identity,

n

Zb

Sn(f)(x0 ) ; f(x0 ) = f(t) ; f(x0 ) Kn(t x0 ) w(t) dt

a

Z b f(t) ; f(x0 )

b b bb

= n+1 ;1 Qn+1(t) Qn (x0 ) ; Qn (t) Qn+1 (x0 ) w(t) dt

n t ; x0

a

bb bb

;1 h h Qn+1 i Qn (x0 ) ; h h Qn i Qn+1 (x0 )

= n+1 n

where h(t) = (f(t) ; f(x0 ))=(t ; x0 ). But h is bounded (and continuous everywhere

b

except, possibly, at x0 ) by hypothesis (i), n+1 ;1 is bounded, and Qn(x0 ) is bounded by

n

b

hypothesis (ii). All that remains is to notice that the numbers hh Qn i are the generalized

Fourier coe cients of the bounded, Riemann integrable function h, and so must tend to

zero (since, in fact, they're even square summable).

We end this section with a negative result, due to Nikolaev:

Theorem. There is no weight w such that every f 2 C a b ] has a uniformly convergent

expansion in terms of orthogonal polynomials. In fact, given any w, there is always some

f for which kf ; Sn(f)k is unbounded.

Problem Set: Orthogonal Polynomials

Math 682 6/17/98

Throughout, w denotes a xed, positive (except possibly at nitely many points), Riemann

integrable weight function on a b ], and we consider the inner product on C a b ] de ned

by

Zb

hf gi = f(x) g(x) w(x) dx

a

and the associated (strictly convex) norm

! 1=2

Zb

q

kfk2 = hf fi = jf(x)j2 w(x) dx :

a

62. Prove that every inner product norm is strictly convex. Speci cally, let h i be an

phx xi be the associated norm.

inner product on a vector space X, and let kxk =

Show that:

(a) kx+yk2 +kx;yk2 = 2 (kxk2 +kyk2) for all x, y 2 X (the parallelogram identity).

(b) If kxk = r = kyk and if kx ; yk = , then x+y 2 = r2 ; ( =2)2 . In particular,

2

x+y < r whenever x 6= y.

2

We de ne a sequence of polynomials (Qn ) which are mutually orthogonal, relative to w,

by setting Q0(x) = 1, Q1 (x) = x ; a0 , and

Qn+1(x) = (x ; an)Qn (x) ; bn Qn;1(x) for n 1, where

an = hxQn Qn i h Qn Qn i and bn = hxQn Qn;1 i hQn;1 Qn;1 i

(and where x Qn is shorthand for the polynomial x Qn (x)).

63. Check that Qn is a monic polynomial of degree exactly n.

64. If (Pn ) is another sequence of orthogonal polynomials such that Pn has degree exactly

n, for each n, show that Pn = nQn for some n 6= 0. In particular, if Pn is a monic

polynomial, then Pn = Qn . Hint: Choose n so that Pn ; nQn 2 Pn;1 and note

that (Pn ; nQn) ? Pn;1. Conclude that Pn ; nQn = 0.]

65. Check that hxQn Qn;1 i = hQn Qn i, and conclude that bn > 0 for each n.

Orthogonal Polynomials 123

66. Given f 2 C a b ] and n 0, prove that qn 2 Pn is the least-squares approximation

to f out of Pn (with respect to w) if and only if

Zb

hf ; qn p i = (f(x) ; qn (x)) p(x) w(x) dx = 0

a

for every p 2 Pn that is, if and only if (f ; qn) ? Pn.

67. If f 2 C a b ] but f 2 Pn, show that f ; qn changes sign at n + 1 (or more) points in

=

(a b). Hint: If not, show that there is a polynomial p 2 Pn such that (f ; qn) p 0

(but (f ; qn) p 6= 0) in (a b). Now appeal to the result in problem 66 to arrive at a

contradiction.]

68. Show that the least-squares approximation to f(x) = xn out of Pn;1 (relative to w)

is qn;1(x) = xn ; Qn (x).

69. Show that Qn has n distinct, simple zeros in (a b). Hint: Combine 67 and 68.]

70. Given f 2 C a b ], let pn denote the best uniform approximation to f out of Pn and

let qn denote the least-squares approximation to f out of Pn. Show that kf ; qnk2

kf ; pnk2 and conclude that kf ; qn k2 ! 0 as n ! 1.

71. Show that the Chebyshev polynomials of the rst kind, (Tn ), and of the second kind,

(Un), satisfy the identities

Tn(x) = Un(x) ; x Un;1 (x) and (1 ; x2 ) Un;1 (x) = x Tn (x) ; Tn+1(x):

72. Show that the Chebyshev polynomials of the second kind, (Un), satisfy the recurrence

relation

Un+1(x) = 2x Un (x) ; Un;1(x) n 1

where U0 (x) = 1 and U1(x) = 2x. Please compare this with the recurrence relation

satis ed by the Tn's!]

Gaussian Quadrature

Math 682 6/23/98

Numerical integration, or quadrature, is the process of approximating the value of a de nite

R

integral ab f(x) w(x) dx based only on a nite number of values or \samples" of f (much

like a Riemann sum). A linear quadrature formula takes the form

Zb X

n

f(x) w(x) dx Ak f(xk )

a k=1

where the nodes (xk ) and the weights (Ak ) are at our disposal. (Note that both sides of

the formula are linear in f.)

Example. Consider the quadrature formula

Z1 X

1 n;1 f 2k + 1

I(f) = f(x) dx = In(f):

n k=;n 2n

;1

R

1

If f is continuous, then we clearly have In(f) ! ;1 f as n ! 1. (Why?) But in the

particular case f(x) = x2 we have (after some simpli cation)

X X

1 n;1 2k + 1 1 n;1(2k + 1)2 = 2 ; 1 :

2

In(f) = n = 2n3 3 6n2

2n

k=;n k=0

That is, j In(f) ; I(f) j = 1=6n2. In particular, we would need to take n 130 to get

1=6n2 10;5, for example, and this would require that we perform over 250 evaluations

of f. We'd like a method that converges a bit faster! In other words, there's no shortage

of quadrature formulas|we just want faster ones.

One reasonable requirement for our proposed quadrature formula is that it be exact

for polynomials of low degree. As it happens, this is easy to come by.

Lemma 1. Given w(x) on a b ] and nodes a x1 < < xn b, there exist unique

weights A1 : : : An such that

Zb X

n

p(x) w(x) dx = Ai p(xi )

a i=1

Gaussian Quadrature 125

for all polynomials p 2 Pn;1.

Proof. Let `1 : : : `n be the Lagrange interpolating polynomials of degree n ; 1 associ-

P

ated to the nodes x1 : : : xn , and recall that we have p = n p(xi ) `i for all p 2 Pn;1.

i=1

Hence, Zb X Zb

n

p(x) w(x) dx = p(xi ) `i(x) w(x) dx:

a a

i=1

R

That is, Ai = ab `i (x) w(x) dx works. To see that this is the only choice, suppose that

Zb X

n

p(x) w(x) dx = Bi p(xi )

a i=1

is exact for all p 2 Pn;1, and set p = `j :

Zb X

n

Aj = `j (x) w(x) dx = Bi `j (xi ) = Bj :

a i=1

The point here is that `1 : : : `n form a basis for Pn;1 and integration is linear thus,

integration is completely determined by its action on the basis|that is, by the n values

Ai = I(`i ), i = 1 : : : n.

T

Said another way, the n point evaluations i (p) = p(xi ) satisfy Pn;1 \ ( n ker i ) =

i=1

f0g, and it follows that every linear, real-valued function on Pn;1 must be a linear com-

bination of the i 's. Here's why: Since the xi 's are distinct, Pn;1 may be identi ed with

Rn by way of the isomorphism p 7! (p(x1 ) : : : p(xn )). A linear, real-valued function on

Pn;1 must, then, correspond to some linear, real-valued function on Rn. In other words,

it's given by inner product against some xed vector (A1 : : : An ) in particular, we must

Pn A p(x ).

have I(p) = i=1 i i

In any case, we now have our quadrature formula: For f 2 C a b ] we de ne In(f) =

Pn A f(x ), where A = R b ` (x) w(x) dx. But notice that the proof of our last result

i=1 i i i ai

suggests an alternate way of writing our quadrature formula. Indeed, if Ln;1(f)(x) =

Pn f(x )` (x) is the Lagrange interpolating polynomial for f of degree n ; 1 based on

i=1 i i

the nodes x1 : : : xn , then

Zb Zb

X X

n n

(Ln;1(f))(x) w(x) dx = f(xi ) `i(x) w(x) dx = Ai f(xi ):

a a

i=1 i=1

Gaussian Quadrature 126

In summary, In(f) = I(Ln;1(f)) I(f) that is,

Zb Zb

X

n

In(f) = Ai f(xi ) = (Ln;1(f))(x) w(x) dx f(x) w(x) dx = I(f)

a a

i=1

where Ln;1 is the Lagrange interpolating polynomial of degree n ; 1 based on the nodes

x1 : : : xn . This formula is obviously exact for f 2 Pn;1.

It's easy to give a bound on jIn(f)j in terms of kfk indeed,

!

X X

n n

jIn(f)j jAi jjf(xi )j kfk jAi j :

i=1 i=1

By considering a norm one continuous function f satisfying f(xi ) = sgnAi for each i =

P

1 : : : n, it's easy to see that n jAi j is the smallest constant that works in this inequality.

i=1

Pn jA j, n = 1 2 : : :, are the \Lebesgue numbers" for this process.

In other words, n = i=1 i

As with all previous settings, we want these numbers to be uniformly bounded.

If w(x) 1 and if f is n-times continuously di erentiable, we even have an error

estimate for our quadrature formula:

Zb Zb Zb 1 kf (n)k Z b Y jx ; x j dx

n

f; jf ; Ln;1(f)j

Ln;1(f) i

n!

a a a a i=1

(recall the Theorem on page 72 of \A Brief Introduction to Interpolation"). As it happens,

the integral on the right is minimized when the xi 's are taken to be the zeros of the

Chebyshev polynomial Un (see Rivlin, page 72).

The fact that a quadrature formula is exact for polynomials of low degree does not by

P

itself guarantee that the formula is highly accurate. The problem is that n Ai f(xi ) may

i=1

be estimating a very small quantity through the cancellation of very large quantities. So,

for example, a positive function may yield a negative result in this approximate integral.

This wouldn't happen if the Ai 's were all positive|and we've already seen how useful

positivity can be. Our goal here is to further improve our quadrature formula to have this

property. But we have yet to take advantage of the fact that the xi 's are at our disposal.

We'll let Gauss show us the way!

Theorem. (Gauss) Fix a weight w(x) on a b ], and let (Qn ) be the canonical sequence of

orthogonal polynomials relative to w. Given n, let x1 : : : xn be the zeros of Qn (these all

Gaussian Quadrature 127

R

P

lie in (a b)), and choose A1 : : : An so that the formula n Ai f(xi ) ab f(x) w(x) dx

i=1

is exact for polynomials of degree less than n. Then, in fact, the formula is exact for all

polynomials of degree less than 2n.

Proof. Given a polynomial P of degree less than 2n, we may divide: P = Qn R + S,

where R and S are polynomials of degree less than n. Then,

Zb Zb Zb

P(x) w(x) dx = Qn (x) R(x) w(x) dx + S(x) w(x) dx

a

Zab a

S(x) w(x) dx since deg R < n

=

a

X

n

Ai S(xi ) since deg S < n:

=

i=1

R b P(x) w(x) dx =

But P(xi ) = Qn(xi ) R(xi ) + S(xi ) = S(xi ), since Qn(xi ) = 0. Hence, a

Pn A P(x ) for all polynomials P of degree less than 2n.

i=1 i i

Amazing! But, well, not really: P2n;1 is of dimension 2n, and we had 2n numbers

x1 : : : xn and A1 : : : An to choose as we saw t. Said another way, the division algorithm

tells us that P2n;1 = QnPn;1 Pn;1. Since QnPn;1 ker(In), the action of In on P2n;1

is the same as its action on a \copy" of Pn;1.

In still other words, since any polynomial that vanishes at all the xi 's must be divisible

T

by Qn (and conversely), we have QnPn;1 = P2n;1 \ ( n ker i ) = ker(In jP2n;1 ). Thus,

i=1

In \factors through" the quotient space P2n;1=QnPn;1 = Pn;1.

Also not surprising is that this particular choice of xi 's is unique.

Lemma 2. Suppose that a x1 < < xn b and A1 : : : An are given so that the

R P

equation ab P(x) w(x) dx = n Ai P(xi ) is satis ed for all polynomials P of degree less

i=1

than 2n. Then, x1 : : : xn are the zeros of Qn .

Qn

i=1 (x ; xi ).

Let Q(x) = Then, for k < n, the polynomial Q Qk has degree

Proof.

n + k < 2n. Hence,

Zb X

n

Q(x) Qk (x) w(x) dx = Ai Q(xi ) Qk (xi ) = 0:

a i=1

Gaussian Quadrature 128

Since Q is a monic polynomial of degree n which is orthogonal to each Qk , k < n, we must

have Q = Qn. Thus, the xi 's are actually the zeros of Qn .

According to Rivlin, the phrase Gaussian quadrature is usually reserved for the speci c

R1 R1

quadrature formula whereby ;1 f(x) dx is approximated by ;1 (Ln;1(f))(x) dx, where

Ln;1(f) is the Lagrange interpolating polynomial to f using the zeros of the n-th Legendre

polynomial as nodes. (What a mouthful!) What is actually being described in our version

of Gauss's theorem is Gaussian-type quadrature.

Before computers, Gaussian quadrature was little more than a curiosity the roots

of Qn are typically irrational, and certainly not easy to come by. By now, though, it's

considered a standard quadrature technique. In any case, we still can't judge the quality

of Gauss's method without a bit more information.

Gaussian-type Quadrature

First, let's summarize our rather cumbersome notation.

orthogonal approximate

polynomial zeros weights integral

x(1) A(1)

Q1 I1

1 1

x(2) x(2) A(2) A(2)

Q2 I2

1 2 1 2

x(3) x(3) x(3) A(3) A(3) A(3)

Q3 I3

1 2 3 1 2 3

.. . . ..

. .

. . . .

P

Hidden here is the Lagrange interpolation formula Ln;1(f) = n f(x(n) ) `(n;1), where

i=1 i i

`(n;1) denote the Lagrange polynomials of degree n ; 1 based on x(n) : : : x(n). The n-th

n

i 1

quadrature formula is then

Zb Zb

X

n

A(n)f(x(n) )

In(f) = Ln;1(f)(x) w(x) dx = f(x) w(x) dx

i i

a a

i=1

which is exact for polynomials of degree less than 2n.

By way of one example, Hermite showed that A(n) = =n for the Chebyshev weight

k

w(x) = (1 ; x2 );1=2 on ;1 1 ]. Remarkably, A(n) doesn't depend on k! The quadrature

k

Gaussian Quadrature 129

formula in this case reads:

Z 1 f(x) dx X

n ;

f cos 2k2n 1

p :

;1 1 ; x2 n k=1

Or, if you prefer,

Z1 X

n ; ;

f cos 2k2n 1 sin 2k2n 1 :

f(x) dx n k=1

;1

(Why?) You can nd full details in Natanson's Constructive Function Theory, Vol. III.

The key result, due to Stieltjes, is that In is positive:

Lemma 3. A(n) : : : A(n) > 0 and Pn A(n) = Rab w(x) dx.

n i=1 i

1

Proof. The second assertion is obvious (since In (1) = I(1) ). For the rst, x 1 jn

and notice that (`(n;1))2 is of degree 2(n ; 1) < 2n. Thus,

j

Z b h (n;1) i2 h i

X

n

(n) `(n;1)(x(n) ) 2

h `(n;1) `(n;1) i = A(n)

0< `j (x) w(x) dx = A

=

j j i j i j

a i=1

because `(n;1)(x(n)) = i j .

j i

Now our last calculation is quite curious what we've shown is that

Zb Z b h (n;1) i2

A(n) = `(n;1)(x) w(x) dx = `j (x) w(x) dx:

j j

a a

The same calculation as above also proves

Corollary. h `(n;1) `(n;1) i = 0 for i 6= j.

i j

Since A(n) : : : A(n) > 0, it follows that In(f) is positive that is, In(f) 0 whenever

n

1

f 0. The second assertion in Lemma 3 tells us that the In's are uniformly bounded:

Zb

X

n

A(n)

jIn(f)j kfk = kfk w(x) dx

i

a

i=1

R b f(x) w(x) dx itself. Given all of this,

and this is the same bound that holds for I(f) = a

proving that In(f) ! I(f) is a piece of cake. The following result is again due to Stieltjes

(a la Lebesgue).

Gaussian Quadrature 130

R b w(x) dx E (f). In particular,

Theorem. In the above notation, jIn(f);I(f)j 2 2n;1

a

In(f) ! I(f) for evey f 2 C a b ].

Proof. Let p be the best uniform approximation to f out of P2n;1 . Then, since

In(p ) = I(p ), we have

jI(f) ; In(f)j jI(f ; p )j + jIn(f ; p )j

Zb X

n

A(n)

kf ; p k w(x) dx + kf ; p k i

a i=1

Zb Zb

= 2 kf ; p k w(x) dx = 2E2n;1(f) w(x) dx:

a a

Computational Considerations

You've probably been asking yourself: \How do I nd the Ai 's without integrating?" Well,

rst let's recall the de nition: In the case of Gaussian-type quadrature we have

Zb ZbQn (x)

A(n) = `(n;1)(x) w(x) dx = (n) ) Q0 (x(n) ) w(x) dx

i i

a (x ; xi

a ni

(because \W " is the same as Qn here|the xi 's are the zeros of Qn ). Next, consider the

function Z b Qn (t) ; Qn (x)

'n(x) = w(t) dt:

t;x

a

Since t;x divides Qn(t);Qn (x), note that 'n is actually a polynomial (of degree at most

n ; 1 ) and that

Z b Qn (t)

= A(n)Q0n (x(n)):

'n(x(n)) = (n) w(t) dt

i i i

t ; xi

a

Now Q0n (x(n)) is readily available we just need to compute 'n(x(n)).

i i

Claim. The 'n's satisfy the same recurrence formula as the Qn's

'n+1(x) = (x ; an )'n(x) ; bn'n;1(x) n 1

but with di erent starting values

Zb

'0(x) 0 '1(x) w(x) dx:

and

a

Gaussian Quadrature 131

Proof. The formulas for '0 and '1 are obviously correct, since Q0 (x) 1 and Q1 (x) =

x ; a0 . We only need to check the recurrence formula itself.

Z b Qn+1(t) ; Qn+1(x)

'n+1(x) = w(t) dt

t;x

a

Z b (t ; an) Qn (t) ; bnQn;1 (t) ; (x ; an ) Qn (x) + bnQn;1 (x)

w(t) dt

= t;x

a

Z b Qn (t) ; Qn (x) Z b Qn;1 (t) ; Qn;1 (x)

= (x ; an) w(t) dt ; bn w(t) dt

t;x t;x

a a

= (x ; an) 'n (x) ; bn 'n;1(x)

R

since ab Qn(t) w(t) dt = 0.

Of course, the derivatives Q0n satisfy a recurrence relation of sorts, too:

Q0n+1(x) = Qn(x) + (x ; an) Q0n (x) ; bn Q0n;1(x):

Q

But Q0n(x(n)) can be computed without knowing Q0n(x). Indeed, Qn(x) = n (x ; x(n)),

i i

i=1

Q (x(n) ; x(n)).

so we have Q0n (x(n)) = j6=i i

i j

The weights A(n), or Christo el numbers, together with the zeros of Qn are tabulated

i

in a variety of standard cases. See, for example, Handbook of Mathematical Functions with

Formulas, Graphs, and Tables, by Abramowitz and Stegun, eds. In practice, of course, it's

enough to tabulate data for the case a b ] = ;1 1 ].

Applications to Interpolation

Although Ln(f) isn't typically a good uniform approximation to f, if we interpolate at the

zeros of an orthogonal polynomial Qn+1, then Ln(f) will be a good approximation in the

k k1 or k k2 norm generated by the corresponding weight w. Speci cally, by rewording

R

our earlier results, it's easy to get estimates for each of the errors ab jf ; Ln(f)j w and

R b jf ; L (f)j2 w. We use essentially the same notation as before, except now we take

n

a

X ;

n+1

f x(n+1) `(n)

Ln(f) = i i

i=1

where x(n+1) : : : x(n+1) are the roots of Qn+1 and `(n) is of degree n. This leads to a

n+1 i

1

quadrature formula that's exact on polynomials of degree less than 2(n + 1).

Gaussian Quadrature 132

As we've already seen, `(n) : : : `(n) are orthogonal and so kLn(f)k2 may be computed

n+1

1

exactly.

R b w(x) dx 1=2

Lemma. kLn(f)k2 kfk .

a

Proof. Since Ln (f)2 is a polynomial of degree 2n < 2(n + 1), we have

Zb

kLn(f)k2 Ln(f)]2 w(x) dx

=

2

a

"n+1 #2

X X ; ;

n+1

A(n+1) f x(n+1) `(n) x(n+1)

= j i i j

j=1 i=1

h ; (n+1) i2

X

n+1

A(n+1) f xj

= j

j=1

Zb

X

n+1

A(n+1) = kfk2

kfk2 w(x) dx:

j

a

j=1

kfk Rab w(x) dx

1=2

Please note that we also have kfk2 that is, this same estimate

holds for kfk2 itself.

As usual, once we have an estimate for the norm of an operator, we also have an

analogue of Lebesgue's theorem.

R 1=2

2 ab w(x) dx

Theorem. kf ; Ln(f)k2 En(f).

Proof. Here we go again! Let p be the best uniform approximation to f out of Pn and

use the fact that Ln(p ) = p to see that:

kf ; Ln(f)k2 kf ; p k2 + kLn(f ; p )k2

! 1=2 !1=2

Zb Zb

kf ; p k + kf ; p k

w(x) dx w(x) dx

a a

!1=2

Z b

w(x) dx :

= 2En(f)

a

Hence, if we interpolate f 2 C a b ] at the zeros of (Qn ), then Ln(f) ! f in k k2

norm. The analogous result for the k k1 norm is now easy:

Gaussian Quadrature 133

R b jf(x) ; L (f)(x)j w(x) dx R b w(x) dx E (f).

Corollary. a 2

n n

a

Proof. We apply the Cauchy-Schwarz inequality:

Zb Zb p p

jf(x) ; Ln(f)(x)j w(x) dx = jf(x) ; Ln(f)(x)j w(x) w(x) dx

a a

!1=2 Z b !1=2

Zb

jf(x) ; Ln(f)(x)j2 w(x) dx w(x) dx

a a

Zb

w(x) dx:

2En(f)

a

R b f(x) dx in terms of R b f(x) w(x) dx

Essentially the same device allows an estimate of a a

(which may be easier to compute).

Corollary. If Rab w(x);1 dx is nite, then

Zb Zb p

jf(x) ; Ln(f)(x)j w(x) p 1 dx

jf(x) ; Ln(f)(x)j dx =

w(x)

a a

!1=2 Z b !1=2

Zb 1 dx

jf(x) ; Ln (f)(x)j2 w(x) dx

a w(x)

a

! 1=2 Z b !1=2

Zb 1 dx

w(x) dx :

2En(f)

a w(x)

a

In particular, the Chebyshev weight satis es

Z 1 dx Z1p

p 1 ; x2 dx = 2 :

= and

1 ; x2

;1 ;1

Thus, interpolation at the zeros of the Chebyshev polynomials (of the rst kind) would

provide good, simultaneous approximation in each of the norms k k1, k k2, and k k.

The Moment Problem

Given a positive, continuous weight function w(x) on a b ], the number

Zb

xk w(x) dx

=

k

a

is called the k-th moment of w. In physical terms, if we think of w(x) as the density of a

thin rod placed on the interval a b ], then 0 is the mass of the rod, 1 = 0 is its center of

Gaussian Quadrature 134

mass, 2 is its moment of inertia (about 0), and so on. In probabilistic terms, if 0 = 1,

then w is the probability density function for some random variable, 1 is the expected

value, or mean, of this random variable, and 2 ; 2 is its variance. The moment problem

1

(or problems, really) concern the inverse procedure. What can be measured in real life are

the moments|can the moments be used to nd the density function?

Questions: Do the moments determine w? Do di erent weights have di erent mo-

ment sequences? If we knew the sequence ( k ), could we nd w? How do we tell if a

given sequence ( k ) is the moment sequence for some positive weight? Do \special"

weights give rise to \special" moment sequences?

Now we've already answered one of these questions: The Weierstrass theorem tells us

that di erent weights have di erent moment sequences. Said another way, if

Zb

xk w(x) dx = 0 for all k = 0 1 2 : : :

a

R b p(x) w(x) dx = 0 for all polynomials p

then w 0. Indeed, by linearity, this says that a

R b w(x)2 dx = 0. (Why?) The remaining questions are harder

which, in turn, tells us that a

to answer. We'll settle for simply stating a few pertinent results.

Given a sequence of numbers ( k ), we de ne the n-th di erence sequence ( n k ) by

0

k= k

k = k ; k+1

1

n k = n;1 k ; n;1 k+1 n 1:

= k ; 2 k+1 + k+2. More generally, induction will show that

2k

For example,

X in

n

n k = (;1)

i k+i :

i=0

In the case of a weight w on the interval 0 1 ], this sum is easy to recognize as an integral.

Indeed,

Z1 X i n Z 1 k+i X in

n n

k (1 ; x)n w(x) dx = (;1)

x i 0 x w(x) dx = i=0 (;1) i k+i:

0 i=0

In particular, if w is nonnegative, then we must have n k 0 for every n and k. This

observation serves as motivation for

Gaussian Quadrature 135

Theorem. The following are equivalent:

(a) ( k ) is the moment sequence of some nonnegative weight function w on 0 1 ].

(b) n k 0 for every n and k.

(c) a0 0 + a1 1 + + an n 0 whenever a0 + a1 x + + anxn 0 for all 0 x 1.

The equivalence of (a) and (b) is due to Hausdor . A real sequence satisfying (b) or

(c) is sometimes said to be positive de nite.

Now dozens of mathematicians worked on various aspects of the moment problem:

Chebyshev, Markov, Stieltjes, Cauchy, Riesz, Frechet, and on and on. And several of

R b w(t) dt

them, in particular Cauchy and Stieltjes, noticed the importance of the integral a x;t

in attacking the problem. (Compare this expression to Cauchy's integral formula.) It was

Stieltjes, however, who gave the rst complete solution to such a problem|developing his

R b dW(t) ), his own variety of continued fractions, and planting

own integral (by considering a x;t

the seeds for the study of orthogonal polynomials while he was at it! We will attempt to

at least sketch a few of these connections.

To begin, let's x our notation: To simpli y things, we suppose that we're given a

nonnegative weight w(x) on a symmetric interval ;a a ], and that all of the moments of

w are nite. We will otherwise stick to our usual notations for (Qn ), the Gaussian-type

quadrature formulas, and so on. Next, we consider the moment-generating function:

Z a w(t) X

1

k.

Lemma. If x 2 ;a a ], then x ; t dt

= = xk+1

;a k=0

X tk

1

1 =1 1

x ; t x 1 ; (t=x) = k=0 xk+1 , and the sum converges uniformly because

Proof.

jt=xj a=jxj < 1. Now just multiply by w(t) and integrate.

By way of an example, consider the Chebyshev weight w(x) = (1;x2 );1=2 on ;1 1 ].

For x > 1 we have

Z1 ;set t = 2u=(1 + u2)

dt

p = p2

(x ; t) 1 ; t2 x ;1

;1

1 ;1=2

= x 1 ; x2

Gaussian Quadrature 136

= x 1 + 1 x2 + 2 2 2! x4 +

1 13 1 1

2

using the binomial formula. Thus, we've found all the moments:

Z 1 dt

p2

= =

0

;1 1 ; t

Z 1 t2n;1dt

p

= =0

2n;1

1 ; t2

;1

Z 1 t2ndt

p 2 = 1 3 5 2nn! ; 1) :

(2n

=

2n

;1 1 ; t

R

a

Stieltjes proved much more: The integral ;a w(t) dt is actually an analytic function of

x;t

x in C n ;a a ]. In any case, since x 2 ;a a ], we know that x;t is continuous on ;a a ].

1

=

In particular, we can apply our quadrature formulas (and Stieltjes theorem, p. 132) to

write Z a w(t) X A(n)

n

i

dt = n!1

lim

;a x ; t (n)

i=1 x ; xi

and these sums are recognizable:

X A(n)

n '

= Qn(x) :

Lemma. i

n (x)

(n)

i=1 x ; xi

Proof. Since 'n has degree < n and 'n (x(n) ) 6= 0 for any i, we may appeal to partial-

i

fractions to write

X ci

n

'n(x) = 'n(x)

(n) ) (x ; x(n)) =

Qn (x) x ; x(n)

(x ; x1 n i=1 i

where ci is given by

'n(x(n)) = A(n):

'

ci = Qn(x) (x ; x(n)) = 0 i(n)

i i

n (x) Qn(xi )

x=x(n)

i

Now here's where the continued fractions come in: Stieltjes recognized the fact that

'n+1(x) = b0

(x ; a0 ) ; (x ; a ) ; b1

Qn+1(x)

...

1

b

; (x ;nan)

Gaussian Quadrature 137

R

(which can be proved by induction), where b0 = ab w(t) dt. More generally, induction will

show that the n-th convergent of a continued fraction can be written as

An = p1

q1 ; q ; p2

Bn

...

2

; pn

qn

by means of the recurrence formulas

A0 = 0 B0 = 1

A1 = p1 B1 = q1

An = qnAn;1 + pnAn;2 Bn = qnBn;1 + pnBn;2

where n = 2 3 4 : : :. Please note that An and Bn satisfy the same recurrence formula,

but with di erent starting values (as is the case with 'n and Qn ).

Again using the Chebyshev weight as an example, for x > 1 we have

Z1 dt

p p 2=

=

x2 ; 1 ;1 (x ; t) 1 ; t 1=2

x; 1=4

x; 1=4

x; .

..

since an = 0 for all n, b1 = 1=2, and bn = 1=4 for n 2. In other words, we've just found

a continued fraction expansion for (x2 ; 1);1=2 .

Appendix

Finally, here is a brief review of some of the fancier bits of linear algebra used in this

chapter. To begin, we discuss sums and quotients of vector spaces.

Each subspace M of a nite-dimensional X induces an equivalence relation on X by

x y () x ; y 2 M:

Standard arguments show that the equivalence classes under this relation are the cosets

(translates) x + M, x 2 X. That is,

x + M = y + M () x ; y 2 M () x y:

Gaussian Quadrature 138

Equally standard is the induced vector arithmetic

(x + M) + (y + M) = (x + y) + M (x + M) = ( x) + M

and

where x, y 2 X and 2 R. The collection of cosets (or equivalence classes) is a vector

space under these operations it's denoted X=M and called the quotient of X by M. Please

note the the zero vector in X=M is simply M itself.

Associated to the quotient space X=M is the quotient map q(x) = x + M. It's easy

to check that q : X ! X=M is a vector space homomorphism with kernel M. (Why?)

Next we recall the isomorphism theorem.

Theorem. Let T : X ! Y be a linear map between nite-dimensional vector spaces, and

let q : X ! X= ker T be the quotient map. Then, there exists a (unique, into) isomorphism

S : X= ker T ! Y satisfying S(q(x)) = T(x) for every x 2 X.

! Y by

Proof. Since q maps onto X= ker T, it's \legal" to de ne a map S : X= ker T

setting S(q(x)) = T(x) for x 2 X. Please note that S is well-de ned since

T(x) = T(y) () T(x ; y) = 0 () x ; y 2 ker T

() q(x ; y) = 0 () q(x) = q(y):

It's easy to see that S is linear and so precisely the same argument as above shows that S

is one-to-one.

Corollary. Let T : X ! Y be a linear map between nite-dimensional vector spaces.

Then, the range of T is isomorphic to X= ker T.

The Muntz Theorems

Math 682

For several weeks now we've taken advantage of the fact that the monomials 1 x x2 : : :

have dense linear span in C 0 1 ]. What, if anything, is so special about these particular

P

powers? How about if we consider polynomials of the form n ak xk2 are they dense,

k=0

too? More generally, what can be said about the span of a sequence of monomials (x n ),

where 0 < 1 < 2 < ? Of course, we'll have to assume that 0 0, but it's not hard

P

ñòð. 4 |