ńņš. 1 |

Fermatā™s Last Theorem

15.1 Overview

Around 1637, Fermat wrote in the margin of his copy of Diophantusā™s work

that, when n ā„ 3,

an + bn = cn , abc = 0 (15.1)

has no solution in integers a, b, c. This has become known as Fermatā™s Last

Theorem. Note that it suļ¬ces to consider only the cases where n = 4 and

where n = is an odd prime (since any n ā„ 3 has either 4 or such an as a

factor). The case n = 4 was proved by Fermat using his method of inļ¬nite

descent (see Section 8.6). At least one unsuccessful attempt to prove the case

n = 3 appears in Arab manuscripts in the 900s (see [34]). This case was

settled by Euler (and possibly by Fermat). The ļ¬rst general result was due to

Kummer in the 1840s: Deļ¬ne the Bernoulli numbers Bn by the power series

ā

tn

t

= Bn .

et ā’ 1 n=1 n!

For example,

1 1 691

B4 = ā’ B12 = ā’

B2 = , , ..., .

6 30 2730

Let be an odd prime. If does not divide the numerator of any of the

Bernoulli numbers

B2 , B4 , . . . , B ā’3

then (15.1) has no solutions for n = . This criterion allowed Kummer to

prove Fermatā™s Last Theorem for all prime exponents less than 100, except

for = 37, 59, 67. For example, 37 divides the numerator of the 32nd Bernoulli

number, so this criterion does not apply. Using more reļ¬ned criteria, based on

the knowledge of which Bernoulli numbers are divisible by these exceptional

, Kummer was able to prove Fermatā™s Last Theorem for the three remaining

445

Ā© 2008 by Taylor & Francis Group, LLC

446 CHAPTER 15 FERMATā™S LAST THEOREM

exponents. Reļ¬nements of Kummerā™s ideas by Vandiver and others, plus the

advent of computers, yielded extensions of Kummerā™s results to many more

exponents. For example, in 1992, Buhler, Crandall, Ernvall, and MetsĀØnkylĀØa a

6

proved Fermatā™s Last Theorem for all exponents less than 4 Ć— 10 . How could

one check so many cases without seeing a pattern that would lead to a full

proof? The reason is that these methods were a prime-by-prime check. For

each prime , the Bernoulli numbers were computed mod . For around 61%

of the primes, none of these Bernoulli numbers was divisible by , so Kum-

merā™s initial criterion yielded the result. For the remaining 39% of the primes,

more reļ¬ned criteria were used, based on the knowledge of which Bernoulli

numbers were divisible by . For up to 4Ć—106 , these criteria suļ¬ced to prove

the theorem. But it was widely suspected that eventually there would be ex-

ceptions to these criteria, and hence more reļ¬nements would be needed. The

underlying problem with this approach was that it did not include any con-

ceptual reason for why Fermatā™s Last Theorem should be true. In particular,

there was no reason why there couldnā™t be a few random exceptions.

In 1986, the situation changed. Suppose that

a +b =c , abc = 0. (15.2)

By removing common factors, we may assume that a, b, c are integers with

gcd(a, b, c) = 1, and by rearranging a, b, c and changing signs if necessary, we

may assume that

bā”0 a ā” ā’1 (mod 4).

(mod 2), (15.3)

Frey suggested that the elliptic curve

y 2 = x(x ā’ a )(x + b )

EFrey :

(this curve had also been considered by Hellegouarch) has such restrictive

properties that it cannot exist, and therefore there cannot be any solutions to

(15.2). As weā™ll outline below, subsequent work of Ribet and Wiles showed

that this is the case.

When ā„ 5, the elliptic curve EFrey has good or multiplicative reduction

(see Exercise 2.24) at all primes (in other words, there is no additive reduc-

tion). Such an elliptic curve is called semistable. The discriminant of the

cubic is the square of the product of the diļ¬erences of the roots, namely

2

= (abc)2

a (ā’b )(a + b )

(we have used (15.2)). Because of technicalities involving the prime 2 (related

to the restrictions in (15.3)), the discriminant needs to be modiļ¬ed at 2 to

yield what is known as the minimal discriminant

ā = 2ā’8 (abc)2

Ā© 2008 by Taylor & Francis Group, LLC

447

SECTION 15.1 OVERVIEW

of EFrey . A conjecture of Brumer and Kramer predicts that a semistable

elliptic curve over Q whose minimal discriminant is an th power will have

a point of order . Mazurā™s Theorem (8.11) says that an elliptic curve over

Q cannot have a point of order when ā„ 11. Moreover, if the 2-torsion is

rational, as is the case with EFrey , then there are no points of order when

ā„ 5. Since ā is almost an th power, we expect EFrey to act similarly to

a curve that has a point of order . Such curves cannot exist when ā„ 5,

so EFrey should act like a curve that cannot exist. Therefore, we expect that

EFrey does not exist. The problem is to make these ideas precise.

Recall (see Chapter 14) that the L-series of an elliptic curve E over Q is

deļ¬ned as follows. For each prime p of good reduction, let

ap = p + 1 ā’ #E(Fp ).

Then

ā

an

ā’s 1ā’2s ā’1

(1 ā’ ap p

LE (s) = (ā—) +p ) = ,

ns

p n=1

where (*) represents the factors for the bad primes (see Section 14.2) and the

product is over the good primes. Suppose E(Q) contains a point of order .

By Theorem 8.9, E(Fp ) contains a point of order for all primes p = such

that E has good reduction at p. Therefore, |#E(Fp ), so

ap ā” p + 1 (mod ) (15.4)

for all such p. This is an example of how the arithmetic of E is related to

properties of the coeļ¬cients ap . We hope to obtain information by studying

these coeļ¬cients.

In particular, we expect a congruence similar to (15.4) to hold for EFrey .

In fact, a close analysis (requiring more detail than we give in Section 13.3) of

Ribetā™s proof shows that EFrey is trying to satisfy this congruence. However,

the irreducibility of a certain Galois representation is preventing it, and this

leads to the contradiction that proves the theorem.

The problem with this approach is that the numbers ap at ļ¬rst seem to

be fairly independent of each other as p varies. However, the Conjecture of

Taniyama-Shimura-Weil (now Theorem 14.4) claims that, for an elliptic curve

E over Q,

ā

an q n

fE (Ļ„ ) =

n=1

(where q = e2ĻiĻ„ ) is a modular form for Ī“0 (N ) for some N (see Section 14.2).

In this case, we say that E is modular. This is a fairly rigid condition and

can be interpreted as saying that the numbers ap have some coherence as p

varies. For example, it is likely that if we change one coeļ¬cient ap , then

the modularity will be lost. Therefore, modularity is a tool for keeping the

Ā© 2008 by Taylor & Francis Group, LLC

448 CHAPTER 15 FERMATā™S LAST THEOREM

numbers ap under control. Frey predicted the following, which Ribet proved

in 1986:

THEOREM 15.1

EFrey cannot be modular. Therefore, the conjecture of Taniyama-Shimura-

Weil implies Fermatā™s Last Theorem.

This result ļ¬nally gave a theoretical reason for believing Fermatā™s Last

Theorem. Then in 1994, Wiles proved

THEOREM 15.2

All semistable elliptic curves over Q are modular.

This result was subsequently extended to include all elliptic curves over Q.

See Theorem 14.4. Since the Frey curve is semistable, the theorems of Wiles

and Ribet combine to show that EFrey cannot exist, hence

THEOREM 15.3

Fermatā™s Last Theorem is true.

In the following three sections, we sketch some of the ideas that go into the

proofs of Ribetā™s and Wilesā™s theorems.

15.2 Galois Representations

Let E be an elliptic curve over Q and let m be an integer. From Theo-

rem 3.2, we know that

E[m] Zm ā• Zm .

Let {Ī²1 , Ī²2 } be a basis of E[m] and let Ļ ā G, where

G = Gal(Q/Q).

Since ĻĪ²i ā E[m], we can write

ĻĪ²1 = aĪ²1 + cĪ²2 , ĻĪ²2 = bĪ²1 + dĪ²2

with a, b, c, d ā Zm . We thus obtain a homomorphism

Ļm : G ā’ā’ GL2 (Zm )

ab

Ļ ā’ā’ .

cd

Ā© 2008 by Taylor & Francis Group, LLC

449

SECTION 15.2 GALOIS REPRESENTATIONS

If m = is a prime, we call Ļ the mod Galois representation attached to

E. We can also take m = n for n = 1, 2, 3, . . . . By choosing an appropriate

sequence of bases, we obtain representations Ļ n such that

n

ā”Ļ

Ļ (mod )

n n+1

for all n. These may be combined to obtain

: G ā’ā’ GL2 (O ),

Ļ ā

where O denotes any ring containing the -adic integers (see Appendix A).

This is called the -adic Galois representation attached to E. An advantage of

working with Ļ ā is that the -adic integers have characteristic 0, so instead

of congruences mod powers of , we can work with equalities.

Notation: Throughout this chapter, we will need rings that are ļ¬nite ex-

tensions of the -adic integers. Weā™ll denote such rings by O . For many

purposes, we can take O to equal the -adic integers, but sometimes we need

slightly larger rings. Since we do not want to discuss the technical issues that

arise in this regard, we simply use O to denote a varying ring that is large

enough for whatever is required. The reader will not lose much by pretending

that O is always the ring of -adic integers.

Suppose r is a prime of good reduction for E. There exists an element

Frobr ā G such that the action of Frobr on E(Q) yields the action of the

Frobenius Ļr on E(Fr ) when E is reduced mod r (the element Frobr is not

unique, but this will not aļ¬ect us). In particular, when = r, the matrices

describing the actions of Frobr and Ļr on the -power torsion are the same

(use a basis and its reduction to compute the matrices). Let

ar = r + 1 ā’ #E(Fr ).

From Proposition 4.11, we obtain that

n n

Trace(Ļ n (Frobr )) ā” ar det(Ļ n (Frobr )) ā” r

(mod ), (mod ),

and therefore

Trace(Ļ (Frobr )) = ar , det(Ļ (Frobr )) = r.

ā ā

Recall that the numbers ar are used to produce the modular form fE attached

to E (see Section 14.2).

Suppose now that

Ļ : G ā’ā’ GL2 (O )

is a representation of G. Under certain technical conditions (namely, Ļ is

unramiļ¬ed at all but ļ¬nitely many primes; see the end of this section), we

may choose elements Frobr (for the unramiļ¬ed primes) and deļ¬ne

ar = Trace(Ļ(Frobr )).

Ā© 2008 by Taylor & Francis Group, LLC

450 CHAPTER 15 FERMATā™S LAST THEOREM

This allows us to deļ¬ne a formal series

ā

an q n .

g=

n=1

We refer to g as the potential modular form attached to Ļ. Of course,

some conditions must be imposed on the ar in order for this to represent a

complex function (for example, the numbers an ā O must be identiļ¬ed with

complex numbers), but we will not discuss this general problem here.

Let N be a positive integer. Recall that a modular form f of weight 2 and

level N is a function analytic in the upper half plane satisfying

aĻ„ + b

= (cĻ„ + d)2 f (Ļ„ ) (15.5)

f

cĻ„ + d

for all

ab

ā Ī“0 (N )

cd

(where Ī“0 (N ) is the group of integral matrices of determinant 1 such that

c ā” 0 (mod N )). There are also technical conditions that we wonā™t discuss

for the behavior of f at the cusps. The cusp forms of weight 2 and level N ,

which weā™ll denote by S(N ), are those modular forms that take the value 0 at

all the cusps. S(N ) is a ļ¬nite dimensional vector space over C. We represent

cusp forms by their Fourier expansions:

ā

bn q n ,

f (Ļ„ ) =

n=1

where q = e2ĻiĻ„ .

If M |N , then Ī“0 (N ) ā Ī“0 (M ), so a modular form of level M can be re-

garded as a modular form of level N . More generally, if d|(N/M ) and f (Ļ„ )

is a cusp form of level M , then it can be shown that f (dĻ„ ) is a cusp form of

level N . The subspace of S(N ) generated by such f , where M ranges through

proper divisors of N and d ranges through divisors of N/M , is called the

subspace of oldforms. There is a naturally deļ¬ned inner product on S(N ),

called the Petersson inner product. The space of newforms of level N is the

perpendicular complement of the space of oldforms. Intuitively, the newforms

are those that do not come from levels lower than N .

We now need to introduce the Hecke operators. Let r be a prime. Deļ¬ne

ā§ā ā

āØ n=1 brn q n + n=1 rbn q rn ,

ā if r N

n

bn q = (15.6)

Tr

ā©ā n

if r | N.

n=1 brn q ,

n=1

It can be shown that Tr maps S(N ) into S(N ) and that the Tr ā™s commute

with each other. Deļ¬ne the Hecke algebra

T = TN ā End(S(N ))

Ā© 2008 by Taylor & Francis Group, LLC

451

SECTION 15.2 GALOIS REPRESENTATIONS

to be the image of Z[T2 , T3 , T5 , . . . ] in the endomorphism ring of S(N ) (the

endomorphism ring of S(N ) is the ring of linear transformations from the

vector space S(N ) to itself).

A normalized eigenform of level N is a newform

ā

bn q n ā S(N )

f=

n=1

of level N with b1 = 1 and such that

Tr (f ) = br f for all r.

It can be shown that the space of newforms in S(N ) has a basis of normalized

eigenforms. Henceforth, essentially all of the modular forms that we encounter

will be normalized eigenforms of level N . Often, we shall refer to them simply

as modular forms.

Let f be a normalized eigenform and suppose the coeļ¬cients bn of f are

rational integers. In this case, Eichler and Shimura showed that f determines

an elliptic curve Ef over Q, and Ef has the property that

br = ar

for all r (where ar = r + 1 ā’ #Ef (Fr ) for the primes of good reduction).

In particular, the potential modular form fEf for E is the modular form f .

Moreover, Ef has good reduction at the primes not dividing N . This result

is, in a sense, a converse of the conjecture of Taniyama-Shimura-Weil. The

conjecture can be restated as claiming that every elliptic curve E over Q

arises from this construction. Actually, we have to modify this statement a

little. Two elliptic curves E1 and E2 are called isogenous over Q if there is

a nonconstant homomorphism E1 (Q) ā’ E2 (Q) that is described by rational

functions over Q (see Chapter 12). It can be shown that, in this case, fE1 =

fE2 . Conversely, Faltings showed that if fE1 = fE2 then E1 and E2 are

isogenous. Since only one of E1 , E2 can be the curve Ef , we must ask whether

an elliptic curve E over Q is isogenous to one produced by the result of Eichler

and Shimura. Theorem 14.4 says that the answer is yes.

If we have an elliptic curve E, how can we predict what N should be? The

smallest possible N is called the conductor of E. For E = Ef , the primes

dividing the conductor N are exactly the primes of bad reduction of Ef (these

are also the primes of bad reduction of any curve isogenous to Ef over Q).

Moreover, p|N and p2 N if and only if Ef has multiplicative reduction at p.

Therefore, if Ef is semistable, then

N= p, (15.7)

p|ā

namely, the product of the primes dividing the minimal discriminant ā. We

see that N is squarefree if and only if Ef is semistable. Therefore, if E is an

arbitrary modular semistable elliptic curve over Q, then N is given by (15.7).

Ā© 2008 by Taylor & Francis Group, LLC

452 CHAPTER 15 FERMATā™S LAST THEOREM

Combining the result of Eichler and Shimura with the Galois representations

bn q n is a normalized

discussed above, we obtain the following. If f =

newform with rational integer coeļ¬cients, then there is a Galois representation

Ļf : G ā’ā’ GL2 (O )

such that

Trace(Ļf (Frobr )) = br , det(Ļf (Frobr )) = r (15.8)

for all r N .

bn q n is any

More generally, Eichler and Shimura showed that if f =

normalized newform (with no assumptions on its coeļ¬cients), then there is a

Galois representation

Ļf : G ā’ GL2 (O )

satisfying (15.8).

Returning to the situation where the coeļ¬cients bn are in Z, we let M be

the kernel of the ring homomorphism

T ā’ā’ F

Tr ā’ā’ br (mod ).

Since the homomorphism is surjective (because 1 maps to 1) and F is a ļ¬eld,

M is a maximal ideal of T. Also, T/M = F . Since Tr ā’ br ā M, the mod

version of (15.8) says that

Trace(Ļf (Frobr )) ā” Tr mod M, det(Ļf (Frobr )) ā” r mod M

for all r N . This has been greatly generalized by Deligne and Serre:

THEOREM 15.4

Let M be a maximal ideal of T and let be the characteristic of T/M. There

exists a semisimple representation

ĻM : G ā’ā’ GL2 (T/M)

such that

Trace(ĻM (Frobr )) ā” Tr mod M, det(ĻM (Frobr )) ā” r mod M

for all primes r N.

The semisimplicity of ĻM means that either ĻM is irreducible or it is the

sum of two one-dimensional representations.

In general, let A be either O or a ļ¬nite ļ¬eld. If

Ļ : G ā’ā’ GL2 (A)

Ā© 2008 by Taylor & Francis Group, LLC

453

SECTION 15.2 GALOIS REPRESENTATIONS

is a semisimple representation, then we say that Ļ is modular of level N if

there exists a homomorphism

Ļ : T ā’ā’ A

such that

Trace(Ļ(Frobr )) = Ļ(Tr ), det(Ļ(Frobr )) = Ļ(r)

for all r N . This says that Ļ is equivalent to a representation coming from

one of the above constructions.

When A = T/M, the homomorphism Ļ is the map T ā’ T/M.

bn q n is a normalized eigenform and A = O , recall that

When f =

Tr (f ) = br f for all r. This gives a homomorphism Ļ : T ā’ O (it is possible

to regard the coeļ¬cients br as elements of a suļ¬ciently large O ).

The way to obtain maximal ideals M of T is to use a normalized eigenform

to get a map T ā’ O , then map O to a ļ¬nite ļ¬eld. The kernel of the map

from T to the ļ¬nite ļ¬eld is a maximal ideal M.

When A is a ļ¬nite ļ¬eld, the level N of the representation Ļ is not unique.

In fact, a key result of Ribet (see Section 15.3) analyzes how the level can be

changed. Also, in the deļ¬nition of modularity in this case, we should allow

modular forms of weight k ā„ 2 (this means that the factor (cz + d)2 in (15.5)

is replaced by (cz + d)k ). However, this more general situation can be ignored

for the present purposes.

If Ļ is a modular representation of some level, and c ā G is complex conju-

gation (regard Q as a subļ¬eld of C) then it can be shown that det(Ļ(c)) = ā’1.

This says that Ļ is an odd representation. A conjecture of Serre [105], which

was a motivating force for much of the work described in this chapter, pre-

dicts that (under certain mild hypotheses) odd representations in the ļ¬nite

ļ¬eld case are modular (where we need to allow modular forms of weight k ā„ 2

in the deļ¬nition of modularity). Serre also predicts the level and the weight

of a modular form that yields the representation.

Finally, there is a type of representation, called ļ¬nite, that plays an impor-

tant role in Ribetā™s proof. Let p be a prime. We can regard the Galois group

for the p-adics as a subgroup of the Galois group for Q:

Gp = Gal(Qp /Qp ) ā‚ G = Gal(Q/Q).

There is a natural map from Gp to Gal(Fp /Fp ). The kernel is denoted Ip and

is called the inertia subgroup of Gp :

Gp /Ip Gal(Fp /Fp ). (15.9)

A representation

Ļ : G ā’ GL2 (F )

is said to be unramiļ¬ed at p if Ļ(Ip ) = 1, namely, Ip is contained in the

kernel of Ļ. If p = and Ļ is unramiļ¬ed at p, then Ļ is said to be ļ¬nite at p.

Ā© 2008 by Taylor & Francis Group, LLC

454 CHAPTER 15 FERMATā™S LAST THEOREM

If p = , the deļ¬nition of ļ¬nite is much more technical (it involves ļ¬nite ļ¬‚at

group schemes) and we omit it. However, for the representation Ļ coming

from an elliptic curve, there is the following:

PROPOSITION 15.5

Let E be an elliptic curve deļ¬ned over Q and let ā be the minimal discrimi-

nant of E. Let and p be primes (the case p = is allowed) and let Ļ be the

representation of G on E[ ]. Then Ļ is ļ¬nite at p if and only if vp (ā) ā” 0

(mod ), where vp denotes the p-adic valuation (see Appendix A).

For a proof, see [105].

Consider the Frey curve. The minimal discriminant is

ā = 2ā’8 (abc)2 .

Therefore, vp (ā) ā” 0 (mod ) for all p = 2, so Ļ is ļ¬nite at all odd primes.

Moreover, Ļ is not ļ¬nite at 2.

15.3 Sketch of Ribetā™s Proof

The key theorem that Ribet proved is the following.

THEOREM 15.6

Let ā„ 3 and let

Ļ : G ā’ GL2 (F )

be an irreducible representation. Assume that Ļ is modular of squarefree level

N and that there exists a prime q|N , q = , at which Ļ is not ļ¬nite. Suppose

p|N is a prime at which Ļ is ļ¬nite. Then Ļ is modular of level N/p.

In other words, if Ļ comes from a modular form of level N , then, under

suitable hypotheses, it also comes from a modular form of level N/p.

COROLLARY 15.7

EFrey cannot be modular.

PROOF Since there are no solutions to the Fermat equation, and hence

no Frey curves, when = 3, we may assume ā„ 5. If EFrey is modular, then

the associated representation Ļ is modular of some level N . Since EFrey is

Ā© 2008 by Taylor & Francis Group, LLC

455

SECTION 15.3 SKETCH OF RIBETā™S PROOF

semistable, (15.7) says that

N= p.

p|abc

It can be shown that Ļ is irreducible when ā„ 5 (see [105], where it is

obtained as a corollary of Mazurā™s theorem (Theorem 8.11)). Let q = 2 in

Ribetā™s theorem. As we showed at the end of Section 13.2, Ļ is not ļ¬nite at

2 and is ļ¬nite at all other primes. Therefore, Ribetā™s theorem allows us to

remove the odd primes from N one at a time. We eventually ļ¬nd that Ļ is

modular of level 2. This means that there is a normalized cusp form of weight

2 for Ī“0 (2) such that Ļ is the associated mod representation. But there

are no nonzero cusp forms of weight 2 for Ī“0 (2), so we have a contradiction.

Therefore, EFrey cannot be modular.

COROLLARY 15.8

The Taniyama-Shimura-Weil conjecture (for semistable elliptic curves) im-

plies Fermatā™s Last Theorem.

PROOF We may restrict to prime exponents ā„ 5. If there is a nontrivial

solution to the Fermat equation for , then the Frey curve exists. However,

Corollary 15.7 and the Taniyama-Shimura-Weil conjecture imply that the Frey

curve cannot exist. Therefore, there are no nontrivial solutions to the Fermat

equation.

We now give a brief sketch of the proof of Ribetā™s theorem. The proof uses

the full power of Grothendieckā™s algebraic geometry and is not elementary.

Therefore, we give only a sampling of some of the ideas that go into the proof.

For more details, see [90], [89], [85], [29].

We assume that Ļ is as in Theorem 15.6 and that N is chosen so that

1. Ļ is modular of squarefree level N ,

2. both p and q divide N ,

3. Ļ is ļ¬nite at p but is not ļ¬nite at q.

The goal is to show that p can be removed from N . The main ingredient

of the proof is a relation between Jacobians of modular curves and Shimura

curves. In the following, we describe modular curves and Shimura curves and

give a brief indication of how they occur in Ribetā™s proof.

Ā© 2008 by Taylor & Francis Group, LLC

456 CHAPTER 15 FERMATā™S LAST THEOREM

Modular curves

Recall that SL2 (Z) acts on the upper half plane H by linear fractional

transformations:

aĻ„ + b

ab

Ļ„= .

cd cĻ„ + d

The fundamental domain F for this action is described in Section 9.3. The

subgroup Ī“0 (N ) (deļ¬ned by the condition that c ā” 0 (mod N )) also acts

on H. The modular curve X0 (N ) is deļ¬ned over C by taking the upper

half plane modulo the action of Ī“0 (N ), and then adding ļ¬nitely many points,

called cusps, to make X0 (N ) compact. We obtain a fundamental domain D

for Ī“0 (N ) by writing

SL2 (Z) = āŖi Ī³i Ī“0 (N )

ā’1

for some coset representatives Ī³i and letting D = āŖi Ī³i F. Certain edges of

this fundamental domain are equivalent under the action of Ī“0 (N ). When

equivalent edges are identiļ¬ed, the fundamental domain gets bent around to

form a surface. There is a hole in the surface corresponding to iā, and there

are also ļ¬nitely many holes corresponding to points where the fundamental

domain touches the real axis. These holes are ļ¬lled in by points, called cusps,

to obtain X0 (N ). It can be shown that X0 (N ) can be represented as an

algebraic curve deļ¬ned over Q.

Figure 15.1 gives a fundamental domain for Ī“0 (2). The three pieces are

ā’1

obtained as Ī³i F, where

0 ā’1

10 11

Ī³1 = , Ī³2 = , Ī³3 = .

ā’1 0

01 10

The modular curve X0 (N ) has another useful description, which works over

arbitrary ļ¬elds K with the characteristic of K not dividing N . Consider pairs

(E, C), where E is an elliptic curve (deļ¬ned over the algebraic closure K) and

C is a cyclic subgroup of E(K) of order N . The set of such pairs is in one-

to-one correspondence with the noncuspidal points of X0 (N )(K). Of course,

it is not obvious that this collection of pairs can be given the structure of an

algebraic curve in a natural way. This takes some work.

Example 15.1

When K = C, we can see this one-to-one correspondence as follows. An

elliptic curve can be represented as

EĻ„ = C/(ZĻ„ + Z),

with Ļ„ ā H, the upper half plane. The set

N ā’1

1

CĻ„ = 0, , ...,

N N

Ā© 2008 by Taylor & Francis Group, LLC

457

SECTION 15.3 SKETCH OF RIBETā™S PROOF

Figure 15.1

A Fundamental Domain for Ī“0 (2)

is a cyclic subgroup of EĻ„ of order N . Let

ab

ā Ī“0 (N )

Ī³=

cd

and let

aĻ„ + b

Ī³Ļ„ = .

cĻ„ + d

Since

ZĻ„ + Z = Z(aĻ„ + b) + Z(cĻ„ + d) = (cĻ„ + d)(ZĪ³Ļ„ + Z),

there is an isomorphism

fĪ³ : C/(ZĻ„ + Z) ā’ā’ C/(ZĪ³Ļ„ + Z)

given by

fĪ³ (z) = z/(cĻ„ + d).

This isomorphism between EĻ„ and EĪ³Ļ„ maps the point k/N to

k ka c aĻ„ + b

ā’k

=

N (cĻ„ + d) N N cĻ„ + d

ka

ā” mod ZĪ³Ļ„ + Z

N

(we have used the fact that c ā” 0 (mod N )). Therefore, the subgroup CĻ„

of EĻ„ is mapped to the corresponding subgroup CĪ³Ļ„ of EĪ³Ļ„ , so fĪ³ maps the

pair (EĻ„ , CĻ„ ) to the pair (EĪ³Ļ„ , CĪ³Ļ„ ). We conclude that if Ļ„1 , Ļ„2 ā H are

equivalent under the action of Ī“0 (N ), then the corresponding pairs (EĻ„j , CĻ„j )

are isomorphic. It is not hard to show that, conversely, if the pairs are iso-

morphic then the corresponding Ļ„j ā™s are equivalent under Ī“0 (N ). Moreover,

Ā© 2008 by Taylor & Francis Group, LLC

458 CHAPTER 15 FERMATā™S LAST THEOREM

every pair (E, C) of an elliptic curve over C and a cyclic subgroup C of order

N is isomorphic to a pair (EĻ„ , CĻ„ ) for some Ļ„ ā H. Therefore, the set of

isomorphism classes of these pairs is in one-to-one correspondence with the

points of H mod the action of Ī“0 (N ). These are the noncuspidal points of

X0 (N ).

Of course, over arbitrary ļ¬elds, we cannot work with the upper half plane

H, and it is much more diļ¬cult to show that the pairs (E, C) can be collected

together as the points on a curve X0 (N ). However, when this is done, it yields

a convenient way to work with the modular curve X0 (N ) and its reductions

mod primes.

For a nonsingular algebraic curve C over a ļ¬eld K, let J(C) be the divisors

(over K) of degree 0 modulo divisors of functions. It is possible to represent

J(C) as an algebraic variety, called the Jacobian of C. When C is an elliptic

curve E, we showed (Corollary 11.4; see also the sequence (9.3)) that J(E)

is a group isomorphic to E(K). When K = C, we thus obtained a torus. In

general, if K = C and C is a curve of genus g, then J(C) is isomorphic to a

higher dimensional torus, namely, Cg mod a lattice of rank 2g. The Jacobian

of X0 (N ) is denoted J0 (N ).

The Jacobian J0 (N ) satisļ¬es various functorial properties. In particular, a

nonconstant map Ļ : X0 (N ) ā’ E induces a map Ļā— : E ā’ J0 (N ) obtained

by mapping a point P of E to the divisor on X0 (N ) formed by the sum of

the inverse images of P minus the inverse images of ā ā E:

Ļā— : P ā’ā’ [Q] ā’ [R].

Ļ(Q)=P Ļ(R)=ā

Therefore, we can map E to a subgroup of J0 (N ) (this map might have a

nontrivial, but ļ¬nite, kernel).

An equivalent formulation of the modularity of E is to say that there is a

nonconstant map from X0 (N ) to E and therefore that E is isogenous to an

elliptic curve contained in some J0 (N ).

If p is a prime dividing N , there are two natural maps X0 (N ) ā’ X0 (N/p).

If (E, C) is a pair corresponding to a point in X0 (N ), then there is a unique

subgroup C ā‚ C of order N/p. So we have a map

Ī± : (E, C) ā’ā’ (E, C ). (15.10)

However, there is also a unique subgroup P ā‚ C of order p. It can be shown

that E/P is an elliptic curve and therefore (E/P, C/P ) is a pair corresponding

to a point on X0 (N/p). This gives a map

Ī² : (E, C) ā’ā’ (E/P, C/P ). (15.11)

These two maps can be interpreted in terms of the complex model of X0 (N ).

Since Ī“0 (N ) ā‚ Ī“0 (N/p), we can map H mod Ī“0 (N ) to H mod Ī“0 (N/p) by

Ā© 2008 by Taylor & Francis Group, LLC

459

SECTION 15.3 SKETCH OF RIBETā™S PROOF

mapping the equivalence class of Ļ„ mod Ī“0 (N ) to the equivalence class of Ļ„

mod Ī“0 (N/p). This corresponds to the map Ī±. The map Ī² can be shown to

correspond to the map Ļ„ ā’ pĻ„ . Note that these two maps represent the two

methods of using modular forms for Ī“0 (N/p) to produce oldforms for Ī“0 (N ).

The Hecke algebra T acts on J0 (N ). Let P be a point on X0 (N ). Recall

that P corresponds to a pair (E, C), where E is an elliptic curve and C is

a cyclic subgroup of order N . Let p be a prime. For each subgroup D of E

of order p with D ā C, we can form the pair (E/D, (C + D)/D). It can be

shown that E/D is an elliptic curve and (C + D)/D is a cyclic subgroup of

order N . Therefore, this pair represents a point on X0 (N ). Deļ¬ne

[(E/D, (C + D)/D)] ā Div(X0 (N )),

Tp ([(E, C)]) =

D

where the sum is over those D of order p with D ā C and where Div(X0 (N ))

denotes the divisors of X0 (N ) (see Chapter 11). It is not hard to show that

this corresponds to the formulas for Tp given in (15.6). Clearly Tp maps

divisors of degree 0 to divisors of degree 0, and it can be shown that it maps

principal divisors to principal divisors. Therefore, Tp gives a map from J0 (N )

to itself. This yields an action of T on J0 (N ), and these endomorphisms are

deļ¬ned over Q.

Let Ī± ā T and let J0 (N )[Ī±] denote the kernel of Ī± on J0 (N ). More generally,

let I be an ideal of T. Deļ¬ne

J0 (N )[I] = J0 (N )[Ī±].

Ī±āI

For example, when I = nT for an integer n, then J0 (N )[I] is just J0 (N )[n],

the n-torsion on J0 (N ).

Now letā™s consider the representation Ļ of Theorem 15.6. Since Ļ is assumed

to be modular, it corresponds to a maximal ideal M of T. Let F = T/M,

which is a ļ¬nite ļ¬eld. Then W = J0 (N )[M] has an action of F, which means

that it is a vector space over F. Let be the characteristic of F. Since = 0

in F, it follows that

W ā J0 (N )[ ],

the -torsion of J0 (N ). Since G acts on W , we see that W yields a represen-

tation Ļ of G over F. It can be shown that Ļ is equivalent to Ļ, so we can

regard the representation space for Ļ as living inside the -torsion of J0 (N ).

This has great advantages. For example, if M |N then there are natural maps

X0 (N ) ā’ X0 (M ). These yield (just as for the map X0 (N ) ā’ E above) maps

J0 (M ) ā’ J0 (N ). Showing that the level can be reduced from N to M is

equivalent to showing that this representation space lives in these images of

J0 (M ). Also, we are now working with a representation that lives inside a

fairly concrete object, namely the -torsion of an abelian variety, rather than

a more abstract situation, so we have more control over Ļ.

Ā© 2008 by Taylor & Francis Group, LLC

460 CHAPTER 15 FERMATā™S LAST THEOREM

Shimura curves

We now need to introduce what are known as Shimura curves. Recall that

in Section 10.2 we deļ¬ned quaternion algebras as (noncommutative) rings of

the form

Q = Q + QĪ± + QĪ² + QĪ±Ī²,

where

Ī±2 , Ī² 2 ā Q, Ī²Ī± = ā’Ī±Ī².

We omit the requirement from Section 10.2 that Ī±2 < 0 and Ī² 2 < 0 since

we want to consider indeļ¬nite quaternion algebras as well. Let r be a prime

(possibly ā) and let Qr be the ring obtained by allowing r-adic coeļ¬cients

in the deļ¬nition of Q. As we mentioned in Section 10.2, there is a ļ¬nite set

of primes r, called the ramiļ¬ed primes, for which Qr has no zero divisors. On

the other hand, when r is unramiļ¬ed, Qr is isomorphic to M2 (Qr ), the ring

of 2 Ć— 2 matrices with r-adic entries.

Given two distinct primes p and q, there is a quaternion algebra B that is

ramiļ¬ed exactly at p and q. In particular, B is unramiļ¬ed at ā, so

Bā = M2 (R).

Corresponding to the integer M = N/pq, there is an order O ā‚ B, called

an Eichler order of level M (an order in B is a subring of B that has rank 4

as an additive abelian group; see Section 10.2). Regarding O as a subset of

Bā = M2 (R), deļ¬ne

Ī“ā = O ā© SL2 (R).

Then Ī“ā acts on H by linear fractional transformations. The Shimura curve

C is deļ¬ned to be H modulo Ī“ā .

There is another description of C, analogous to the one given above for

X0 (N ). Let Omax be a maximal order in B. Consider pairs (A, B), where A

is a two-dimensional abelian variety (these are algebraic varieties that, over

C, can be described as C2 mod a rank 4 lattice) and B is a subgroup of

A isomorphic to ZM ā• ZM . We restrict our attention to those pairs such

that Omax is contained in the endomorphism ring of A and such that Omax

maps B to B. When we are working over C, such pairs are in one-to-one

correspondence with the points on C. In general, over arbitrary ļ¬elds, such

pairs correspond in a natural way to points on an algebraic curve, which we

again denote C.

Let J be the Jacobian of C. The description of C in terms of pairs (A, B)

means that we can deļ¬ne an action of the Hecke operators on J, similarly to

what we did for the modular curves.

Let J[ ] be the -torsion of the Jacobian J of C. It can be shown that

the representation Ļ occurs in J[M], so there is a space V isomorphic to the

Ā© 2008 by Taylor & Francis Group, LLC

461

SECTION 15.4 SKETCH OF WILESā™S PROOF

representation space W of Ļ with

V ā J[M] ā J[ ].

We now have the representation Ļ living in J0 (N )[ ] and in J[ ]. The rep-

resentation Ļ can be detected using the reduction of J0 (N ) mod q and also

using the reduction of J mod p, and Ribet uses a calculation with quater-

nion algebras to establish a relationship between these two reductions. This

relationship allows him to show that p can be removed from the level N .

REMARK 15.9 A correspondence between modular forms for GL2 and

modular forms for the multiplicative group of a quaternion algebra plays a

major role in work of Jacquet-Langlands. This indicates a relation between

J0 (N ) and J. In fact, there is a surjection from J0 (N ) to J. However, this

map is not being used in the present case since such a map would relate the

reduction of J0 (N ) mod q to the reduction of J mod q. Instead, Ribet works

with the reduction of J0 (N ) mod q and the reduction of J mod p. This switch

between p and q is a major step in the proof of Ribetā™s theorem.

15.4 Sketch of Wilesā™s Proof

In this section, we outline the proof that all semistable elliptic curves over

Q are modular. For more details, see [29], [32], [118], [133]. Let E be a

semistable elliptic curve and let

an q n

fE =

nā„1

be the associated potential modular form. We want to prove that fE is a

modular form (for some Ī“0 (N )).

Suppose we have two potential modular forms

cn q n , cn q n

g=

f=

nā„1 nā„1

arising from Galois representations G ā’ GL2 (Op ) (where Op is some ring

containing the p-adic integers. We assume that all of the coeļ¬cients cn , cn

are embedded in Op ). Let p be the prime above p in Op . (If Op is the ring of

Ė

p-adic integers, then p = p.) If c ā” c (mod p) for almost all primes (that

Ė Ė

is, we allow ļ¬nitely many exceptions), then we write

f ā”g (mod p).

Ė

Ā© 2008 by Taylor & Francis Group, LLC

462 CHAPTER 15 FERMATā™S LAST THEOREM

This means that the Galois representations mod p associated to f and g are

Ė

equivalent.

The following result of Langlands and Tunnell gives us a place to start.

THEOREM 15.10

Let E be an elliptic curve deļ¬ned over Q and let fE = nā„1 an q n be the

associated potential modular form. There exists a modular form

bn q n

g0 =

nā„1

such that

a ā”b (mod Ė

3)

for almost all primes (that is, with possibly ļ¬nitely many exceptions), and

where Ė denotes a prime of O3 .

3

Recall that O3 denotes an unspeciļ¬ed ring containing the 3-adic integers.

If O3 is suļ¬ciently large, the coeļ¬cients b , which are algebraic integers, can

be regarded as lying in O3 .

The reason that 3 is used is that the group GL2 (F3 ) has order 48, hence

is solvable. The representation Ļ3 of G on E[3] therefore has its image in

a solvable group. The techniques of base change developed in the Langlands

program apply to cyclic groups, hence to solvable groups, and these techniques

are the key to proving the result. The groups GL2 (Fp ) for p ā„ 5 are not

solvable, so the base change techniques do not apply. On the other hand, the

representation Ļ2 for the Galois action on E[2] is trivial for the Frey curves

since the 2-torsion is rational for these curves. Therefore, it is not expected

that Ļ2 should yield any information.

Note that the modular form g0 does not necessarily have rational coeļ¬-

cients. Therefore, g0 is not necessarily the modular form associated to an

elliptic curve. Throughout Wilesā™s proof, Galois representations associated to

arbitrary modular forms are used.

The result of Langlands and Tunnell leads us to consider the following.

GENERAL PROBLEM

Fix a prime p. Let g = nā„1 an q n be a potential modular form (associated

to a 2-dimensional Galois representation). Suppose there is a modular form

bn q n such that g ā” g0 (mod Ė Can we prove that g is a modular

g0 = )p.

form?

The work of Wiles shows that the answer to the general problem is often

yes. Let A be the set of all potential modular forms g with g ā” g0 (mod Ė )p

(subject to certain restrictions). Let M ā A be the set of modular gā™s in A.

We are assuming that g0 ā M . The basic idea is the following. Let TA be the

Ā© 2008 by Taylor & Francis Group, LLC

463

SECTION 15.4 SKETCH OF WILESā™S PROOF

tangent space to A at g0 and let TM be the tangent space to M at g0 . The

goal is to show that TA = TM . Wiles shows that the spaces A and M are nice

enough that the equality of tangent spaces suļ¬ces to imply that A = M .

A

TA TM

g0 M

Figure 15.2

Tangent Spaces

Example 15.2

Let E be given by

y 2 + xy + y = x3 ā’ x2 ā’ 171x + 1904.

This curve has multiplicative reduction at 17 and 37 and good reduction at

all other primes. Therefore, E is semistable. The minimal discriminant of E

is ā = ā’17 Ā· 375 . Since E is semistable, the conductor of E is N = 17 Ā· 37.

Therefore, we expect that gE is a modular form for Ī“0 (17 Ā· 37). Counting

points on E mod for various yields the following values for a (we ignore

the bad prime 17):

2 35 7 11 13 17 19 23

ā’1 0 3 ā’1 ā’5 ā’2 ā’ ā’6

a 1

Therefore,

gE = q ā’ q 2 + 0 Ā· q 3 ā’ q 4 + 3q 5 + Ā· Ā· Ā· .

There is a modular form

bn q n = q ā’ q 2 + 0 Ā· q 3 ā’ q 4 ā’ 2q 5 + Ā· Ā· Ā·

g0 =

Ā© 2008 by Taylor & Francis Group, LLC

464 CHAPTER 15 FERMATā™S LAST THEOREM

for Ī“0 (17). The ļ¬rst few values of b are as follows:

2 35 7 11 13 17 19 23

ā’1 0 ā’2 4 0 ā’2 ā’ ā’4

b 4

It can be shown that a ā” b (mod 5) for all = 17, 37 (we ignore these

bad primes), so

gE ā” g0 (mod 5).

Can we prove that gE is a modular form?

Let A be the set of all potential modular forms g with g ā” g0 (mod 5) and

where the level N for g is allowed to contain only the primes 5, 17, 37 in its

factorization. There is also a technical condition, which we omit, on the ring

generated by the coeļ¬cients of g. The subspace M of true modular forms

contains g0 . Here are pictures of A and M :

ā¢ ā¢

A:

g0 gE

ā¢ ā¢ ā¢

M: or

g0 g0 gE

Therefore, our intuitive picture given in Figure 15.2 is not quite accurate.

In particular, the sets A and M are ļ¬nite. However, by reinterpreting the

geometric picture algebraically, we can still discuss tangent spaces.

Since the sets A and M are ļ¬nite, why not count the elements in both sets

and compare? First of all, this seems to be hard to do. Secondly, the tangent

spaces yield enough information. Consider the following situation. Suppose

you arrive at a train station in a small town. There are no signs telling you

which town it is, but you know it must be either I or II. You have the maps

given in Figure 15.3, where the large dot in the center indicates the station.

I II

Figure 15.3

Two Small Towns

Ā© 2008 by Taylor & Francis Group, LLC

465

SECTION 15.4 SKETCH OF WILESā™S PROOF

By counting the streets emanating from the station, you can immediately

determine which town you are in. The reason is that you have a base point. If

you didnā™t, then you might be on any of the vertices of I or II. You would not

be able to count streets and identify the town. The conļ¬guration of streets at

the station is the analogue of the tangent space at the base point. Of course,

it is possible that two towns could have the same tangent spaces, but Wiles

shows that this does not happen in his situation.

Tangent spaces

We now want to translate the notion of a tangent space into a useful alge-

braic formulation. Let R[x, y] be the ring of polynomials in two variables and

let f (x, y) ā R[x, y]. We can regard f as a function from the xy-plane to R.

Restricting f to the parabola y = x2 ā’ 6x, we obtain a function

f : parabola ā’ā’ R.

If g(x, y) ā R[x, y], then f and g give the same function on the parabola if

and only if f ā’ g is a multiple of y + 6x ā’ x2 . For example, let f = x3 ā’ y and

g = 6x + xy + 5x2 . Then

f ā’ g = ā’(x + 1)(y + 6x ā’ x2 ).

If we choose a point (a, b) on the parabola, then b + 6a ā’ a2 = 0, so

f (a, b) = g(a, b) ā’ (a + 1)(b + 6a ā’ a2 ) = g(a, b).

Therefore, there is a one-to-one correspondence

R[x, y]/(y + 6x ā’ x2 ).

āā’

polynomial functions on the parabola

The ring on the right consists of congruence classes of polynomials, where

we say that two polynomials are congruent if their diļ¬erence is a multiple of

y +6xā’x2 . In this way, we have represented a geometric object, the parabola,

by an algebraic object, the ring R[x, y]/(y + 6x ā’ x2 ).

Now letā™s consider the tangent line y + 6x = 0 at (0, 0). It is obtained by

taking the degree 1 terms in y + 6x ā’ x2 . We can represent it by the set

{ax + by | a, b ā R} mod (y + 6x),

where we are taking all linear functions and regarding two of them as congru-

ent if they diļ¬er by a multiple of y + 6x. Of course, we could have represented

the tangent line by the ring R[x, y]/(y + 6x), but, since we already know that

the tangent line is deļ¬ned by a linear equation, we do not lose any information

by replacing R[x, y] by the linear polynomials ax + by.

Ā© 2008 by Taylor & Francis Group, LLC

466 CHAPTER 15 FERMATā™S LAST THEOREM

Now consider the surface

y ā’ x2 + xz + 6x + z = 0.

This surface contains the parabola y = x2 ā’ 6x, z = 0. The inclusion of the

parabola in the surface corresponds to a surjective ring homomorphism

R[x, y, z]/(y ā’ x2 + xz + 6x + z) R[x, y]/(y + 6x ā’ x2 )

ā’ā’

ā’ā’

f (x, y, z) f (x, y, 0).

We also have a surjective map on the algebraic objects representing the tan-

gent spaces

{ax + by + cz} ā’ā’ {ax + by}

mod (y + 6x + z) mod (y + 6x)

corresponding to the inclusion of the tangent line to the parabola in the tan-

gent plane for the surface at (0, 0, 0). In this way, we can study relations

between geometric objects by looking at the corresponding algebraic objects.

Wiles works with rings such as Op [[x]]/(x2 ā’ px), where for simplicity we

henceforth assume that Op is the p-adic integers and where Op [[x]] denotes

power series with p-adic coeļ¬cients. The zeros of x2 ā’ px are 0 and p, so this

ring corresponds to the geometric object

ā¢ ā¢

S1 :

0 p

The tangent space is represented by the set obtained by looking only at the

linear terms, namely {ax | a ā Op } mod (px). Since

a1 x ā” a2 x āā’ a1 ā” a2

mod px (mod p),

the tangent space can be identiļ¬ed with Zp .

As another example, consider the ring Op [[x]]/(x(x ā’ p)(x ā’ p3 )), which

corresponds to the geometric object

ā¢ ā¢ ā¢

S2 :

p3

0 p

The tangent space is Zp4 .

There is an inclusion S1 ā‚ S2 , which corresponds to the natural ring ho-

momorphism

Op [[x]]/(x(x ā’ p)(x ā’ p3 )) ā’ā’ Op [[x]]/(x2 ā’ px).

The map on tangent spaces is the map from Zp4 to Zp that takes a number

mod p4 and reduces it mod p.

Now consider the ring Op [[x, y]]/(x2 ā’px, y 2 ā’py). In this case, we are look-

ing at power series in two variables, and two power series are congruent if their

Ā© 2008 by Taylor & Francis Group, LLC

467

SECTION 15.4 SKETCH OF WILESā™S PROOF

diļ¬erence is a linear combination of the form A(x, y)(x2 ā’px)+B(x, y)(y 2 ā’py)

with A, B ā Op [[x, y]]. The corresponding geometric object is

ā¢ ā¢

(0, p) (p, p)

S3 :

ā¢ ā¢

(0, 0) (p, 0)

It can be shown that two power series give the same function on this set of

four points if they diļ¬er by a linear combination of x2 ā’ px and y 2 ā’ py. The

tangent space is represented by

{ax + by | a, b ā Op } mod (px, py),

which means we are considering two linear polynomials to be congruent if

their diļ¬erence is a linear combination of px and py. It is easy to see that

a1 x + b1 y ā” a2 x + b2 y āā’ a1 ā” a2 , b1 ā” b2

mod (px, py) (mod p).

Therefore, the tangent space is isomorphic to Zp ā• Zp .

The inclusion S1 ā‚ S3 corresponds to the ring homomorphism

Op [[x, y]]/(x2 ā’ px, y 2 ā’ py) ā’ā’ Op [[x]]/(x2 ā’ px).

The map on tangent spaces is the map Zp ā• Zp ā’ Zp given by projection

onto the ļ¬rst factor.

In all three examples above, the rings are given by power series over Op .

The number of variables equals the number of relations and the resulting

ring is a ļ¬nitely generated Op -module (this is easily veriļ¬ed in the three

examples). Such rings are called local complete intersections. For such

rings, it is possible to recognize when a map is an isomorphism by looking at

the tangent spaces.

Before proceeding, letā™s look at an example that is not a local complete

intersection. Consider the ring

Op [[x, y]]/(x2 ā’ px, y 2 ā’ py, xy).

The corresponding geometric object is

ā¢

(0, p)

S4 :

ā¢ ā¢

(0, 0) (p, 0)

Ā© 2008 by Taylor & Francis Group, LLC

468 CHAPTER 15 FERMATā™S LAST THEOREM

There are two variables and three relations, so we do not have a complete

intersection. The tangent space is Zp ā•Zp . The inclusion S4 ā‚ S3 corresponds

to the ring homomorphism

Op [[x, y]]/(x2 ā’ px, y 2 ā’ py) ā’ā’ Op [[x, y]]/(x2 ā’ px, y 2 ā’ py, xy)

and the map on tangent spaces is an isomorphism. However, S3 = S4 . The

problem is that the tangent space calculation does not notice the relation xy,

which removed the point (p, p) from S3 to get S4 . Therefore, the tangent

space thinks this point is still there and incorrectly predicts an isomorphism

between the three point space and the four point space.

The general fact we need is that if we have a surjective homomorphism of

rings that are local complete intersections, and if the induced map on tangent

spaces is an isomorphism, then the ring homomorphism is an isomorphism.

Deformations of Galois representations

Now letā™s return to our sets A and M . Corresponding to these two sets are

rings RA and RM . We have g0 ā M ā A. Let TA and TM be the tangent

spaces at g0 . In the examples above, the base point g0 would correspond to

x = 0 or to (x, y) = (0, 0). Corresponding to the inclusion M ā A, there are

surjective maps

RA ā’ā’ RM , TA ā’ā’ TM .

Therefore,

#TM ā¤ #TA .

The ring RM can be constructed using the Hecke algebra and the ring RA

is constructed using results about representability of functors. In fact, it was

shown that there is a representation

Ļuniversal : G ā’ā’ GL2 (RA )

with the following property. Let

Ļ : G ā’ā’ GL2 (Op )

be a representation and let g be the potential modular form attached to Ļ.

Assume that Ļ is unramiļ¬ed outside a ļ¬xed ļ¬nite set of primes. If g ā” g0

(mod p), then there exists a unique ring homomorphism

Ė

Ļ : RA ā’ā’ Op

such that the diagram

Ļuniversal /

G PPP GL2 (RA )

PPP

PPP

P

Ļ PPP

Ļ

P'

GL2 (Op )

Ā© 2008 by Taylor & Francis Group, LLC

469

SECTION 15.4 SKETCH OF WILESā™S PROOF

commutes.

The representations Ļ such that g ā” g0 (mod p) are examples of what are

Ė

known as deformations of the Galois representation for g0 . The representa-

tion Ļuniversal is called a universal deformation.

Example 15.3

We continue with Example 13.2. Let p = 5 and take the ļ¬xed set of primes

to be {5, 17, 37}. Then it can be shown that

O5 [[x]]/(x2 ā’ bx),

RA

where b/5 is a 5-adic unit and O5 is the ring of 5-adic integers. This implies

that TA = Z5 . The set A has two points, g0 and g, corresponding to x = 0

and x = b.

There exists an integer n, deļ¬ned below, such that

n ā¤ #TM ā¤ #TA .

Moreover, a result of Flach shows that n Ā· TA = 0. If it can be shown that

n = #TA , then TA = TM .

In our example, n = 5. Since we know that TA = Z5 , we have n = #TA .

Therefore, TA = TM . It can be shown that RA and RM are local complete

intersections. This yields RA = RM and A = M . This implies that g is a

modular form.

In general, recall that we started with a semistable elliptic curve E. Associ-

ated to E is the 3-adic Galois representation Ļ3ā . The theorem of Langlands-

Tunnell yields a modular form g0 , and therefore a Galois representation

Ļ0 : G ā’ā’ GL2 (O3 ).

We have

Ļ3ā ā” Ļ0 (mod Ė

3),

so the base point Ļ0 is modular and semistable mod Ė (the notion of semistabil-

3

ity can be deļ¬ned for general Galois representations). Under the additional

ā

assumption that Ļ3 restricted to Gal(Q/Q( ā’3)) is absolutely irreducible,

Wiles showed that if RM is a local complete intersection then n = #TA and

the map RA ā’ RM is an isomorphism of local complete intersections. Finally,

in 1994, Wiles and Taylor used an ingenious argument to show that RM is a

local complete intersection, and therefore A = M .

What happens if Ļ3 does not satisfy the irreducibility assumption? Wiles

showed that there is a semistable elliptic curve E with the same mod 5

representation as E but whose mod 3 representation is irreducible. Therefore,

E is modular, so the mod 5 representation of E is modular. This means

that the mod 5 representation of E is modular. If the mod 5 representation,

Ā© 2008 by Taylor & Francis Group, LLC

470 CHAPTER 15 FERMATā™S LAST THEOREM

ā

restricted to Gal(Q/Q( 5)), is absolutely irreducible, then the above result

of Wiles, with 5 in place of 3, shows that E is modular.

There are only ļ¬nitely many elliptic curves over Q for which both the mod 3

ā

representation (restricted to Gal(Q/Q( ā’3))) and the mod 5 representation

ā

(restricted to Gal(Q/Q( 5))) are not absolutely irreducible. These ļ¬nitely

many exceptions can be proved to be modular individually.

Therefore, semistable elliptic curves over Q are modular. Eventually, the

argument was extended by Breuil, Conrad, Diamond, and Taylor to include

all elliptic curves over Q (Theorem 14.4).

The integer n is deļ¬ned as follows. Let g0 = bm q m and let

ńņš. 1 |