ńņš. 1(āńåćī 2)ŃĪÄÅŠĘĄĶČÅ >>
Chapter 15
Fermatā™s Last Theorem

15.1 Overview
Around 1637, Fermat wrote in the margin of his copy of Diophantusā™s work
that, when n ā„ 3,

an + bn = cn , abc = 0 (15.1)

has no solution in integers a, b, c. This has become known as Fermatā™s Last
Theorem. Note that it suļ¬ces to consider only the cases where n = 4 and
where n = is an odd prime (since any n ā„ 3 has either 4 or such an as a
factor). The case n = 4 was proved by Fermat using his method of inļ¬nite
descent (see Section 8.6). At least one unsuccessful attempt to prove the case
n = 3 appears in Arab manuscripts in the 900s (see [34]). This case was
settled by Euler (and possibly by Fermat). The ļ¬rst general result was due to
Kummer in the 1840s: Deļ¬ne the Bernoulli numbers Bn by the power series
ā
tn
t
= Bn .
et ā’ 1 n=1 n!

For example,
1 1 691
B4 = ā’ B12 = ā’
B2 = , , ..., .
6 30 2730
Let be an odd prime. If does not divide the numerator of any of the
Bernoulli numbers
B2 , B4 , . . . , B ā’3

then (15.1) has no solutions for n = . This criterion allowed Kummer to
prove Fermatā™s Last Theorem for all prime exponents less than 100, except
for = 37, 59, 67. For example, 37 divides the numerator of the 32nd Bernoulli
number, so this criterion does not apply. Using more reļ¬ned criteria, based on
the knowledge of which Bernoulli numbers are divisible by these exceptional
, Kummer was able to prove Fermatā™s Last Theorem for the three remaining

445

Ā© 2008 by Taylor & Francis Group, LLC
446 CHAPTER 15 FERMATā™S LAST THEOREM

exponents. Reļ¬nements of Kummerā™s ideas by Vandiver and others, plus the
advent of computers, yielded extensions of Kummerā™s results to many more
exponents. For example, in 1992, Buhler, Crandall, Ernvall, and MetsĀØnkylĀØa a
6
proved Fermatā™s Last Theorem for all exponents less than 4 Ć— 10 . How could
one check so many cases without seeing a pattern that would lead to a full
proof? The reason is that these methods were a prime-by-prime check. For
each prime , the Bernoulli numbers were computed mod . For around 61%
of the primes, none of these Bernoulli numbers was divisible by , so Kum-
merā™s initial criterion yielded the result. For the remaining 39% of the primes,
more reļ¬ned criteria were used, based on the knowledge of which Bernoulli
numbers were divisible by . For up to 4Ć—106 , these criteria suļ¬ced to prove
the theorem. But it was widely suspected that eventually there would be ex-
ceptions to these criteria, and hence more reļ¬nements would be needed. The
underlying problem with this approach was that it did not include any con-
ceptual reason for why Fermatā™s Last Theorem should be true. In particular,
there was no reason why there couldnā™t be a few random exceptions.
In 1986, the situation changed. Suppose that

a +b =c , abc = 0. (15.2)

By removing common factors, we may assume that a, b, c are integers with
gcd(a, b, c) = 1, and by rearranging a, b, c and changing signs if necessary, we
may assume that

bā”0 a ā” ā’1 (mod 4).
(mod 2), (15.3)

Frey suggested that the elliptic curve

y 2 = x(x ā’ a )(x + b )
EFrey :

(this curve had also been considered by Hellegouarch) has such restrictive
properties that it cannot exist, and therefore there cannot be any solutions to
(15.2). As weā™ll outline below, subsequent work of Ribet and Wiles showed
that this is the case.
When ā„ 5, the elliptic curve EFrey has good or multiplicative reduction
(see Exercise 2.24) at all primes (in other words, there is no additive reduc-
tion). Such an elliptic curve is called semistable. The discriminant of the
cubic is the square of the product of the diļ¬erences of the roots, namely
2
= (abc)2
a (ā’b )(a + b )

(we have used (15.2)). Because of technicalities involving the prime 2 (related
to the restrictions in (15.3)), the discriminant needs to be modiļ¬ed at 2 to
yield what is known as the minimal discriminant

ā = 2ā’8 (abc)2

Ā© 2008 by Taylor & Francis Group, LLC
447
SECTION 15.1 OVERVIEW

of EFrey . A conjecture of Brumer and Kramer predicts that a semistable
elliptic curve over Q whose minimal discriminant is an th power will have
a point of order . Mazurā™s Theorem (8.11) says that an elliptic curve over
Q cannot have a point of order when ā„ 11. Moreover, if the 2-torsion is
rational, as is the case with EFrey , then there are no points of order when
ā„ 5. Since ā is almost an th power, we expect EFrey to act similarly to
a curve that has a point of order . Such curves cannot exist when ā„ 5,
so EFrey should act like a curve that cannot exist. Therefore, we expect that
EFrey does not exist. The problem is to make these ideas precise.
Recall (see Chapter 14) that the L-series of an elliptic curve E over Q is
deļ¬ned as follows. For each prime p of good reduction, let

ap = p + 1 ā’ #E(Fp ).

Then
ā
an
ā’s 1ā’2s ā’1
(1 ā’ ap p
LE (s) = (ā—) +p ) = ,
ns
p n=1

where (*) represents the factors for the bad primes (see Section 14.2) and the
product is over the good primes. Suppose E(Q) contains a point of order .
By Theorem 8.9, E(Fp ) contains a point of order for all primes p = such
that E has good reduction at p. Therefore, |#E(Fp ), so

ap ā” p + 1 (mod ) (15.4)

for all such p. This is an example of how the arithmetic of E is related to
properties of the coeļ¬cients ap . We hope to obtain information by studying
these coeļ¬cients.
In particular, we expect a congruence similar to (15.4) to hold for EFrey .
In fact, a close analysis (requiring more detail than we give in Section 13.3) of
Ribetā™s proof shows that EFrey is trying to satisfy this congruence. However,
the irreducibility of a certain Galois representation is preventing it, and this
leads to the contradiction that proves the theorem.
The problem with this approach is that the numbers ap at ļ¬rst seem to
be fairly independent of each other as p varies. However, the Conjecture of
Taniyama-Shimura-Weil (now Theorem 14.4) claims that, for an elliptic curve
E over Q,
ā
an q n
fE (Ļ„ ) =
n=1

(where q = e2ĻiĻ„ ) is a modular form for Ī“0 (N ) for some N (see Section 14.2).
In this case, we say that E is modular. This is a fairly rigid condition and
can be interpreted as saying that the numbers ap have some coherence as p
varies. For example, it is likely that if we change one coeļ¬cient ap , then
the modularity will be lost. Therefore, modularity is a tool for keeping the

Ā© 2008 by Taylor & Francis Group, LLC
448 CHAPTER 15 FERMATā™S LAST THEOREM

numbers ap under control. Frey predicted the following, which Ribet proved
in 1986:

THEOREM 15.1
EFrey cannot be modular. Therefore, the conjecture of Taniyama-Shimura-
Weil implies Fermatā™s Last Theorem.

This result ļ¬nally gave a theoretical reason for believing Fermatā™s Last
Theorem. Then in 1994, Wiles proved

THEOREM 15.2
All semistable elliptic curves over Q are modular.

This result was subsequently extended to include all elliptic curves over Q.
See Theorem 14.4. Since the Frey curve is semistable, the theorems of Wiles
and Ribet combine to show that EFrey cannot exist, hence

THEOREM 15.3
Fermatā™s Last Theorem is true.

In the following three sections, we sketch some of the ideas that go into the
proofs of Ribetā™s and Wilesā™s theorems.

15.2 Galois Representations
Let E be an elliptic curve over Q and let m be an integer. From Theo-
rem 3.2, we know that
E[m] Zm ā• Zm .
Let {Ī²1 , Ī²2 } be a basis of E[m] and let Ļ ā G, where

G = Gal(Q/Q).

Since ĻĪ²i ā E[m], we can write

ĻĪ²1 = aĪ²1 + cĪ²2 , ĻĪ²2 = bĪ²1 + dĪ²2

with a, b, c, d ā Zm . We thus obtain a homomorphism

Ļm : G ā’ā’ GL2 (Zm )
ab
Ļ ā’ā’ .
cd

Ā© 2008 by Taylor & Francis Group, LLC
449
SECTION 15.2 GALOIS REPRESENTATIONS

If m = is a prime, we call Ļ the mod Galois representation attached to
E. We can also take m = n for n = 1, 2, 3, . . . . By choosing an appropriate
sequence of bases, we obtain representations Ļ n such that
n
ā”Ļ
Ļ (mod )
n n+1

for all n. These may be combined to obtain

: G ā’ā’ GL2 (O ),
Ļ ā

where O denotes any ring containing the -adic integers (see Appendix A).
This is called the -adic Galois representation attached to E. An advantage of
working with Ļ ā is that the -adic integers have characteristic 0, so instead
of congruences mod powers of , we can work with equalities.
Notation: Throughout this chapter, we will need rings that are ļ¬nite ex-
tensions of the -adic integers. Weā™ll denote such rings by O . For many
purposes, we can take O to equal the -adic integers, but sometimes we need
slightly larger rings. Since we do not want to discuss the technical issues that
arise in this regard, we simply use O to denote a varying ring that is large
enough for whatever is required. The reader will not lose much by pretending
that O is always the ring of -adic integers.
Suppose r is a prime of good reduction for E. There exists an element
Frobr ā G such that the action of Frobr on E(Q) yields the action of the
Frobenius Ļr on E(Fr ) when E is reduced mod r (the element Frobr is not
unique, but this will not aļ¬ect us). In particular, when = r, the matrices
describing the actions of Frobr and Ļr on the -power torsion are the same
(use a basis and its reduction to compute the matrices). Let

ar = r + 1 ā’ #E(Fr ).

From Proposition 4.11, we obtain that
n n
Trace(Ļ n (Frobr )) ā” ar det(Ļ n (Frobr )) ā” r
(mod ), (mod ),

and therefore

Trace(Ļ (Frobr )) = ar , det(Ļ (Frobr )) = r.
ā ā

Recall that the numbers ar are used to produce the modular form fE attached
to E (see Section 14.2).
Suppose now that
Ļ : G ā’ā’ GL2 (O )
is a representation of G. Under certain technical conditions (namely, Ļ is
unramiļ¬ed at all but ļ¬nitely many primes; see the end of this section), we
may choose elements Frobr (for the unramiļ¬ed primes) and deļ¬ne

ar = Trace(Ļ(Frobr )).

Ā© 2008 by Taylor & Francis Group, LLC
450 CHAPTER 15 FERMATā™S LAST THEOREM

This allows us to deļ¬ne a formal series
ā
an q n .
g=
n=1

We refer to g as the potential modular form attached to Ļ. Of course,
some conditions must be imposed on the ar in order for this to represent a
complex function (for example, the numbers an ā O must be identiļ¬ed with
complex numbers), but we will not discuss this general problem here.
Let N be a positive integer. Recall that a modular form f of weight 2 and
level N is a function analytic in the upper half plane satisfying
aĻ„ + b
= (cĻ„ + d)2 f (Ļ„ ) (15.5)
f
cĻ„ + d
for all
ab
ā Ī“0 (N )
cd
(where Ī“0 (N ) is the group of integral matrices of determinant 1 such that
c ā” 0 (mod N )). There are also technical conditions that we wonā™t discuss
for the behavior of f at the cusps. The cusp forms of weight 2 and level N ,
which weā™ll denote by S(N ), are those modular forms that take the value 0 at
all the cusps. S(N ) is a ļ¬nite dimensional vector space over C. We represent
cusp forms by their Fourier expansions:
ā
bn q n ,
f (Ļ„ ) =
n=1

where q = e2ĻiĻ„ .
If M |N , then Ī“0 (N ) ā Ī“0 (M ), so a modular form of level M can be re-
garded as a modular form of level N . More generally, if d|(N/M ) and f (Ļ„ )
is a cusp form of level M , then it can be shown that f (dĻ„ ) is a cusp form of
level N . The subspace of S(N ) generated by such f , where M ranges through
proper divisors of N and d ranges through divisors of N/M , is called the
subspace of oldforms. There is a naturally deļ¬ned inner product on S(N ),
called the Petersson inner product. The space of newforms of level N is the
perpendicular complement of the space of oldforms. Intuitively, the newforms
are those that do not come from levels lower than N .
We now need to introduce the Hecke operators. Let r be a prime. Deļ¬ne
ā§ā ā
āØ n=1 brn q n + n=1 rbn q rn ,
ā if r N
n
bn q = (15.6)
Tr
ā©ā n
if r | N.
n=1 brn q ,
n=1

It can be shown that Tr maps S(N ) into S(N ) and that the Tr ā™s commute
with each other. Deļ¬ne the Hecke algebra
T = TN ā End(S(N ))

Ā© 2008 by Taylor & Francis Group, LLC
451
SECTION 15.2 GALOIS REPRESENTATIONS

to be the image of Z[T2 , T3 , T5 , . . . ] in the endomorphism ring of S(N ) (the
endomorphism ring of S(N ) is the ring of linear transformations from the
vector space S(N ) to itself).
A normalized eigenform of level N is a newform
ā
bn q n ā S(N )
f=
n=1

of level N with b1 = 1 and such that
Tr (f ) = br f for all r.
It can be shown that the space of newforms in S(N ) has a basis of normalized
eigenforms. Henceforth, essentially all of the modular forms that we encounter
will be normalized eigenforms of level N . Often, we shall refer to them simply
as modular forms.
Let f be a normalized eigenform and suppose the coeļ¬cients bn of f are
rational integers. In this case, Eichler and Shimura showed that f determines
an elliptic curve Ef over Q, and Ef has the property that
br = ar
for all r (where ar = r + 1 ā’ #Ef (Fr ) for the primes of good reduction).
In particular, the potential modular form fEf for E is the modular form f .
Moreover, Ef has good reduction at the primes not dividing N . This result
is, in a sense, a converse of the conjecture of Taniyama-Shimura-Weil. The
conjecture can be restated as claiming that every elliptic curve E over Q
arises from this construction. Actually, we have to modify this statement a
little. Two elliptic curves E1 and E2 are called isogenous over Q if there is
a nonconstant homomorphism E1 (Q) ā’ E2 (Q) that is described by rational
functions over Q (see Chapter 12). It can be shown that, in this case, fE1 =
fE2 . Conversely, Faltings showed that if fE1 = fE2 then E1 and E2 are
isogenous. Since only one of E1 , E2 can be the curve Ef , we must ask whether
an elliptic curve E over Q is isogenous to one produced by the result of Eichler
and Shimura. Theorem 14.4 says that the answer is yes.
If we have an elliptic curve E, how can we predict what N should be? The
smallest possible N is called the conductor of E. For E = Ef , the primes
dividing the conductor N are exactly the primes of bad reduction of Ef (these
are also the primes of bad reduction of any curve isogenous to Ef over Q).
Moreover, p|N and p2 N if and only if Ef has multiplicative reduction at p.
Therefore, if Ef is semistable, then

N= p, (15.7)
p|ā

namely, the product of the primes dividing the minimal discriminant ā. We
see that N is squarefree if and only if Ef is semistable. Therefore, if E is an
arbitrary modular semistable elliptic curve over Q, then N is given by (15.7).

Ā© 2008 by Taylor & Francis Group, LLC
452 CHAPTER 15 FERMATā™S LAST THEOREM

Combining the result of Eichler and Shimura with the Galois representations
bn q n is a normalized
discussed above, we obtain the following. If f =
newform with rational integer coeļ¬cients, then there is a Galois representation

Ļf : G ā’ā’ GL2 (O )

such that

Trace(Ļf (Frobr )) = br , det(Ļf (Frobr )) = r (15.8)

for all r N .
bn q n is any
More generally, Eichler and Shimura showed that if f =
normalized newform (with no assumptions on its coeļ¬cients), then there is a
Galois representation
Ļf : G ā’ GL2 (O )
satisfying (15.8).
Returning to the situation where the coeļ¬cients bn are in Z, we let M be
the kernel of the ring homomorphism

T ā’ā’ F
Tr ā’ā’ br (mod ).

Since the homomorphism is surjective (because 1 maps to 1) and F is a ļ¬eld,
M is a maximal ideal of T. Also, T/M = F . Since Tr ā’ br ā M, the mod
version of (15.8) says that

Trace(Ļf (Frobr )) ā” Tr mod M, det(Ļf (Frobr )) ā” r mod M

for all r N . This has been greatly generalized by Deligne and Serre:

THEOREM 15.4
Let M be a maximal ideal of T and let be the characteristic of T/M. There
exists a semisimple representation

ĻM : G ā’ā’ GL2 (T/M)

such that

Trace(ĻM (Frobr )) ā” Tr mod M, det(ĻM (Frobr )) ā” r mod M

for all primes r N.

The semisimplicity of ĻM means that either ĻM is irreducible or it is the
sum of two one-dimensional representations.
In general, let A be either O or a ļ¬nite ļ¬eld. If

Ļ : G ā’ā’ GL2 (A)

Ā© 2008 by Taylor & Francis Group, LLC
453
SECTION 15.2 GALOIS REPRESENTATIONS

is a semisimple representation, then we say that Ļ is modular of level N if
there exists a homomorphism

Ļ : T ā’ā’ A

such that

Trace(Ļ(Frobr )) = Ļ(Tr ), det(Ļ(Frobr )) = Ļ(r)

for all r N . This says that Ļ is equivalent to a representation coming from
one of the above constructions.
When A = T/M, the homomorphism Ļ is the map T ā’ T/M.
bn q n is a normalized eigenform and A = O , recall that
When f =
Tr (f ) = br f for all r. This gives a homomorphism Ļ : T ā’ O (it is possible
to regard the coeļ¬cients br as elements of a suļ¬ciently large O ).
The way to obtain maximal ideals M of T is to use a normalized eigenform
to get a map T ā’ O , then map O to a ļ¬nite ļ¬eld. The kernel of the map
from T to the ļ¬nite ļ¬eld is a maximal ideal M.
When A is a ļ¬nite ļ¬eld, the level N of the representation Ļ is not unique.
In fact, a key result of Ribet (see Section 15.3) analyzes how the level can be
changed. Also, in the deļ¬nition of modularity in this case, we should allow
modular forms of weight k ā„ 2 (this means that the factor (cz + d)2 in (15.5)
is replaced by (cz + d)k ). However, this more general situation can be ignored
for the present purposes.
If Ļ is a modular representation of some level, and c ā G is complex conju-
gation (regard Q as a subļ¬eld of C) then it can be shown that det(Ļ(c)) = ā’1.
This says that Ļ is an odd representation. A conjecture of Serre [105], which
was a motivating force for much of the work described in this chapter, pre-
dicts that (under certain mild hypotheses) odd representations in the ļ¬nite
ļ¬eld case are modular (where we need to allow modular forms of weight k ā„ 2
in the deļ¬nition of modularity). Serre also predicts the level and the weight
of a modular form that yields the representation.
Finally, there is a type of representation, called ļ¬nite, that plays an impor-
tant role in Ribetā™s proof. Let p be a prime. We can regard the Galois group
for the p-adics as a subgroup of the Galois group for Q:

Gp = Gal(Qp /Qp ) ā‚ G = Gal(Q/Q).

There is a natural map from Gp to Gal(Fp /Fp ). The kernel is denoted Ip and
is called the inertia subgroup of Gp :

Gp /Ip Gal(Fp /Fp ). (15.9)

A representation
Ļ : G ā’ GL2 (F )
is said to be unramiļ¬ed at p if Ļ(Ip ) = 1, namely, Ip is contained in the
kernel of Ļ. If p = and Ļ is unramiļ¬ed at p, then Ļ is said to be ļ¬nite at p.

Ā© 2008 by Taylor & Francis Group, LLC
454 CHAPTER 15 FERMATā™S LAST THEOREM

If p = , the deļ¬nition of ļ¬nite is much more technical (it involves ļ¬nite ļ¬‚at
group schemes) and we omit it. However, for the representation Ļ coming
from an elliptic curve, there is the following:

PROPOSITION 15.5
Let E be an elliptic curve deļ¬ned over Q and let ā be the minimal discrimi-
nant of E. Let and p be primes (the case p = is allowed) and let Ļ be the
representation of G on E[ ]. Then Ļ is ļ¬nite at p if and only if vp (ā) ā” 0
(mod ), where vp denotes the p-adic valuation (see Appendix A).

For a proof, see [105].
Consider the Frey curve. The minimal discriminant is

ā = 2ā’8 (abc)2 .

Therefore, vp (ā) ā” 0 (mod ) for all p = 2, so Ļ is ļ¬nite at all odd primes.
Moreover, Ļ is not ļ¬nite at 2.

15.3 Sketch of Ribetā™s Proof
The key theorem that Ribet proved is the following.

THEOREM 15.6
Let ā„ 3 and let
Ļ : G ā’ GL2 (F )

be an irreducible representation. Assume that Ļ is modular of squarefree level
N and that there exists a prime q|N , q = , at which Ļ is not ļ¬nite. Suppose
p|N is a prime at which Ļ is ļ¬nite. Then Ļ is modular of level N/p.

In other words, if Ļ comes from a modular form of level N , then, under
suitable hypotheses, it also comes from a modular form of level N/p.

COROLLARY 15.7
EFrey cannot be modular.

PROOF Since there are no solutions to the Fermat equation, and hence
no Frey curves, when = 3, we may assume ā„ 5. If EFrey is modular, then
the associated representation Ļ is modular of some level N . Since EFrey is

Ā© 2008 by Taylor & Francis Group, LLC
455
SECTION 15.3 SKETCH OF RIBETā™S PROOF

semistable, (15.7) says that

N= p.
p|abc

It can be shown that Ļ is irreducible when ā„ 5 (see [105], where it is
obtained as a corollary of Mazurā™s theorem (Theorem 8.11)). Let q = 2 in
Ribetā™s theorem. As we showed at the end of Section 13.2, Ļ is not ļ¬nite at
2 and is ļ¬nite at all other primes. Therefore, Ribetā™s theorem allows us to
remove the odd primes from N one at a time. We eventually ļ¬nd that Ļ is
modular of level 2. This means that there is a normalized cusp form of weight
2 for Ī“0 (2) such that Ļ is the associated mod representation. But there
are no nonzero cusp forms of weight 2 for Ī“0 (2), so we have a contradiction.
Therefore, EFrey cannot be modular.

COROLLARY 15.8
The Taniyama-Shimura-Weil conjecture (for semistable elliptic curves) im-
plies Fermatā™s Last Theorem.

PROOF We may restrict to prime exponents ā„ 5. If there is a nontrivial
solution to the Fermat equation for , then the Frey curve exists. However,
Corollary 15.7 and the Taniyama-Shimura-Weil conjecture imply that the Frey
curve cannot exist. Therefore, there are no nontrivial solutions to the Fermat
equation.

We now give a brief sketch of the proof of Ribetā™s theorem. The proof uses
the full power of Grothendieckā™s algebraic geometry and is not elementary.
Therefore, we give only a sampling of some of the ideas that go into the proof.
For more details, see [90], [89], [85], [29].
We assume that Ļ is as in Theorem 15.6 and that N is chosen so that

1. Ļ is modular of squarefree level N ,

2. both p and q divide N ,

3. Ļ is ļ¬nite at p but is not ļ¬nite at q.

The goal is to show that p can be removed from N . The main ingredient
of the proof is a relation between Jacobians of modular curves and Shimura
curves. In the following, we describe modular curves and Shimura curves and
give a brief indication of how they occur in Ribetā™s proof.

Ā© 2008 by Taylor & Francis Group, LLC
456 CHAPTER 15 FERMATā™S LAST THEOREM

Modular curves
Recall that SL2 (Z) acts on the upper half plane H by linear fractional
transformations:
aĻ„ + b
ab
Ļ„= .
cd cĻ„ + d
The fundamental domain F for this action is described in Section 9.3. The
subgroup Ī“0 (N ) (deļ¬ned by the condition that c ā” 0 (mod N )) also acts
on H. The modular curve X0 (N ) is deļ¬ned over C by taking the upper
half plane modulo the action of Ī“0 (N ), and then adding ļ¬nitely many points,
called cusps, to make X0 (N ) compact. We obtain a fundamental domain D
for Ī“0 (N ) by writing
SL2 (Z) = āŖi Ī³i Ī“0 (N )
ā’1
for some coset representatives Ī³i and letting D = āŖi Ī³i F. Certain edges of
this fundamental domain are equivalent under the action of Ī“0 (N ). When
equivalent edges are identiļ¬ed, the fundamental domain gets bent around to
form a surface. There is a hole in the surface corresponding to iā, and there
are also ļ¬nitely many holes corresponding to points where the fundamental
domain touches the real axis. These holes are ļ¬lled in by points, called cusps,
to obtain X0 (N ). It can be shown that X0 (N ) can be represented as an
algebraic curve deļ¬ned over Q.
Figure 15.1 gives a fundamental domain for Ī“0 (2). The three pieces are
ā’1
obtained as Ī³i F, where
0 ā’1
10 11
Ī³1 = , Ī³2 = , Ī³3 = .
ā’1 0
01 10
The modular curve X0 (N ) has another useful description, which works over
arbitrary ļ¬elds K with the characteristic of K not dividing N . Consider pairs
(E, C), where E is an elliptic curve (deļ¬ned over the algebraic closure K) and
C is a cyclic subgroup of E(K) of order N . The set of such pairs is in one-
to-one correspondence with the noncuspidal points of X0 (N )(K). Of course,
it is not obvious that this collection of pairs can be given the structure of an
algebraic curve in a natural way. This takes some work.

Example 15.1
When K = C, we can see this one-to-one correspondence as follows. An
elliptic curve can be represented as
EĻ„ = C/(ZĻ„ + Z),
with Ļ„ ā H, the upper half plane. The set
N ā’1
1
CĻ„ = 0, , ...,
N N

Ā© 2008 by Taylor & Francis Group, LLC
457
SECTION 15.3 SKETCH OF RIBETā™S PROOF

Figure 15.1
A Fundamental Domain for Ī“0 (2)

is a cyclic subgroup of EĻ„ of order N . Let

ab
ā Ī“0 (N )
Ī³=
cd

and let
aĻ„ + b
Ī³Ļ„ = .
cĻ„ + d
Since
ZĻ„ + Z = Z(aĻ„ + b) + Z(cĻ„ + d) = (cĻ„ + d)(ZĪ³Ļ„ + Z),
there is an isomorphism

fĪ³ : C/(ZĻ„ + Z) ā’ā’ C/(ZĪ³Ļ„ + Z)

given by
fĪ³ (z) = z/(cĻ„ + d).
This isomorphism between EĻ„ and EĪ³Ļ„ maps the point k/N to

k ka c aĻ„ + b
ā’k
=
N (cĻ„ + d) N N cĻ„ + d
ka
ā” mod ZĪ³Ļ„ + Z
N
(we have used the fact that c ā” 0 (mod N )). Therefore, the subgroup CĻ„
of EĻ„ is mapped to the corresponding subgroup CĪ³Ļ„ of EĪ³Ļ„ , so fĪ³ maps the
pair (EĻ„ , CĻ„ ) to the pair (EĪ³Ļ„ , CĪ³Ļ„ ). We conclude that if Ļ„1 , Ļ„2 ā H are
equivalent under the action of Ī“0 (N ), then the corresponding pairs (EĻ„j , CĻ„j )
are isomorphic. It is not hard to show that, conversely, if the pairs are iso-
morphic then the corresponding Ļ„j ā™s are equivalent under Ī“0 (N ). Moreover,

Ā© 2008 by Taylor & Francis Group, LLC
458 CHAPTER 15 FERMATā™S LAST THEOREM

every pair (E, C) of an elliptic curve over C and a cyclic subgroup C of order
N is isomorphic to a pair (EĻ„ , CĻ„ ) for some Ļ„ ā H. Therefore, the set of
isomorphism classes of these pairs is in one-to-one correspondence with the
points of H mod the action of Ī“0 (N ). These are the noncuspidal points of
X0 (N ).
Of course, over arbitrary ļ¬elds, we cannot work with the upper half plane
H, and it is much more diļ¬cult to show that the pairs (E, C) can be collected
together as the points on a curve X0 (N ). However, when this is done, it yields
a convenient way to work with the modular curve X0 (N ) and its reductions
mod primes.

For a nonsingular algebraic curve C over a ļ¬eld K, let J(C) be the divisors
(over K) of degree 0 modulo divisors of functions. It is possible to represent
J(C) as an algebraic variety, called the Jacobian of C. When C is an elliptic
curve E, we showed (Corollary 11.4; see also the sequence (9.3)) that J(E)
is a group isomorphic to E(K). When K = C, we thus obtained a torus. In
general, if K = C and C is a curve of genus g, then J(C) is isomorphic to a
higher dimensional torus, namely, Cg mod a lattice of rank 2g. The Jacobian
of X0 (N ) is denoted J0 (N ).
The Jacobian J0 (N ) satisļ¬es various functorial properties. In particular, a
nonconstant map Ļ : X0 (N ) ā’ E induces a map Ļā— : E ā’ J0 (N ) obtained
by mapping a point P of E to the divisor on X0 (N ) formed by the sum of
the inverse images of P minus the inverse images of ā ā E:

Ļā— : P ā’ā’ [Q] ā’ [R].
Ļ(Q)=P Ļ(R)=ā

Therefore, we can map E to a subgroup of J0 (N ) (this map might have a
nontrivial, but ļ¬nite, kernel).
An equivalent formulation of the modularity of E is to say that there is a
nonconstant map from X0 (N ) to E and therefore that E is isogenous to an
elliptic curve contained in some J0 (N ).
If p is a prime dividing N , there are two natural maps X0 (N ) ā’ X0 (N/p).
If (E, C) is a pair corresponding to a point in X0 (N ), then there is a unique
subgroup C ā‚ C of order N/p. So we have a map

Ī± : (E, C) ā’ā’ (E, C ). (15.10)

However, there is also a unique subgroup P ā‚ C of order p. It can be shown
that E/P is an elliptic curve and therefore (E/P, C/P ) is a pair corresponding
to a point on X0 (N/p). This gives a map

Ī² : (E, C) ā’ā’ (E/P, C/P ). (15.11)

These two maps can be interpreted in terms of the complex model of X0 (N ).
Since Ī“0 (N ) ā‚ Ī“0 (N/p), we can map H mod Ī“0 (N ) to H mod Ī“0 (N/p) by

Ā© 2008 by Taylor & Francis Group, LLC
459
SECTION 15.3 SKETCH OF RIBETā™S PROOF

mapping the equivalence class of Ļ„ mod Ī“0 (N ) to the equivalence class of Ļ„
mod Ī“0 (N/p). This corresponds to the map Ī±. The map Ī² can be shown to
correspond to the map Ļ„ ā’ pĻ„ . Note that these two maps represent the two
methods of using modular forms for Ī“0 (N/p) to produce oldforms for Ī“0 (N ).
The Hecke algebra T acts on J0 (N ). Let P be a point on X0 (N ). Recall
that P corresponds to a pair (E, C), where E is an elliptic curve and C is
a cyclic subgroup of order N . Let p be a prime. For each subgroup D of E
of order p with D ā C, we can form the pair (E/D, (C + D)/D). It can be
shown that E/D is an elliptic curve and (C + D)/D is a cyclic subgroup of
order N . Therefore, this pair represents a point on X0 (N ). Deļ¬ne

[(E/D, (C + D)/D)] ā Div(X0 (N )),
Tp ([(E, C)]) =
D

where the sum is over those D of order p with D ā C and where Div(X0 (N ))
denotes the divisors of X0 (N ) (see Chapter 11). It is not hard to show that
this corresponds to the formulas for Tp given in (15.6). Clearly Tp maps
divisors of degree 0 to divisors of degree 0, and it can be shown that it maps
principal divisors to principal divisors. Therefore, Tp gives a map from J0 (N )
to itself. This yields an action of T on J0 (N ), and these endomorphisms are
deļ¬ned over Q.
Let Ī± ā T and let J0 (N )[Ī±] denote the kernel of Ī± on J0 (N ). More generally,
let I be an ideal of T. Deļ¬ne

J0 (N )[I] = J0 (N )[Ī±].
Ī±āI

For example, when I = nT for an integer n, then J0 (N )[I] is just J0 (N )[n],
the n-torsion on J0 (N ).
Now letā™s consider the representation Ļ of Theorem 15.6. Since Ļ is assumed
to be modular, it corresponds to a maximal ideal M of T. Let F = T/M,
which is a ļ¬nite ļ¬eld. Then W = J0 (N )[M] has an action of F, which means
that it is a vector space over F. Let be the characteristic of F. Since = 0
in F, it follows that
W ā J0 (N )[ ],
the -torsion of J0 (N ). Since G acts on W , we see that W yields a represen-
tation Ļ of G over F. It can be shown that Ļ is equivalent to Ļ, so we can
regard the representation space for Ļ as living inside the -torsion of J0 (N ).
This has great advantages. For example, if M |N then there are natural maps
X0 (N ) ā’ X0 (M ). These yield (just as for the map X0 (N ) ā’ E above) maps
J0 (M ) ā’ J0 (N ). Showing that the level can be reduced from N to M is
equivalent to showing that this representation space lives in these images of
J0 (M ). Also, we are now working with a representation that lives inside a
fairly concrete object, namely the -torsion of an abelian variety, rather than
a more abstract situation, so we have more control over Ļ.

Ā© 2008 by Taylor & Francis Group, LLC
460 CHAPTER 15 FERMATā™S LAST THEOREM

Shimura curves
We now need to introduce what are known as Shimura curves. Recall that
in Section 10.2 we deļ¬ned quaternion algebras as (noncommutative) rings of
the form
Q = Q + QĪ± + QĪ² + QĪ±Ī²,
where
Ī±2 , Ī² 2 ā Q, Ī²Ī± = ā’Ī±Ī².
We omit the requirement from Section 10.2 that Ī±2 < 0 and Ī² 2 < 0 since
we want to consider indeļ¬nite quaternion algebras as well. Let r be a prime
(possibly ā) and let Qr be the ring obtained by allowing r-adic coeļ¬cients
in the deļ¬nition of Q. As we mentioned in Section 10.2, there is a ļ¬nite set
of primes r, called the ramiļ¬ed primes, for which Qr has no zero divisors. On
the other hand, when r is unramiļ¬ed, Qr is isomorphic to M2 (Qr ), the ring
of 2 Ć— 2 matrices with r-adic entries.
Given two distinct primes p and q, there is a quaternion algebra B that is
ramiļ¬ed exactly at p and q. In particular, B is unramiļ¬ed at ā, so

Bā = M2 (R).

Corresponding to the integer M = N/pq, there is an order O ā‚ B, called
an Eichler order of level M (an order in B is a subring of B that has rank 4
as an additive abelian group; see Section 10.2). Regarding O as a subset of
Bā = M2 (R), deļ¬ne
Ī“ā = O ā© SL2 (R).
Then Ī“ā acts on H by linear fractional transformations. The Shimura curve
C is deļ¬ned to be H modulo Ī“ā .
There is another description of C, analogous to the one given above for
X0 (N ). Let Omax be a maximal order in B. Consider pairs (A, B), where A
is a two-dimensional abelian variety (these are algebraic varieties that, over
C, can be described as C2 mod a rank 4 lattice) and B is a subgroup of
A isomorphic to ZM ā• ZM . We restrict our attention to those pairs such
that Omax is contained in the endomorphism ring of A and such that Omax
maps B to B. When we are working over C, such pairs are in one-to-one
correspondence with the points on C. In general, over arbitrary ļ¬elds, such
pairs correspond in a natural way to points on an algebraic curve, which we
again denote C.
Let J be the Jacobian of C. The description of C in terms of pairs (A, B)
means that we can deļ¬ne an action of the Hecke operators on J, similarly to
what we did for the modular curves.
Let J[ ] be the -torsion of the Jacobian J of C. It can be shown that
the representation Ļ occurs in J[M], so there is a space V isomorphic to the

Ā© 2008 by Taylor & Francis Group, LLC
461
SECTION 15.4 SKETCH OF WILESā™S PROOF

representation space W of Ļ with

V ā J[M] ā J[ ].

We now have the representation Ļ living in J0 (N )[ ] and in J[ ]. The rep-
resentation Ļ can be detected using the reduction of J0 (N ) mod q and also
using the reduction of J mod p, and Ribet uses a calculation with quater-
nion algebras to establish a relationship between these two reductions. This
relationship allows him to show that p can be removed from the level N .

REMARK 15.9 A correspondence between modular forms for GL2 and
modular forms for the multiplicative group of a quaternion algebra plays a
major role in work of Jacquet-Langlands. This indicates a relation between
J0 (N ) and J. In fact, there is a surjection from J0 (N ) to J. However, this
map is not being used in the present case since such a map would relate the
reduction of J0 (N ) mod q to the reduction of J mod q. Instead, Ribet works
with the reduction of J0 (N ) mod q and the reduction of J mod p. This switch
between p and q is a major step in the proof of Ribetā™s theorem.

15.4 Sketch of Wilesā™s Proof
In this section, we outline the proof that all semistable elliptic curves over
Q are modular. For more details, see [29], [32], [118], [133]. Let E be a
semistable elliptic curve and let

an q n
fE =
nā„1

be the associated potential modular form. We want to prove that fE is a
modular form (for some Ī“0 (N )).
Suppose we have two potential modular forms

cn q n , cn q n
g=
f=
nā„1 nā„1

arising from Galois representations G ā’ GL2 (Op ) (where Op is some ring
containing the p-adic integers. We assume that all of the coeļ¬cients cn , cn
are embedded in Op ). Let p be the prime above p in Op . (If Op is the ring of
Ė
p-adic integers, then p = p.) If c ā” c (mod p) for almost all primes (that
Ė Ė
is, we allow ļ¬nitely many exceptions), then we write

f ā”g (mod p).
Ė

Ā© 2008 by Taylor & Francis Group, LLC
462 CHAPTER 15 FERMATā™S LAST THEOREM

This means that the Galois representations mod p associated to f and g are
Ė
equivalent.
The following result of Langlands and Tunnell gives us a place to start.

THEOREM 15.10
Let E be an elliptic curve deļ¬ned over Q and let fE = nā„1 an q n be the
associated potential modular form. There exists a modular form

bn q n
g0 =
nā„1

such that
a ā”b (mod Ė
3)
for almost all primes (that is, with possibly ļ¬nitely many exceptions), and
where Ė denotes a prime of O3 .
3

Recall that O3 denotes an unspeciļ¬ed ring containing the 3-adic integers.
If O3 is suļ¬ciently large, the coeļ¬cients b , which are algebraic integers, can
be regarded as lying in O3 .
The reason that 3 is used is that the group GL2 (F3 ) has order 48, hence
is solvable. The representation Ļ3 of G on E[3] therefore has its image in
a solvable group. The techniques of base change developed in the Langlands
program apply to cyclic groups, hence to solvable groups, and these techniques
are the key to proving the result. The groups GL2 (Fp ) for p ā„ 5 are not
solvable, so the base change techniques do not apply. On the other hand, the
representation Ļ2 for the Galois action on E[2] is trivial for the Frey curves
since the 2-torsion is rational for these curves. Therefore, it is not expected
that Ļ2 should yield any information.
Note that the modular form g0 does not necessarily have rational coeļ¬-
cients. Therefore, g0 is not necessarily the modular form associated to an
elliptic curve. Throughout Wilesā™s proof, Galois representations associated to
arbitrary modular forms are used.
The result of Langlands and Tunnell leads us to consider the following.

GENERAL PROBLEM
Fix a prime p. Let g = nā„1 an q n be a potential modular form (associated
to a 2-dimensional Galois representation). Suppose there is a modular form
bn q n such that g ā” g0 (mod Ė Can we prove that g is a modular
g0 = )p.
form?

The work of Wiles shows that the answer to the general problem is often
yes. Let A be the set of all potential modular forms g with g ā” g0 (mod Ė )p
(subject to certain restrictions). Let M ā A be the set of modular gā™s in A.
We are assuming that g0 ā M . The basic idea is the following. Let TA be the

Ā© 2008 by Taylor & Francis Group, LLC
463
SECTION 15.4 SKETCH OF WILESā™S PROOF

tangent space to A at g0 and let TM be the tangent space to M at g0 . The
goal is to show that TA = TM . Wiles shows that the spaces A and M are nice
enough that the equality of tangent spaces suļ¬ces to imply that A = M .

A

TA TM

g0 M

Figure 15.2
Tangent Spaces

Example 15.2
Let E be given by

y 2 + xy + y = x3 ā’ x2 ā’ 171x + 1904.

This curve has multiplicative reduction at 17 and 37 and good reduction at
all other primes. Therefore, E is semistable. The minimal discriminant of E
is ā = ā’17 Ā· 375 . Since E is semistable, the conductor of E is N = 17 Ā· 37.
Therefore, we expect that gE is a modular form for Ī“0 (17 Ā· 37). Counting
points on E mod for various yields the following values for a (we ignore
the bad prime 17):

2 35 7 11 13 17 19 23
ā’1 0 3 ā’1 ā’5 ā’2 ā’ ā’6
a 1

Therefore,
gE = q ā’ q 2 + 0 Ā· q 3 ā’ q 4 + 3q 5 + Ā· Ā· Ā· .
There is a modular form

bn q n = q ā’ q 2 + 0 Ā· q 3 ā’ q 4 ā’ 2q 5 + Ā· Ā· Ā·
g0 =

Ā© 2008 by Taylor & Francis Group, LLC
464 CHAPTER 15 FERMATā™S LAST THEOREM

for Ī“0 (17). The ļ¬rst few values of b are as follows:

2 35 7 11 13 17 19 23
ā’1 0 ā’2 4 0 ā’2 ā’ ā’4
b 4

It can be shown that a ā” b (mod 5) for all = 17, 37 (we ignore these
bad primes), so
gE ā” g0 (mod 5).
Can we prove that gE is a modular form?
Let A be the set of all potential modular forms g with g ā” g0 (mod 5) and
where the level N for g is allowed to contain only the primes 5, 17, 37 in its
factorization. There is also a technical condition, which we omit, on the ring
generated by the coeļ¬cients of g. The subspace M of true modular forms
contains g0 . Here are pictures of A and M :

ā¢ ā¢
A:
g0 gE

ā¢ ā¢ ā¢
M: or
g0 g0 gE
Therefore, our intuitive picture given in Figure 15.2 is not quite accurate.
In particular, the sets A and M are ļ¬nite. However, by reinterpreting the
geometric picture algebraically, we can still discuss tangent spaces.

Since the sets A and M are ļ¬nite, why not count the elements in both sets
and compare? First of all, this seems to be hard to do. Secondly, the tangent
spaces yield enough information. Consider the following situation. Suppose
you arrive at a train station in a small town. There are no signs telling you
which town it is, but you know it must be either I or II. You have the maps
given in Figure 15.3, where the large dot in the center indicates the station.

I II

Figure 15.3
Two Small Towns

Ā© 2008 by Taylor & Francis Group, LLC
465
SECTION 15.4 SKETCH OF WILESā™S PROOF

By counting the streets emanating from the station, you can immediately
determine which town you are in. The reason is that you have a base point. If
you didnā™t, then you might be on any of the vertices of I or II. You would not
be able to count streets and identify the town. The conļ¬guration of streets at
the station is the analogue of the tangent space at the base point. Of course,
it is possible that two towns could have the same tangent spaces, but Wiles
shows that this does not happen in his situation.

Tangent spaces
We now want to translate the notion of a tangent space into a useful alge-
braic formulation. Let R[x, y] be the ring of polynomials in two variables and
let f (x, y) ā R[x, y]. We can regard f as a function from the xy-plane to R.
Restricting f to the parabola y = x2 ā’ 6x, we obtain a function

f : parabola ā’ā’ R.

If g(x, y) ā R[x, y], then f and g give the same function on the parabola if
and only if f ā’ g is a multiple of y + 6x ā’ x2 . For example, let f = x3 ā’ y and
g = 6x + xy + 5x2 . Then

f ā’ g = ā’(x + 1)(y + 6x ā’ x2 ).

If we choose a point (a, b) on the parabola, then b + 6a ā’ a2 = 0, so

f (a, b) = g(a, b) ā’ (a + 1)(b + 6a ā’ a2 ) = g(a, b).

Therefore, there is a one-to-one correspondence

R[x, y]/(y + 6x ā’ x2 ).
āā’
polynomial functions on the parabola

The ring on the right consists of congruence classes of polynomials, where
we say that two polynomials are congruent if their diļ¬erence is a multiple of
y +6xā’x2 . In this way, we have represented a geometric object, the parabola,
by an algebraic object, the ring R[x, y]/(y + 6x ā’ x2 ).
Now letā™s consider the tangent line y + 6x = 0 at (0, 0). It is obtained by
taking the degree 1 terms in y + 6x ā’ x2 . We can represent it by the set

{ax + by | a, b ā R} mod (y + 6x),

where we are taking all linear functions and regarding two of them as congru-
ent if they diļ¬er by a multiple of y + 6x. Of course, we could have represented
the tangent line by the ring R[x, y]/(y + 6x), but, since we already know that
the tangent line is deļ¬ned by a linear equation, we do not lose any information
by replacing R[x, y] by the linear polynomials ax + by.

Ā© 2008 by Taylor & Francis Group, LLC
466 CHAPTER 15 FERMATā™S LAST THEOREM

Now consider the surface

y ā’ x2 + xz + 6x + z = 0.

This surface contains the parabola y = x2 ā’ 6x, z = 0. The inclusion of the
parabola in the surface corresponds to a surjective ring homomorphism

R[x, y, z]/(y ā’ x2 + xz + 6x + z) R[x, y]/(y + 6x ā’ x2 )
ā’ā’
ā’ā’
f (x, y, z) f (x, y, 0).

We also have a surjective map on the algebraic objects representing the tan-
gent spaces

{ax + by + cz} ā’ā’ {ax + by}
mod (y + 6x + z) mod (y + 6x)

corresponding to the inclusion of the tangent line to the parabola in the tan-
gent plane for the surface at (0, 0, 0). In this way, we can study relations
between geometric objects by looking at the corresponding algebraic objects.
Wiles works with rings such as Op [[x]]/(x2 ā’ px), where for simplicity we
henceforth assume that Op is the p-adic integers and where Op [[x]] denotes
power series with p-adic coeļ¬cients. The zeros of x2 ā’ px are 0 and p, so this
ring corresponds to the geometric object

ā¢ ā¢
S1 :
0 p
The tangent space is represented by the set obtained by looking only at the
linear terms, namely {ax | a ā Op } mod (px). Since

a1 x ā” a2 x āā’ a1 ā” a2
mod px (mod p),

the tangent space can be identiļ¬ed with Zp .
As another example, consider the ring Op [[x]]/(x(x ā’ p)(x ā’ p3 )), which
corresponds to the geometric object

ā¢ ā¢ ā¢
S2 :
p3
0 p
The tangent space is Zp4 .
There is an inclusion S1 ā‚ S2 , which corresponds to the natural ring ho-
momorphism

Op [[x]]/(x(x ā’ p)(x ā’ p3 )) ā’ā’ Op [[x]]/(x2 ā’ px).

The map on tangent spaces is the map from Zp4 to Zp that takes a number
mod p4 and reduces it mod p.
Now consider the ring Op [[x, y]]/(x2 ā’px, y 2 ā’py). In this case, we are look-
ing at power series in two variables, and two power series are congruent if their

Ā© 2008 by Taylor & Francis Group, LLC
467
SECTION 15.4 SKETCH OF WILESā™S PROOF

diļ¬erence is a linear combination of the form A(x, y)(x2 ā’px)+B(x, y)(y 2 ā’py)
with A, B ā Op [[x, y]]. The corresponding geometric object is

ā¢ ā¢
(0, p) (p, p)
S3 :
ā¢ ā¢
(0, 0) (p, 0)

It can be shown that two power series give the same function on this set of
four points if they diļ¬er by a linear combination of x2 ā’ px and y 2 ā’ py. The
tangent space is represented by

{ax + by | a, b ā Op } mod (px, py),

which means we are considering two linear polynomials to be congruent if
their diļ¬erence is a linear combination of px and py. It is easy to see that

a1 x + b1 y ā” a2 x + b2 y āā’ a1 ā” a2 , b1 ā” b2
mod (px, py) (mod p).

Therefore, the tangent space is isomorphic to Zp ā• Zp .
The inclusion S1 ā‚ S3 corresponds to the ring homomorphism

Op [[x, y]]/(x2 ā’ px, y 2 ā’ py) ā’ā’ Op [[x]]/(x2 ā’ px).

The map on tangent spaces is the map Zp ā• Zp ā’ Zp given by projection
onto the ļ¬rst factor.
In all three examples above, the rings are given by power series over Op .
The number of variables equals the number of relations and the resulting
ring is a ļ¬nitely generated Op -module (this is easily veriļ¬ed in the three
examples). Such rings are called local complete intersections. For such
rings, it is possible to recognize when a map is an isomorphism by looking at
the tangent spaces.
Before proceeding, letā™s look at an example that is not a local complete
intersection. Consider the ring

Op [[x, y]]/(x2 ā’ px, y 2 ā’ py, xy).

The corresponding geometric object is

ā¢
(0, p)
S4 :
ā¢ ā¢
(0, 0) (p, 0)

Ā© 2008 by Taylor & Francis Group, LLC
468 CHAPTER 15 FERMATā™S LAST THEOREM

There are two variables and three relations, so we do not have a complete
intersection. The tangent space is Zp ā•Zp . The inclusion S4 ā‚ S3 corresponds
to the ring homomorphism
Op [[x, y]]/(x2 ā’ px, y 2 ā’ py) ā’ā’ Op [[x, y]]/(x2 ā’ px, y 2 ā’ py, xy)
and the map on tangent spaces is an isomorphism. However, S3 = S4 . The
problem is that the tangent space calculation does not notice the relation xy,
which removed the point (p, p) from S3 to get S4 . Therefore, the tangent
space thinks this point is still there and incorrectly predicts an isomorphism
between the three point space and the four point space.
The general fact we need is that if we have a surjective homomorphism of
rings that are local complete intersections, and if the induced map on tangent
spaces is an isomorphism, then the ring homomorphism is an isomorphism.

Deformations of Galois representations
Now letā™s return to our sets A and M . Corresponding to these two sets are
rings RA and RM . We have g0 ā M ā A. Let TA and TM be the tangent
spaces at g0 . In the examples above, the base point g0 would correspond to
x = 0 or to (x, y) = (0, 0). Corresponding to the inclusion M ā A, there are
surjective maps
RA ā’ā’ RM , TA ā’ā’ TM .
Therefore,
#TM ā¤ #TA .
The ring RM can be constructed using the Hecke algebra and the ring RA
is constructed using results about representability of functors. In fact, it was
shown that there is a representation
Ļuniversal : G ā’ā’ GL2 (RA )
with the following property. Let
Ļ : G ā’ā’ GL2 (Op )
be a representation and let g be the potential modular form attached to Ļ.
Assume that Ļ is unramiļ¬ed outside a ļ¬xed ļ¬nite set of primes. If g ā” g0
(mod p), then there exists a unique ring homomorphism
Ė
Ļ : RA ā’ā’ Op
such that the diagram
Ļuniversal /
G PPP GL2 (RA )
PPP
PPP
P
Ļ PPP
Ļ
P' 
GL2 (Op )

Ā© 2008 by Taylor & Francis Group, LLC
469
SECTION 15.4 SKETCH OF WILESā™S PROOF

commutes.
The representations Ļ such that g ā” g0 (mod p) are examples of what are
Ė
known as deformations of the Galois representation for g0 . The representa-
tion Ļuniversal is called a universal deformation.

Example 15.3
We continue with Example 13.2. Let p = 5 and take the ļ¬xed set of primes
to be {5, 17, 37}. Then it can be shown that

O5 [[x]]/(x2 ā’ bx),
RA

where b/5 is a 5-adic unit and O5 is the ring of 5-adic integers. This implies
that TA = Z5 . The set A has two points, g0 and g, corresponding to x = 0
and x = b.

There exists an integer n, deļ¬ned below, such that

n ā¤ #TM ā¤ #TA .

Moreover, a result of Flach shows that n Ā· TA = 0. If it can be shown that
n = #TA , then TA = TM .
In our example, n = 5. Since we know that TA = Z5 , we have n = #TA .
Therefore, TA = TM . It can be shown that RA and RM are local complete
intersections. This yields RA = RM and A = M . This implies that g is a
modular form.
In general, recall that we started with a semistable elliptic curve E. Associ-
ated to E is the 3-adic Galois representation Ļ3ā . The theorem of Langlands-
Tunnell yields a modular form g0 , and therefore a Galois representation

Ļ0 : G ā’ā’ GL2 (O3 ).

We have
Ļ3ā ā” Ļ0 (mod Ė
3),
so the base point Ļ0 is modular and semistable mod Ė (the notion of semistabil-
3
ity can be deļ¬ned for general Galois representations). Under the additional
ā
assumption that Ļ3 restricted to Gal(Q/Q( ā’3)) is absolutely irreducible,
Wiles showed that if RM is a local complete intersection then n = #TA and
the map RA ā’ RM is an isomorphism of local complete intersections. Finally,
in 1994, Wiles and Taylor used an ingenious argument to show that RM is a
local complete intersection, and therefore A = M .
What happens if Ļ3 does not satisfy the irreducibility assumption? Wiles
showed that there is a semistable elliptic curve E with the same mod 5
representation as E but whose mod 3 representation is irreducible. Therefore,
E is modular, so the mod 5 representation of E is modular. This means
that the mod 5 representation of E is modular. If the mod 5 representation,

Ā© 2008 by Taylor & Francis Group, LLC
470 CHAPTER 15 FERMATā™S LAST THEOREM
ā
restricted to Gal(Q/Q( 5)), is absolutely irreducible, then the above result
of Wiles, with 5 in place of 3, shows that E is modular.
There are only ļ¬nitely many elliptic curves over Q for which both the mod 3
ā
representation (restricted to Gal(Q/Q( ā’3))) and the mod 5 representation
ā
(restricted to Gal(Q/Q( 5))) are not absolutely irreducible. These ļ¬nitely
many exceptions can be proved to be modular individually.
Therefore, semistable elliptic curves over Q are modular. Eventually, the
argument was extended by Breuil, Conrad, Diamond, and Taylor to include
all elliptic curves over Q (Theorem 14.4).
The integer n is deļ¬ned as follows. Let g0 = bm q m and let
 ńņš. 1(āńåćī 2)ŃĪÄÅŠĘĄĶČÅ >>