Breaking Ciphers in the Real World
Mark Stamp
Richard M. Low
San Jose State University
San Jose, CA
BICENTENNIAL
BICENTENNIAL
WILEYINTERSCIENCE
A JOHN WILEY & SONS, INC., PUBLICATION
This Page Intentionally Left Blank
APPLIED CRYPTANALYSIS
THE W I L E Y BICENTENNIALKNOWLEDGE
FOR G E N E R A T I O N S
ach generation has its unique needs and aspirations. When Charles Wiley first
opened his small printing shop in lower Manhattan in 1807, it was a generation
of boundless potential searching for an identity. And we were there, helping to
define a new American literary tradition. Over half a century later, in the midst
of the Second Industrial Revolution, it was a generation focused on building the
future. Once again, we were there, supplying the critical scientific, technical, and
engineering knowledge that helped frame the world. Throughout the 20th
Century, and into the new millennium, nations began to reach out beyond their
own borders and a new international community was born. Wiley was there,
expanding its operations around the world to enable a global exchange of ideas,
opinions, and knowhow.
For 200 years, Wiley has been an integral part of each generation™s journey,
enabling the flow of information and understanding necessary to meet their needs
and fulfill their aspirations. Today, bold new technologies are changing the way
we live and learn. Wiley will be there, providing you the musthave knowledge
you need to imagine new worlds, new possibilities, and new opportunities.
Generations come and go, but you can always count on Wiley to provide you the
knowledge you need, when and where you need it!
6
WILEY
WILLIAM J. PESCE PETER BOOTH
BOARD
CHAIRMAN OF THE
PRESIDENT AND CHIEF EXECUTIVE OFFICER
APPLl ED CRYPTANALYSIS
Breaking Ciphers in the Real World
Mark Stamp
Richard M. Low
San Jose State University
San Jose, CA
BICENTENNIAL
BICENTENNIAL
WILEYINTERSCIENCE
A JOHN WILEY & SONS, INC., PUBLICATION
Copyright 02007 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate percopy fee to
the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 7508400, fax
(978) 7504470, or on the web at www.copyright.com. Requests to the Publisher for permission should
be addressed to the Permissions Department, John Wiley & Sons, Inc., 11 1 River Street, Hoboken, NJ
07030, (201) 748601 1. fax (201) 7486008, or online at http://www.wiley.codgo/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifically disclaim any implied warranties of
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be
suitable for your situation. You should consult with a professional where appropriate. Neither the
publisher nor author shall be liable for any loss of profit or any other commercial damages, including
but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 7622974, outside the United States at
(317) 5723993 or fax (317) 5724002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may
not be available in electronic format. For information about Wiley products, visit our web site at
www.wiley.com.
Wiley Hicentennial Logo: Richard J . Pacific0
Library of Congress CataloginginPublication Data:
Stamp, Mark.
Applicd cryptanalysis : breaking ciphers in the real world / Mark Stamp,
Richard M. Low.
p. cm.
lncludes bibliographical references and index.
ISBN 97804701 14865 (pbk.)
1 , Computer security. 2. Data encryption (Computer science) 3.
Cryptography. I. Low, Richard M., 1967 11. Title.
QA76.9.A25S687 2007
005.8'24˜22 2007001277
Printed in the United States of America
10987654321
To Melody, Austin, and Males MSS
˜
TOAmy  RML
This Page Intentionally Left Blank
Contents
Preface xiii
About the Authors xvii
Acknowledgments xix
1 Classic Ciphers 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Good Guys and Bad Guys . . . . . . . . . . . . . . . . . . . . 1
1.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Selected Classic Crypto Topics . . . . . . . . . . . . . . . . . 4
1.4.1 Transposition Ciphers . . . . . . . . . . . . . . . . . . 5
1.4.2 Subst.itution Ciphers . . . . . . . . . . . . . . . . . . . 8
1.4.3 OneTime Pad . . . . . . . . . . . . . . . . . . . . . . 18
1.4.4 Codebook Ciphers . . . . . . . . . . . . . . . . . . . . 20
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2 World War I1 Ciphers 25
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Enigma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 Enigma Cipher Machine . . . . . . . . . . . . . . . . . 26
2.2.2 Enigma Keyspace . . . . . . . . . . . . . . . . . . . . . 29
2.2.3 Rotors . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.4 Enigma Attack . . . . . . . . . . . . . . . . . . . . . . 34
2.2.5 More Secure Enigma? . . . . . . . . . . . . . . . . . . 37
2.3 Purple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.1 Purple Cipher Machine . . . . . . . . . . . . . . . . . 38
2.3.2 Purple Keyspace . . . . . . . . . . . . . . . . . . . . . 44
2.3.3 Purple Diagnosis . . . . . . . . . . . . . . . . . . . . . 45
2.3.4 Decrypting Purple . . . . . . . . . . . . . . . . . . . . 49
2.3.5 Purple versus Enigma . . . . . . . . . . . . . . . . . . 50
2.4 Sigaba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
vii
...
CONTENTS
Vlll
2.4.1 Sigaba Cipher Machine . . . . . . . . . . . . . .... 52
2.4.2 Sigaba Keyspace . . . . . . . . . . . . . . . . . . ... 57
2.4.3 Sigaba Attack . . . . . . . . . . . . . . . . . . . . ... 59
2.4.4 Sigaba Conclusion . . . . . . . . . . . . . . . . . ... 67
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . ...
2.5 68
Problerns . . . . . . . . . . . . . . . . . . . . . . . . . . ...
2.6 69
Stream Ciphers
3 79
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.2 Shift Registers . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.2.1 BerlekampMassey Algorithm . . . . . . . . . . . . . . 83
3.2.2 Cryptographically Strong Sequences . . . . . . . . . . 85
3.2.3 Shift RegisterBased Stream Ciphers . . . . . . . . . . 89
3.2.4 Correlation Attack . . . . . . . . . . . . . . . . . . . . 90
3.3 ORYX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.3.1 ORYX Cipher . . . . . . . . . . . . . . . . . . . . . . . 94
3.3.2 ORYX Attack . . . . . . . . . . . . . . . . . . . . . . . 97
3.3.3 Secure ORYX? . . . . . . . . . . . . . . . . . . . . . . 102
3.4 RC4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.4.1 RC4 Algorithm . . . . . . . . . . . . . . . . . . . . . . 105
3.4.2 RC4 Attack . . . . . . . . . . . . . . . . . . . . . . . . 105
3.4.3 Preventing the RC4 Attack . . . . . . . . . . . . . . . 110
3.5 I™KZIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.5.1 PKZIP Cipher . . . . . . . . . . . . . . . . . . . . . . 111
3.5.2 PKZIP Attack . . . . . . . . . . . . . . . . . . . . . . 113
3.5.3 Improved PKZIP? . . . . . . . . . . . . . . . . . . . . 120
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4 Block Ciphers 127
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.2 Block Cipher Modes . . . . . . . . . . . . . . . . . . . . . . . 128
4.3 Feistel Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4 Hellman™s TimeMemory Tradeoff . . . . . . . . . . . . . . . 133
4.4.1 Cryptanalytic T M T O . . . . . . . . . . . . . . . . . . 133
4.4.2 Bad Chains . . . . . . . . . . . . . . . . . . . . . . . . 137
4.4.3 Succcss Probability . . . . . . . . . . . . . . . . . . . . 141
4.4.4 Distributed T M T O . . . . . . . . . . . . . . . . . . . . 142
4.4.5 T M T O Conclusioris . . . . . . . . . . . . . . . . . . . 143
4.5 CMEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.5.1 CMEA Cipher . . . . . . . . . . . . . . . . . . . . . . 144
4.5.2 SCMEA Cipher . . . . . . . . . . . . . . . . . . . . . . 146
4.5.3 SCMEA Chosen Plaintext Attack . . . . . . . . . . . 147
CONTENTS ix
4.5.4 CMEA Chosen Plaintext Attack . . . . . . . . . . . . 148
4.5.5 SCMEA Known Plaintext Attack . . . . . . . . . . . 151
4.5.6 CMEA Known Plaintext Attack . . . . . . . . . . . . 158
4.5.7 More Secure CMEA? . . . . . . . . . . . . . . . . . . . 159
4.6 Akelarre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.6.1 Akelarre Cipher . . . . . . . . . . . . . . . . . . . . . . 160
4.6.2 Akelarre Attack . . . . . . . . . . . . . . . . . . . . . . 166
4.6.3 Improved Akelarre? . . . . . . . . . . . . . . . . . . . 169
4.7 FEAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
4.7.1 FEAL4 Cipher . . . . . . . . . . . . . . . . . . . . . . 171
4.7.2 FEAL4 Differential Attack . . . . . . . . . . . . . . . 172
4.7.3 FEAL4 Linear Attack . . . . . . . . . . . . . . . . . . 177
4.7.4 Confusion and Diffusion . . . . . . . . . . . . . . . . . 182
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
4.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Hash Functions 193
5
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.2 Birthdays and Hashing . . . . . . . . . . . . . . . . . . . . . . 200
5.2.1 The Birthday Problem . . . . . . . . . . . . . . . . . . 200
5.2.2 Birthday Attacks on Hash Functions . . . . . . . . . . 201
5.2.3 Digital Signature Birthday Attack . . . . . . . . . . . 202
5.2.4 Nostradamus Attack . . . . . . . . . . . . . . . . . . . 203
5.3 MD4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
5.3.1 MD4 Algorithm . . . . . . . . . . . . . . . . . . . . . . 208
5.3.2 MD4 Attack . . . . . . . . . . . . . . . . . . . . . . . 210
5.3.3 A Meaningful Collision . . . . . . . . . . . . . . . . . . 224
5.4 MD5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
5.4.1 MD5 Algorithm . . . . . . . . . . . . . . . . . . . . . . 225
5.4.2 A Precise Differential . . . . . . . . . . . . . . . . . . 231
5.4.3 Outline of Wang™s Attack . . . . . . . . . . . . . . . . 233
5.4.4 Wang™s MD5 Differentials . . . . . . . . . . . . . . . . 235
5.4.5 Reverse Engineering Wang™s Attack . . . . . . . . . . 238
5.4.6 Stevens™ Implementation of Wang™s Attack . . . . . . 252
5.4.7 A Practical Attack . . . . . . . . . . . . . . . . . . . 253
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
5.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Public Key Systems
6 265
6.1 Introduction . . . . . . . . . . . . . . . . . . . ......... 265
6.2 MerkleeHellman Knapsack . . . . . . . . . . . ......... 267
6.2.1 LatticeReduction Attack . . . . . . . . . . . . . . . . 270
6.2.2 Knapsack Conclusion . . . . . . . . . . . . . . . . . . 275
CONTENTS
X
DifieHellman Key Exchange . . . . . . . . . . . . . . . . . .
6.3 275
6.3.1 ManintheMiddle Attack . . . . . . . . . . . . . . . . 277
6.3.2 DiffieHellman Conclusion . . . . . . . . . . . . . . . . 278
Arithmetica Key Exchange . . . . . . . . . . . . . . . . . . .
6.4 279
6.4.1 HughesTannenbaum Length Attack . . . . . . . . . . 283
6.4.2 Arithmetica Conclusion . . . . . . . . . . . . . . . . . 284
RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5 284
6.5.1 Mathematical Issues . . . . . . . . . . . . . . . . . . . 285
6.5.2 RSA Conclusion . . . . . . . . . . . . . . . . . . . . . 288
Rabin Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6 289
6.6.1 Chosen Ciphertext Attack . . . . . . . . . . . . . . . 291
6.6.2 Rabin Cryptosystenl Conclusion . . . . . . . . . . . . 292
NTRU Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7 293
6.7.1 MeetintheMiddle Attack . . . . . . . . . . . . . . . 299
6.7.2 Multiple Transmission Attack . . . . . . . . . . . . . 301
6.7.3 Chosen Ciphertext Attack . . . . . . . . . . . . . . . 302
6.7.4 NTRU Conclusion . . . . . . . . . . . . . . . . . . . . 304
ElGarnal Signature Scheme . . . . . . . . . . . . . . . . . . .
6.8 305
6.8.1 Mathematical Issues . . . . . . . . . . . . . . . . . . . 308
6.8.2 ElGamal Signature Conclusioil . . . . . . . . . . . . . 308
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9 309
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.10 309
7 Public Key Attacks 315
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
7.2 Factoring Algorithms . . . . . . . . . . . . . . . . . . . . . . . 316
7.2.1 Trial Division . . . . . . . . . . . . . . . . . . . . . . . 316
7.2.2 Dixon™s Algorithm . . . . . . . . . . . . . . . . . . . . 317
7.2.3 Quadratic Sieve . . . . . . . . . . . . . . . . . . . . . . 323
7.2.4 Factoring Conclusions . . . . . . . . . . . . . . . . . . 327
7.3 Discrete Log Algorithms . . . . . . . . . . . . . . . . . . . . . 330
7.3.1 Trial Multiplication . . . . . . . . . . . . . . . . . . . 330
7.3.2 BabyStep GiantStep . . . . . . . . . . . . . . . . . . 331
7.3.3 Index Calculus . . . . . . . . . . . . . . . . . . . . . . 332
7.3.4 Discrete Log Conchlsions . . . . . . . . . . . . . . . . 333
7.4 RSA Iniplenieritation Attacks . . . . . . . . . . . . . . . . . . 334
7.4.1 Tinling Attacks . . . . . . . . . . . . . . . . . . . . . 334
7.4.2 Glitchirlg Attack . . . . . . . . . . . . . . . . . . . . . 353
7.4.3 Implementatiorl Attacks Conclusiorls . . . . . . . . . . 354
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
7.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
CONTENTS xi
Appendix 361
A1 MD5Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
A2 Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
A2.1 Number Theory . . . . . . . . . . . . . . . . . . . . . 371
A2.2 Group Theory . . . . . . . . . . . . . . . . . . . . . . 372
A2.3 Ring Theory . . . . . . . . . . . . . . . . . . . . . . . 372
A2.4 Linea.r Algebra . . . . . . . . . . . . . . . . . . . . . . 373
Annotated Bibliography 375
Index 393
This Page Intentionally Left Blank
Preface
To paraphrase Barbie, “cryptanalysis is hard” [6]. Unfortunately, many
cryptanalysis papers seem to be written in their own impenetrable secret
code, making the subject appear to be even more difficult than it really is.
In this book, we strive to present applied cryptanalytic attacks in an acces
sible form. Here, we are focused on practical attacks that actually break real
world systems, not attacks that merely indicate some theoretical weakness
in a cipher. Consequently, we consider real ciphers and, primarily, modern
ciphers. Many attacks that satisfy our criteria are scattered throughout the
literature.™ With a few notable exceptions, these papers require a Herculean
effort to digest and understand. One of our goals is to lift this unintentional
veil on the exciting and fascinating field of cryptanalysis.
Most of the topics presented in this book require only a modest math
ematical background. Some of the public key topics are inherently more
mathematical, but in every case we have strived to minimize the advanced
mathematics. We also believe that we have provided enough background in
formation so that the book is essentially selfcontained. Some of the more
advanced mathematical topics are treated briefly in the Appendix. Any moti
vated upperdivision undergraduate studentin any technical field of study
should be able to tackle this book. Some of the material is not easy, but those
who persist will be rewarded with a solid understanding of cryptanalysis, as
well as the knowledge, tools, and experience to confidently explore cutting
edge cryptanalytic topics.
We have provided an extensive set of problems for each chapter. A few of
these problems are relatively easy, but most range from moderate to some
what challenging. Generally, we have tried to avoid obvious problems of the
“implement suchandsuch attack” variety. Of course, it is useful and instruc
tive to implement an attack, but the problems are intended to reinforce and
expand on material presented in the text, without placing an overwhelming
burden on the reader. A fairly complete solutions manual is available to
instructors directly froni your Wiley representative.
™A large percentage of the cryptanalysis literature is informal in the sense that many
papers never receive any formal peer review. Although the academic peerreview process
suffers from a multitude of sins, no peer review is no better.
xiii
PREFACE
xiv
To really understand the material in this book, i t is necessary to work a
significant number of the problems. Cryptarialysis is definitely not a spectator
sport. We believe that the computer is an essential cryptanalytic tool. It is
riot coincidental that many of the homework problems require some computer
programming.
For the terminally cryptanalytically insane, we have created it collection
of challenge problems. These problems, which are posted on the textbook
website at
http://cs.sjsu.edu/faculty/stamp/crypto/
consist primarily of cryptanalytic challenges based on the ciphers and attacks
presented in the text. A few researchoriented problems are also included.
Each problem carries a difficulty rating so that you will have some idea of
what you might be getting into. For each challenge problem, a small prize2 is
offered to the first solver. We promise to update the website as the challenge
problems are solved. The website includes source code arid test vectors for
many of the ciphers discussed here. In addition, a complete set of quality
PowerPoint slides is available.
The text is organized around four major themes, namely, classic ciphers
(Chapters 1 and a ) , symmetric ciphers (Chapters 3 and 4), hash functions
(Chapter 5 ) , and public key crypto (Chapters 6 and 7). The specific topics
covered in each chapter are summarized below:
Chapter Topics
1. Classic Ciphers Penandpaper systems
Enigma, Purple, Sigaba
2. World War I1 Ciphers
Shift registers,
3 . Stream Ciphers
correlation at tacks,
ORYX. RC4, PKZIP
Block cipher modes,
4. Block Ciphers
MAC, Hellman's TMTO,
CMEA, Akelarre, FEAL
HMAC, birthday attacks,
5. Hash Functions
Nostrasamus at tack,
MD4, MD5
Knapsack, DiffieHellman,
6. Public Key Systems
Arithmetica, RSA
Rabin, NTRU, EIGamal
Factoring, discrete log,
7. Public Key Attacks
RSA timing attacks,

RSA ditching attack
Y
'The emphasis here is on '?,mall ''
PREFACE xv
The first author wrote Chapters 2 through 5 and 7, while the second
author wrote the majority of Chapters 1 and 6. The first author extensively
edited all chapters to give the book a more consistent “look and feel.” The
first author did his best to resist including too many bad jokes, but some
proved irresistible. Most of these have, mercifully, been relegated to footnotes.
The majority of the book consists of a series of cryptanalytic vignettes,
organized by topic. Chapters 3, 4, and 5 each begin with a relatively generic
method of attack (correlation attacks, Hellman™s TMTO and birthday at
tacks, respectively). These attacks are interesting in their own right, but
each also serves as an introduction to the type of cipher under consideration.
Each of these chapters then segues into the cryptanalysis of specific ciphers.
For public key crypto, the introductory material has been expanded to
an entire chapter. In Chapter 6, several public key systems are introduced
and discussed from the perspective of relatively straightforward attacks or
implementation issues that can lead to weaknesses. Then selected public key
attacks are covered in depth in Chapter 7.
The chapters are highly independent of each other, as are many of the sec
tions within chapters. The most dependent chapters are 6 and 7, which cover
public key crypto. In addition, some familiarity with hashing (Chapter 5)
would be useful before diving into the public key material. The terminology
and background covered in Chapter 1 is used throughout the text. Regardless
of your background in cryptography, we recommend that you read Chapter 1
first, since terminology is not consistent throughout the crypto world. Not
only is crypto terminology inconsistent, but notation is even worse. Notation
wise, we have tried to be as internally consistent as possible. Consequently,
our notation often differs from the original source.
The first author™s information security textbook [142] covers four ma
jor topics, one of which is cryptography. The only significant overlap be
tween [142] and this book is Hellman™s timememory tradeoff attack, dis
cussed here in Section 4.4. A brief section on the knapsack attack is also
included in both books; here, in Section 6.2.
Finally, we apologize in advance for the inevitable “bugs” in this book.
Any computer program of sufficient size has bugs and it is more difficult to
debug a textbook than a program, since there is at least some hope of getting
a program to misbehave during testing. There is no method to “exercise” a
textbook other than to proofread it and to teach from it,the more times the
better. The first author has taught virtually all of the material in this text,
and several careful proofreadings have been done. Nevertheless, it is a sure
bet that errors remain. Please tell us of any bugs you find. We would also
appreciate any other comments you have regarding this book.
Mark Stamp
Richard M , Low
San Jose State University
This Page Intentionally Left Blank
About the Authors
Mark Stamp has an extensive background in information security in general
and cryptography in particular, having spent more than seven years as a
Cryptologic Mathematician at the National Security Agency. His other rele
vant experience includes two years as Chief Cryptologic Scientist at a small
Silicon Valley startup company. Since the demise of his startup company
in 2002, he has been a faculty member in the department of computer science
at San Jose State University, where he primarily teaches courses in informa
tion security. In 2005, Dr. Stamp published his first textbook, Information
Security: Principles a.nd Practice (Wiley Interscience).
Richard M. Low has a PhD in mathematics and is a faculty member in
the department of mathematics at San Jose State University. His research
interests include cryptography, combinatorics and group theory. In addition
to teaching mathematics, he has conducted a popular cryptography seminar
at SJSU.
xvii
Acknowledgments
I want to thank the following San Jose State University students who con
tributed significantly to the indicated sections: Heather Kwong (Enigma);
Thang Dao (Purple); Wing On Chan and Ethan Le (Sigaba); Thuy Nguyen
phuc (ORYX); Bevan Jones and Daniel T u (Akelarre); Tom Austin, Ying
Zhang, and Narayana Kashyap (MD5); Ramya Venkataramu (RSA timing
attack); Natalia Khuri (RSA); Edward Yin (Chapter 2 solutions).
As always, thanks to my PhD advisor, Clyde F. Martin. Clyde is the one
who introduced me to cryptography.
Richard Low deserves credit for enthusiastically signing on to this project
and for convincing me to persevere at a couple of points where I was ready
to throw in the towel. He also tolerated my occasional rants very well.
A very special thanks to WanTeh Chang for his careful reading of most
sections of this book. WanTeh has an excellent eye for detail and he provided
numerous corrections and useful suggestions.
Thanks are due to all of the people at Wiley who were involved with this
book. In particular, I want to thank Paul Petralia, Whitney A. Lesch, and
Kellsee Chu who were extremely helpful throughout.
Last but certainly not least, thanks to my lovely wife, Melody, and my
two boys, Austin and Miles, for their patience during the seemingly endless
hours I spent working on this project.
MSS
˜
My love of mathematics was cultivated by many of my former math teach
ers (from junior high school to graduate school). Those that come particularly
to mind include: Joseph Buckley, Gary Chartrand, Daniel Goldston, Doug
Harik, Philip Hsieh, SinMin Lee, John Martino, John Mitchem, Thomas
Richardson, Gerhard Ringel, Jerome Schroeder, Michael Slack, Arthur Stod
dart, Sandra Swanson, Arthur White, Gregg Whitnah, and KungWei Yang.
Thank you for showing me the way.
RML

xix
This Page Intentionally Left Blank
Chapter 1
Classic Ciphers
You are in a maze of twisty little passages, all alike.
 Adventure
Introduction
1.1
Most of this chapter is devoted to introducing terminology and discussing a
select few classic “pen and paper” ciphers. Our goal here is not to cover clas
sical cryptography in detail, since there are already many excellent sources of
information on such ciphers. For example, Kahn™s history [74] has a general
discussion of virtually every cipher developed prior to its original publica
tion date of 1967, Barr [7] presents a readable introduction to cryptography,
Spillman [139] nicely covers the cryptanalysis of several classic cipher systems
and Bauer [8]provides rigorous coverage of a large number of classical crypto
topics. The ciphers we discuss in this chapter have been selected t o illustrate
a few important points that arise in upcoming chapters.
Even if you are familiar with classical cryptosystems, you should read
the next two sections where terminology is discussed, since the terminology
in cryptography is not always consistent. In addition, the material in Sec
tions 1.4.3 and 1.4.4 is directly referenced in upcoming chapters.
Good Guys and Bad Guys
1.2
In cryptography, it is traditional that Alice and Bob are the good guys who
are trying to communicate securely over an insecure channel. We employ
Trudy (the “intruder”) as our generic bad guy. Some books have a whole
cast of bad guys with the name indicating the particular evil activity (Eve,
the eavesdropper, for example), but we use Trudy as our allpurpose bad
“guy” .
1
CLASSIC CIPHERS
2
Since this is a cryptanalysis book, we often play the role of Trudy. Trudy
is an inherently more interesting character than boring old Alice and Bob, and
this is part of what makes cryptanalysis so much more fun than cryptography.
Trudy does not have to play by any preconceived set of rules. However, it
is important to remember that attacks on real systems are almost certainly
illegal, so do not attempt to play Trudy in the real world.
Terminology
1.3
Oryptology is the art and science of making and breaking “secret codes.”
Cryptology can be subdivided irito cryptography (the art and science of mak
ing secret codes) and cryptanalysis (the breaking of secret codes). The secret
codes themselves are known as ciphers or cryptosystems. In this book, we are
focused on cryptanalysis, but many topics in cryptography naturally arise.
It is common practice to use the term cryptography as a synonym for
cryptology, and we generally follow this practice. In fact, we often use crypto
as shorthand for cryptology, cryptography, cryptanalysis, or any variety of
other cryptorelated topics. The precise meaning should be clear from the
context.
The original readable message is the plaintext, while the ciphertext is the
unreadable text that results from encrypting the plaintext. Decryption is the
inverse process, where the ciphertext is converted into plaintext.
A k e y is used to configure a cryptosystem. All classic systems are s y m 
m e t r i c ciphers, meaning that the same key is used to encrypt as to decrypt.
In socalled public k e y cryptography the encryption and decryption keys are
different, which means that the encryption key can be be made public, but
the decryption key must remain private. We cover public key cryptosystems
in Chapters 6 and 7, while all of the remaining chaptersincluding the re
maining sections of this chapter˜deal with symmetric ciphers.
Note that decryption is distinct from cryptanalysis, since cryptanalysis
implies an attack of some sort has been used to read the messages, while
decryption implies that the plaintext has been retrieved using the key by the
expectcd process. Of course, if Trudy recovers the key via cryptanalysis, then
she can simply decrypt a particular ciphertext.
The typical encryption and decryption process is illustrated in Figure 1.1,
where Pi is the ith unit of plaintext (which may be a bit, a letter, a word, or
a la.rger block, depending on the particular cipher), Ci is the corresponding
unit of ciphertext, and the squiggly line represenh the transmission of the
ciphertext over an insecure channel.
There are several generic types of attacks on ciphers. In a ciphertext
only attack, the attacker attempts to recover the key or plaintext from the
ciphertext. In particular, in a ciphertextonly attack, the cryptanalyst does
1.3 TERMINOLOGY 3
w dI˜ p
I
Pi
plaintext encryption plaintext
algorithm
ciphertext
Figure 1.1: Encryption and decryption.
not know any of the underlying plaintext. A basic assumption is that the
ciphertext is always available to an attacker. After all˜ if the ciphertext is not
available to the attacker, why bother to encrypt?
In a known plaintext attack, Trudy has the ciphertext as well as some of
the corresponding plaintext. This might give the attacker some advantage
over the ciphertext only scenariocertainly the attacker is no worse off with
known plaintext. If Trudy knows all of the plaintext, there is probably not
much point in bothering to attack the system, so the implicit assumption is
that the amount of known plaintext is relatively limited.
As the name implies, in a chosen plaintext attack, an adversary can choose
the plaintext and then obtain the corresponding ciphertext. This can only
help the attacker, as compared to a known plaintext scenario. Similarly, in
a chosen ciphertext attack, the cryptanalyst chooses ciphertext and gets to
see the corresponding plaintext. There are also related key attacks, where the
attacker can break the system if two keys are used that happen to be related
in some very special way. While this may seem somewhat esoteric, we will
see an example of a realworld related key attack in Chapter 3.
In most cases, recovering the key is Trudy™s ultimate goal, but there are
attacks that recover the plaintext without revealing the key. A cipher is
generally not considered secure unless it is secure against all plausible attacks.
Cryptographers are, by nature, a paranoid bunch, so “plausible” is usually
defined very broadly.
Kerckhoffs™ Principle is one of the fundamental concepts underlying cryp
tography. This principle states that the strength of a cryptosystem depends
only on the key and, in particular, the security does not depend on keeping
the encryption algorithm secret. This principle is generally construed even
more broadly to imply that the attacker knows the protocols and overall sys
tem in which a cryptosystem is used. Adherence to Kerckhoffs™ Principle
should ensure that the security of a cryptosystem does not depend on the
muchdreaded “security by obscurity”, since the security does not depend
on a secret algorithm. Unfortunately, there are many realworld pressures
that can lead to violations of Kerckhoffs™ Principle, usually with disastrous
consequences.
CLASSIC CIPHERS
4
Why do we insist on Kerckhoffs™ Principle? After all, the attacker™s job
certainly must be more difficult if the crypto algorithm is unknown. In part,
the answer is that Kerckhoffs™ Principle is just a codification of reality
algorithms never remain secret for long so it is far better to find flaws before
hand, rather than after an algorithm is embedded in millions of applications
dispersed across the globe. It also happens to be true that designing a secure
cipher is not easy, and it is made all the more difficult when efficiency is an
issue, which is usually the case. An extensive peer review process is essential
before any algorithm can be considered sufficiently secure for use. We will
see several realworld examples that illustrate the wisdom of Kerckhoffs in
upcoming chapters.
Suppose that Alice encrypts a message and sends the ciphertext to Bob.
Figure 1.2 illustrates what information is available to Alice, Bob and the
attacker, Trudy. At a minimum we assume that Trudy has access to the
ciphertext and, by Kerckhoffs™ Principle, she also knows how the crypto al
gorithm works. In some cases, Trudy may have additional information, such
as known plaintext, chosen plaintext, etc.
.
..........................
, ,
.˜.................................................. I .
Figure 1.2: Who knows what.
In the next section we highlight a few selected classic crypto topics. We
also discuss some important cryptanalytic principles arid we provide details
on a few specific ciphers that are relevant to later chapters.
Selected Classic Crypto Topics
1.4
If you have done much traveling, you know that it is almost impossible to
see everything, and if you try, you are bound to regret it. It is usually far
more productive to avoid the “tourist death march” and instead focus on a
few specific interesting locations. We will take a similar approach here as we
peruse selected classic crypto topics, stopping at a few points of interest, but
making no attempt to cover every possible topic along the way. Since our
focus in the remainder of the book is cryptanalysis, we emphasize attacks on
the classic ciphers that we cover.
1.4 SELECTED CLASSICC R Y P T 0 TOPICS 5
Since ancient times, cryptography has been used for military and diplo
matic purposes. In the remainder of this chapter we consider a few specific
examples of classic ciphers. These ciphers have been carefully selected to il
lustrate important topics that arise in the study of modern ciphers presented
in subsequent chapters.
The history of crypto is itself a fascinating topic, but it is not our focus
here. For more crypto history, a good crypto timeline can be found at [lo41
and there is always Kahn™s book [74]. For a more indepth technical look at
classic ciphers, see Bauer™s fine book [8].
Transposition Ciphers
1.4.1
Transposition ciphers jumble the letters of the message in a way that is de
signed to confuse the attacker, but can be unjumbled by the intended recipi
ent. The concept of transposition is an important one and is widely used in
the design of modern ciphers, as will be seen in subsequent chapters. Note
that the key must provide sufficient information to unscramble the ciphertext.
Scytale
One of the earliest recorded uses of cryptography was the Spartan scytale
(circa 500 B.C.). A thin strip of parchment was wrapped helically around a
cylindrical rod and the message was written across the rod, with each letter
on a successive turn of the parchment. The strip was unwound and delivered
to the receiver. The message could then be decrypted with the use of an
identical cylindrical rod. To anyone who intercepted the message, and did
not understand the encryption technique, the message would appear to be a
jumble of letters. A clever cryptanalyst with access to a number of rods of
various diameters will soon recover the plaintext.
For the scytale cipher, which is an example of a transposition cipher, the
key is the rod (or its diameter). This is a very weak cipher since the system
could be easily broken by anyone who understands the encryption method.
Columnar Transposition
Suppose we have plaintext SEETHELIGHT and we want to encrypt this using
a columnar transposition cipher. We first put the plaintext into the rows of
an array of some given dimension. Then we read the ciphertext out of the
columns. The key consists of the the number of columns in the array. For
example, suppose we choose the key to be four, which means that we write
the plaintext in four columns as
SEET
CLASSIC CIPHERS
6
where the final X is used as to fill out the array. The ciphertext is then read
from the columns, which in this case yields SHGEEHELTTIX. The intended
recipient, who knows the number of columns, can put the ciphertext into an
appropriatesized array and read the plaintext out from the rows.
Not surprisingly, a columnar transposition is not particularly strong. To
perform a ciphertext only attack on this cipher, we simply need to test all
possible decrypts using c columns, where c is a divisor of the number of
characters in the ciphertext.
Keyword Columnar Transposition
The columnar transposition cipher can be strengthened by using a keyword,
where the keyword determines the order in which the columns of ciphertext
are transcribed. We refer to this as a keyword columnar transposition cipher.
For example, consider encrypting the plaintext CRYPTOISFUN using a keyword
columnar transposition cipher with keyword MATH, again using four columns.
In this case, we get the array
MATH
The ciphertext is read from the columns in alphabetical order (as determined
by the keyword), so that, in this example, the ciphertext is ROUPSXCTFYIN.
Is it possible to conduct a ciphertextonly attack on a keyword columnar
transposition cipher? It is certainly not as straightforward as attacking a
nonkeyword columnar cipher. Suppose we obtain the ciphertext
VOESA IVENE MRTNL EANGE WTNIM HTMEE ADLTR NISHO DWOEH
which we believe was encrypted using a keyword columnar transposition.
Our goal is to recover the key and the plaintext. First, note that there are 45
letters in the ciphertext. Assuming the array is not a single column or row,
the array could have any of the following dimensions: 9 x 5. 5 x 9. 15 x 3
or 3 x 15. Suppose that we first try a 9 x 5 array. Then we have the ciphertext
array in Table 1.1.
We focus our attention on the top row of the array in Table 1.1. If we
permute the columns as shown in Table 1.2, we see the word GIVE in the first
row and we see words or partial words in the other rows. Therefore, we have
almost certainly recovered the key.
This method is somewhat ad hoc, but the process could be automated,
provided we can automatically recognize likely plaintexts. In this example,
we have recovered the encryption key 24013 and the plaintext is
GIVE ME SOMEWHERE TO STAND AND I WILL MOVE THE EARTH.
1.4 SELECTED CLASSIC C R Y P T 0 TOPICS 7
Table 1.1: Ciphertext Array
0 1 2 3 4
VE G M I
OM E E S
ER W E H
ST T A O
AN N D D
IL I L W
VE M T O
EA H R E
NN T N H
Table 1.2: Permuted Ciphertext Array
G I V EM
E S O ME
W H E RE
T O S TA
N D A ND
I W I LL
M O V ET
H E E AR
T H N NN
There are many ways to systematically mix the letters of the plaintext.
For example, we can strengthen the columnar transposition cipher by allowing
the permutation of columns and rows. Since two transpositions are involved,
this is known as a double transposition cipher, which we briefly describe next.
Double Transposition Cipher
To encrypt with a double transposition cipher, we first write the plaintext
into an array of a given size and then permute the rows and columns accord
ing to specified permutations. For example, suppose we write the plaintext
ATTACKATDAWN into a 3 x 4 array:
CLASSIC CIPHERS
8
Now if we transpose the rows according to (0,1,2) + (2,1,0) and then trans
(3 ,1 ,0, 2), we obtain
pose the columns according to (0,1,2,3)
[:: : ;I+ : ; :I[: : : :I.
j
A T T A D A W N N A D W
The ciphertext is read directly from the final array:
NADWTKCAATAT.
For the double t,ransposition, the key consists of the size of the matrix and
the row and column permutations. The recipient who knows the key can
simply put the ciphertext into the appropriate sized matrix and undo the
permutations to recover the plaintext.
If Trudy happens to know the size of the matrix used in a double transpo
sition, she can insert the ciphertext into a matrix of the appropriate size. She
can then try to unscramble the columns to reveal words (or partial words).
Once the column transposition has been undone, she can easily unscramble
the rows; see Problem 12 for an example. This attack illustrates the fun
damental principle of divide and conquer. That is, Trudy can recover the
double transposition key in parts, instead of attacking the entire key all at
once. There are many exarnples of divide and conquer attacks throughout
the remainder of this book.
in spite of the inherent divide and conquer attack, the double transposi
tion cipher is relatively strongat least in comparison to many other classic
cipher. The interested reader is directed to [88] for a thorough cryptanalysis
of the double transposition.
Substitution Ciphers
1.4.2
Like transposition, substitution is a crucial concept in the design of modern
ciphers. in fact, Shannon™s [133] two fundamental principles for the design
of symmetric ciphers are confusion and diflusion, which, roughly, correspond
to the classic concepts of substitution and transposition, respectively. These
are still the guiding principles in the design of symmetric ciphers.
In this section we discuss several classic substitution ciphers. We highlight,
some of the clever techniques that can be brought to bear to attack such
ciphers.
Caesar™s Cipher
in 50 R.C., Gaius Julius Caesar described the use of a specific cipher that,
goes by the name of Caesar™s c2pher.l In Caesar™s cipher, encryption is ac
˜Historians generally agree that the Caesar™s cipher was named after the Roman dictator.
not t h e salad.
1.4 SELECTED CLASSIC CRYPT0 TOPICS 9
complished by replacing each plaintext letter with its corresponding “shift
bythree” letter, that is, A is replaced by D, B is replaced by E, C is replaced
by F, and so on. At the end of the alphabet, a wrap around occurs, with X re
placed by A, Y replaced by B and Z replaced by C. Decryption is accomplished
by replacing each ciphertext letter with its corresponding leftshiftbythree
letter, again, taking the wrap around into account.
Suppose we assign numerical values 0 , 1 , . . . ,25 to the letters A, B, . . . ,Z,
respectively, Let p i be the ith plaintext letter of a given message, and ci the
corresponding ith ciphertext letter. Then Caesar™s cipher can be mathemat
+
ically stated as ci = pi 3 (mod 26) and, therefore, pi = ci  3 (mod 26).
In Caesar™s cipher, the key is “3”, which is not very secure, since there is
only one keyanyone who knows that the Caesar™s cipher is being used can
immediately decrypt the message.
Simple Substitution
A simple substitution (or monoalphabetic substitution) cipher is a general
ization of the Caesar™s cipher where the key can be any permutation of the
alphabet. For the simple substitution, there are 26! = 288 keys available.
This is too many keys for any attacker to simply try them all, but even with
this huge number of keys, the simple substitution cipher is insecure. Before
we discuss the attack on the simple substitution, we consider a few special
types of related ciphers that have been used in the past.
Nomenclator
Circa 1400, a type of cipher known as a nomenclator was invented and came
into widespread use by trading states in Europe and by the Catholic Church.
A nomenclator is a book that describes how letters, syllables, and words are
converted into ciphertext and vice versa. In effect, this is a hybrid between
a simple substitution and a codebook cipher (described below), and it has
a larger number of possible keys than a simple substitution cipher. All else
being equal (which it never is), this should make the cryptanalyst™s job more
difficult.
Polyalphabetic Substitution
During the Renaissance, the first polyalphabetic substitution cipher was in
vented by one Leon Battista Alberti (14041472). Such a cipher is essentially
a variable simple substitution cipher, that is, a different substitution alpha
bet is used for different parts of the message. In Alberti™s cipher, this was
accomplished by use of a device that included an inner and outer cipher wheel
with the alphabet written in particular ways on each wheel. The inner wheel
freely rotated allowing the two alphabets to be aligned in any fashion, with
CLASSICCIPHERS
10
each alignment generating a different (simple) substitution. As the message
was encrypted, differing substitution alphabets could be used, as determined
by both parties in advance, or as specified within the message itself.
In his book Traict6 des Chaffres, Blaise de Vigenkre (1585) discusses a
polyalphabetic substitution that uses a 26 x 26 rectangular array of letters.
The first row of the array is A, B, C, . . . , Z, and each succeeding row is a cyclic
left shift of the preceding one. A keyword can then be used to determine
which of the cipher alphabets to use at each position in the text. In this
way, all “shiftbyn” simple substitutions are readily available for use. The
Vigenkre cipher, and its cryptanalysis, is discussed below.
Affine Cipher
An ajJine cipher is a simple substitution where ci = api + b (mod 26). Here,
the constants a and b are integers in the range 0 to 25 (as are p , and ci).
To decrypt uniquelyalways a nice feature for a cipher systemwe must
have gcd(a, 26) = 1. Consequently, there are 26.4(26) = 312 affine ciphers for
the English language, where 4 is the Eulerphi function (see the Appendix for
a definition of the 4 function). The decryption function for the affine cipher
is pi = aP1(ci  b ) (mod 26), where aal = 1 (mod 26), that is, u p 1 is the
multiplicative inverse of a,modulo 26.
Affine ciphers are weak for several reasons, but the most obvious problem
is that they have a small keyspace. A ciphertext only attack can be performed
by conducting a brute force search of all 312 possible key pairs ( a , b). This
attack is trivial, provided we can recognize the plaintext when we see it (or,
better yet, automatically test for it).
Simple Substitution Cryptanalysis
Trying all possible keys is known as an exhaustive key search, and this attack
is always an option for Trudy. If there are N possible keys, then Trudy will,
on average, need to try about half of these, that is; N/2 of the keys, before she
can expect to find the correct key. Therefore, the first rule of cryptography
is that any cipher must have a large enough keyspace so that an exhaustive
search is impractical. However, a large keyspace does not ensure that a cipher
is secure. To see that this is the case, we next consider an attack that will
work against any simple substitution cipher and, in the general case, requires
far less work than an exhaustive key search. This attack relies on the fact
that statistical information that is present in the plaintext language “leaks”
through a simple substitution.
Suppose we have a reasonably large ciphertext message generated by a
simple substitution, and we know that the underlying plaintext is English.
Consider the English letter frequency information in Table 1.3, which was
1.4 SELECTED CLASSIC CRYPT0 TOPICS 11
compiled from a 7834letter sample of written English. By simply computing
letter frequency counts on our ciphertext, we can make educated guesses
as to which plaintext letters correspond to some of the ciphertext letters.
For example, the most common ciphertext letter probably corresponds to
plaintext E. We can obtain additional statistical information by making use
of digraphs (pairs of letters) and common trigraphs (triples). This type of
statistical attack on a simple substitution, is very effective. After a few letters
have been guessed correctly, partial words will start to appear and the cipher
should then quickly unravel.
Table 1.3: English Letter Frequencies as Percentages
11
Relative Relative
6.778
N
1.442 7.493
0
B
2.527 1.991
C P
4 0.077
4.800
D
12.15 6.063
R
E
2.132 6.319
F S
2.323 8.999
G T
6.025 2.783
U
H
6.485 0.996
I V
0.102 2.464
J W
0.689
K 0.204
X
4.008 2.157
Y
L
2.566 0.025
M
Vigenere Cipher
Recall that a polyalphabetic substitution cipher uses multiple simple substi
tutions to encrypt a message. The Vigenkre cipher is a classic polyalphabetic
substitution cipher. The World War I1 cipher machines discussed in Chap
ter 2 are more recent examples of polyalphabetic substitutions.
In the Vigenkre cipher, a key of the form K = (ko,k l ; . . . , k n  l ) , where
each ki E {0,1,. . . ,25}, is used to encipher the plaintext. Here, each ki
represents a particular shift of the alphabet. To encrypt a message,
+ ki (mod n )(mod
= Pi 26)
CZ
and to decrypt
CLASSIC CIPHERS
12
For example, suppose K = (12,0,19,7), which corresponds to the keyword
MATH (since M corresponds to a shift of 12, A corresponds to a shift of 0, and
so on). Using this keyword, the the plaintext SECRETMESSAGE is encrypted
as EEVYQTFLESTNQ.
Next, we cryptanalyze the Vigenkre cipher. But first, note that a poly
alphabetic substitution (such as the VigenBre cipher) does not preserve plain
text letter frequencies to the same degree as a monoalphabetic substitution.
Furthermore, if the number of alphabets is large relative to the message size,
the plaintext letter frequencies will not be preserved at all. Therefore, the
generic simple substitution attack discussed above will not work on a poly
alphabetic substitution.
However, the VigenBre cipher is vulnerable to a slightly more sophisticated
statistical attack. To see how this works, first consider a VigenBre cipher with
a small keyword. Suppose that the following ciphertext was created using a
VigenBre cipher with a threelettered keyword:
RLWRV MRLAQ EDUEQ QWGKI LFMFE XZYXA QXGJH FMXKM QWRLA
LKLFE LGWCL SOLMX RLWPI OCVWL SKNIS IMFES JUVAR MFEXZ
(1.1)
CVWUS MJHTC RGRVM RLSZS MREFW XZGRY RLWPI OMYDB SFJCT
CAZYX AQ.
To recover the key and decrypt the message, we can make use of the fact that
the ciphertext is composed of three simple substitutions. To accomplish t,his,
we tabulate the letter frequencies for the sets
{Q, c 4 , ˜ 7 , . . . }, and 52 = {c2,cg, ex.. . . }.
SO= {co,c:˜, . . }, 51 =
cg,.
Doing so, we obtain the results in
where c, is the ith ciphertext letter.
Tables 1.4, 1.5, and 1.6, respectively.
Table 1.4: Letter Frcquericics in So
Letter II R QUKFEYJMLGPCNIZWB
Freauencv110 4 3 1 2 3 2 3 3 4 2 2 4 1 I 1 1 1
Table 1.5: Letter Frequencies in S
1
Letter IL v E w I M x Q K S H R Y C A
Frequency 16 5 4 2 4 4 7 1 1 6 1 2 1 1 1
From the S ciphertext in Table 1.4, we might reasonably guess that
o
ciphertext R corresponds to plaintext E. T, N, 0, R, I. A or S. which gives 11s
1.4 SELECTED CLASSIC C R Y P T 0 TOPICS 13
Table 1.6: Letter Frequencies in S2
WMADQGLFZKC0XSJTY
Letter
Frequency 6 4 5 2 1 3 3 5 4 2 1 3 1 2 1 2 1
candidate values for ko, namely ko E {13,24,4,3,0,9,17,25}. Similarly, for
set S ,
1 ciphertext X niight correspond to plaintext E, T, N, 0, R, I, A or S,
from which we obtain likely values for k l , and from set Sz, ciphertext W
likely correspond to plaintext E, T, N, 0,R, I,A or S. The corresponding likely
keyword letters are tabulated in Table 1.7.
Table 1.7: Likely Keyword Letters
ko k2
k1
N T S
Y E D
E K J
D J I
A G F
0
J P
R X W
Z F E
The conibinations of likely keyword letters in Table 1.7 yield 83 = 2 ™
putative keywords. By testing each of these putative keyword on the first
few letters of the ciphertext, we can easily determine which, if any, is the
actual keyword. For this example, we find that ( k o ,k l , k2) = (24,4,18),
which corresponds t o YES, and the original plaintext is
THE TRUTH IS ALWAYS SOMETHING THAT IS TOLD, NOT
SOMETHING THAT IS KNOWN. IF THERE WERE NO SPEAKING
OR WRITING, THERE WOULD BE NO TRUTH ABOUT ANYTHING.
THERE WOULD ONLY BE WHAT IS.
This attack provides a significant shortcut, as conipared t o trying all possi
ble 263 M 214 keywords.
Knowing the length of the keyword used in a Vigenkre cipher helps greatly
in the cryptanalysis. If the keyword is known, and the message is long enough,
we can simply perform letter frequency counts on the associated sets of ci
phertext t o begin solving for the plaintext. However, it is not so obvious
how to determine the length of an unknown keyword. Next, we consider two
methods for approximating the length of the keyword in a Vigenhre cipher.
CLASSICCZPHER,S
14
Friederich W. Kasiski (18051881) was a major in the East Prussian in
fantry regiment and the author of the cryptologic text Die Geheimschriflen
und die Dechigerkunst. Kasiski developed a test (amazingly, known as the
Kasiski Test), that can sonietimes be used to find the length of a keyword
used in a cipher such as the Vigenkre. It relies on the occasional coincidental
alignment of letter groups in plaintext with the keyword. To attack a periodic
cipher using the Kasiski Test, we find repeated letter groups in the ciphertext
arid tabulate the separations between them. The greatest common divisor of
these separations (or a divisor of it) gives a possible length for the keyword.
For example, suppose we encrypt the plaintext
THECHILDISFATHEROFTHEMAN
with a Vigenkre cipher using the keyword POETRY.The resulting ciphertest is
IVIVYGARMLMYIVIKFDIVIFRL.
Notice that the second Occurrence of the ciphertext letters IVI begins ex
actly 12 letters after the first, and the third occurrence of IVI occurs exactly
six letters after the second. Therefore, it is likely that the length of the
keyword is a divisor of six. In this case, the keyword length is exactly six.
Index of Coincidence
While working at the Riverbank Laboratory, William F. Friedman (1891L
1969) developed the index of coincidence. For a given ciphertext, the index
of coincidence I is defined to be the probability that two randomly selected
letters in the ciphertext represent, t.he same plaintext symbol.
For a given ciphertext, let no, 1 2 1 , . . . ,1225 be the respective letter counts
+ +
of A , B, C, . . . , Z in the ciphertext, and set 7 1 = n o + 111 . . . r125. Then, the
index of coincidence can be computed as
To see why the index of coincidence gives us useful information, first note
that the empirical probability of randomly selecting two identical letters from
a large English plaintext is
25
1=0
where po is the probability of selecting an A, p l is the probability of selecting
a B, and so on, and the values of p , are given in Table 1.3. This implies that
an (English) ciphertext having an index of coincidence I x 0.065 is probably
1.4 SELECTED CLASSIC C R Y P T 0 TOPICS 15
associated with a monoalphabetic substitution cipher, since this statistic will
not change if the letters are simply relabeled (which is the effect of encrypting
with a simple substitution).
The longer and more random a Vigenkre cipher keyword is, the more
evenly the letters are distributed throughout the ciphertext. With a very
long and very random keyword, we would expect to find
(4)
I 2= 1
26 0.03846.
E
Therefore, a ciphertext having I E 0.03846 could be associated with a poly
alphabetic cipher using a large keyword. Note that for any English ciphertext,
the index of coincidence I must satisfy 0.03846 5 I 5 0.065.
The question remains as to how to determine the length of the keyword
of a Vigenkre cipher using the index of coincidence. The main weakness of
the Vigenkre (or any similar periodic cipher) is that two identical charac
ters occurring a distance apart that is a multiple of the key length will be
encrypted identically. In such cryptosystems, the key length k can be ap
proximated by a function involving the index of coincidence I and the length
of the ciphertext R. The following example illustrates this technique.
Suppose an English plaintext containing n letters is encrypted using a
VigenBre cipher, with a keyword of length k , where, for simplicity, we as
sume R is a multiple of k . Now suppose that we arrange the ciphertext
letters into a rectangular array of n / k rows and k columns, from left to right,
top to bottom. If we select two letters from different columns in the array,
this would be similar to choosing from a collection of letters that is uniformly
distributed, since the keyword is moreorless “random”. In this case, the
portion of pairs of identical letters is, approximately,
(i) (): n 2 ( k  1)
0.03846 = 0.03846
2k .
On the other hand, if the two selected letters are from the same column,
this would correspond to choosing from ciphertext having a symbol distribu
tion similar to printed English plaintext, since, in effect, a simple substitution
is applied to each column. In this case, the portion of pairs of identical letters
is approximately
(5) k
n
(f) (f 1) k
0.065 = 0.065 0.065
=

CLASSIC CIPHI3R.S
16
Therefore, the index of coincidence satisfies
(w)
+
n˜(k1)
0.03846 2 : 0.065
k
I%
(3
+ (0.065)(n k)
 O.O3846n(k  1) 

(1.3)
k ( n  1)
The attacker, Trudy, does not know k , but she can solve for k in (1.3) to
obtain
0.02654n
+
k = (0.065  I ) n ( I  0.03846) ˜ (1.4)
Then given n and I , which are easily computed from the ciphertext, Trudy
can approximate k , the number of letters in the keyword of the underlying
Vigenkre cipher.
The index of coincidence was a cryptologic breakthrough, since it can be
used to gain information about polyalphabetic substitution ciphers. Fried
man™s work on the index of coincidence was one of his most important contri
butions to cryptology, and it provided invaluable information to cryptanalysts
during WWII, where polyalphabetic ciphers played a major role.
Hill Cipher
As a final example of a substitution cipher, we consider the Hill cipher, which
was introduced by mathematician Lester Hill in 1929 [67]. The Hill cipher
is interesting since it is a premodern block cipher. The idea behind the Hill
cipher is to create a substitution cipher with an extremely large “alphabet”.
Such a system is more resilient to cryptanalysis that relies on letter frequency
counts and statistical analysis of the plaintext language. However, the cipher
is linear which makes i t vulnerable to a relatively straightforward known
plaintext attack. The description of the Hill cipher requires some elementary
linear algebra; see the Appendix for the necessary background information.
Suppose that Alice wants to send a message to Bob and they have decided
to use the Hill cipher. First, the plaintext is divided into blocks po,pl,pz, . . .,
each consisting of n letters. Alice then chooses an ri x n invertible matrix A,
with the entries reduced modulo 26, which acts as the key. Encryption is
accomplished by computing the ciphertext as ci = Api (mod 26) for each
plaintext block pi. Bob decrypts the message by computing Alci (mod 26),
for each ciphertext block c i , where A™ is the inverse of A , modulo 26.
For example, suppose Alice wants t,o send the plaintext MEETMEHERE, using
[:?
y]
the encryption matrix
A=
1.4 SELECTED CLASSICC R Y P T 0 TOPICS 17
Converting letters t o numbers, Alice finds
MEETMEHERE = (12,4,4,19,12,4,7,4,17,4).
Next, she divides the plaintext into blocks of length two and then represents
each block as a column vector, which yields
To encrypt, Alice computes ci = Api (mod 26) for each column vector p i .
r;]
In this example, the resulting ciphertext is
[;4 [;I
[t2] ,
, c1 = , c2 = [] ,
2
;
co = c3 = c4 =
Converting into letters, we have
(4,22,23,9,4,22,24,19,10,25)
= EWXJEWYTKZ,
which Alice sends to Bob. When Bob receives the ciphertext, he breaks it into
blocks ci of length two and treats these as column vectors. He then decrypts
the message by computing pi = Alci (mod 26) for each ciphertext block c i .
The Hill cipher, with an invertible matrix A (mod 26) and block length n,
can be viewed as a substitution cipher utilizing an alphabet of 26n possible
“letters” and the expected letter frequency distribution in the ciphertext is
far more uniform than that of the plaintext. This makes a ciphertext only
attack generally impractical. However, the Hill cipher is highly vulnerable to
a known plaintext attack.
Suppose that Trudy suspects Alice of using a Hill cipher with an n x n
encryption matrix A . Further, suppose that Trudy can obtain ciphertext
blocks c i , for i = 0 , 1 , . . . , n  1, where each block is of length n, as well as
the corresponding plaintext blocks, that is, pi, for i = 0 , 1 , . . . ,n  1. Then
Trudy may be able t o recover t,he key A as follows: Let P and C be the n x n