. 1
( 9)



>>

Lecture Notes in Physics
Founding Editors: W. Beiglb¨ ck, J. Ehlers, K. Hepp, H. Weidenm¨ ller
o u

Editorial Board
R. Beig, Vienna, Austria
W. Beiglb¨ ck, Heidelberg, Germany
o
W. Domcke, Garching, Germany
B.-G. Englert, Singapore
U. Frisch, Nice, France
F. Guinea, Madrid, Spain
P. H¨ nggi, Augsburg, Germany
a
W. Hillebrandt, Garching, Germany
R. L. Jaffe, Cambridge, MA, USA
W. Janke, Leipzig, Germany
H. v. L¨ hneysen, Karlsruhe, Germany
o
M. Mangano, Geneva, Switzerland
J.-M. Raimond, Paris, France
M. Salmhofer, Heidelberg, Germany
D. Sornette, Zurich, Switzerland
S. Theisen, Potsdam, Germany
D. Vollhardt, Augsburg, Germany
W. Weise, Garching, Germany
J. Zittartz, K¨ ln, Germany
o
The Lecture Notes in Physics
The series Lecture Notes in Physics (LNP), founded in 1969, reports new developments
in physics research and teaching “ quickly and informally, but with a high quality and
the explicit aim to summarize and communicate current knowledge in an accessible way.
Books published in this series are conceived as bridging material between advanced grad-
uate textbooks and the forefront of research and to serve three purposes:
• to be a compact and modern up-to-date source of reference on a well-de¬ned topic
• to serve as an accessible introduction to the ¬eld to postgraduate students and
nonspecialist researchers from related areas
• to be a source of advanced teaching material for specialized seminars, courses and
schools
Both monographs and multi-author volumes will be considered for publication. Edited
volumes should, however, consist of a very limited number of contributions only. Pro-
ceedings will not be considered for LNP.

Volumes published in LNP are disseminated both in print and in electronic formats, the
electronic archive being available at springerlink.com. The series content is indexed, ab-
stracted and referenced by many abstracting and information services, bibliographic net-
works, subscription agencies, library networks, and consortia.

Proposals should be sent to a member of the Editorial Board, or directly to the managing
editor at Springer:

Christian Caron
Springer Heidelberg
Physics Editorial Department I
Tiergartenstrasse 17
69121 Heidelberg / Germany
christian.caron@springer.com
C. Kollmitzer
M. Pivk (Eds.)




Applied Quantum
Cryptography




ABC
Mario Pivk
Christian Kollmitzer
P¨ ckau 171
o
AIT Austrian Institute
9601 Arnoldstein
of Technology GmbH
Austria
Safety & Security Department
mpivk@edu.uni-klu.ac.at
Quantum Technologies
Lakeside B01A, 9020, Klagenfurt
Austria
christian.kollmitzer@ait.ac.at




Kollmitzer C., Pivk M. (Eds.), Applied Quantum Cryptography, Lect. Notes Phys. 797
(Springer, Berlin Heidelberg 2010), DOI 10.1007/978-3-642-04831-9




Lecture Notes in Physics ISSN 0075-8450 e-ISSN 1616-6361
ISBN 978-3-642-04829-6 e-ISBN 978-3-642-04831-9
DOI 10.1007/978-3-642-04831-9
Springer Heidelberg Dordrecht London New York

Library of Congress Control Number: 2010920541

c Springer-Verlag Berlin Heidelberg 2010
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, speci¬cally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on micro¬lm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations are
liable to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a speci¬c statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.

Cover design: Integra Software Services Pvt. Ltd., Pondicherry

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)
CK: For my family and Verena
MP: For all those, who enrich my life
Foreword




Using the quantum properties of single photons to exchange binary keys between
two partners for subsequent encryption of secret data is an absolutely novel tech-
nology. Only a few years ago quantum cryptography “ or better Quantum Key
Distribution “ was the domain of basic research laboratories at universities. But
during the last few years things changed. Quantum Key Distribution or QKD left
the laboratories and was picked up by more practical-oriented teams that worked
hard to develop a practically applicable technology out of the astonishing results of
basic research.
One major milestone toward a QKD technology was a large research and devel-
opment project funded by the European Commission that aimed at combining quan-
tum physics with complementary technologies that are necessary to create a techni-
cal solution: electronics, software, and network components were added within the
project SECOQC (Development of a Global Network for Secure Communication
based on Quantum Cryptography) that teamed up all expertise on European level to
get a technology for future cryptography.
Lead-managed by a team at the Austrian Research Centers in Vienna, the practi-
cal application of QKD in a standard optical ¬ber network was demonstrated giving
a glimpse of the future of secure communication. Although many steps have still
to be done in order to achieve a real mature technology the cornerstone for future
secure communication is already laid. QKD will not be the Holy Grail of security,
it will not be able to solve all problems for evermore. But QKD has the potential to
replace one of the weakest parts of symmetric encryption: the exchange of the key.
It can be proven that the key exchange process cannot be corrupted and that keys
that are generated and exchanged quantum cryptographically will be secure for ever
(as long as some additional conditions are kept).
This book will show the state of the art of Quantum Cryptography and it will
sketch how it can be implemented in standard communication infrastructure. The
growing vulnerability of sensitive data requires new concepts and QKD will be a
possible solution to overcome some of today™s limitations.

Vienna, Austria Christian Monyk




vii
Acknowledgements




We would like to give thanks to the Austrian Research Centers GmbH “ ARC Kla-
genfurt and Vienna for their support of this book. This work was supported by the
EC/IST Integrated Project SECOQC (contract no. 506813). M.S. is grateful to T.
L¨ nger, T. Lor¨ nser, C. Pacher, M. Peev, and A. Poppe for discussion, help, and
a u
assistance in writing Chap. 6. We would like to take this opportunity to express
our gratitude to Roland Potzmann from the Central Institute for Meteorology and
Geodynamics (ZAMG) who provided us with the climate data. Furthermore, a spe-
cial thanks to the people of the University of Vienna for their support. Christian
Kollmitzer would like to give thanks to Mr. Gerald Dissauer for discussions and
explanations of Medical Information Systems. He would also like to give thanks
to Michele Mosca, Norbert L¨ tkenhaus, and Daniel Gottesman from the University
u
of Waterloo, Ontario, Canada, and to Takashi Linzbichler from the University of
Applied Science Joanneum, Kapfenberg, Austria, and their students for many hours
of discussion on the “Ring of Trust” model. We want to thank Claus Ascheron for
his work in the process of making this book and our reviewers for their fruitful
comments.




ix
Contents




1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
C. Kollmitzer

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
M. Pivk
2.1 Quantum Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Unconditional Secure Authentication . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Quantum Key Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
M. Pivk
3.1 Quantum Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Public Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 QKD Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Finite Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4 Adaptive Cascade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
S. Rass and C. Kollmitzer
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Error Correction and the Cascade Protocol . . . . . . . . . . . . . . . . . . . . . 49
4.3 Adaptive Initial Block-Size Selection . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Fixed Initial Block-Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5 Dynamic Initial Block-Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5 Attack Strategies on QKD Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
S. Schauer
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

xi
xii Contents

5.2 Attack Strategies in an Ideal Environment . . . . . . . . . . . . . . . . . . . . . 73
5.3 Individual Attacks in an Realistic Environment . . . . . . . . . . . . . . . . . 89
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6 QKD Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
M. Suda
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 QKD Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7 Statistical Analysis of QKD Networks in Real-Life Environment . . . . . . . . 123
K. Lessiak and J. Pilz
7.1 Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.2 Results of the Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8 QKD Networks Based on Q3P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
O. Maurhart
8.1 QKD Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.2 PPP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.3 Q3P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.4 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
8.5 Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

9 Quantum-Cryptographic Networks from a Prototype to the Citizen . . . . . . 173
P. Schartner and C. Kollmitzer
9.1 The SECOQC Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
9.2 How to Bring QKD into the “Real” Life . . . . . . . . . . . . . . . . . . . . . . . 176
9.3 Resumee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

10 The Ring of Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
C. Kollmitzer and C. Moesslacher
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
10.2 Model of the Point of Trust Architecture . . . . . . . . . . . . . . . . . . . . . . 186
10.3 Communication in the Point of Trust Model . . . . . . . . . . . . . . . . . . . 186
10.4 Exempli¬ed Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
10.5 A Medical Information System Based on the Ring of Trust . . . . . . . 204
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Chapter 1
Introduction

C. Kollmitzer




Quantum cryptography or more precisely quantum key distribution (QKD) is a new
technology which gets a high level of attention today worldwide. The possibility
to exchange information in a provable secure way is a milestone in communication
history. The main problem in QKD is the range limitation between the communi-
cation partners Alice and Bob. Several experiments have shown that the distance
between Alice and Bob can be enhanced. Some of these experiments used optical
¬bers; others were based on free space technologies. But beyond that it is now pos-
sible to build communication networks based on QKD. Thus, not only Peer to Peer
connection has to be used but it is now possible to build modern communication
structures.
The ¬rst fully functional QKD-based network was presented in October 2008 in
Vienna, Austria. It acted thereby as the base layer for a video conferencing network,
which connected several parts of the city. Not only one QKD technology was used
but ¬ve different systems were deployed. Single communications used several of
them, invisible for the user.
This book contains the following aspects:
At ¬rst fundamental techniques are discussed which form a basis for all further
concepts, in particular the QKD systems also introduced here. The individual steps
of the communication setup are shown in detail, especially sifting, reconciliation,
error correction, and privacy ampli¬cation.
Regarding error correction the Adaptive Cascade protocol, an improvement of
the original error correction protocol Cascade, is introduced. It enhances the clas-
sical Cascade approach and supplements it with a method to determine the optimal
initial block size and hence enhance its ef¬ciency.
In order to ensure the security of communication systems, different attack strate-
gies must be examined. Besides the classical attack strategies QKD systems offer a
range of new aspects which are also presented.


C. Kollmitzer (B)
Safety & Security Department, Quantum Technologies, AIT Austrian Institute of Technology
GmbH, Lakeside B01A 9020 Klagenfurt, Austria,
christian.kollmitzer@ait.ac.at; http://www.ait.ac.at




Kollmitzer, C.: Introduction. Lect. Notes Phys. 797, 1“2 (2010)
c Springer-Verlag Berlin Heidelberg 2010
DOI 10.1007/978-3-642-04831-9 1
2 C. Kollmitzer

After that we present current QKD systems in detail, which were used in the
SECOQC network project of the European Union and were part of the ¬rst QKD-
based network in October 2008 in Vienna, Austria.
Although QKD system has been used for years within different experimental
setups, many of these experiments took place within a laboratory. But due to the
deployment of the SECOQC network, it was possible to collect data from a longtime
setup within an urban environment. The in¬‚uence of the environment was examined
and temperature, humidity, etc., could be seized for the ¬rst time and subsequently
statistically evaluated. The results are discussed in detail.
QKD systems are designed as an enhancement to existing communication net-
works. Thus, their integration into current communication systems is a crucial fac-
tor. Special network protocols had been developed whereby in particular the Q3P
protocol is of high importance and presented in detail.
A communication network is one of the key developments on the way to the ¬eld
use. The appropriate fundamentals are presented. Apart from that we deal with the
user itself and its bene¬t in using QKD networks. In particular the employments of
QKD generated keys using current communication equipments like the iPhone are
described.
Because of the range limitation of QKD systems, the development of global net-
works is one of the most strongly examined research areas. A possible solution
based on a network of trusted communication centers is also presented. The main
advantage in this model is the possibility to generate keys on demand such that the
user doesn™t need to store them in a relatively uncertain environment.
We hope that we can promote the interest in QKD systems and the associated
new possibilities with this book. We assume today that these new options will be
the subject of international research activities all over the next years worldwide
and that the results will have a massive impact on the communication structures of
tomorrow.
Chapter 2
Preliminaries

M. Pivk




This chapter discusses basics necessary for the next chapters. All ¬elds are skimped,
because some areas would need more explanation, like quantum information theory,
but this will be out of scope for this chapter.


2.1 Quantum Information Theory
In this section a short introduction to quantum information is given. For detailed
explanation we refer to the book of Nielsen and Chuang [4], where this topic is
ampli¬ed.


2.1.1 Quantum Bits

Since Shannon and the beginning of information theory, the bit has been the basic
term in classical information. The states of a bit are either 0 or 1. In accordance with
the classical concept in quantum information exists the qubit (short for quantum bit).
Like for the classical bit two states are possible, |0 and |1 . This special notation
˜| ™ is called the Dirac notation (or ket) and is the standard notation for states in
quantum mechanics. The major difference to the classical bit, which accepts only
0 or 1, is that a qubit also allows states in between |0 and |1 , which are called
superpositions. Let us denote this by

|ψ = ±|0 + β|1 , (2.1)

where ±, β ∈ C. Because these factors are complex numbers the state of a qubit
can be described as a vector in a two-dimensional complex vector space C2 , also
called Hilbert space. The states |0 and |1 form the computational basis and are

M. Pivk (B)
Safety & Security Department, Quantum Technologies, AIT Austrian Institute of Technology
GmbH, Lakeside B01A, 9020 Klagenfurt, Austria, mario.pivk@ait.ac.at;
http://www.ait.ac.at



Pivk, M.: Preliminaries. Lect. Notes Phys. 797, 3“21 (2010)
c Springer-Verlag Berlin Heidelberg 2010
DOI 10.1007/978-3-642-04831-9 2
4 M. Pivk

1 0
orthonormal (see De¬nition 2.4) to each other, e.g., |0 = and |1 = .
0 1
Since a qubit state is a unit vector, meaning the length is normalized to 1, following
equation must be ful¬lled by the scalars ±, β:

|±|2 + |β|2 = 1. (2.2)

Using this fact we can rewrite the state of a qubit

θ θ
|ψ = cos |0 + ei• sin |1 , (2.3)
2 2

where θ, • are real numbers and de¬ne a point on a sphere called the Bloch sphere
(see Fig. 2.1).

Fig. 2.1 Bloch sphere |1〉
representation of a qubit

z |ψ〉

θ

y

x




|0〉


The measurement of qubits is a problem. In the special case when ± or β is 0,
the mapping to the classical bit will result in 1 or 0, respectively, as expected. But
what happens if the qubit is in another superposition, i.e., ±, β = 0? Depending on
the scalars the qubit will be measured as 1 with a certain probability or as 0 with
the complementary probability. Since the scalars ful¬l Eq. 2.2, the probability for a
qubit to be measured as 0 is |±|2 and as 1 it is |β|2 . We see this in detail in Sect. 2.1.3.
Furthermore in quantum mechanics the scalars ± and β are also called the ampli-
tudes of the states |0 and |1 , respectively. But there exists a second term describing
a qubit, the phase. Consider the state ei• |ψ , where |ψ is a state vector, and • is
a real number. We say that the state ei• |ψ is equal to |ψ , up to the global phase
factor ei• . The measurements for these two states are from the point of statistics the
same as you will see in Sect. 2.1.2.
2 Preliminaries 5

Another kind of phase is the relative phase. Consider these two states

1 1
|+ = √ (|0 + |1 ) |’ = √ (|0 ’ |1 ) .
and (2.4)
2 2

In the state |+ the amplitude of |1 is √2 . In state |’ the amplitude has the same
1

magnitude but a different sign. We de¬ne that two amplitudes ±1 , ±2 for some states
differ by a relative phase if there is a real • such that the ±1 = ei• ±2 . In contrast to
the global phase, where both amplitudes of the state are different by the factor ei• ,
the relative phase differs only in one amplitude by the factor ei• .



2.1.2 Linear Operators

The state change of qubits is done by linear operators. Therefore a function A is
used, taking vectors from V to W (V and W are vector spaces of C— ). The most
convenient way to describe such a function is the matrix representation. If matrix A
has m columns and n rows and this matrix is multiplied with the vector |v ∈ Cn we
get a new vector |w ∈ Cm as result. The claim for such a matrix A is to ful¬ll the
linearity equation [4]



ai |vi = ai A|vi .
A (2.5)
i i


Let A : V ’’ W be a linear operator and |v1 , . . . , |vn be a basis of V and
|w1 , . . . , |wm a basis of W. There exist complex numbers A1 j , . . . , Am j ,


A|v j = Ai j |wi 1 ¤ i ¤ m, 1 ¤ j ¤ n,
with (2.6)
i


which form the matrix representation of the operator A.
Contrarily, a n — m matrix can be understood as the opposite linear operator
sending vectors out of the vector space W to the vector space V by performing the
matrix multiplication with those vectors.
We use a notation which is different to the usual notation in linear algebra.
Table 2.1 lists some frequently used symbols in quantum mechanics. As we know, a
vector can be represented by the sum of the vectors out of⎛ ⎞
the computational basis.
⎛⎞
1 0
⎜0⎟ ⎜1⎟
⎜⎟ ⎜⎟
For simpli¬cation we take the computational basis v1 = ⎜ . ⎟, v2 = ⎜ . ⎟, . . . ,
⎝..⎠ ⎝.⎠.
0 0
6 M. Pivk

Table 2.1 Summary of some standard quantum mechanical notation
Notation Description
z— Complex conjugate of the complex number z. e.g., (1 + i)— = 1 ’ i
|v Vector. Also known as a ket. |v = i ai |vi
Vector dual to |v . Also known as a bra. v| = i ai— |vi T
v|
»|v Multiplication by a scalar ». »|v = i »ai |vi
v|w Inner product between the vectors |v and |w
|v w| Outer product of |v and |w
|v — |w Tensor product of |v and |w
A— Complex conjugate of the A matrix
AT Transpose of the A matrix
A† Hermitian conjugate or ad-joint of the A matrix, A† = (A T )—
•|A|ψ Inner product between |• and A|ψ


⎛⎞


0
a1
⎜0⎟
⎜.⎟
⎜⎟
vn = ⎜ . ⎟ such that a vector |v = i ai |vi can also be written as |v = ⎝ . ⎠.
.
⎝.⎠
.
an
1
If an other computational basis is used, it is written explicitly.


2.1.2.1 The Pauli Matrices
Four extremely useful matrices are the Pauli matrices. These are 2 by 2 matrices
and represent some needed effects on qubits. The matrices are

0 ’i
01 10
X= Y= Z= . (2.7)
0 ’1
10 i0

The fourth matrix is the identity matrix (I). The Pauli operators X and Z are also
known as bit ¬‚ip and phase ¬‚ip operators. If we apply the X operation on a qubit we
see that |0 changes to |1 and vice versa, i.e.,

01 1 0 01 0 1
X |0 = = X |1 = = .
and
10 0 1 10 1 0

The Z operator is called phase ¬‚ip operator because it changes the phase of |1
by the sign, i.e.,
√ √
10 2 2
√ √
Z |+ = = and
0 ’1 ’2
2
√ √
10 2 2
√ =√
Z |’ =
0 ’1 ’2 2
2 Preliminaries 7

A possibility to illustrate Y is to multiply the matrix with the imaginary unit i
so we deal only with natural-numbered matrices. Thus, the reformulated version of
Y is

01
iY = .
’1 0

The iY operator performs both ¬‚ips, a bit ¬‚ip and a phase ¬‚ip, since

0 ’i
10 01 01
iY = Z X = = =i .
0 ’1 ’1 0
10 i0

Therefore, when the iY operator is applied on the states |0 and |1 we get

01 1 0 01 0 1
iY |0 = = and iY |1 = = .
’1 0 ’1 ’1 0
0 1 0


2.1.2.2 Inner Products
An inner product (or scalar product) v|w (usually notation in linear algebra
(|v , |w )) is a function which takes as input two vectors |v and |w from vector
space V and produces a complex number as output. For example, the inner product
of two n-dimensional vectors over the ¬eld of complex numbers is de¬ned as
⎛⎞
b1
⎜.⎟
ai— bi = a1 · · · an · ⎝ . ⎠ .
— —
v|w = (2.8)
.
i bn

The inner product satis¬es the following requirements:
1. It is linear in the second argument


»i |wi = »i (|v , |wi ) .
|v ,
i i


2. v|w = w|v — .
3. v|v ≥ 0 with equality if and only if |v = 0.


In the following some de¬nitions in connection with the inner product are given.


De¬nition 2.1 Let V be a set of vectors over Cn . Then two vectors |v , |w ∈ V are
orthogonal, if their inner product is 0.
8 M. Pivk

De¬nition 2.2 Let V√ a set of vectors over Cn . The norm of a vector |v ∈ V is
be
de¬ned by |v = v|v . The norm of a vector is often understood as its length
or size.

De¬nition 2.3 Let V be a set of vectors over Cn . Then a vector |v ∈ V is called
a unit vector or the vector is normalized if |v = 1. Normalizing a vector means
dividing it by its norm:

|v
= 1.
|v

De¬nition 2.4 Let V be a set of vectors over Cn . Then a subset of vectors |vi ∈ V
is called orthonormal if each vector |vi is a unit vector, and distinct vectors are
orthogonal vi |w j = 0, i, j = 1...n, i = j.
For the computational basis of a vector space the last de¬nition of orthonormal
must hold. So those vectors form the spanning set for the vector space and any vector
out of this space can be written as a linear combination.

2.1.2.3 Outer Products
The outer product of two vectors is the contrary multiplication to the inner product.
In opposite to the inner product resulting in a single complex value, the outer product
yields to a matrix:
⎛ ⎞ ⎛ ⎞
— —
a1 b1 · · · a1 bn
a1
⎜.⎟ ⎜. .⎟
= ⎝ . ⎠ · b1 · · · b n = ⎝ . . . . . ⎠ .
— —
|v w| = Ai, j (2.9)
. . .
— —
am b1 · · · am bn
am

The outer product representation is a useful way of representing linear operators
which makes use of the inner product. Let |v be a vector in an inner product space
V and |w be a vector in an inner product space W. De¬ne |w v| to be the linear
operator from V to W like

(|w v|)(|v ) = |w v|v = v|v |w . (2.10)

The equation imposes at least two interpretations. On the one hand the vector |v
is mapped by the matrix to a vector which lies in W; on the other hand, it is only
the representation of the vector |w multiplied by a complex value.
One application of the outer product notation can be discerned from an impor-
tant result known as the completeness relation for orthonormal vectors. Let |vi be
orthonormal basis for the vector space V. Then following equation must be ful¬lled

|vi vi | = I. (2.11)
i
2 Preliminaries 9

2.1.2.4 Tensor Products
The tensor product is an operation to create a larger vector space from two smaller
vector spaces. We have two vector spaces V and W of dimensions m and n, respec-
tively. Then V — W is an mn dimensional vector space, whose elements are linear
combinations of tensor products of elements |v ∈ V and |w ∈ W .
For example, the tensor product of vectors (1, 2) and (3, 4) is the vector
⎛ ⎞ ⎛⎞
1—3 3
⎜1 — 4⎟ ⎜4⎟
1 3
=⎜ ⎟ ⎜⎟
— ⎝2 — 3⎠ = ⎝6⎠.
2 4
2—4 8

In the previous example, we perform the operation on two vector spaces, but the
tensor product can also be applied on the linear operators of vector spaces. Assume
A : V ’ V and B : W ’ W then A — B : V — W ’ V — W . Suppose A is a
m by n matrix, and B is a p by q matrix. Then we have the matrix representation:
⎞«

A11 B A12 B · · · A1n B ⎪⎪
⎜ A21 B A22 B · · · A2n B ⎟⎪
¬
⎜ ⎟
A—B ≡ ⎜ . . ⎟ mp, (2.12)
. . . ⎠⎪
⎝. . . ⎪
. . . . ⎪

Am1 B Am2 B · · · Amn B
nq


where A11 B denotes a p by q submatrix. For example, the tensor product of the
Pauli matrices X and Y is
⎛ ⎞
0 0 0 ’i
⎜0 0 i 0 ⎟
0·Y 1·Y
=⎜ ⎟
X —Y = ⎝ 0 ’i 0 0 ⎠ .
1·Y 0·Y
i000


2.1.3 Quantum Measurement

This section provides ways for describing the effects of measurements on quantum
systems with reference to the four postulates in [4]. Before we can measure a quan-
tum we have to set up the area in which quantum mechanics takes place.
Postulate 1 of [4]: A complex vector space with inner product (also called Hilbert
space) is related with any isolated physical system. This is also known as the state
space of the system. With the unit vectors of the system™s state space (state vectors)
we can span the complete system.
To get more information of a particular system we would measure the state space.
But not in quantum mechanics, here we cannot measure what the state space of the
system is, nor we can tell what the state vector of that system is. The simplest and
10 M. Pivk

most important system is the qubit, described in Sect. 2.1.1. The next postulate gives
the description how states change with time.
Postulate 2 of [4]: The evolution of a closed quantum system is described by a
unitary transformation. That is, the state |ψ of the system at time t1 is related to
the state |ψ of the system at time t2 by a unitary operator U which depends only
on the times t1 and t2 ,

|ψ = U |ψ . (2.13)

After quantum mechanics does not tell us the state space and quantum state of
a system, it only assures us which unitary operators U describe the change in any
closed quantum system. Such operators we have already seen in Sect. 2.1.2. Setting
up the base, we can continue with the measurement.
Postulate 3 of [4]: Quantum measurements are described by a collection {Mm } of
measurement operators. These are operators acting on the state space of the system
being measured. The index m refers to the measurement outcomes that may occur
in the experiment. If the state of the quantum system is |ψ immediately before the
measurement then the probability that result m occurs is given by

p(m) = ψ|Mm Mm |ψ , (2.14)

and the state of the system after the measurement is

Mm |ψ
. (2.15)

ψ|Mm Mm |ψ

The measurement operators satisfy the completeness equation


Mm Mm = I. (2.16)
m

The completeness equation expresses the fact that the probabilities sum to 1:


p (m) = ψ|Mm Mm |ψ = 1. (2.17)
m m

There are different types of measurements, but the most important in our case
is the measurement in the computational basis. Hence, we know that |0 and |1
form a computational basis for the two-dimensional complex vector space (space of
qubits). As said in a previous section we can map these states onto the states 0 and
1 of a classical bit during measurement. Now we de¬ne two measurement operators
M0 , M1 :

10 00
M0 = |0 0| = M1 = |1 1| = .
and (2.18)
00 01
2 Preliminaries 11

† †
Observe that for the operators apply M0 = M0 , M1 = M1 and M0 = M0 ,2

M1 = M1 . A measurement on a qubit with state |ψ = ±|0 + β|1 . If we use the
2

operator M0 for the qubit we obtain a probability that the result is 0


p (0) = ψ|M0 M0 |ψ
= ψ|M0 |ψ
±
10
= ±— β —
β
00 (2.19)
±
= ±— β —
0
= ± — ± + 0 = |±|2 .

Similarly, we get the probability p (1) = |β|2 for the measurement result 1.
The state of the system after the measurement is

±
10
β
M0 |ψ 00
|ψ = =
|±|2

ψ|M0 M0 |ψ
(2.20)
±
±|0 ±
0
= = = |0
|±| |±| |±|

if the result was 0. Analog the state

β
|ψ = |1 (2.21)
|β|

for the measurement result 1.
Based on Eq. 2.4 if the qubit is in the speci¬c state |+ we have ± = β = 1

2
and using Eq. 2.14 we get 0 as well as 1 with probability

2
1 1
p (0) = p (1) = |±| = |β| = √ = .
2 2
(2.22)
2
2

Due to Eq. 2.15 after the measurement the system™s state is

1 1
√ √
± β
2 2
|0 = |0 = |0 |1 = |1 = |1 ,
or (2.23)
|±| |β|
1 1
√ √
2 2


respectively. We get the same results for the state |’ .
12 M. Pivk

To perform a correct measurement for such states, we cannot use the compu-
tational basis {|0 , |1 }. Therefore we have to use the basis {|+ , |’ } (de¬ned
in Eq. 2.4). We have seen that these states are orthonormal to each other, since
+|+ = ’|’ = 1 and +|’ = 0, so we can use them as computational basis.
As before, we map the two states |+ , |’ onto classical bits, e.g., |+ ’ 0 and
|’ ’ 1. Thus, the two operators are

1 ’1
1 1
11
M0 = |+ +| = M1 = |’ ’| = . (2.24)
and
’1 1
11
2 2

Using these measurement operators and the states |+ and |’ of Eq. 2.4 we get
the probability for the result 0


p (0) = +|M0 M0 |+ = +|+ +|+ = 1 · 1 = 1. (2.25)

So for the other case


p (1) = +|M1 M1 |+ = +|’ ’|+ = 0 · 0 = 0, (2.26)

and the state after the measurement is

M0 |+ |+ +|+ 1
=√ = · |+ = |+ . (2.27)
+|+ +|+ 1

+|M0 M0 |+

As the probability for the result 0 was 1 the state |+ is preserved even after the
measurement.
With the computational basis {|0 , |1 } and measurement states |+ , |’ , now the
measurement of states |0 , |1 in the computational basis {|+ , |’ } yields the same
probability results:

1 1 1

p (0) = 0|M0 M0 |0 = 0|+ +|0 = √ · √ = ,
2
2 2
1 1 1

p (1) = 0|M1 M1 |0 = 0|’ ’|0 = √ · √ =,
2
2 2
(2.28)
1 1 1

p (0) = 1|M0 M0 |1 = 1|+ +|1 = √ · √ =,
2
2 2
1 1 1

p (1) = 1|M1 M1 |1 = 1|’ ’|1 = ’ √ · ’ √ =,
2
2 2

and the state after the measurement is
2 Preliminaries 13

1

M0 |0 |+ +|0 2
=√ = · |+ = |+
0|+ +|0 1


0|M0 M0 |0 2
1

M1 |0 |’ ’|0 2
=√ = · |’ = |’
or
0|’ ’|0 1


0|M1 M1 |0 2
1

M0 |1 |+ +|1 2
=√ = · |+ = |+
and
0|+ +|0 1


1|M0 M0 |1 2

’ √2
1
M1 |1 |’ ’|1
=√ = 1 · |’ = (’1) · |’ = |’ .
or
1|’ ’|1 √

1|M1 M1 |1 2

(2.29)

In Sect. 2.1.1 we said if qubits differ only by a global phase factor, they have the
same statistical properties and so we consider them to be the same. Thus, in the last
line in Eq. 2.29 we can neglect the factor ’1.


2.1.4 The No-Cloning Theorem

Is it possible to make a copy of an unknown quantum state? The answer is no. In
[10] the no-cloning theorem was presented the ¬rst time.
Suppose we have a quantum machine with two slots labeled A and B. Slot A is
the data slot and starts out in a quantum state, |ψ . We do not know which state
it has. The goal is to copy the state into slot B, the target slot. We assume that
the target slot starts out in some independent state, |s . Thus the initial state of the
copying machine is

|ψ — |s . (2.30)

Some unitary evolution U now effects the copying procedure, ideally,

|ψ — |s ’U U (|ψ — |s ) = |ψ — |ψ . (2.31)

Suppose this copying procedure works for two particular states, |ψ and |• .
Then we have

U (|ψ — |s ) = |ψ — |ψ , (2.32)
U (|• — |s ) = |• — |• . (2.33)

Taking the inner product of these two equations gives

ψ|• = ( ψ|• )2 . (2.34)
14 M. Pivk

But x = x 2 has only two solution, x = 0 and x = 1, so either |ψ = |• or
|ψ and |• are orthogonal. If a cloning device would exist it can only clone states
which are orthogonal to one another, and therefore a general quantum cloning device
cannot exist. For example, we have qubits with states |ψ and |0 where ψ is not
|1 , so it is impossible for a quantum cloner to copy them, since these states are not
orthogonal.



2.2 Unconditional Secure Authentication

QKD communication is divided into two channels: the quantum channel and the
classical public channel. On the public channel we have the problem that the adver-
sary Eve can start a man-in-the-middle attack. So the need of authentication for the
communication is essential. Usually authentication is done by public key methods,
e.g., RSA [6] or DSS [3]. But this security is only computational. QKD™s secu-
rity is an unconditional secure, by using such an authentication scheme we will
reduce it also to computational security. Therefore unconditional secure authentica-
tion schemes are necessary for QKD.
Next we present universal hashing, which can be used as an authentication
scheme, where the authentication parties hold a pre-shared secret. This was ¬rst
presented by Wegman and Carter [9]. Nowadays, more effective symmetric authen-
tication methods are known. We will use Wegman“Carter authentication, because it
describes an upper bound for needed symmetric authentication key.



2.2.1 Universal Hashing

The fundamental idea of universal hashing is to choose a hash function at random,
independent of the input, such that the probability that any two distinct inputs have
the same hash values is suf¬ciently low. To render more precisely, let A and B be
¬nite sets, characterized by the number of elements a = |A| and b = |B|, where
a ≥ b. The hash function h maps an element x ∈ A into an element h(x) ∈ B. For
h and for x, y ∈ A, x = y, we de¬ne δh (x, y) = 1 if h(x) = h(y) and δh (x, y) = 0
otherwise. Meaning δh (x, y) = 1 if and only if the hashed values of x and y collide.
The collection of all hash functions h, mapping the set A to set B, is given as H.
Hence, the number of collisions for the hash function set H and the values x, y is
de¬ned as

δH (x, y) = δh (x, y).
h∈H


Thus, the probability for the two distinct values can be computed by δH (x, y)/|H|.
The goal is to make this probability small, which can be achieved by a large number
of hash functions |H|. But when increasing |H| the number of bits for specifying the
2 Preliminaries 15

function log2 |H| increases as well. This case would be unpractical for applications
since more random bits are needed. So we have to consider this when choosing H.
Wegman and Carter [1, 9] were the ¬rst ones dealing with universal hashing and
they de¬ned some useful properties. Below, a de¬nition of the term universal is
given [8].


De¬nition 2.5 Let µ be a positive real number. H is µ-almost univer sal2 (or µ-AU2 )
if δH (x, y) ¤ µ|H| for all x, y ∈ A, x = y.


In other words the collision probability for any two inputs x and y is at most µ.
Note that µ is bounded below by 1/b. In the special case when µ = 1/b we can
skip the term almost and speak only about a univer sal2 class of hash functions as
Wegman and Carter [1] have done it. Next we de¬ne a stricter property for such
classes [8].


De¬nition 2.6 Let µ be a positive real number. H is µ-almost strongly univer sal2
(or µ- ASU2 ) if


(a) for every x1 ∈ A and for every y1 ∈ B, |{h ∈ H : h(x 1 ) = y1 }| = |H|/|B|,
(b) for every x1 , x2 ∈ A(x1 = x2 ) and for every y1 , y2 ∈ B,
|{h ∈ H : h(x 1 ) = y1 , h(x2 ) = y2 }| ¤ µ|H|/|B|.



The ¬rst part of the de¬nition says that any input x1 has the probability 1/b to be
mapped to a hashed value y1 (see Fig. 2.2). The second part concerns the conditional
probability that with given tuple (x1 , y1 ) the probability that x2 is mapped to y2 is
at most µ (see Fig. 2.3). As before in the special case we call it strongly univer sal2
class of hash functions when µ = 1/b.




A B
h∈H
y1
x1
1/b



Fig. 2.2 De¬nition 2.6a. Any element of A is mapped to any element of B with probability 1/b
16 M. Pivk



h∈H y2
¤µ
x2
A B
h∈H y1
x1




x1 ≠ x2

Fig. 2.3 De¬nition 2.6b. With given tuple (x1 , y1 ) any x2 is mapped to y2 with probability ¤ µ



2.2.2 Authentication

In the previous section we de¬ne what µ-almost strongly univer sal2 families of
hash function are. For authentication we can use them in the following manner:
Alice and Bob share a secret random key k and have agreed on a µ-ASU set of hash
function H = {h|h : M ’ T } (not necessarily secret), where M is the set of
possible messages, T the set of authentication tags, k has the length log2 (|H|), and
the hash function in H are ordered {h 0 , h 1 , ..., h |H|’1 }. Alice sends Bob a message
m ∈ M and the authentication tag t = h k (m). For Bob now it is easy to check
if the message is really from Alice, he compares if the received tag t is equal to
h k (m) and accepts the message as authentic if it does. The key k is then discarded.
The authentication system™s resistance against forgery has to be stated. A forger has
two possibilities to attack the system: First, he can place a new message m into
the channel. Because the key k is random and secret, he does not know which hash
function h k he has to use and his only choice is a guess. Due to the De¬nition 2.62.6
the probability of success is 1/|T |. Second possibility for the forger is to wait for a
message pair (m, t). With the knowledge of the tuple (m, t), the class of hash func-
tion H and enough computing power, the forger can ¬nd all |H|/|T | matching keys
or hash functions, respectively. Nevertheless, according to De¬nition 2.62.6 with
this knowledge he has a probability of at most µ to guess the correct authentication
tag for a modi¬ed message m , where m = m.
So the aim is to make the µ very small concerning the security aspect. This
means to bring it down to 1/|T |, which is the lower bound. The disadvantage when
µ = 1/|T | is that the number of hash functions is very high. Since we need a
string (key) to address the hash function, the size of it increases with log2 |H|, which
results in key length as long as the message length or higher (|k| ∈ O(|m|)). This
is not practicable for applications, so we have to ¬nd an µ small enough but which
increases the key length logarithmic in comparison to the message length.
The ¬rst solution for an almost strongly univer sal2 class increasing the key log-
arithmic was presented by Wegman and Carter [9]. Their construction is shown in
Fig. 2.4. Let s = b + log2 log2 a, where b = log2 |T | is the length of an authen-
tication tags and a = log2 |M| is the length of the messages. Construct a strongly
2 Preliminaries 17

Message


k1 k1 k1 k1 k1 k1
SU2 SU2 SU2 SU2 SU2 SU2




k2 k2 k2
SU2 SU2 SU2




0


k3 k3
SU2 SU2




k1 k4
SU2

K k2
E
k3
Y

k4 Tag


Fig. 2.4 Schematic of 2/|T |-almost strongly univer sal2 class of hash function



univer sal2 class of hash functions, which maps an input string of length 2s to a
hash value of length s. The idea now is to split the message in blocks of length 2s. If
necessary, add zero padding at the end. Now pick a hash function h ∈ H and apply
it on all blocks. The iteration ends with concatenating all outputs. Now start the new
iteration again with splitting this string in blocks of length 2s and picking a hash
function until only one block with length s remains. Finally the tag is the low-order
|T | bits of this block. To generate a tag we need log2 a ’ log2 b hash functions or
keys, respectively. Before we mentioned that the key length in strongly univer sal2
classes increases O(|m|). Now the input for the hash function is a block of length
2s. A key for one iteration would be O(2s) = O(s) = O(b + log2 log2 a). We need
log2 a ’ log2 b iteration resulting in a key length of O((b + log2 log2 a) · log2 a). For
further use we will use the suggestions in [9], where a strongly univer sal2 class was
used, needing a key roughly twice the size of the input. With this class the following
equation is de¬nitively an upper bound for the key size:

4 · ((b + log2 log2 a) · log2 a). (2.35)
18 M. Pivk

We achieve that the key increases only logarithmic in comparison to the mes-
sage. To prove that this construction is an µ-almost strongly univer sal2 class of
hash functions, we have to take a look on the iterations. After every iteration the
probability for two distinct messages m 1 , m 2 to be equal is 1/2s . Since we iterate
log2 a ’ log2 b times, the probability that the ¬nal tags match is at most log2 a/2s ,
which equals to 1/2b = 1/|T |. Because we use a strongly univer sal2 class of hash
functions the last reduction will be taken m 1 to any tag t1 with equal probability and
ful¬lls the ¬rst part of De¬nition 2.6a. And as long as the penultimate blocks of m 1
and m 2 are different, m 2 will also be taken into any tag t2 with probability less than
1/|T |, but if these blocks are the same, less than 2/|T | hash functions will take m 2
to any t2 . So µ is 2/|T | and it ful¬lls the second part of De¬nition 2.6b.
The second solution presented in [9] tries to reduce the key length needed per
message. Therefore again a strongly univer sal2 class H is constructed, which maps
M to T . The two communicating parties (Alice and Bob) split the shared key in two
parts. The ¬rst part has the length log2 |H| to specify a hash function and the second
part is a sequence (r1 , r2 , ...) of elements of T , each with length log2 |T |. Secure
authentication requires unique indexed messages. The shared secret key indicates
both parties which hash function h to choose. To create now the authentication tag
ti for a message m i with a unique message number (e.g., i), it is hashed by the hash
function and then exclusive-or™s with ri , so that ti = h(m i ) • ri .
For the probability of guessing an authentication tag t for a message m, we de¬ne
a set for the tag t as St = {(h, r )|h ∈ H, r ∈ T , h(m 1 ) • r = t1 , and h(m) • r = t},
where m 1 and t1 are the ¬rst message and its authentication tag. In words, St is the
set of partial keys which map the new message m to the tag t with the knowledge that
m 1 is mapped to t1 . Since the chosen class of hash function is strongly univer sal2 ,
the size of all sets St is the same. The only way to extend these partial keys of St
is to append ri = h(m i ) • ti such that message m i maps to tag ti for i = 2, 3, ....
Thus, the sets have the same size and m will be assigned to any tag t ∈ T with same
probability as any other tag. So the guessing probability of a forger depends on the
tag length which results in 1/|T |.
Other than the ¬rst construction of Wegman and Carter, the needed key size
per message depends on the tag length log2 |T | (neglecting the speci¬er for the
hash function). The ¬rst construction returns one authentication tag for one large
message, in contrast the second construction splits the message in smaller pieces,
numbers them, and generate for each piece an authentication tag. In Fig. 2.5 we
compare both constructions. Let H be a strongly univer sal2 class of hash functions,
where h : M ’ T and m i , m j,k ∈ M, m ∈ M, ti ∈ T , t ∈ T , |m i | = 2|ti |,
/
i = 1, . . . , n, j = 1, . . . , |m |/|t|, k = 1, . . . , log2 (|m |/|t|) and lbb (x) : b lower
order bits of x. The additional needed space for the numeration of messages m i is
neglected.
The probability to insert a forge tuple (m i , ti ), even with knowledge of a tuple
(m i , ti ) is 1/|T |. Contrary to the ¬rst construction, the probability to insert a forge
tuple (m , t ) even with knowledge of a tuple (m , t ) is 2/|T |. For the second con-
struction the input size (message size) of the strongly univer sal2 class is much
larger than the output size (tag size).
2 Preliminaries 19


Message m™



m1 ,1 m2 ,1 m|m'|/|t| ,1 m1 m2 m3 mn



h r1 r2 r3 rn
h1 h2 hlog(|m™|/|t|) Split key

mj ,k= hk-1( m2j-1 ,k-1)||hk-1( m2j ,k“1)
if mj ,k does not exist mj ,k=0 ti = h(mi) … ri
t™= lb|t™|( hlog(|m™|/|t|)( m1 ,log(|m™|/|t|)) )

t™ t1 t2 t3 tn

Fig. 2.5 The two constructs of Wegman and Carter [9]


2.3 Entropy
Entropy is a key concept in the ¬eld of information theory, which is a branch of
applied mathematics concerned with the process to quantifying information. In the
following a short part is presented, which we need in later chapters. In general it
is recommendable to read up on information theory, hence we refer to the book of
Cover and Thomas [2], where the subject is handled in detail.


2.3.1 Shannon Entropy
The Shannon entropy [7] is a basic measure in information theory. Let X be a dis-
crete random variable on a ¬nite set X = {x1 , ..., xn } than the Shannon entropy (or
information entropy) H (X ) is de¬ned as

H (X ) = ’ p(x) log2 p(x). (2.36)
x∈X


where p(x) is the probability distribution function of X . In other words, H (X ) is the
expected value of the amount of bits needed to specify a value x ∈ X . Therefore it
is easy to see that the entropy is upper bounded by

H (X ) ¤ log2 |X |, (2.37)

with equality if p(x) = 1/|X | for each x. For probabilities p(x) = 0 we have the
convention 0 log2 0 = 0. Usually the base of the logarithm is 2, so the entropy is
measured in bit. Other bases are e and 10.
If X and Y are random variables on X and Y, respectively, the conditional Shan-
non entropy H (X |Y ) is de¬ned as
20 M. Pivk


H (X |Y ) = ’ p(y) p(x|y) log2 p(x|y), (2.38)
y∈Y x∈X


where p(x|y) = p(x,y) is the conditional probability distribution of X given Y .
p(y)
Further, the mutual information I (X ; Y ) is de¬ned as

p(x|y) p(x, y)
I (X ; Y ) = = . (2.39)
p(x, y) log2 p(x, y) log2
p(x) p(x) p(y)
y∈Y x∈X y∈Y x∈X


Note that it is called mutual because

I (X ; Y ) = I (Y ; X ). (2.40)

To understand the coherency between H (X ), H (Y ), H (X |Y ), H (Y |X ), and
I (X ; Y ) take a look on Fig. 2.6.

Fig. 2.6 Interpretation of H(Y|X)
H (X ), H (Y ), H (X |Y ), H (Y |X ), and
I (X ; Y )
H(Y)
I(X;Y)
H(X)


H(X|Y)




2.3.2 R´ nyi Entropy
e

The R´ nyi entropy [5] of order ± is a generalization of Shannon entropy. In the
e
same manner as Shannon entropy, let X be a discrete random variable on a ¬nite set
X = {x1 , ..., xn } than the R´ nyi entropy of order ± H± (X ) with ± ≥ 0, ± = 1 is
e
de¬ned as

1
p(x)± ,
H± (X ) = log2 (2.41)
1’± x∈X


where p(x) is the probability distribution function of X . In the case ± ’ 1, H± (X )
converges to Shannon entropy H (X ). Again we use for the logarithm base 2 and
the output of the entropy is bit. We pick out the R´ nyi entropy of order 2 (R(X ) for
e
short):

R(X ) = H2 (X ) = ’ log2 p(x)2 . (2.42)
x∈X
2 Preliminaries 21

If we consider only the term in the logarithm, we observe that it equals the col-
lision probability. The collision probability pc (X ) of X is de¬ned as the probability
that X results in the same value or event twice in two independent executions. The
connection to Shannon entropy is that it is upper bounded by Shannon

R(X ) ¤ H (X ), (2.43)

for every probability distribution p(x).
If X and Y are random variables on X and Y, respectively, the conditional R´ nyi
e
entropy R(X |Y ) is de¬ned as

p(x|y)2 ,
R(X |Y ) = ’ p(y) log2 (2.44)
y∈Y x∈X


where p(x|y) is the conditional probability distribution of X given Y . The upper
bound can be derived following Eq. 2.43 as

. 1
( 9)



>>