Founding Editors: W. Beiglb¨ ck, J. Ehlers, K. Hepp, H. Weidenm¨ ller

o u

Editorial Board

R. Beig, Vienna, Austria

W. Beiglb¨ ck, Heidelberg, Germany

o

W. Domcke, Garching, Germany

B.-G. Englert, Singapore

U. Frisch, Nice, France

F. Guinea, Madrid, Spain

P. H¨ nggi, Augsburg, Germany

a

W. Hillebrandt, Garching, Germany

R. L. Jaffe, Cambridge, MA, USA

W. Janke, Leipzig, Germany

H. v. L¨ hneysen, Karlsruhe, Germany

o

M. Mangano, Geneva, Switzerland

J.-M. Raimond, Paris, France

M. Salmhofer, Heidelberg, Germany

D. Sornette, Zurich, Switzerland

S. Theisen, Potsdam, Germany

D. Vollhardt, Augsburg, Germany

W. Weise, Garching, Germany

J. Zittartz, K¨ ln, Germany

o

The Lecture Notes in Physics

The series Lecture Notes in Physics (LNP), founded in 1969, reports new developments

in physics research and teaching “ quickly and informally, but with a high quality and

the explicit aim to summarize and communicate current knowledge in an accessible way.

Books published in this series are conceived as bridging material between advanced grad-

uate textbooks and the forefront of research and to serve three purposes:

• to be a compact and modern up-to-date source of reference on a well-de¬ned topic

• to serve as an accessible introduction to the ¬eld to postgraduate students and

nonspecialist researchers from related areas

• to be a source of advanced teaching material for specialized seminars, courses and

schools

Both monographs and multi-author volumes will be considered for publication. Edited

volumes should, however, consist of a very limited number of contributions only. Pro-

ceedings will not be considered for LNP.

Volumes published in LNP are disseminated both in print and in electronic formats, the

electronic archive being available at springerlink.com. The series content is indexed, ab-

stracted and referenced by many abstracting and information services, bibliographic net-

works, subscription agencies, library networks, and consortia.

Proposals should be sent to a member of the Editorial Board, or directly to the managing

editor at Springer:

Christian Caron

Springer Heidelberg

Physics Editorial Department I

Tiergartenstrasse 17

69121 Heidelberg / Germany

christian.caron@springer.com

C. Kollmitzer

M. Pivk (Eds.)

Applied Quantum

Cryptography

ABC

Mario Pivk

Christian Kollmitzer

P¨ ckau 171

o

AIT Austrian Institute

9601 Arnoldstein

of Technology GmbH

Austria

Safety & Security Department

mpivk@edu.uni-klu.ac.at

Quantum Technologies

Lakeside B01A, 9020, Klagenfurt

Austria

christian.kollmitzer@ait.ac.at

Kollmitzer C., Pivk M. (Eds.), Applied Quantum Cryptography, Lect. Notes Phys. 797

(Springer, Berlin Heidelberg 2010), DOI 10.1007/978-3-642-04831-9

Lecture Notes in Physics ISSN 0075-8450 e-ISSN 1616-6361

ISBN 978-3-642-04829-6 e-ISBN 978-3-642-04831-9

DOI 10.1007/978-3-642-04831-9

Springer Heidelberg Dordrecht London New York

Library of Congress Control Number: 2010920541

c Springer-Verlag Berlin Heidelberg 2010

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is

concerned, speci¬cally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,

reproduction on micro¬lm or in any other way, and storage in data banks. Duplication of this publication

or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,

1965, in its current version, and permission for use must always be obtained from Springer. Violations are

liable to prosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,

even in the absence of a speci¬c statement, that such names are exempt from the relevant protective laws

and regulations and therefore free for general use.

Cover design: Integra Software Services Pvt. Ltd., Pondicherry

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

CK: For my family and Verena

MP: For all those, who enrich my life

Foreword

Using the quantum properties of single photons to exchange binary keys between

two partners for subsequent encryption of secret data is an absolutely novel tech-

nology. Only a few years ago quantum cryptography “ or better Quantum Key

Distribution “ was the domain of basic research laboratories at universities. But

during the last few years things changed. Quantum Key Distribution or QKD left

the laboratories and was picked up by more practical-oriented teams that worked

hard to develop a practically applicable technology out of the astonishing results of

basic research.

One major milestone toward a QKD technology was a large research and devel-

opment project funded by the European Commission that aimed at combining quan-

tum physics with complementary technologies that are necessary to create a techni-

cal solution: electronics, software, and network components were added within the

project SECOQC (Development of a Global Network for Secure Communication

based on Quantum Cryptography) that teamed up all expertise on European level to

get a technology for future cryptography.

Lead-managed by a team at the Austrian Research Centers in Vienna, the practi-

cal application of QKD in a standard optical ¬ber network was demonstrated giving

a glimpse of the future of secure communication. Although many steps have still

to be done in order to achieve a real mature technology the cornerstone for future

secure communication is already laid. QKD will not be the Holy Grail of security,

it will not be able to solve all problems for evermore. But QKD has the potential to

replace one of the weakest parts of symmetric encryption: the exchange of the key.

It can be proven that the key exchange process cannot be corrupted and that keys

that are generated and exchanged quantum cryptographically will be secure for ever

(as long as some additional conditions are kept).

This book will show the state of the art of Quantum Cryptography and it will

sketch how it can be implemented in standard communication infrastructure. The

growing vulnerability of sensitive data requires new concepts and QKD will be a

possible solution to overcome some of today™s limitations.

Vienna, Austria Christian Monyk

vii

Acknowledgements

We would like to give thanks to the Austrian Research Centers GmbH “ ARC Kla-

genfurt and Vienna for their support of this book. This work was supported by the

EC/IST Integrated Project SECOQC (contract no. 506813). M.S. is grateful to T.

L¨ nger, T. Lor¨ nser, C. Pacher, M. Peev, and A. Poppe for discussion, help, and

a u

assistance in writing Chap. 6. We would like to take this opportunity to express

our gratitude to Roland Potzmann from the Central Institute for Meteorology and

Geodynamics (ZAMG) who provided us with the climate data. Furthermore, a spe-

cial thanks to the people of the University of Vienna for their support. Christian

Kollmitzer would like to give thanks to Mr. Gerald Dissauer for discussions and

explanations of Medical Information Systems. He would also like to give thanks

to Michele Mosca, Norbert L¨ tkenhaus, and Daniel Gottesman from the University

u

of Waterloo, Ontario, Canada, and to Takashi Linzbichler from the University of

Applied Science Joanneum, Kapfenberg, Austria, and their students for many hours

of discussion on the “Ring of Trust” model. We want to thank Claus Ascheron for

his work in the process of making this book and our reviewers for their fruitful

comments.

ix

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

C. Kollmitzer

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

M. Pivk

2.1 Quantum Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Unconditional Secure Authentication . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Quantum Key Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

M. Pivk

3.1 Quantum Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2 Public Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 QKD Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4 Finite Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4 Adaptive Cascade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

S. Rass and C. Kollmitzer

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.2 Error Correction and the Cascade Protocol . . . . . . . . . . . . . . . . . . . . . 49

4.3 Adaptive Initial Block-Size Selection . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.4 Fixed Initial Block-Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.5 Dynamic Initial Block-Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5 Attack Strategies on QKD Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

S. Schauer

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

xi

xii Contents

5.2 Attack Strategies in an Ideal Environment . . . . . . . . . . . . . . . . . . . . . 73

5.3 Individual Attacks in an Realistic Environment . . . . . . . . . . . . . . . . . 89

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6 QKD Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

M. Suda

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.2 QKD Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7 Statistical Analysis of QKD Networks in Real-Life Environment . . . . . . . . 123

K. Lessiak and J. Pilz

7.1 Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7.2 Results of the Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.3 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8 QKD Networks Based on Q3P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

O. Maurhart

8.1 QKD Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

8.2 PPP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

8.3 Q3P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

8.4 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

8.5 Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

9 Quantum-Cryptographic Networks from a Prototype to the Citizen . . . . . . 173

P. Schartner and C. Kollmitzer

9.1 The SECOQC Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

9.2 How to Bring QKD into the “Real” Life . . . . . . . . . . . . . . . . . . . . . . . 176

9.3 Resumee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

10 The Ring of Trust Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

C. Kollmitzer and C. Moesslacher

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

10.2 Model of the Point of Trust Architecture . . . . . . . . . . . . . . . . . . . . . . 186

10.3 Communication in the Point of Trust Model . . . . . . . . . . . . . . . . . . . 186

10.4 Exempli¬ed Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

10.5 A Medical Information System Based on the Ring of Trust . . . . . . . 204

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

Chapter 1

Introduction

C. Kollmitzer

Quantum cryptography or more precisely quantum key distribution (QKD) is a new

technology which gets a high level of attention today worldwide. The possibility

to exchange information in a provable secure way is a milestone in communication

history. The main problem in QKD is the range limitation between the communi-

cation partners Alice and Bob. Several experiments have shown that the distance

between Alice and Bob can be enhanced. Some of these experiments used optical

¬bers; others were based on free space technologies. But beyond that it is now pos-

sible to build communication networks based on QKD. Thus, not only Peer to Peer

connection has to be used but it is now possible to build modern communication

structures.

The ¬rst fully functional QKD-based network was presented in October 2008 in

Vienna, Austria. It acted thereby as the base layer for a video conferencing network,

which connected several parts of the city. Not only one QKD technology was used

but ¬ve different systems were deployed. Single communications used several of

them, invisible for the user.

This book contains the following aspects:

At ¬rst fundamental techniques are discussed which form a basis for all further

concepts, in particular the QKD systems also introduced here. The individual steps

of the communication setup are shown in detail, especially sifting, reconciliation,

error correction, and privacy ampli¬cation.

Regarding error correction the Adaptive Cascade protocol, an improvement of

the original error correction protocol Cascade, is introduced. It enhances the clas-

sical Cascade approach and supplements it with a method to determine the optimal

initial block size and hence enhance its ef¬ciency.

In order to ensure the security of communication systems, different attack strate-

gies must be examined. Besides the classical attack strategies QKD systems offer a

range of new aspects which are also presented.

C. Kollmitzer (B)

Safety & Security Department, Quantum Technologies, AIT Austrian Institute of Technology

GmbH, Lakeside B01A 9020 Klagenfurt, Austria,

christian.kollmitzer@ait.ac.at; http://www.ait.ac.at

Kollmitzer, C.: Introduction. Lect. Notes Phys. 797, 1“2 (2010)

c Springer-Verlag Berlin Heidelberg 2010

DOI 10.1007/978-3-642-04831-9 1

2 C. Kollmitzer

After that we present current QKD systems in detail, which were used in the

SECOQC network project of the European Union and were part of the ¬rst QKD-

based network in October 2008 in Vienna, Austria.

Although QKD system has been used for years within different experimental

setups, many of these experiments took place within a laboratory. But due to the

deployment of the SECOQC network, it was possible to collect data from a longtime

setup within an urban environment. The in¬‚uence of the environment was examined

and temperature, humidity, etc., could be seized for the ¬rst time and subsequently

statistically evaluated. The results are discussed in detail.

QKD systems are designed as an enhancement to existing communication net-

works. Thus, their integration into current communication systems is a crucial fac-

tor. Special network protocols had been developed whereby in particular the Q3P

protocol is of high importance and presented in detail.

A communication network is one of the key developments on the way to the ¬eld

use. The appropriate fundamentals are presented. Apart from that we deal with the

user itself and its bene¬t in using QKD networks. In particular the employments of

QKD generated keys using current communication equipments like the iPhone are

described.

Because of the range limitation of QKD systems, the development of global net-

works is one of the most strongly examined research areas. A possible solution

based on a network of trusted communication centers is also presented. The main

advantage in this model is the possibility to generate keys on demand such that the

user doesn™t need to store them in a relatively uncertain environment.

We hope that we can promote the interest in QKD systems and the associated

new possibilities with this book. We assume today that these new options will be

the subject of international research activities all over the next years worldwide

and that the results will have a massive impact on the communication structures of

tomorrow.

Chapter 2

Preliminaries

M. Pivk

This chapter discusses basics necessary for the next chapters. All ¬elds are skimped,

because some areas would need more explanation, like quantum information theory,

but this will be out of scope for this chapter.

2.1 Quantum Information Theory

In this section a short introduction to quantum information is given. For detailed

explanation we refer to the book of Nielsen and Chuang [4], where this topic is

ampli¬ed.

2.1.1 Quantum Bits

Since Shannon and the beginning of information theory, the bit has been the basic

term in classical information. The states of a bit are either 0 or 1. In accordance with

the classical concept in quantum information exists the qubit (short for quantum bit).

Like for the classical bit two states are possible, |0 and |1 . This special notation

˜| ™ is called the Dirac notation (or ket) and is the standard notation for states in

quantum mechanics. The major difference to the classical bit, which accepts only

0 or 1, is that a qubit also allows states in between |0 and |1 , which are called

superpositions. Let us denote this by

|ψ = ±|0 + β|1 , (2.1)

where ±, β ∈ C. Because these factors are complex numbers the state of a qubit

can be described as a vector in a two-dimensional complex vector space C2 , also

called Hilbert space. The states |0 and |1 form the computational basis and are

M. Pivk (B)

Safety & Security Department, Quantum Technologies, AIT Austrian Institute of Technology

GmbH, Lakeside B01A, 9020 Klagenfurt, Austria, mario.pivk@ait.ac.at;

http://www.ait.ac.at

Pivk, M.: Preliminaries. Lect. Notes Phys. 797, 3“21 (2010)

c Springer-Verlag Berlin Heidelberg 2010

DOI 10.1007/978-3-642-04831-9 2

4 M. Pivk

1 0

orthonormal (see De¬nition 2.4) to each other, e.g., |0 = and |1 = .

0 1

Since a qubit state is a unit vector, meaning the length is normalized to 1, following

equation must be ful¬lled by the scalars ±, β:

|±|2 + |β|2 = 1. (2.2)

Using this fact we can rewrite the state of a qubit

θ θ

|ψ = cos |0 + ei• sin |1 , (2.3)

2 2

where θ, • are real numbers and de¬ne a point on a sphere called the Bloch sphere

(see Fig. 2.1).

Fig. 2.1 Bloch sphere |1〉

representation of a qubit

z |ψ〉

θ

y

•

x

|0〉

The measurement of qubits is a problem. In the special case when ± or β is 0,

the mapping to the classical bit will result in 1 or 0, respectively, as expected. But

what happens if the qubit is in another superposition, i.e., ±, β = 0? Depending on

the scalars the qubit will be measured as 1 with a certain probability or as 0 with

the complementary probability. Since the scalars ful¬l Eq. 2.2, the probability for a

qubit to be measured as 0 is |±|2 and as 1 it is |β|2 . We see this in detail in Sect. 2.1.3.

Furthermore in quantum mechanics the scalars ± and β are also called the ampli-

tudes of the states |0 and |1 , respectively. But there exists a second term describing

a qubit, the phase. Consider the state ei• |ψ , where |ψ is a state vector, and • is

a real number. We say that the state ei• |ψ is equal to |ψ , up to the global phase

factor ei• . The measurements for these two states are from the point of statistics the

same as you will see in Sect. 2.1.2.

2 Preliminaries 5

Another kind of phase is the relative phase. Consider these two states

1 1

|+ = √ (|0 + |1 ) |’ = √ (|0 ’ |1 ) .

and (2.4)

2 2

In the state |+ the amplitude of |1 is √2 . In state |’ the amplitude has the same

1

magnitude but a different sign. We de¬ne that two amplitudes ±1 , ±2 for some states

differ by a relative phase if there is a real • such that the ±1 = ei• ±2 . In contrast to

the global phase, where both amplitudes of the state are different by the factor ei• ,

the relative phase differs only in one amplitude by the factor ei• .

2.1.2 Linear Operators

The state change of qubits is done by linear operators. Therefore a function A is

used, taking vectors from V to W (V and W are vector spaces of C— ). The most

convenient way to describe such a function is the matrix representation. If matrix A

has m columns and n rows and this matrix is multiplied with the vector |v ∈ Cn we

get a new vector |w ∈ Cm as result. The claim for such a matrix A is to ful¬ll the

linearity equation [4]

ai |vi = ai A|vi .

A (2.5)

i i

Let A : V ’’ W be a linear operator and |v1 , . . . , |vn be a basis of V and

|w1 , . . . , |wm a basis of W. There exist complex numbers A1 j , . . . , Am j ,

A|v j = Ai j |wi 1 ¤ i ¤ m, 1 ¤ j ¤ n,

with (2.6)

i

which form the matrix representation of the operator A.

Contrarily, a n — m matrix can be understood as the opposite linear operator

sending vectors out of the vector space W to the vector space V by performing the

matrix multiplication with those vectors.

We use a notation which is different to the usual notation in linear algebra.

Table 2.1 lists some frequently used symbols in quantum mechanics. As we know, a

vector can be represented by the sum of the vectors out of⎛ ⎞

the computational basis.

⎛⎞

1 0

⎜0⎟ ⎜1⎟

⎜⎟ ⎜⎟

For simpli¬cation we take the computational basis v1 = ⎜ . ⎟, v2 = ⎜ . ⎟, . . . ,

⎝..⎠ ⎝.⎠.

0 0

6 M. Pivk

Table 2.1 Summary of some standard quantum mechanical notation

Notation Description

z— Complex conjugate of the complex number z. e.g., (1 + i)— = 1 ’ i

|v Vector. Also known as a ket. |v = i ai |vi

Vector dual to |v . Also known as a bra. v| = i ai— |vi T

v|

»|v Multiplication by a scalar ». »|v = i »ai |vi

v|w Inner product between the vectors |v and |w

|v w| Outer product of |v and |w

|v — |w Tensor product of |v and |w

A— Complex conjugate of the A matrix

AT Transpose of the A matrix

A† Hermitian conjugate or ad-joint of the A matrix, A† = (A T )—

•|A|ψ Inner product between |• and A|ψ

⎛⎞

⎛

⎞

0

a1

⎜0⎟

⎜.⎟

⎜⎟

vn = ⎜ . ⎟ such that a vector |v = i ai |vi can also be written as |v = ⎝ . ⎠.

.

⎝.⎠

.

an

1

If an other computational basis is used, it is written explicitly.

2.1.2.1 The Pauli Matrices

Four extremely useful matrices are the Pauli matrices. These are 2 by 2 matrices

and represent some needed effects on qubits. The matrices are

0 ’i

01 10

X= Y= Z= . (2.7)

0 ’1

10 i0

The fourth matrix is the identity matrix (I). The Pauli operators X and Z are also

known as bit ¬‚ip and phase ¬‚ip operators. If we apply the X operation on a qubit we

see that |0 changes to |1 and vice versa, i.e.,

01 1 0 01 0 1

X |0 = = X |1 = = .

and

10 0 1 10 1 0

The Z operator is called phase ¬‚ip operator because it changes the phase of |1

by the sign, i.e.,

√ √

10 2 2

√ √

Z |+ = = and

0 ’1 ’2

2

√ √

10 2 2

√ =√

Z |’ =

0 ’1 ’2 2

2 Preliminaries 7

A possibility to illustrate Y is to multiply the matrix with the imaginary unit i

so we deal only with natural-numbered matrices. Thus, the reformulated version of

Y is

01

iY = .

’1 0

The iY operator performs both ¬‚ips, a bit ¬‚ip and a phase ¬‚ip, since

0 ’i

10 01 01

iY = Z X = = =i .

0 ’1 ’1 0

10 i0

Therefore, when the iY operator is applied on the states |0 and |1 we get

01 1 0 01 0 1

iY |0 = = and iY |1 = = .

’1 0 ’1 ’1 0

0 1 0

2.1.2.2 Inner Products

An inner product (or scalar product) v|w (usually notation in linear algebra

(|v , |w )) is a function which takes as input two vectors |v and |w from vector

space V and produces a complex number as output. For example, the inner product

of two n-dimensional vectors over the ¬eld of complex numbers is de¬ned as

⎛⎞

b1

⎜.⎟

ai— bi = a1 · · · an · ⎝ . ⎠ .

— —

v|w = (2.8)

.

i bn

The inner product satis¬es the following requirements:

1. It is linear in the second argument

»i |wi = »i (|v , |wi ) .

|v ,

i i

2. v|w = w|v — .

3. v|v ≥ 0 with equality if and only if |v = 0.

In the following some de¬nitions in connection with the inner product are given.

De¬nition 2.1 Let V be a set of vectors over Cn . Then two vectors |v , |w ∈ V are

orthogonal, if their inner product is 0.

8 M. Pivk

De¬nition 2.2 Let V√ a set of vectors over Cn . The norm of a vector |v ∈ V is

be

de¬ned by |v = v|v . The norm of a vector is often understood as its length

or size.

De¬nition 2.3 Let V be a set of vectors over Cn . Then a vector |v ∈ V is called

a unit vector or the vector is normalized if |v = 1. Normalizing a vector means

dividing it by its norm:

|v

= 1.

|v

De¬nition 2.4 Let V be a set of vectors over Cn . Then a subset of vectors |vi ∈ V

is called orthonormal if each vector |vi is a unit vector, and distinct vectors are

orthogonal vi |w j = 0, i, j = 1...n, i = j.

For the computational basis of a vector space the last de¬nition of orthonormal

must hold. So those vectors form the spanning set for the vector space and any vector

out of this space can be written as a linear combination.

2.1.2.3 Outer Products

The outer product of two vectors is the contrary multiplication to the inner product.

In opposite to the inner product resulting in a single complex value, the outer product

yields to a matrix:

⎛ ⎞ ⎛ ⎞

— —

a1 b1 · · · a1 bn

a1

⎜.⎟ ⎜. .⎟

= ⎝ . ⎠ · b1 · · · b n = ⎝ . . . . . ⎠ .

— —

|v w| = Ai, j (2.9)

. . .

— —

am b1 · · · am bn

am

The outer product representation is a useful way of representing linear operators

which makes use of the inner product. Let |v be a vector in an inner product space

V and |w be a vector in an inner product space W. De¬ne |w v| to be the linear

operator from V to W like

(|w v|)(|v ) = |w v|v = v|v |w . (2.10)

The equation imposes at least two interpretations. On the one hand the vector |v

is mapped by the matrix to a vector which lies in W; on the other hand, it is only

the representation of the vector |w multiplied by a complex value.

One application of the outer product notation can be discerned from an impor-

tant result known as the completeness relation for orthonormal vectors. Let |vi be

orthonormal basis for the vector space V. Then following equation must be ful¬lled

|vi vi | = I. (2.11)

i

2 Preliminaries 9

2.1.2.4 Tensor Products

The tensor product is an operation to create a larger vector space from two smaller

vector spaces. We have two vector spaces V and W of dimensions m and n, respec-

tively. Then V — W is an mn dimensional vector space, whose elements are linear

combinations of tensor products of elements |v ∈ V and |w ∈ W .

For example, the tensor product of vectors (1, 2) and (3, 4) is the vector

⎛ ⎞ ⎛⎞

1—3 3

⎜1 — 4⎟ ⎜4⎟

1 3

=⎜ ⎟ ⎜⎟

— ⎝2 — 3⎠ = ⎝6⎠.

2 4

2—4 8

In the previous example, we perform the operation on two vector spaces, but the

tensor product can also be applied on the linear operators of vector spaces. Assume

A : V ’ V and B : W ’ W then A — B : V — W ’ V — W . Suppose A is a

m by n matrix, and B is a p by q matrix. Then we have the matrix representation:

⎞«

⎛

A11 B A12 B · · · A1n B ⎪⎪

⎜ A21 B A22 B · · · A2n B ⎟⎪

¬

⎜ ⎟

A—B ≡ ⎜ . . ⎟ mp, (2.12)

. . . ⎠⎪

⎝. . . ⎪

. . . . ⎪

⎭

Am1 B Am2 B · · · Amn B

nq

where A11 B denotes a p by q submatrix. For example, the tensor product of the

Pauli matrices X and Y is

⎛ ⎞

0 0 0 ’i

⎜0 0 i 0 ⎟

0·Y 1·Y

=⎜ ⎟

X —Y = ⎝ 0 ’i 0 0 ⎠ .

1·Y 0·Y

i000

2.1.3 Quantum Measurement

This section provides ways for describing the effects of measurements on quantum

systems with reference to the four postulates in [4]. Before we can measure a quan-

tum we have to set up the area in which quantum mechanics takes place.

Postulate 1 of [4]: A complex vector space with inner product (also called Hilbert

space) is related with any isolated physical system. This is also known as the state

space of the system. With the unit vectors of the system™s state space (state vectors)

we can span the complete system.

To get more information of a particular system we would measure the state space.

But not in quantum mechanics, here we cannot measure what the state space of the

system is, nor we can tell what the state vector of that system is. The simplest and

10 M. Pivk

most important system is the qubit, described in Sect. 2.1.1. The next postulate gives

the description how states change with time.

Postulate 2 of [4]: The evolution of a closed quantum system is described by a

unitary transformation. That is, the state |ψ of the system at time t1 is related to

the state |ψ of the system at time t2 by a unitary operator U which depends only

on the times t1 and t2 ,

|ψ = U |ψ . (2.13)

After quantum mechanics does not tell us the state space and quantum state of

a system, it only assures us which unitary operators U describe the change in any

closed quantum system. Such operators we have already seen in Sect. 2.1.2. Setting

up the base, we can continue with the measurement.

Postulate 3 of [4]: Quantum measurements are described by a collection {Mm } of

measurement operators. These are operators acting on the state space of the system

being measured. The index m refers to the measurement outcomes that may occur

in the experiment. If the state of the quantum system is |ψ immediately before the

measurement then the probability that result m occurs is given by

†

p(m) = ψ|Mm Mm |ψ , (2.14)

and the state of the system after the measurement is

Mm |ψ

. (2.15)

†

ψ|Mm Mm |ψ

The measurement operators satisfy the completeness equation

†

Mm Mm = I. (2.16)

m

The completeness equation expresses the fact that the probabilities sum to 1:

†

p (m) = ψ|Mm Mm |ψ = 1. (2.17)

m m

There are different types of measurements, but the most important in our case

is the measurement in the computational basis. Hence, we know that |0 and |1

form a computational basis for the two-dimensional complex vector space (space of

qubits). As said in a previous section we can map these states onto the states 0 and

1 of a classical bit during measurement. Now we de¬ne two measurement operators

M0 , M1 :

10 00

M0 = |0 0| = M1 = |1 1| = .

and (2.18)

00 01

2 Preliminaries 11

† †

Observe that for the operators apply M0 = M0 , M1 = M1 and M0 = M0 ,2

M1 = M1 . A measurement on a qubit with state |ψ = ±|0 + β|1 . If we use the

2

operator M0 for the qubit we obtain a probability that the result is 0

†

p (0) = ψ|M0 M0 |ψ

= ψ|M0 |ψ

±

10

= ±— β —

β

00 (2.19)

±

= ±— β —

0

= ± — ± + 0 = |±|2 .

Similarly, we get the probability p (1) = |β|2 for the measurement result 1.

The state of the system after the measurement is

±

10

β

M0 |ψ 00

|ψ = =

|±|2

†

ψ|M0 M0 |ψ

(2.20)

±

±|0 ±

0

= = = |0

|±| |±| |±|

if the result was 0. Analog the state

β

|ψ = |1 (2.21)

|β|

for the measurement result 1.

Based on Eq. 2.4 if the qubit is in the speci¬c state |+ we have ± = β = 1

√

2

and using Eq. 2.14 we get 0 as well as 1 with probability

2

1 1

p (0) = p (1) = |±| = |β| = √ = .

2 2

(2.22)

2

2

Due to Eq. 2.15 after the measurement the system™s state is

1 1

√ √

± β

2 2

|0 = |0 = |0 |1 = |1 = |1 ,

or (2.23)

|±| |β|

1 1

√ √

2 2

respectively. We get the same results for the state |’ .

12 M. Pivk

To perform a correct measurement for such states, we cannot use the compu-

tational basis {|0 , |1 }. Therefore we have to use the basis {|+ , |’ } (de¬ned

in Eq. 2.4). We have seen that these states are orthonormal to each other, since

+|+ = ’|’ = 1 and +|’ = 0, so we can use them as computational basis.

As before, we map the two states |+ , |’ onto classical bits, e.g., |+ ’ 0 and

|’ ’ 1. Thus, the two operators are

1 ’1

1 1

11

M0 = |+ +| = M1 = |’ ’| = . (2.24)

and

’1 1

11

2 2

Using these measurement operators and the states |+ and |’ of Eq. 2.4 we get

the probability for the result 0

†

p (0) = +|M0 M0 |+ = +|+ +|+ = 1 · 1 = 1. (2.25)

So for the other case

†

p (1) = +|M1 M1 |+ = +|’ ’|+ = 0 · 0 = 0, (2.26)

and the state after the measurement is

M0 |+ |+ +|+ 1

=√ = · |+ = |+ . (2.27)

+|+ +|+ 1

†

+|M0 M0 |+

As the probability for the result 0 was 1 the state |+ is preserved even after the

measurement.

With the computational basis {|0 , |1 } and measurement states |+ , |’ , now the

measurement of states |0 , |1 in the computational basis {|+ , |’ } yields the same

probability results:

1 1 1

†

p (0) = 0|M0 M0 |0 = 0|+ +|0 = √ · √ = ,

2

2 2

1 1 1

†

p (1) = 0|M1 M1 |0 = 0|’ ’|0 = √ · √ =,

2

2 2

(2.28)

1 1 1

†

p (0) = 1|M0 M0 |1 = 1|+ +|1 = √ · √ =,

2

2 2

1 1 1

†

p (1) = 1|M1 M1 |1 = 1|’ ’|1 = ’ √ · ’ √ =,

2

2 2

and the state after the measurement is

2 Preliminaries 13

1

√

M0 |0 |+ +|0 2

=√ = · |+ = |+

0|+ +|0 1

√

†

0|M0 M0 |0 2

1

√

M1 |0 |’ ’|0 2

=√ = · |’ = |’

or

0|’ ’|0 1

√

†

0|M1 M1 |0 2

1

√

M0 |1 |+ +|1 2

=√ = · |+ = |+

and

0|+ +|0 1

√

†

1|M0 M0 |1 2

’ √2

1

M1 |1 |’ ’|1

=√ = 1 · |’ = (’1) · |’ = |’ .

or

1|’ ’|1 √

†

1|M1 M1 |1 2

(2.29)

In Sect. 2.1.1 we said if qubits differ only by a global phase factor, they have the

same statistical properties and so we consider them to be the same. Thus, in the last

line in Eq. 2.29 we can neglect the factor ’1.

2.1.4 The No-Cloning Theorem

Is it possible to make a copy of an unknown quantum state? The answer is no. In

[10] the no-cloning theorem was presented the ¬rst time.

Suppose we have a quantum machine with two slots labeled A and B. Slot A is

the data slot and starts out in a quantum state, |ψ . We do not know which state

it has. The goal is to copy the state into slot B, the target slot. We assume that

the target slot starts out in some independent state, |s . Thus the initial state of the

copying machine is

|ψ — |s . (2.30)

Some unitary evolution U now effects the copying procedure, ideally,

|ψ — |s ’U U (|ψ — |s ) = |ψ — |ψ . (2.31)

Suppose this copying procedure works for two particular states, |ψ and |• .

Then we have

U (|ψ — |s ) = |ψ — |ψ , (2.32)

U (|• — |s ) = |• — |• . (2.33)

Taking the inner product of these two equations gives

ψ|• = ( ψ|• )2 . (2.34)

14 M. Pivk

But x = x 2 has only two solution, x = 0 and x = 1, so either |ψ = |• or

|ψ and |• are orthogonal. If a cloning device would exist it can only clone states

which are orthogonal to one another, and therefore a general quantum cloning device

cannot exist. For example, we have qubits with states |ψ and |0 where ψ is not

|1 , so it is impossible for a quantum cloner to copy them, since these states are not

orthogonal.

2.2 Unconditional Secure Authentication

QKD communication is divided into two channels: the quantum channel and the

classical public channel. On the public channel we have the problem that the adver-

sary Eve can start a man-in-the-middle attack. So the need of authentication for the

communication is essential. Usually authentication is done by public key methods,

e.g., RSA [6] or DSS [3]. But this security is only computational. QKD™s secu-

rity is an unconditional secure, by using such an authentication scheme we will

reduce it also to computational security. Therefore unconditional secure authentica-

tion schemes are necessary for QKD.

Next we present universal hashing, which can be used as an authentication

scheme, where the authentication parties hold a pre-shared secret. This was ¬rst

presented by Wegman and Carter [9]. Nowadays, more effective symmetric authen-

tication methods are known. We will use Wegman“Carter authentication, because it

describes an upper bound for needed symmetric authentication key.

2.2.1 Universal Hashing

The fundamental idea of universal hashing is to choose a hash function at random,

independent of the input, such that the probability that any two distinct inputs have

the same hash values is suf¬ciently low. To render more precisely, let A and B be

¬nite sets, characterized by the number of elements a = |A| and b = |B|, where

a ≥ b. The hash function h maps an element x ∈ A into an element h(x) ∈ B. For

h and for x, y ∈ A, x = y, we de¬ne δh (x, y) = 1 if h(x) = h(y) and δh (x, y) = 0

otherwise. Meaning δh (x, y) = 1 if and only if the hashed values of x and y collide.

The collection of all hash functions h, mapping the set A to set B, is given as H.

Hence, the number of collisions for the hash function set H and the values x, y is

de¬ned as

δH (x, y) = δh (x, y).

h∈H

Thus, the probability for the two distinct values can be computed by δH (x, y)/|H|.

The goal is to make this probability small, which can be achieved by a large number

of hash functions |H|. But when increasing |H| the number of bits for specifying the

2 Preliminaries 15

function log2 |H| increases as well. This case would be unpractical for applications

since more random bits are needed. So we have to consider this when choosing H.

Wegman and Carter [1, 9] were the ¬rst ones dealing with universal hashing and

they de¬ned some useful properties. Below, a de¬nition of the term universal is

given [8].

De¬nition 2.5 Let µ be a positive real number. H is µ-almost univer sal2 (or µ-AU2 )

if δH (x, y) ¤ µ|H| for all x, y ∈ A, x = y.

In other words the collision probability for any two inputs x and y is at most µ.

Note that µ is bounded below by 1/b. In the special case when µ = 1/b we can

skip the term almost and speak only about a univer sal2 class of hash functions as

Wegman and Carter [1] have done it. Next we de¬ne a stricter property for such

classes [8].

De¬nition 2.6 Let µ be a positive real number. H is µ-almost strongly univer sal2

(or µ- ASU2 ) if

(a) for every x1 ∈ A and for every y1 ∈ B, |{h ∈ H : h(x 1 ) = y1 }| = |H|/|B|,

(b) for every x1 , x2 ∈ A(x1 = x2 ) and for every y1 , y2 ∈ B,

|{h ∈ H : h(x 1 ) = y1 , h(x2 ) = y2 }| ¤ µ|H|/|B|.

The ¬rst part of the de¬nition says that any input x1 has the probability 1/b to be

mapped to a hashed value y1 (see Fig. 2.2). The second part concerns the conditional

probability that with given tuple (x1 , y1 ) the probability that x2 is mapped to y2 is

at most µ (see Fig. 2.3). As before in the special case we call it strongly univer sal2

class of hash functions when µ = 1/b.

A B

h∈H

y1

x1

1/b

Fig. 2.2 De¬nition 2.6a. Any element of A is mapped to any element of B with probability 1/b

16 M. Pivk

h∈H y2

¤µ

x2

A B

h∈H y1

x1

x1 ≠ x2

Fig. 2.3 De¬nition 2.6b. With given tuple (x1 , y1 ) any x2 is mapped to y2 with probability ¤ µ

2.2.2 Authentication

In the previous section we de¬ne what µ-almost strongly univer sal2 families of

hash function are. For authentication we can use them in the following manner:

Alice and Bob share a secret random key k and have agreed on a µ-ASU set of hash

function H = {h|h : M ’ T } (not necessarily secret), where M is the set of

possible messages, T the set of authentication tags, k has the length log2 (|H|), and

the hash function in H are ordered {h 0 , h 1 , ..., h |H|’1 }. Alice sends Bob a message

m ∈ M and the authentication tag t = h k (m). For Bob now it is easy to check

if the message is really from Alice, he compares if the received tag t is equal to

h k (m) and accepts the message as authentic if it does. The key k is then discarded.

The authentication system™s resistance against forgery has to be stated. A forger has

two possibilities to attack the system: First, he can place a new message m into

the channel. Because the key k is random and secret, he does not know which hash

function h k he has to use and his only choice is a guess. Due to the De¬nition 2.62.6

the probability of success is 1/|T |. Second possibility for the forger is to wait for a

message pair (m, t). With the knowledge of the tuple (m, t), the class of hash func-

tion H and enough computing power, the forger can ¬nd all |H|/|T | matching keys

or hash functions, respectively. Nevertheless, according to De¬nition 2.62.6 with

this knowledge he has a probability of at most µ to guess the correct authentication

tag for a modi¬ed message m , where m = m.

So the aim is to make the µ very small concerning the security aspect. This

means to bring it down to 1/|T |, which is the lower bound. The disadvantage when

µ = 1/|T | is that the number of hash functions is very high. Since we need a

string (key) to address the hash function, the size of it increases with log2 |H|, which

results in key length as long as the message length or higher (|k| ∈ O(|m|)). This

is not practicable for applications, so we have to ¬nd an µ small enough but which

increases the key length logarithmic in comparison to the message length.

The ¬rst solution for an almost strongly univer sal2 class increasing the key log-

arithmic was presented by Wegman and Carter [9]. Their construction is shown in

Fig. 2.4. Let s = b + log2 log2 a, where b = log2 |T | is the length of an authen-

tication tags and a = log2 |M| is the length of the messages. Construct a strongly

2 Preliminaries 17

Message

k1 k1 k1 k1 k1 k1

SU2 SU2 SU2 SU2 SU2 SU2

k2 k2 k2

SU2 SU2 SU2

0

k3 k3

SU2 SU2

k1 k4

SU2

K k2

E

k3

Y

k4 Tag

Fig. 2.4 Schematic of 2/|T |-almost strongly univer sal2 class of hash function

univer sal2 class of hash functions, which maps an input string of length 2s to a

hash value of length s. The idea now is to split the message in blocks of length 2s. If

necessary, add zero padding at the end. Now pick a hash function h ∈ H and apply

it on all blocks. The iteration ends with concatenating all outputs. Now start the new

iteration again with splitting this string in blocks of length 2s and picking a hash

function until only one block with length s remains. Finally the tag is the low-order

|T | bits of this block. To generate a tag we need log2 a ’ log2 b hash functions or

keys, respectively. Before we mentioned that the key length in strongly univer sal2

classes increases O(|m|). Now the input for the hash function is a block of length

2s. A key for one iteration would be O(2s) = O(s) = O(b + log2 log2 a). We need

log2 a ’ log2 b iteration resulting in a key length of O((b + log2 log2 a) · log2 a). For

further use we will use the suggestions in [9], where a strongly univer sal2 class was

used, needing a key roughly twice the size of the input. With this class the following

equation is de¬nitively an upper bound for the key size:

4 · ((b + log2 log2 a) · log2 a). (2.35)

18 M. Pivk

We achieve that the key increases only logarithmic in comparison to the mes-

sage. To prove that this construction is an µ-almost strongly univer sal2 class of

hash functions, we have to take a look on the iterations. After every iteration the

probability for two distinct messages m 1 , m 2 to be equal is 1/2s . Since we iterate

log2 a ’ log2 b times, the probability that the ¬nal tags match is at most log2 a/2s ,

which equals to 1/2b = 1/|T |. Because we use a strongly univer sal2 class of hash

functions the last reduction will be taken m 1 to any tag t1 with equal probability and

ful¬lls the ¬rst part of De¬nition 2.6a. And as long as the penultimate blocks of m 1

and m 2 are different, m 2 will also be taken into any tag t2 with probability less than

1/|T |, but if these blocks are the same, less than 2/|T | hash functions will take m 2

to any t2 . So µ is 2/|T | and it ful¬lls the second part of De¬nition 2.6b.

The second solution presented in [9] tries to reduce the key length needed per

message. Therefore again a strongly univer sal2 class H is constructed, which maps

M to T . The two communicating parties (Alice and Bob) split the shared key in two

parts. The ¬rst part has the length log2 |H| to specify a hash function and the second

part is a sequence (r1 , r2 , ...) of elements of T , each with length log2 |T |. Secure

authentication requires unique indexed messages. The shared secret key indicates

both parties which hash function h to choose. To create now the authentication tag

ti for a message m i with a unique message number (e.g., i), it is hashed by the hash

function and then exclusive-or™s with ri , so that ti = h(m i ) • ri .

For the probability of guessing an authentication tag t for a message m, we de¬ne

a set for the tag t as St = {(h, r )|h ∈ H, r ∈ T , h(m 1 ) • r = t1 , and h(m) • r = t},

where m 1 and t1 are the ¬rst message and its authentication tag. In words, St is the

set of partial keys which map the new message m to the tag t with the knowledge that

m 1 is mapped to t1 . Since the chosen class of hash function is strongly univer sal2 ,

the size of all sets St is the same. The only way to extend these partial keys of St

is to append ri = h(m i ) • ti such that message m i maps to tag ti for i = 2, 3, ....

Thus, the sets have the same size and m will be assigned to any tag t ∈ T with same

probability as any other tag. So the guessing probability of a forger depends on the

tag length which results in 1/|T |.

Other than the ¬rst construction of Wegman and Carter, the needed key size

per message depends on the tag length log2 |T | (neglecting the speci¬er for the

hash function). The ¬rst construction returns one authentication tag for one large

message, in contrast the second construction splits the message in smaller pieces,

numbers them, and generate for each piece an authentication tag. In Fig. 2.5 we

compare both constructions. Let H be a strongly univer sal2 class of hash functions,

where h : M ’ T and m i , m j,k ∈ M, m ∈ M, ti ∈ T , t ∈ T , |m i | = 2|ti |,

/

i = 1, . . . , n, j = 1, . . . , |m |/|t|, k = 1, . . . , log2 (|m |/|t|) and lbb (x) : b lower

order bits of x. The additional needed space for the numeration of messages m i is

neglected.

The probability to insert a forge tuple (m i , ti ), even with knowledge of a tuple

(m i , ti ) is 1/|T |. Contrary to the ¬rst construction, the probability to insert a forge

tuple (m , t ) even with knowledge of a tuple (m , t ) is 2/|T |. For the second con-

struction the input size (message size) of the strongly univer sal2 class is much

larger than the output size (tag size).

2 Preliminaries 19

Message m™

m1 ,1 m2 ,1 m|m'|/|t| ,1 m1 m2 m3 mn

h r1 r2 r3 rn

h1 h2 hlog(|m™|/|t|) Split key

mj ,k= hk-1( m2j-1 ,k-1)||hk-1( m2j ,k“1)

if mj ,k does not exist mj ,k=0 ti = h(mi) … ri

t™= lb|t™|( hlog(|m™|/|t|)( m1 ,log(|m™|/|t|)) )

t™ t1 t2 t3 tn

Fig. 2.5 The two constructs of Wegman and Carter [9]

2.3 Entropy

Entropy is a key concept in the ¬eld of information theory, which is a branch of

applied mathematics concerned with the process to quantifying information. In the

following a short part is presented, which we need in later chapters. In general it

is recommendable to read up on information theory, hence we refer to the book of

Cover and Thomas [2], where the subject is handled in detail.

2.3.1 Shannon Entropy

The Shannon entropy [7] is a basic measure in information theory. Let X be a dis-

crete random variable on a ¬nite set X = {x1 , ..., xn } than the Shannon entropy (or

information entropy) H (X ) is de¬ned as

H (X ) = ’ p(x) log2 p(x). (2.36)

x∈X

where p(x) is the probability distribution function of X . In other words, H (X ) is the

expected value of the amount of bits needed to specify a value x ∈ X . Therefore it

is easy to see that the entropy is upper bounded by

H (X ) ¤ log2 |X |, (2.37)

with equality if p(x) = 1/|X | for each x. For probabilities p(x) = 0 we have the

convention 0 log2 0 = 0. Usually the base of the logarithm is 2, so the entropy is

measured in bit. Other bases are e and 10.

If X and Y are random variables on X and Y, respectively, the conditional Shan-

non entropy H (X |Y ) is de¬ned as

20 M. Pivk

H (X |Y ) = ’ p(y) p(x|y) log2 p(x|y), (2.38)

y∈Y x∈X

where p(x|y) = p(x,y) is the conditional probability distribution of X given Y .

p(y)

Further, the mutual information I (X ; Y ) is de¬ned as

p(x|y) p(x, y)

I (X ; Y ) = = . (2.39)

p(x, y) log2 p(x, y) log2

p(x) p(x) p(y)

y∈Y x∈X y∈Y x∈X

Note that it is called mutual because

I (X ; Y ) = I (Y ; X ). (2.40)

To understand the coherency between H (X ), H (Y ), H (X |Y ), H (Y |X ), and

I (X ; Y ) take a look on Fig. 2.6.

Fig. 2.6 Interpretation of H(Y|X)

H (X ), H (Y ), H (X |Y ), H (Y |X ), and

I (X ; Y )

H(Y)

I(X;Y)

H(X)

H(X|Y)

2.3.2 R´ nyi Entropy

e

The R´ nyi entropy [5] of order ± is a generalization of Shannon entropy. In the

e

same manner as Shannon entropy, let X be a discrete random variable on a ¬nite set

X = {x1 , ..., xn } than the R´ nyi entropy of order ± H± (X ) with ± ≥ 0, ± = 1 is

e

de¬ned as

1

p(x)± ,

H± (X ) = log2 (2.41)

1’± x∈X

where p(x) is the probability distribution function of X . In the case ± ’ 1, H± (X )

converges to Shannon entropy H (X ). Again we use for the logarithm base 2 and

the output of the entropy is bit. We pick out the R´ nyi entropy of order 2 (R(X ) for

e

short):

R(X ) = H2 (X ) = ’ log2 p(x)2 . (2.42)

x∈X

2 Preliminaries 21

If we consider only the term in the logarithm, we observe that it equals the col-

lision probability. The collision probability pc (X ) of X is de¬ned as the probability

that X results in the same value or event twice in two independent executions. The

connection to Shannon entropy is that it is upper bounded by Shannon

R(X ) ¤ H (X ), (2.43)

for every probability distribution p(x).

If X and Y are random variables on X and Y, respectively, the conditional R´ nyi

e

entropy R(X |Y ) is de¬ned as

p(x|y)2 ,

R(X |Y ) = ’ p(y) log2 (2.44)

y∈Y x∈X

where p(x|y) is the conditional probability distribution of X given Y . The upper

bound can be derived following Eq. 2.43 as