Introduction
In the realm of cryptography, one of the most critical concepts is that of perfect secrecy, initially formalized by Claude Shannon in his groundbreaking work from 1948. Perfect secrecy constitutes the strongest notion of secrecy, essentially ensuring that ciphertext provides no additional information about the corresponding plaintext, regardless of what prior knowledge an attacker possesses. This article dives deep into the concept of perfect secrecy, discussing its definitions, the underlying principles, and its importance in cryptographic systems.
Understanding Perfect Secrecy
What is Perfect Secrecy?
Perfect secrecy is a cryptographic notion where the ciphertext does not reveal any information about the plaintext message. More formally, an encryption scheme is considered perfectly secure if for every plaintext and corresponding ciphertext, the a priori distribution of the plaintext remains the same before and after the ciphertext is observed by the attacker. This means that even with complete knowledge of the encryption mechanism, the adversary cannot determine any more information about the plaintext just from observing the ciphertext.
The Attack Model
The well-known attack model analyzed in the context of perfect secrecy is the ciphertext-only attack model, where:
- The sender and receiver agree on a random key value (k).
- The sender encrypts a single plaintext message (m) using the encryption algorithm under key (k).
- The ciphertext generated is intercepted by an adversary.
The unique aspect of this model is that the adversary is considered computationally unbounded, implying they can perform any computation, including brute force attacks, without any restrictions on time or resources.
Shannon’s Definition of Perfect Secrecy
According to Shannon, an encryption process is said to be perfectly secure if:
- Prior Information: This encompasses the information known by the attacker about the plaintext prior to seeing the ciphertext.
- No Additional Information: After observing the ciphertext, the attacker should gain no additional information regarding the plaintext.
Mathematically, this can be expressed using probability distributions over message (M), key (K), and ciphertext (C) spaces:
-
The probability of the plaintext being equal to m given the ciphertext is c should equal the a priori probability of the plaintext being m, i.e.,
P(M = m | C = c) = P(M = m)
This condition must hold for every plaintext message and every ciphertext output, effectively asserting that the ciphertext reveals nothing beyond the prior information already held by the adversary.
Probability Distributions in Perfect Secrecy
1. Distribution over Key Space
The key generation algorithm induces a probability distribution over the key space. Under Kerckhoffs’ principle, the workings of the key generation process are assumed to be public knowledge, meaning that the adversary knows how keys are generated and can make corresponding calculations.
2. Distribution over Plaintext Space
This is where the attacker’s prior knowledge about the plaintext comes into play. For example, if the attacker believes that certain messages have a higher likelihood of being sent, this distribution impacts their deduction about the intercepted ciphertext.
3. Distribution over Ciphertext Space
The probability distribution over the ciphertext space is determined by both the distributions over the message and key spaces. When the encryption process is known, adversaries can compute the probability of each possible ciphertext based on their knowledge.
Equivalent Definitions of Perfect Secrecy
1. Independence of Ciphertext Distribution
An equivalent definition states that the ciphertext distribution must be independent of the plaintext. This means that for any pair of messages (m0, m1) selected with non-zero probability, knowing the ciphertext should not reveal whether m0 or m1 was encrypted at any point in time.
2. Game-Based Definition of Perfect Secrecy
The game-based definition involves an adversary attempting to determine which of two plaintext messages has been encrypted. The encryption process, along with key generation, operates randomly. The adversary must produce a guess that aligns with the encrypted message's index. The encryption scheme is said to be perfectly secure if the adversary’s probability of guessing correctly remains at 0.5, regardless of any observed ciphertext.
The Importance of Perfect Secrecy
Achieving perfect secrecy is the lofty goal in the field of cryptography. Perfect secrecy implies that even the most sophisticated adversary with unlimited computational resources will be unable to derive any information about the plaintext solely based on the intercepted ciphertext. This is crucial for protecting sensitive information against unauthorized disclosures.
Case Study: The Vigenère Cipher
To illustrate the concepts of perfect secrecy, let’s evaluate the Vigenère cipher, which has been shown not to achieve perfect secrecy. When testing for perfect security, if an adversary encrypts the same or two different characters under specific keys, the output will reveal some information by allowing the adversary to draw conclusions based on patterns.
Example of Winning Probability
In the Vigenère cipher instance mentioned, an adversary can guess whether the encrypted text corresponds to plaintext “00” or “01” with a probability significantly higher than 0.5. This demonstrates that the Vigenère cipher does not satisfy the conditions for perfect secrecy, as it does not prevent an adversary from gaining additional information.
Conclusion
In this exploration of perfect secrecy, we uncovered its foundational principles, defined key terminologies, and examined the underlying cryptographic mechanics that govern secure communication. Understanding perfect secrecy is crucial for constructing robust encryption schemes that safeguard sensitive information against any form of cryptographic attack. In a world increasingly reliant on digital communication, the implications of perfect secrecy remain vital in protecting data integrity and confidentiality. Thank you for engaging in this comprehensive overview of one of cryptography’s cornerstone concepts.
Plan for this lecture is as follows: we will
discuss perfect secrecy, which is the first formal notion of security, which is also the
strongest notion of secrecy. We also will discuss various equivalent definitions of
perfect secrecy.
So, the definition of perfect security was
given by Claude Shannon in his classical work in 1948 and Shannon is often considered as
the father of information theory and this notion of security is also called as unconditional
security and information theoretic security.
The attack model that is considered in the
definition of perfect security is the ciphertext only attack model, where we assume that we
have a sender and a receiver who have agreed upon a random key value k.
We assume that sender has encrypted a single
message m using a publicly known encryption process under that key k, and the ciphertext
has been intercepted by the adversary. The interesting part of this attack model is that
we assume here that the adversary is computationally
unbounded. That means we make no assumption
whatsoever about his computing power. We assume that he can do any kind of computation with
brute force or any other computation. So that is the interesting aspect of this security
model.
Informally, perfect security says that irrespective
of any prior information that the attacker has about the underlying plain text, seeing
the cipher text provides no additional information to him about the underlying plain text. That
means seeing the ciphertext is absolutely
useless for the attacker. Whatever it could
infer about the message from the ciphertext, same it could have already inferred before
any ciphertext would have been communicated. So here we have 2 entities, we have the prior
information and we have the term no additional
information. We have to now learn a little
bit of math to understand how to formalize these 2 notions namely that of prior information
and that have no additional information. So, imagine we are given a publicly known
cipher namely a triplet of algorithms key
generation algorithm, encryption algorithm
and a decryption algorithm. Then any attacker has the following information about the encryption
process. Namely, he knows 3 spaces, namely the message space, the key space and the cipher
text space, where the message space is the
set of all legitimate messages which could
be encrypted by the encryption process. Keys space denotes all possible keys which could
be output by the key generation algorithm. And the ciphertext denotes all possible ciphertext,
which could be generated by the encryption
algorithm. From the viewpoint of the attacker, any encryption
scheme induces 3 probability distribution, one probability distribution over the message
space, one probability distribution over the
key space and one probability distribution
over the ciphertext space. So now let us go over this probability distributions one by
one. The first probability distribution is over
the key space and this is induced by the key
generation algorithm. So remember, as per
the Kerckhoffs’ principle, we assume that steps of the key generation algorithm, encryption
algorithm and decryption algorithm are publicly known to the attacker. We also discussed that
key generational algorithm has to be a randomized
algorithm. That means, every time we run the
key generation algorithm it is going to output a possible candidate key from the key space
with certain probability. So that is what we mean by the probability distribution over
the key space induced by the key generation
algorithm. Also, we discussed in one of the earlier lectures
that in most of the cases, the key generation algorithm is going to output a uniformly random
key from the key space. Namely, if the key
space size is cardinality of key space, then
any candidate key could be generated with probability one over the key space. So that
is the uniform distribution over the key space, which adversary will already have if he knows
the steps of key generation algorithm. But
it is not necessary that your key generation
algorithm always outputs uniformly random keys from the key space. So that would be
inducing another kind of probability distribution. Whatever is the case since the steps of the
key generation algorithms are publicly known.
We know that from the viewpoint of the attacker,
there is a probability distribution over different values of keys, which could occur with different
probabilities. And to capture that, we introduced this notation |K|, which is a random variable
which denotes the value of candidate key,
which could be obtained by running the key
generation algorithm. And since the key generation algorithm is going to be a randomized algorithm,
it is not the case that every time we run the key generation algorithm, we obtain the
same value of k. And different values of k
could occur with different probabilities to
denote what value of k has been obtained by running the key generation algorithm, we introduced
this random variable |K|. The notation probability, random variable
k is equal to k denotes the probability with
which the key generation algorithm would output
the candidate value of key to be k. That is the probability distribution that adversary
has about the key space. The next important probability distribution
is that over the plain text space, and this
models any kind of prior information that
the attacker might have about the underlying plaintext. For example, if the attacker already
knows or feels that depending upon the underlying context sender would either encrypt a message
“attack” 70% times or the message “retreat”
30% times. Then that is a probability distribution
over the message space that is induced and known to the attacker. But it might be the case that attacker is
completely clueless what exactly is going
to be the message it could be any uniform
the random message from the plain text space in that case, the probability distribution
is a uniform probability distribution. We introduced this random variable |M|, which
is a random variable to denote the candidate
plain text, which sender might encrypt using
the encryption algorithm. Since attacker does not know what exactly
the message is going to be encrypted, we have problems for different candidate messages
to occur with different probabilities. The
notation probability random variable M equal
to m denotes a probability with which attacker feels that sender would encrypt a plain text
m using the encryption algorithm. So that is a probability distribution over the plain
text space.
The third probability distribution is the
probability distribution over the ciphertext space. This is determined by the probability
distribution over the message space and a key space and the steps of the encryption
process. Because once the attacker knows,
what is the probability that a certain plain
text would be encrypted by the sender, and what is the probability that a candidate key
would be obtained by the key generation algorithm. Then by running the steps of the encryption
algorithm over that candidate plain text and
the candidate key adversary can find out what
is the probability of occurrence of a certain ciphertext. We introduced this third random
variable |C| to denote value of the candidate ciphertext and the notation probability the
random variable C equal to c denotes a probability
with which the encryption algorithm would
output a cipher text c. Notice that there are random variables, M
and K are independent of each other because the key is obtained by running the key generation
algorithm, which is run independent of what
message is going to be encrypted by the Enc
algorithm. Whereas the random variable c, it depends upon the random value of the random
variable m and a random variable k. So now let us see a numerical example to understand
these concepts in a better fashion. Imagine
I have a candidate encryption process where
the candidate messages could be either the character a or b, or c or d. In the same way,
imagine I have a key generation algorithm which could output 3 keys with different probabilities
and say the attacker has the following probability
distribution over the pain text space. That means it knows depending upon the underlying
context, it feels or it knows somehow that the message could be either “a” with a
probability 1/4, or the message could b with
a probability 3/10 and so on. In the same
way, assume that the key generation algorithm outputs the candidate key to be k1, k2, k3
with probability 1/4, 1/2 and 1/4 respectively. So that is a probability distribution over
the key space that attacker has.
Imagine the adversary knows the steps of the
encryption algorithm. So, it could compute an encryption matrix where along the columns
you have the candidate plain text namely a b c d, these are the candidate plain text.
And here you have the candidate keys which
could be obtained by running the key generation
algorithm. This value 3 denotes that the encryption algorithm is going to output cipher text 3
if the plain text would have been “a” and the key would have been k1.
In the same way, for example, this last entry
denotes that the ciphertext will be 2, if the plain text would have been d, and the
key would have been k3, and so on. Adversary could compute this matrix because it knows
the steps of the algorithm and it knows the
different candidate values of plain text and
candidate values of key. So that is the information that is available with the attacker. So now let us try to compute the probability
distribution over the ciphertext space. So
as you can see from the encryption matrix,
the ciphertext could be either 1 or 2, or 3 or 4. Let us try to compute the probability
with which the ciphertext could be 1 or 2, or 3 or 4. So let us first compute the probability
that what is the probability that encryption
algorithm produces ciphertext to be 1. If you see the encryption matrix, ciphertext
could be 1, if the plain text is b and the keys k2, that is one option or one condition
under which the cipher text will be 1. The
other condition in which the cipher text could
be 1 is when the plain text is c, and the key would have been k3. The third condition
under which the ciphertext could be 1 is when the plain text is d, and the key would have
been k1.
And all these 3 conditions or events are independent
of each other. So we apply the sum rule here. The first term here denotes the probability
that the key would have been k2. That means by running the key generation algorithm sender
would have obtained the key to b k2, and the
probability that sender would have used the
message b for encryption. Both these events occur together, then the ciphertext would
have been 1. The second term denotes the other possible
condition in which the ciphertext could have
been 1 namely, plaintext at sender wants to
encrypt is c. And the key that is obtained by running the key generation algorithm is
k3. And the third term here denotes the probability of occurrence of the third condition, namely,
the plain text is d and the key generation
algorithm outputs the k to the k1. And since we know the value of each of these
individual probabilities, which is available to the attacker, attacker can compute this
and hence it knows that the probability that
encryption algorithm could produce ciphertext
to be 1 is 0.2625. So that is the value of probability of occurrence of 1 as the ciphertext.
In the same way, let us try to compute what is the probability that the ciphertext could
be 2.
Again, if you see the encryption matrix, there
are 3 possible conditions in which the ciphertext could be 2, the first condition is if sender
would have encrypted the plain text c, and the key that is used by the sender namely
the key generation algorithm outputs the key
to k1. So that is first term. The second condition in which the ciphertext
character would have been 2 is if the plain text would have been d and the key generation
algorithm for sender would have output the
k to k2. So that is the second term. And the
third condition under which the ciphertext character could have been 2 is if the plaintext
would have been d, and the key would have been k3. Again, we know all these individual
probabilities and we can sum them together
to obtain the probability of ciphertext being
2 being produced by the encryption algorithm. In the same way you can compute the remaining
probabilities, namely, the probability of ciphertext being 3 and the probability that
ciphertext character is 4. And that gives
you the probability distribution over the
ciphertext space. Not only we can compute the probability distribution
over the cipher text space, we can also compute some additional values, namely some kind of
conditional probabilities. For example, let
us try to compute the conditional probability
that given the plain text is m, what is the probability that ciphertext would have been
c. So for instance, suppose let us try to compute the value of probability that ciphertext
is 1 given that plaintext is equal to a.
So again, if we see the encryption matrix,
you see the column under the plaintext a, then you can see that if the plain text would
have been a, no way it is possible that the ciphertext could be 1 because the encryption
of a could give either the ciphertext being
3 or ciphertext being 4. For no value of k,
the encryption of a would have given you the ciphertext to be 1. That is why we can say that the conditional
probability that ciphertext is 1 given the
plain text is a is 0. In the same way, the
conditional probability that ciphertext is going to be 2 given the plaintext is 1 is
again 0 because again, you can see the encryption matrix or the encryption algorithm, which
states that for no value of k encryption of
a under that key would give you the ciphertext
value 2. Now let us compute the conditional probability
that given the message a, what is the probability ciphertext is equal to 3? Now, if you see
the encryption matrix, under the plain text
a, there are 2 values of keys namely k1 and
k2, both of which could produce an encryption of a to be 3. Since the key generation algorithm
is a randomized algorithm, the probability that the key is equal to k1 is given to you
the probability that key is equal to k2 is
given to you and if any of these 2 events
occur, then the encryption of a under the key k1 or under the key k2 would have given
you the ciphertext equal to 3. The conditional probability that given the
plain text is a, you obtain the cipher text
3 is 0.75. In the same way, if you want to
compute the conditional probability that given the plain text is “a” what is the probability
ciphertext equal to 4? Then again, if you look into the encryption matrix, you can see
that if the plain text is a then the only
condition in which you can obtain the ciphertext
to be 4 is if the key generation algorithm would have produced the key to be k3. And the probability that the key generation
algorithm would produce the key to be k3 is
0.25. In the same way, you could compute other
conditional probabilities as well. So now let us see the original definition
of perfect security as given by Shannon. Recall the informal requirement from the perfect
secrecy is that: we will say an encryption
process to be perfectly secure, if irrespective
of any prior information the attacker has about the underlying plaintext, cipher text
that it intercepts, leak no additional information about the plaintext.
We know how to model this prior information.
That is precisely the probability distribution over the plain text space that attacker has.
We can also mathematically capture what exactly we mean by no additional information. Formally,
an encryption process namely a collection
of algorithms key generation, Enc and Dec
over a plain text space M is called perfectly secure if for every probability distribution
over the plain text space and key space, and for every plain text m belonging to the plain
text space, according to that probability
distribution, and for every ciphertext c belonging
to the ciphertext space, the following equality holds. So let us try to understand what exactly
the LHS and RHS of this equality stands for. If you consider the RHS part of this equality,
namely the probability that the plain text
is equal to m that is precisely the apriori
probability that message m could be communicated by the sender. That is a prior information
about the underlying plain text before any communication has happened from the sender
to receiver, before any key generation algorithm
has been used, and the message has been encrypted.
That is the apriori information about the underlying plaintext. Whereas the LHS part of this equality namely
the conditional probability that the plain
text is equal to m given the ciphertext equal
to c that supposed posteriori probability that the message m is encrypted in c. That
means, given that adversary has seen or intercepted a cipher text c, what is the probability that
the plain text m is encrypted in this cipher
text c. So, intuitively, when we say that
these 2 probabilities are equal, it means that whatever the adversary knew about the
underlying plain text before any cipher text was communicated with same probability adversary
knows that a plain text could be m and in
the given cipher text c. That means observing the ciphertext c does
not change attacker’s knowledge about the distribution of the plaintext. Whatever the
attacker’s knowledge was before seeing the
ciphertext, the same knowledge it has, even
after seeing the ciphertext. Moreover, it holds even if adversary is computationally
unbounded. That is the importance of this definition.
This definition does not put any kind of restriction
on the computational power of the adversary. Even if adversary knows the steps of the algorithm,
even if it knows what could be the candidate keys even if it is allowed to do brute force,
its view or its knowledge about the underlying
plain text or the distribution of the plain
text should not change before and after seeing the ciphertext.
If that holds, then we say that our underlying encryption process is perfectly secure. Notice
that in this definition, I have highlighted
few things namely, I have said that the condition
should hold for every probability distribution over the plain text space. That means it does not matter whether the
distribution over a plain text space is a
uniform distribution or a bias distribution,
the condition should hold for any possible probability distribution over the message
space. In the same way, the condition or equality should hold for any kind of probability distribution
over the key space whether it is a uniformly
generated key, whether the key generation
algorithm output uniformly generate random keys or bias keys still the condition should
hold. Moreover, once we fixed up probability distribution
of the plain text space and the key space,
the condition should hold for every plaintext
belonging to the message space and every candidate ciphertext belonging to the ciphertext space. So now, what we will do is we will see some
alternate equivalent definitions for perfect
security. This is the original definition
of perfect security as given by Shannon and the interpretation of equality of these 2
probabilities is that the probability of knowing a plaintext remains the same before and after
seeing the cipher text. That is the interpretation
of the original definition of perfect security
as given by Shannon. Now let us see the first alternate definition.
The first alternative definition says that we will say an encryption process or an encryption
scheme to be perfectly secure if for every
probability distribution over the plaintext
space and key space and for every pair of message m0, m1 which occurs with non-zero
probability with respect to that probability distribution and every cipher text c the following
equality should hold.
The equality says that the conditional probability
that ciphertext is c given the plaintext is m0 is same as the conditional probability
that ciphertext is c given the plaintext is m1. The interpretation of this equality is
that the probability distribution of the ciphertext
is independent of what exactly is your underlying
plain text. That means if the adversary sees the ciphertext c over the channel, then it
does not matter whether the plain text is m0 or whether the plain text is m1 with equal
probability from the viewpoint of the attacker,
the ciphertext c should be a candidate encryption
of m0 as well as a candidate encryption of m1. There should not be any bias in the probability.
But whether it is an encryption of m0 or whether
it is an encryption of m1, that means the
ciphertext distribution should be independent of the underlying plaintext. So now what we are going to do next is we
are going to prove that both these 2 definitions
are equivalent. Namely, we will show that
if there is an encryption process which satisfies the original condition of Shannon's perfect
security, then the same encryption process has to satisfy the alternate definition and
vice versa. Let us first prove the equivalence
of the definition assuming that we have an
encryption process which satisfies the condition of alternate definition. Namely, we assume that we have an encryption
process where for every probability distribution,
the conditional probability that m0 is encrypted
in c and m1 is encrypted in c is same. And we call that probability to be say delta,
for all pair of messages m0, m1 in the plain text space, and for all ciphertext c belonging
to the cipher text space. Given this, we will
prove that the original condition of the Shannon’s
perfect security is also true for the encryption process. The conditional probability that the message
is m in the ciphertext c is same as the probability
that plain text is m. So what we are going
to do is assume we have an arbitrary plain text and arbitrary cipher text character.
Now let us try to compute the conditional probability that message is m given the cipher
text is c. So what I am going to do here is
I am going to apply the Bayes theorem here. By applying the Bayes theorem, I can change
the numerator and denominator of the conditional probabilities. In the left-hand side, the
numerator is message being m and the denominator
is ciphertext being c. I am applying the Bayes
theorem and rewriting the same expression in LHS, where I am changing the numerator
and denominator and as for the Bayes theorem, I will get the right-hand side part.
Now let us try to compute the value of probability
what is C = c. The probability that your ciphertext is c is going to be this summation. This summation
state that you take all possible messages from the plain text space and find out what
is the probability that the message is that
candidate plain text namely m’ and given
that the message is m’, what is the probability that m’ is encrypted in c. That will give you the probability distribution
over the cipher text c. Because what have
to basically do is imagine you have the plain
text space you take each of the candidate plaintext that is precisely the first term
in this summation. And once you have fixed that candidate plaintext you just compute
what is the probability that that candidate
plaintext will take you to the ciphertext
c, that is this second term. And if you multiply all this, if these 2 probabilities
and take the summation over all candidate plain text, that will give you the probability
distribution of ciphertext being c. Now, as
per our hypothesis of the condition, we know
that the conditional probability that ciphertext is c given the plain text is m’ is same
for all m’ namely, it is delta, because that is what is the assumption that we are
making.
We are making the assumption that our encryption
process satisfies the alternate condition. So for every m’, I can replace the second
term in the summation by delta. As a result, I get this simplified form of the summation
and I can take out delta common and what I
will be left with summation of probabilities
over the plain text space, that plain text is little m’. As per the rules of probability,
if I sum up all those probabilities I will get 1. So that means I can say that the probability
that ciphertext is c is nothing but delta.
We have computed the value of probability
of C = c, that is delta. Now, let us try to compute the conditional probability that ciphertext
is c given the plain text is m, that is the numerator part of this RHS expression. And
again, as per our hypothesis of this theorem,
or this lemma, we already know that the given
the plain text is m, the probability that ciphertext will be c is nothing but delta.
And that is irrespective of what is my m, it could be m0, it could be m1 it could be
m2 for any candidate m from the plaintext
base, the probability that it could lead to
the ciphertext when c is delta. So, that means the numerator part of this RHS expression
is also delta. Hence, I can say that my original LHS namely, the conditional probability that
plain text is m given the ciphertext is c
is nothing but this equality. And in the numerator as well as in denominator,
I have the delta so I can cancel it out. And hence, I obtain that this conditional probability
is nothing but the probability that plaintext
is m, which is precisely what exactly we wanted
to prove. That means we have proved that if the encryption process satisfies the condition
of the alternate definition, then it also has to satisfy the condition of original Shannon's
definition. So, now let us do the proof in
the reverse direction. Namely, we assume that we have an encryption
process which satisfies the condition of the original Shannon's definition. That is the
conditional probability that message is m
given the cipher text is c is same as the
probability that message is m, for all messages in the plain text space and all cipher text
in the cipher text space. Given this will prove that the distribution
of the cipher text is independent of the underlying
plaintext. Namely, it does not matter whether
the plain text is m0 or whether the plain text is m1 with equal probability, it could
lead to the cipher text c, and this holds for every pair of messages m0, m1 in from
the plain text space and every cipher text
c from the cipher text space. Let us first try to compute the probability
that ciphertext is c given the plain text is m0. For this we are going to use the fact
that as per the given condition, since the
encryption scheme satisfies the original Shannon’s
condition, we know that the conditional probability that message is m0 given ciphertext is c is
same as the probability that message is m0. So now what I am going to do is I am going
to expand my LHS part here using the Bayes
theorem, where I am just changing the numerator
and the denominator of the conditional probability. And by applying the Bayes theorem, I get this
equality. Now, what I can do is I can cancel out this
common term from both the LHS and RHS. Hence
I obtain that the conditional probability
that m0 will lead to an encryption of c is same as the probability that ciphertext is
c. By applying the same logic, I can conclude that the conditional probability that ciphertext
is c given the plain text is m1 is same as
the probability that ciphertext is c. As a result, both this conditional probability
that m0 will lead to an encryption of c and the probability that m1 will lead to an encryption
of c are same namely, it is probability that
ciphertext is equal to c. That means, we have
proved that if the original Shannon's condition is satisfied, then the encryption process
also has to satisfy the condition for the alternate definition. That means, both these
definitions are equivalent to each other.
So, if you are given an encryption process
and if you are asked to prove whether it is perfectly secure or not, then you can either
prove it as per the original Shannon's condition or you can prove it as per the first alternate
definition.
Now, let us see another equivalent definition
of perfect security. Before going into this definition, let us first try to understand
the underlying intuition that we want to capture through this definition. So, the goal of the
perfect security is the following: imagine
a scenario where a key for the encryption
process has been agreed upon between the sender and the receiver. Suppose the adversary knows
that the candidate plaintext for the sender is going to be either m0 or m1 and that too
with equal probability.
Suppose indeed sender is going to encrypt
message m0 or m1 with equal probability. And it encrypts one of these messages randomly
using the key k as per the encryption process, computes the cipher text c and sent it over
the channel. Say, the attacker intercepts
the cipher text c, and the attacker has unbounded
computing power. Intuitively perfect secrecy demands that the adversary’s knowledge about
the underlying plain text should remain the same even after seeing the ciphertext c.
So what was this information about the plain
text before any cipher text was communicated in this particular case: with probability
half from the viewpoint of the attacker, the plain text could be m0, and with probability
half the plain text would be m1. Now perfect
secrecy demands that even after seeing the
cipher text c and even after knowing the steps of the encryption process, and even if the
adversary has unbounded computing power, the advantage that adversity gets after seeing
the cipher text and learning the underlying
plain text should be 0.
That means, even after seeing the cipher text c, the probability that the underlying plain
text would be m0 or m1 should be half. Adversary can do nothing better than guessing the underlying
plain text. That is the underlying intuition
that we are now going to formalize. And this intuition is formalized by a challenge
response game, which we call this experiment. And we are going to see that in the rest of
this course, every security definition is
going to be formalized by this kind of challenge
response experiment or a game which model something which we want, which can happen
in reality. The experiment is as follows: we assume that we are given a publicly known
cipher namely a triplet of algorithms and
the message space, and the game is played
between 2 entities. The first entity here is an adversary or an
algorithm for the attacker which we denote by this algorithm A and the algorithm or the
adversary is computationally unbounded. This
models the fact that we are trying to capture
the notion of perfect secrecy where the adversary is computationally unbound. The second entity
here is the hypothetical verifier which is going to model the sender.
Now, the nomenclature that we are going to
follow while naming this kind of experiments is as follows: this particular experiment
is denoted by this name. Now let us try to decode each of the individual parts of this
complicated name. So, the name PrivK denotes
here that we are trying to model an experiment
for a symmetric encryption process or a private key encryption process. That is why the name PrivK, later when we
will try to model the security requirement
for public key primitives. This privK will
be replaced by PubK. The name of the string eav denotes here we are considering an adversary
where the adversary is only allowed to eavesdrop or just listen the cipher text because we
are in the cipher text only attack model.
The name A here denotes the name of the algorithm
here and pi denotes the name of the encryption process with respect to which this game is
going to be played. That is the nomenclature we are going to follow to denote this particular
experiment.
Now, what are the rules of this game? This
experiment is going to be a randomized experiment. The first step here is that adversary selects
a pair of messages from the plain text space, say m0 and m1 with the restriction that their
size has to be same. An adversary can choose
any pair of messages. There is absolutely
no restriction we put on which pair of messages that it can submit to the verifier. The only
restriction is that their length have to be same, so adversary can deterministically pick
up a pair of message, it could randomly pick
a pair of message, it can use any strategy
to pick the pair of message that it wants to send to the verifier. Now, once the pair of messages are communicated
to the verifier, what the verifier is going
to do is the following: it is going to run
the key generation algorithm which will output a uniformly random key or a key as per the
steps of the key generation algorithm for the verifier.
And the verifier is going to randomly pick
one of these two messages for encryption. The index of the message which is going to
pick for encryption is denoted by b, which is picked uniformly randomly. That means,
with probability half, it might be picking
the message m0 for encryption, or with probability
half it might be picking the message m1 for encryption. Once it has decided the index
of the message, it encrypts that message mb using the key k which is known only to the
verifier and the ciphertext is given to the
adversary. And now the goal of the adversary is to find
out the index of the message which has been encrypted in c, that means the adversary has
to find out whether the c is an encryption
of m0 or whether it is an encryption of m1.
After analyzing the cipher text c as per whatever strategy adversary wants to follow, it gives
them prediction. Namely, it outputs a bit, which is either 0 or 1 corresponding to the
index, which it feels that that particular
message has been encrypted in the cipher text
c. That means b’ denotes the index, which according
to the adversary is the index of the message which is encrypted in the ciphertext c. That
is the experiment. As you can see, this is
a randomized experiment, because adversary
could pick any pair of messages as per its randomness. In the same way the verifier is
going to pick any random message out of this pair for encryption, and the key could be
any random key as per the key generation algorithm.
Now the output of the experiment is decided
as follows. We say that output of the experiment is 1 or which is interpreted as adversary
has won the game, if it has correctly predicted what exactly is the message which is encrypted
in the ciphertext c. That means if it correctly
output b’ equal to b then we say that the
adversary has won this experiment or the adversary’s experiment’s output is 1. On the other hand, if the adversary incorrectly
identifies what is encrypted in that cipher
text c that means it outputs b’ which is
different from b, then we say that output of the experiment is 0, which is interpreted
as adversary has lost the game. Now, this is the experiment and now we have
a definition: we say that an encryption process
pi with respect to which this game has been
played, is perfectly indistinguishable, if for every attacker who participates in this
game, the probability that the output of the experiment is 1 that means probability that
the adversary outputs b equal to b’ in the
experiment is half. If it is more than half,
then we say that the encryption process is not perfectly indistinguishable. Also, the probability here is taken over the
randomness over adversary and the randomness
over the verifier namely the value of b and
the value of k, output by the key generation. So, that is our third equivalent definition
of perfect secrecy. The original definition of perfect secrecy
as given by Shannon was this, we saw the first
equivalent definition and now we have this
game-based definition. We have now 3 different definitions. It can be proved that all these
3 definitions are equivalent to each other. We already proved that the first 2 definitions
are equivalent to each other. We can also
prove that this third definition is equivalent
to the first definition. And the third definition is also equivalent
to the second definition. That means, if you are given an encryption process, we can prove
it to be secure as per any of these 3 conditions,
we can directly prove it using the Shannon's
original definition, we can try to prove it using the equivalent definition, or we can
try to prove it using the game based definition. So these are the 3 equivalent definitions
of perfect security.
So now I would like to stress a little bit
on this game-based definition. This game-based definition describes the notion of what we
call as perfect indistinguishability. That means the ciphertext is completely indistinguishable
from the viewpoint of the attacker, and it
leaks no information whatsoever, whether it
is an encryption of m0 or whether it is an encryption of m1, and this notion of indistinguishability
is very important. Because for the rest of the course, we will see that all the definitions
that we are going to formulate will be in
terms of the notion of indistinguishability. So now let us do an example here. We will
prove that the Vigenere cipher which we have discussed in the last lecture is not perfectly
secure. So, we are going to consider an instance
of Vigenere cipher, where the message space
and the cipher text space are going to be of length two, namely, the plaintext is going
to be a two character string and the cipher text is going to be a two character string
that is why the message space and the cipher
text space are strings over the set (0 – 25)
X (0 – 25). The key generation algorithm for this instance
of Vigenere cipher is going to do the following: Since the message string, plain text string
is of length two, then the key which is going
to be used should be of length either one
or it should be either length two. So that means the period of the key which is the value
t is a uniformly random value, which could be either one or two. Once the period of the
key has been selected, we select the key of
that much size uniformly, randomly from the
set 0-25. So that is a key generation algorithm. Once the key has been picked, we encrypt the
plain text that is available with the sender as per the encryption process of the Vigenere
cipher. Namely, we divide the plain text into
blocks of size t. If t is equal to one, that
means the same t is used to encrypt both the characters of the plain text, whereas if t
is equal to two, we divide the plain text into one single block of size two and, encrypt
the whole block in one row using the two different
characters using the two characters of the
key. That is encryption process. The decryption process is just the reverse
operation of the corresponding encryption process. So now, we are going to prove that
this instance of Vigenere cipher is not perfectly
secure. That means we will give a concrete
attack where the attacker can significantly find out what exactly is the underlying message
which has been encrypted and we are going to use the game-based definition here.
Namely, we will show an instance of the game
here we will show that the distinguishing advantage of the attacker is significantly
better than half. The instance of the experiment here is as follows: adversary submits the
following pair of messages, it submits a message
m0, which consists of the same characters
00. And it submits another plain text m1 which consists of two different plain text characters. Now, as per the rules of the game, the verifier
is going to randomly encrypt one of these
two messages as per the key generation algorithm.
It decides whether it is going to encrypt the message m0 or whether it is going to encrypt
the message m1 with probably half and half. Once it decides what is going to encrypt,
it runs a key generation algorithm which is
going to output a period which could be either
1 or 2. And once the period is generated a key of
that much size is uniformly randomly picked and given to the verifier. And now using that
key the verifier encrypts and sends the ciphertext
to the attacker. Let the cipher text character
be c1 and c2. Since in this case the plain text, which is encrypted consist of two characters,
the cipher text is also going to contain two characters.
Now the goal of the attacker is to find out
after seeing this cipher text c, whether the cipher text c corresponds to an encryption
of 00, or whether it corresponds to an encryption of 01. And the adversary does that by applying
the following strategy: it compares the two
cipher text characters in c, namely c1 and
c2. If it finds that c1 and c2 are same, then it predicts that the message m0 is encrypted
in c, whereas if the cipher text characters c1 and c2 are different then it predicts that
the message m1 is encrypted in ciphertext.
That is analysis of the attack. So now let us see what the distinguishing
advantage of the attacker is, whether it can significantly distinguish apart, whether the
c that it is seeing is an encryption of 00
or whether it is an encryption of 01. We claim
that the probability that adversary can win this experiment, namely, it can correctly
identify what has been encrypted in the cipher text c is 0.75, which is significantly better
than half. Then that means that this scheme,
this instance of Vigenere cipher, is not perfectly
secure. So now let us prove our claim. What is the
probability that adversary wins the experiment? The probability adversary wins to experiment
is the following: we can view this instance
of the experiment as two individual experiments,
the experiment where the index b is equal to 0, namely the verifier has selected the
message m0 for encryption and c is the encryption of m0. The second experiment here is the version
of the experiment where the value of b is
equal to 1, that means the verifier has selected
the message m1 for encryption and c is the encryption of m1. The probability that the
adversary wins the experiment is what is the probability that b is equal to 0 and probability
of b equal to 0 is half. And given that b
equal to 0 adversary, indeed outputs b’
equal to 0. That is one case in which adversary can win the experiment. The second case is when b is equal to 1, which
can occur with probability half. And given
that b is equal to 1, what is the probability
that adversary correctly output the b’ equal to 1. If we sum up these two things, then
that gives the winning probability of the adversary for this experiment.
So now let us compute each of these 2 individual
probabilities one by one. So let us first compute the probability that adversary’s
analysis outputs b’ equal to 0, given that indeed b equal to 0, that means we now want
to analyze what is the probability that if
m0 is encrypted in c adversary’s analysis
indeed outputs b’ equal to 0. If you see the adversary’s analysis, adversary outputs
b’ equal to 0 only if both the ciphertext characters are same.
If indeed 00 is encrypted, then there are
two possibilities. The key could be either of period 1, or the key could be of period
2. If the plain text 00 is encrypted with a period of size 1, that means that the key
size is 1 and the same character of the key
is used to encrypt first 0 as well as a second
0, then of course the both the cipher text characters are going to be same. Whereas if
the key is of size 2, then the probability that both the cipher text characters are same
is same as probability with which the key
of size 2 satisfies the following condition,
the first character of the key is the same as the second character of the key. If that
happens, then again, the encryption of the message m0 will produce the ciphertext in
both the characters being the same.
These are the two individual events under
which an encryption of m 00 encryption of the message 00 will lead to a cipher text
we are both the cipher text characters are same. So now what is the probability that
the key generation algorithm of the verifier
outputs a key of size 1. Well, it is 1/2. And what is the probability
that the key generation algorithm for the verifier outputs a random key of size 2 where
both the key characters are the same. The
probability of that is 1/2 * 1/26. If we sum
these 2 probabilities, then we get the probability that adversary’s analysis correctly output
b’ equal to 0, given that verifier has indeed encrypted the message m0. So that is the first
probability we have computed.
Now let us try to compute the second probability
here. Namely, what is the probability that adversary outputs b’ equal to 1, given that
indeed b equal to 1, that means we are now trying to analyze that assume the verifier
has encrypted the message (0, 1), what is
the probability that adversary’s analysis
also outputs b’ equal to 1. For this we are actually going to subtract
the complimentary probability namely, we will compute the probability that adversary incorrectly
output b’ equal to 0 given that b equal
to 1 and we are going to subtract this probability
from 1 and that will give us the required probability. Now let us try to compute the
probability that adversary’s analysis or adversary’s algorithm outputs b’ equal
to 0 given that b equal to 1.
The probability of this event is same as a
key of size period 2 is selected with the first character being 1 more than the second
character. Because, for example, if the key is k1, k2 where k1 and k2 differs by 1 character,
or whether the difference in the sense that
say, the first character is the 1 and the
second character is 0 of the key, or say the first character of the key is 2 and the second
character of the key is 1 and so on. If this kind of relationship holds between
the 2 characters of the key, and if indeed
the message (0, 1) is encrypted then after
encryption both the cipher text characters will be same. Only if this event happens then
adversary will incorrectly end up outputting b’ equal to 0, even though b is equal to
1. Now what is the probability that a key
period 2 is selected with the first character
being 1 more than the second character. And the probability of this is 1/2 * 1/26.
Because with probability 1/2, a key of period 2 could be selected, because the probability
half the key could be a period 1 probability
half the key could be period 2. The 1/2 here
denotes the probability that key is of period 2. Given this, what is the probability that
the first character of the key is 1 more than the second character of the key, it is same
as 1/26.
If we multiply these 2 probabilities, we get
this complimentary probability. And if we now subtract this complimentary probability
from 1, we get our required probability. Now we have both this individual probabilities
computed, and if we sum them together, and
we have outside 1/2 as the common thing, we
get that the probability with which adversary analysis in this case could correctly identify
what is encrypted is 3/4, which is significantly more than 1/2. That means this instance of
Vigenere cipher is not perfectly secure. So
that brings me to the end of this lecture. To summarize, in this lecture we discussed
the notion of perfect secrecy, which is the first formal notion of secrecy, and which
is also the most strongest form of secrecy
because this notion of security is against
an adversary which is computationally unbounded. So, if you achieve perfect secrecy, then that
is the best possible notion of secrecy that you can think of.
We also discuss some alternate definitions
of perfect secrecy, namely, the distribution of the ciphertext should be independent of
the underlying plaintext. We also saw the game-based definition, which models perfect
indistinguishability. I hope you enjoyed this
lecture. Thank you!
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeRelated Summaries
Understanding Modern Cryptography: The Shift from Perfect Secrecy to Computational Security
Explore the evolution of cryptography, focusing on key reusability and computational security concepts.
Understanding Modern Cryptography: Foundations of Computational Security
Explore modern cryptography's principles, focusing on computational security and efficient algorithms. Dive into the birth of cryptography!
Understanding the Limitations of One Time Pad Encryption
Explore the one time pad encryption, its perfect security, and inherent limitations.
Understanding Cryptography: Key Agreement and Secure Communication
Explore the fundamentals of cryptography, including key agreement and secure communication problems.
Understanding Cryptography: Key Agreement and Symmetric Encryption
Explore the fundamental problems of cryptography including key agreement and symmetric encryption techniques.
Most Viewed Summaries
Kolonyalismo at Imperyalismo: Ang Kasaysayan ng Pagsakop sa Pilipinas
Tuklasin ang kasaysayan ng kolonyalismo at imperyalismo sa Pilipinas sa pamamagitan ni Ferdinand Magellan.
A Comprehensive Guide to Using Stable Diffusion Forge UI
Explore the Stable Diffusion Forge UI, customizable settings, models, and more to enhance your image generation experience.
Pamamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakaran ng mga Espanyol sa Pilipinas, at ang epekto nito sa mga Pilipino.
Mastering Inpainting with Stable Diffusion: Fix Mistakes and Enhance Your Images
Learn to fix mistakes and enhance images with Stable Diffusion's inpainting features effectively.
Pamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakarang kolonyal ng mga Espanyol sa Pilipinas at ang mga epekto nito sa mga Pilipino.

