Introduction
In this lecture, we delve deep into the complex yet crucial notion of semantic security within the realm of cryptography, specifically focusing on scenarios involving ciphertext-only attacks. Understanding semantic security is essential for anyone involved in designing or analyzing encryption schemes. This lecture provides a well-rounded perspective on definitions, implications, and the relationship between various models of security.
What is Semantic Security?
Semantic security in cryptography is a condition that ensures that the encryption of a plaintext does not reveal any additional information to a computationally bounded adversary. This security measure comes into play particularly when the adversary only has access to the ciphertext and nothing else related to the plaintext or the key used for encryption.
Definition in Ciphertext-Only Attack Model
In the ciphertext-only attack (COA) model, the encryption process is deemed semantically secure if the knowledge of the ciphertext does not significantly aid the adversary in determining any properties of the plaintext. The goal is to formalize this intuition and ensure that:
- The probability of the adversary computing function f(m) using the ciphertext c alongside any prior information must be nearly identical to the probability of computing the same function without the ciphertext.
Key Components and Functions
- History Function h(m): Models any prior knowledge about the plaintext that the adversary might possess.
- Function f(m): Represents the additional information the adversary hopes to obtain about the plaintext after seeing the ciphertext.
The essence of semantic security can be understood clearly when the definition requires that:
- The advantage of the adversary computing f(m) with the ciphertext must be nearly equal to the advantage of computing it without the ciphertext—i.e., the ciphertext should provide no help to the adversary beyond what they already knew.
Indistinguishability and Security
To define semantic security more effectively, we introduce the indistinguishability game:
Indistinguishability Game Framework
In this framework, a sender has two messages, m0 and m1, and randomly encrypts one of the two. The adversary, aware that one of the messages is encrypted, must identify which one. The security requirement here is set such that:
- The adversary's ability to identify which message was encrypted should not exceed a probability better than
0.5 + negligible, assuming the adversary is computationally bounded.
Adjustments to the Indistinguishability Game
The distinction from perfect security is that the adversary is not computationally unbounded. We allow for a negligible advantage because, in practical scenarios, some information leakage may occur without compromising security.
Comparable Definitions
The relationship between the original semantic security definition and the indistinguishability-based definition is profound:
- Original Definition states that the advantages in both worlds should differ negligibly.
- Indistinguishability Definition posits that an adversary cannot conclusively guess which message has been encrypted.
Both paradigms are fundamentally equivalent, affirming that if one holds true, so does the other.
Practical Implications in Different Attack Models
The concepts of semantic security can also be adapted into higher security models:
- In models like CPA (Chosen Plaintext Attack) or CCA (Chosen Ciphertext Attack), similar indistinguishability definitions can be formulated where the adversary benefits from additional context or oracle access, yet the conditions of indistinguishability remain.
Reduction-Based Proofs
One of the critical methodologies introduced in this lecture is reduction-based proofs:
- We showcase how, if an adversary could extract information (like a bit of plaintext) significantly better than random guessing, we could construct another adversary capable of winning the indistinguishability game—leading to a contradiction.
- The foundational idea is that if A can compute any bit of plaintext, then a new adversary A’ can differentiate between two encryptions successfully, thereby proving the scheme's insecurity.
This method, common in cryptographic proofs, validates the strength of the proposed encryption schemes.
Conclusion
Semantic security emphasizes the importance of ensuring that ciphertext does not leak additional information about plaintext, even under the capabilities of computationally bounded adversaries. The duality of semantic security definitions—through both original and indistinguishability perspectives—illustrates the rigor with which security must be analyzed in cryptography. By understanding and applying reduction-based proofs, we can affirm the robustness of encryption schemes against various adversarial scenarios.
In summary, the strong foundation of semantic security underlies much of modern cryptography, and mastering its principles is essential for security professionals and academics alike. I hope you found this lecture insightful. Thank you for your attention!
Hello, everyone, welcome to lecture 7. The plan for this lecture is as follows. We will define the notion of semantic security
in the ciphertext only attack model and we
will see an equivalent version of this definition
based on indistinguishability game and we will also introduce reduction-based proofs
which is central to cryptography. So, let us begin with the semantic-security
definition in the ciphertext only attack model
and the scenario is the following. So, we are in the ciphertext only attack model
where we have an adversary, and now we will consider a computationally bounded adversary.
Because remember, in the last lecture we have
discussed that if key reusability is your ultimate goal, then you have to ensure that
your adversary is computationally bound. So, we assume we have a computationally bounded
adversary, who is seeing a ciphertext c of
some unknown message m encrypted by the sender
using an unknown key k as per the encryption algorithm, where the steps of the encryption
algorithm is known to the adversary. Intuitively, we will say that our encryption
process is semantically secure in this ciphertext
only attack model if the ciphertext does not
reveal any additional information about the underlying plaintext to the attacker. Moreover, this should hold even if the adversary
have any kind of prior external information
about the underlying plaintext, which could
have been leaked to the attacker by any other means before the ciphertext have been communicated. So even though this intuition is very straightforward
to understand, it is extremely challenging
to formalize the above intuition. So let us proceed to formalize this intuition,
right. So, we first introduce an abstract function
here, namely h(m), which models any kind of
prior external information about the underlying
plaintext, which might be leaked to the adversary through other means before any ciphertext
have been communicated. So this h(m) is kind of some history function.
There might be in some context or might be
a scenario where adversary might have absolutely no prior external information, in that case,
my function h(m) will be an empty function, but there might be a scenario where adversary
might have some prior external information
about the underlying plaintext, which has
been leaked to the adversary through some other means. So whatever is the case, we introduce this
abstract function to model this prior external
information about the underlying plaintext,
which the adversary has. We next introduce another function f(m), which
basically models the additional information about the underlying plaintext, which adversary
would like to compute after seeing the ciphertext
or which adversary would like to know, right? So this models the additional information
and intuitively, the goal of semantic security is to ensure the following.
We say that our encryption process is semantically
secure if the probability with which adversary could compute this additional information,
namely the function f(m) by using the ciphertext c and using the help of the history function
or the prior information is almost the same
with which the adversary could have computed
the function f(m) by just using the prior information in the absence of the ciphertext. If that is the case, then the implication
that we get here is that ciphertext is of
no help whatsoever for the attacker in computing
f(m). What I mean by this is pictorially the following. So you imagine that in this world, we have
an adversary, who is actually seeing a ciphertext,
which is an encryption of some unknown message
under the unknown key and the adversary also have access to the history function, namely
any kind of prior information, which would have been leaked to the adversary through
some external mechanism without the knowledge
of the ciphertext. So the adversary in this world is called A. You compare by imagining another world where
we have again another adversary, say A’,
who does not see the ciphertext and this adversary
A’ has access only to the history function, namely the prior information about the underlying
plaintext, which sender might communicate over the channel.
Now, the intuition behind semantic security
is that the probability with which A and A’ could compute the f(m), namely the additional
information about the underlying plaintext are almost the same, namely, we will say that
our encryption process is semantically secure
in the ciphertext attack model if the absolute
difference between the following two probabilities is upper bounded by some negligible function. So let us see closer.
Let us have a closer look into this respective
probabilities. So your first probability is the probability
with which the adversary A, namely the adversary in the first world, outputs the value of f(m),
where the adversary is given the ciphertext
as well as the history function. Whereas the second probability is the probability
with which the adversary in the second world, namely the adversary A’, computes the value
of f(m) just using the value of history function.
So if the absolute difference between these
two probabilities is upper bounded by a negligible probability, then what it means is that whatever
adversary could have computed by seeing the ciphertext, namely whatever the adversary
could have known about f(m) using the help
of c with almost the same probability adversary
could have computed f(m) without actually seeing the c. If that is the case, then it means that our
encryption process is so good that ciphertext
is kind of some random bit strings, and it
helps or provides no kind of aid to the adversary in computing f(m) with a significant advantage,
that is what is the intuition behind the notion of semantic security, right.
So this is the original definition of semantic
security, and it turns out that if we want to prove the semantic security of an arbitrary
encryption process as per this original definition, then this is slightly complicated, where because
here you have to bring history function as
well as here you have to bring the arbitrary
function f(m) which adversary would like to compute. Instead, what we are going to do is we will
see an equivalent version of this definition
based on indistinguishability based experiment. This alternate definition based on indistinguishability
based experiment, you can imagine that it is the computationally secure variant of indistinguishability
based definition of perfect secrecy, right.
So, let us first recall the indistinguishability
based definition that we use to define perfect secrecy. So, the essence of that indistinguishability
based definition for defining perfect secrecy
is that if you have a scenario where a sender
has 2 messages m0 or m1 and it has randomly encrypted one of those messages, and if adversary
is aware of the fact that the sender has either encrypted m0 or m1, then even after having
this prior information and seeing the ciphertext
c, adversary should not be able to identify
what has been encrypted in the ciphertext c with probability better than half. That was the intuition that we wanted to capture
through the indistinguishability based definition
of perfect secrecy. This was captured very nicely by the following
experiment, right. So this is the experiment, which we use to
define the notion of perfect secrecy, where
we have a publicly known scheme, namely a
triplet of algorithms over some message space, and in the model of perfect secrecy, we had
a computationally unbounded adversary, and the name of the experiment was this : PrivK
eav A,pi.
So just to recall the nomenclature of the
experiment is as follows. PrivK denotes that we want to model an experiment,
which in the context of a private key or symmetric encryption, eav means we are considering an
adversary who is an eavesdropper, A is the
name of the adversarial algorithm and pi is
the name of the scheme, and in this experiment, the rules are as follows. Adversary is allowed to submit any pair of
messages from the plaintext space with the
restriction that the size of the two plaintext
should be same, and the experiment or the verifier, the hypothetical verifier does the
following: It randomly generates a key by running a key
generation algorithm and it randomly encrypts
one of the messages using the key, and the
challenge for the adversary is to identify what plaintext has been encrypted in the challenge
ciphertext c, whether it is m0 or m1. So adversary outputs a bit, namely its guess
about what exactly has been encrypted in the
challenge ciphertext. We define the scheme pi to be perfectly secure
or we say that a scheme is perfectly indistinguishable if the probability with which adversary could
successfully identify what message has been
encrypted is upper bounded by half. So if adversary could successfully identify
what message has been encrypted in the challenge ciphertext, then we say that adversary has
won the game or we say that output of the
experiment is equal to 1. That means this notation that output of the
experiment is 1 denotes the probability that b’ equal to b.
So in the context of perfect secrecy, our
requirement was that a probability adversary could identify b, b’ equal to b correctly
should be upper bounded by half. Now, let us see the indistinguishability based
definition to model the notion of semantic
security in the ciphertext only attack model. We will make the following changes. The first change that we are going to make
is the following.
Instead of assuming that our adversary is
computationally unbounded, we will assume that our adversary is computationally bounded. This is to model the first relaxation that
we agreed that we should make in computational
security model right. So remember the last lecture we discussed
that if key reusability is your ultimate goal, then we should target to achieve security
only against a computationally bounded adversary,
namely an adversary whose running time is
upper bounded by some polynomial function of the security parameter. So, that is why we made this first change
in the experiment, we will not be considering
an adversary whose running time is computationally
unbounded. Consequently, the name of the experiment is
going to be PrivK COA A pi n. So, instead of saying eav, I am now calling
this experiment COA to denote that this is
the indistinguishability experiment in the
ciphertext only attack model and the second difference here in the nomenclature is that
I am now introducing this security parameter n because the running time of the adversary
is going to be upper bounded by a polynomial
function of the security parameter, whereas
if you see in the nomenclature for the indistinguishability based experiment in the world of perfect secrecy,
no such security parameter was there because our adversary was allowed to have unlimited
running time.
So this is the second relaxation. This is the second change in the experiment,
and the third change is that instead of saying that our encryption process is perfectly indistinguishable,
we will say that our encryption process is
computationally indistinguishable. If the probability with which adversary could
correctly identify what has been encrypted in the challenge ciphertext is upper bounded
by some negligible function plus half, that
is a third change we are going to make in
our experiment, right. So in the experiment for perfect secrecy,
the requirement was that adversary should not be able to identify what has been encrypted
in the challenge ciphertext with probability
better than half, but now we are giving the
adversary extra negligible advantage to correctly identify what has been encrypted in the challenge
ciphertext c. This extra advantage, namely an advantage
of negligible function and advantage of some
negligible probability is to model the second
relaxation that we have to make in the model of computational security if the key reusability
is your ultimate goal. So again recall in the last lecture, we have
seen that if you want to design a scheme where
key reusability is your ultimate goal, that
instead of demanding that adversary should not learn anything additional, you should
be willing to let the adversary learn something about your underlying message or to let the
adversary break your scheme with some additional
probability and that additional probability
should be so small in which it should be a negligible probability, which for most practical
purposes you can ignore it off. So, that is why I am bringing this additional
advantage of negligible function of n in my
security definition. So, that is the computationally secure indistinguishability
based version experiment in the ciphertext only attack model.
So the essence of this experiment is the following. What we want to capture through this experiment
is the following. If you have a scenario where a sender is having
a pair of message, say m0 and m1 and if the
adversary is aware of this where our adversary
is computationally bounded, if one of these 2 messages m0 or m1 has been encrypted by
the sender and communicated to the receiver and our adversary intercepts a ciphertext,
then we require the following property from
our encryption process. We require that a computationally bounded
adversary should not be able to identify whether the ciphertext c which he is seeing is an
encryption of m0 or whether it is an encryption
of m1 with probability better than half plus
negligible. That is what is the scenario or real world
scenario we are trying to capture through this experiment.
So now we have the following 2 definitions. The first definition is actually the original
definition of semantic security in the ciphertext only attack model, where we want to capture
that the advantage of the adversary in first
world and the adversary in the second world
is upper bounded by negligible probability, whereas the second definition is the indistinguishability
based definition. It turns out that both these two definitions
are equivalent.
Namely if we have an encryption process which
satisfies the first condition, then we can prove that for the same encryption process
in the second condition also hold and vice versa.
Namely, if we have an encryption process where
the second condition hold, then for the same encryption process, the first condition also
holds. I would like to stress the following.
In the experiment, which we have discussed
when I say that the probability that adversary correctly identifies what has been encrypted
in the challenge ciphertext should be upper bounded by half plus negligible, then this
probability is over the randomness of the
experiment. So remember that the experiment could choose
the message m0 to be encrypted in the challenge ciphertext with probability half and with
probability half the experiment or the verifier
could choose the message m1 to be encrypted
in the ciphertext c. This probability of correctly identifying
whatever has been encrypted in c should be also over the randomness of the adversary,
right, because the entire experiment is going
to be a randomized experiment, right. So, as I said that these 2 notions of security
or these 2 definitions are equivalent, and the proof that these 2 definitions are equivalent
is slightly complicated and due to interest
of time, we will not be going into the details
of the proof. However, if you want to have a very high level
overview of the proof, you can refer to the book by Katz and Lindell.
Interestingly, it turns out that the equivalence
of these 2 definitions holds in the other models as well, right So, currently what we are considering is the
ciphertext only attack model and in the ciphertext
only attack model, the adversary has got access
to the encryption of some message, but if I go to the higher attack model, by higher
attack model means more powerful attack model, say the CPA attack model where apart from
the ciphertext, adversary also gets access
to the encryption oracle then we can have
a corresponding semantically secure version of the definition that we are currently giving
here for the COA model, right. Namely, we would like to state that adversarial
advantage or the difference of the absolute
probabilities of adversary computing the function
f(n) in the 2 worlds should be upper bounded by a negligible function, where in the first
world apart from the ciphertext, adversary will also get access to the encryption oracle
if we take this definition to the CPA attack
model. In the same way if we take this definition
to the CCA attack model, then apart from the ciphertext, adversary in the first world will
have access to the encryption oracle, it will
have access to the decryption oracle and so
on. So, we can come up with a semantically secure,
we can come up with a version of the semantic security in the CPA model, in the CCA model
and so on where the essence of the definition
will be that the absolute difference between
the two probabilities of adversity computing f(m) in the first world and f(m) in the second
world should be upper bounded by a negligible probability, that will be the essence of the
semantic security definition in CPA model,
CCA model and so on. It turns out that irrespective of the model,
we can come up with a corresponding indistinguishability based definition and we can prove that the
semantically secure version of the definition,
the original version of the semantic security
will be equivalent to the indistinguishability based definition. So, that is why for the rest of the course,
we will not be seeing the original version
of the semantic security definition. We will be rather following the indistinguishability
based security definition and depending upon whether we are in the CPA world, CCA world,
we will enhance the indistinguishability based
experiment, right. So, this is the indistinguishability based
definition in the ciphertext only attack model and it turns out that we could come up with
an alternate version of this definition, right.
So the original definition requires you that
the probability that our adversary is correctly able to identify the message that is encrypted
in c should be upper bounded by half plus negligible, that is what is the original definition.
The alternate definition demands that the
output of the adversary should be same irrespective of what exactly is the message which has been
encrypted in the challenge ciphertext c. So, remember that since this indistinguishability
based definition is a randomized experiment
with probability half, my b could be equal
to zero and with probability half the message which has been encrypted and ciphertext c
could be m1, right. The goal of the adversary is to identify whether
it is m0 which is encrypted in c or whether
it is m1 which has been encrypted in c. So this alternate definition demands that
output of the adversary should be same irrespective of whether it is m0 which is encrypted in
c or whether it is m1 which is encrypted in
c. More formally, it does not matter whether
the message m1 has been encrypted or message m0 has been encrypted in c, in both the cases,
adversary’s output should be almost same
except with a negligible probability. That means absolute advantage of the adversary
distinguishing apart whether he is seeing an encryption of m0 in ciphertext c or whether
he is seeing an encryption of m1 in the ciphertext
c should be upper bounded by some negligible
function. If this is the case, then we say that our
encryption process has indistinguishable encryption in the ciphertext only attack model, right.
So, another interpretation of this difference
of these two probabilities is upper bounded by a negligible probability is that a distinguishing
advantage, you can view the difference between these 2 probabilities as the distinguishing
advantage of our adversary, right?
So the essence of this alternate definition
is that the distinguishing advantage of our adversary in this experiment should be upper
bounded by a negligible probability. It turns out that these 2 versions of the
indistinguishability based definitions are
equivalent. Namely, we can say that our encryption process
is computationally indistinguishable if the probability with which adversary could correctly
output b equal to b’ is upper bounded by
half plus some negligible function. The second definition says that the distinguishing
advantage of the attacker to distinguish apart whether he is seeing an encryption of m0 or
whether he is seeing an encryption of m1 should
be upper bounded by a negligible probability. It turns out that both these 2 conditions
are equivalent. Namely, we can prove that if we have an encryption
process pi where the first condition holds
and for the same encryption process, the second
condition also holds and vice versa. So, what we are going to do is we will follow
the implication in the direction that the condition 2 implies condition 1, namely we
will assume that say we have an encryption
process where condition 2 holds, namely the
distinguishing advantage of the attacker is upper bounded by some negligible probability. If that is the case, then we are going to
show that irrespective of the way adversary
participates in the indistinguishability based
experiment, the probability that adversary could correctly identify b equal to b’ or
it ensures b equal to b’ is upper bounded by half plus some negligible function.
So, let us prove that. So, what is the probability that in the indistinguishability
based experiment, adversary outputs b equal to b’ because if adversary outputs b equal
to b’, that is what is the interpretation
that the experiment outputs 1. Now, if you recall, there are 2 versions of
the experiment. One version of the experiment where the challenger
or the experiment has selected message m0
in the ciphertext c, that means m0 has been
encrypted in c and the second version of the experiment is when the message m1 has been
encrypted in c. It turns out that with probability half, the
challenger could use m0 to encrypt in c and
the probability half it could use m1 to encrypt
in c. So overall, the probability that the experiment
adversary outputs b equal to b’ is he should output b equal to b’ in the case when b
= 0 and he should output b = b’ even for
the case when b = 1. Both these events b = 0 and b = 1 can occur
with probability half. So that is why I am taking half as common,
right.
Now, what I am going to do here is the following. I am going to rewrite this first probability
as 1 minus the complimentary probability. Namely the probability that adversary incorrectly
outputs b’ = 1, even though the message
m0 has been encrypted in the challenge ciphertext,
and if I subtract that probability from 1, then I get the probability with which adversary
could identify that indeed m0 has been encrypted in c, given that indeed m0, has been encrypted
in c, right.
So, that is what is the substitution I have
done. Now, what I can do is I can take this 1/2
inside the bracket, and as a result, by rearranging the term I get this given expression.
Now I make use of the fact that I am assuming
that for my encryption process, the distinguishing advantage of the attacker is upper bounded
by some negligible probability. So if you see this highlighted thing, this
highlighted thing is nothing but the absolute
difference between the 2 probabilities which
highlights actually the distinguishing advantage of the attacker, namely, the probability with
which it can distinguish apart whether it is seeing an encryption of m0 or whether it
is seeing an encryption of m1.
As part of our assumption, we are assuming
that the distinguishing advantage of the attacker is upper bounded by some negligible function. So, I can substitute this highlighted thing
by some negligible function.
So, what I obtain here is that the probability
that adversary outputs b = b’ in the experiment is upper bounded by half plus some other negligible
function because 1/2 into negligible function, I can always replace by another negligible
function, negligible’.
So, what I have shown here is if the condition
2 holds for my arbitrary encryption process, then even the condition one holds as well. In the same way, we can prove that if condition
1 holds for our encryption process, then for
the same encryption process condition 2 holds
as well, I leave that as an exercise for you. That means both these 2 versions of the definitions
are same, we can prove that we can either demand that condition 1 holds, if we want
to say that our encryption process is COA
secure as per the indistinguishability game
or we can demand that the distinguishing advantage of the adversary should be upper bounded by
a negligible probability. Both of them are equivalent.
Throughout the course, depending upon the
convenience, we can use any of these 2 versions of the indistinguishability based definition
of the ciphertext only attack, right. So, here is the summary.
So we have the original definition of semantic
security, right. So, this is your definition of semantic security
here, and we have an indistinguishability based definition, and even though we have
not proved it formally, you have to believe
me that both these 2 definitions, both these
2 conditions are equivalent to each other. So, now, you might be wondering that how can
these 2 conditions definitions might be equivalent because in the original definition of semantic
security, you have an abstract function.
Namely the history function and you have some
function f(m) which adversary would like to compute, whereas in the computationally indistinguishability
based definition there is no such history function, no such function f(m) which adversary
would like to compute.
So, you might be wondering how come these
2 conditions are equivalent? So, what we are going to do next is I am going
to take an illustration and we will consider the case where our encryption process satisfies
the indistinguishability based definition.
Namely, we will assume that our encryption
process satisfies this condition, and we will assume that our distribution of the plaintext
space is a uniform distribution over the set of messages, namely the set of l bit strings,
and what we are going to show is that if our
encryption process satisfies the indistinguishability
based definition, then it implies that adversary by looking into the ciphertext cannot compute
any bit of the underlying plaintext. So, the function f(m) which adversary is interested
to compute in this illustration is to compute
the underlying bits of the message by looking
into the ciphertext, and our goal will be to prove that if our encryption process satisfies
the indistinguishability based definition, then the probability with which the adversary
could compute this candidate f(m) is upper
bounded by a negligible function, right. The proof strategy that we will use in this
illustration is reduction based proof, which is kind of central to the cryptography because,
in the rest of the course, almost all the
proofs will be following this reduction based
proof. This reduction based proofs actually comes
from another interesting branch of computer science namely complexity theory.
Specifically, if you want to prove that a
problem y is NP complete, then the way you proceed to prove that a problem y is NP complete
is you show actually that any existing problem x in NP can be reduced to an instance of this
new problem y and the implication of this
reduction is that if you have a polynomial
time solution for solving the problem y, then actually you have a polynomial time solution
for the problem x as well, right? So, that is what is the essence of this reduction.
In the same way, we are going to use such
kind of reduction based proofs in cryptography as well, where we will use an adversarial
algorithm to break a scheme or to attack a scheme to solve or design another adversary
or algorithm to break something else as well.
In more detail, what we are going to show
in this illustration is the following, right. So imagine your underlying message, the message
which sender has used for encryption is randomly selected from the set of l bit strings and
we are in the ciphertext only attack model
and our adversary has seen a ciphertext c. So, the claim that we are making here is that
if your encryption process is computationally indistinguishable, that means it satisfies
the notion of computational indistinguishability,
then it is impossible for any polynomial time
adversary to compute any of the underlying bits of the plaintext with probability significantly
better than half. That is what we are going to show.
That means for each i, i belonging to 1 to
l, there is the probability that any algorithm A, which when given an encryption of an unknown
message, which is randomly chosen from the set of l bit strings and encrypted using an
unknown key that adversarial algorithm when
given such a ciphertext outputs the ith bit
of the underlying plaintext is upper bounded by half plus negligible. So this m raised to power i actually denotes
the ith bit of the underlying plaintext.
So, this illustration basically is going to
show you that if your encryption process actually satisfies the indistinguishability based definition,
then indeed it is not possible for any polynomial time algorithm to compute any f(m) of his
choice, where in this particular illustration,
the function f(m) is computing the ith bit
of the underlying plaintext. The basic intuition to prove this claim is
the following. Imagine that we have an adversary A, which
when given an encryption of a random message
could compute the ith bit of the underlying
message with significant probability better than half. Then, using this algorithm or adversary, we
can design another adversary, which can distinguish
apart an encryption of message m0 and an encryption
of message m1 where the ith bits are different with significant probability, but this will
be a violation to the assumption that my encryption process is computationally indistinguishable.
Because when I say my encryption process is
computationally indistinguishable, that means there exist no adversary, which can distinguish
apart an encryption of message m0 and an encryption of message m1 whose ith bits are different.
That is what is the basic intuition of the
claim that we are making here. However, this claim has to be formalized by
giving a reduction based proof and let us see the details of the reduction based proof.
So the claim that we are making is formally
stated here. So imagine what the claim states that if you
have an encryption process over the message space of l bit strings, and if the underlying
message is selected uniformly randomly and
if your encryption process is computationally
indistinguishable, then it implies there exist no adversary, which when given an encryption
of a random unknown message under random key can compute the ith bit of the underlying
plaintext with significant probability better
than half. The proof strategy here will be the proof
by contrapositive, right. Namely, we will assume that suppose there
exist an adversary, who can significantly
compute the ith bit of the underlying plaintext
where the underlying plaintext is randomly chosen from the set of l bit strings and underlying
plaintext is not known to the adversary and also the key is not known to the adversary.
That means, the property of this adversary
is if it is given an encryption or ciphertext c of an unknown randomly chosen message m,
it could correctly identify or it could correctly output the ith bit of the underlying plaintext
with probability significantly better than
half. That is what we mean when I say that there
exist an adversary who can correctly identify the ith bit of the underlying plaintext.
Now assuming the existence of this adversary,
what we are going to show is that the claim that we are making our encryption process,
namely the claim that our encryption process is computationally indistinguishable, also
does not hold.
Namely, we will design an adversary A’,
which can participate in an instance of the indistinguishability based game in the COA
attack model and with significant probability better than half it can distinguish apart
whether it is seeing an encryption of m0 or
whether it is seeing an encryption of m1,
right. So, that means, with whatever advantage A
could identify the ith bit of the underlying plaintext, namely with whatever advantage
our adversary A could compute the function
f(m), so remember the function f(m) here in
this case is computing the ith bit of the underlying plaintext. So, if there exist an adversary A who can
significantly compute value of function f(m)
with non-negligible probability better than
half, then with almost the same advantage, we can design another algorithm A’ who can
win an instance of COA indistinguishability experiment.
That is what we are going to show through
this reduction. So let us see the details of the reduction
here. So assume we have an adversary A and the property
of the adversary is that if it sees an encryption
of a random message from the plaintext space
of l bit strings, then it could identify the ith bit of the underlying plaintext with probability
half plus non-negligible, where i ranges from 1 to l.
So, for simplicity, you can imagine i is equal
to 1, that means, the property of this adversary is if you give him an encryption of randomly
chosen message, where the message is not known to him and underlying key is not known to
him, then this adversary in polynomial time
can correctly output the first bit of the
message because I am assuming i equal to 1, this adversary could output the first bit
of the message with probability half plus some non-negligible function.
Now, assuming the existence of such an adversary
A, we design adversary A’ who participates in an instance of COA indistinguishability
game as follows. So as per the steps of the COA indistinguishability
game, the adversary A’ has to submit two
messages from the plaintext space. What the adversary A’ does is, it selects
a pair of messages m0 and m1, and remember as per the steps of the indistinguishability
game, the adversary A’ is free to choose
any pair of message from the plaintext space. So, the adversary A’ here cleverly chooses
the pair of messages here, the message m0 as ith bit being 0 and the message m1 as the
ith bit being 1.
That means, the message m0 and m1 differ in
their ith bit, apart from that, all other bits of the messages m0 and m1 are randomly
chosen. That is what is the pair of messages which
adversary submits to the experiment.
Now, what the experiment does is it runs the
key generation algorithm and it decides to either encrypt the message m0 or to encrypt
the message m1. So, imagine the adversary A’ has given the
encryption of the message mb where b is the
index of the message which has been encrypted
by the experiment, and the goal of the attacker A’ is to identify whether it is m0 which
has been encrypted in c or whether it is m1 which has been encrypted in c, right.
That is what is the goal of the attacker,
whether it is an encryption of m0 or whether it is an encryption of m1. Now, what this attacker is going to do is
it is going to take the help of the adversarial
algorithm A, which we have assumed that it
exists, right. Namely, A’ invokes the algorithm A, namely
it creates an instance of an encryption for the algorithm A by supplying a ciphertext
c and where the ciphertext c is the same ciphertext
which the adversary A’ has obtained in the
experiment, the COA experiment. Namely the adversary A’ here challenges
the adversary A here to identify the ith bit of the underlying message which has been encrypted
in c.
Now, what the algorithm A is going to do is
it is going to identify the ith bit of the underlying message which has been encrypted
in the challenge ciphertext c, right? Because the property of the algorithm A is
that if you give him an encryption of a random
message, then it can tell you the ith bit
of the underlying message whether it is 0 or 1. Now, depending upon what output the adversarial
algorithm A’ obtain from A, right, the adversarial
algorithm A’ comes back to the instance
of the COA experiment, and it outputs b’ = mi. Namely, the response or the output from the
adversarial algorithm A’ in the COA experiment
is the same bit which the adversarial algorithm
A has output after seeing the ciphertext c, right. Now that my claim here is the probability
that the algorithm A’ outputs the correct
message which has been encrypted in c, namely
the probability with which A’ outputs b = b’ in the COA indistinguishability game
is exactly the same with which the adversarial algorithm A could correctly output the ith
bit of the underlying message which has been
encrypted in the ciphertext c. Mathematically, what I mean to say is the
probability with which the adversary A’ can win the COA experiment is exactly the
same with which the probability with which
the adversary A could identify the ith bit
of the underlying message which has been encrypted in the challenge ciphertext. This is because the view of the adversary
A here, which A’ is providing to A is exactly
the same as A expects to launch its attack,
right. So, let me elaborate on this in the next slide. So, this is the reduction here.
So, if you can see here this part of the experiment
is the COA experiment. This is an instance of COA indistinguishability
experiment, and in this part, A’ is basically taking help of A, right.
So, the first observation in this reduction
is the following. The running time of our adversary algorithm
A’ is almost the same as the running time of the adversarial algorithm A because what
is the running time of our algorithm A’,
it has to submit a pair of random messages
which it can do in polynomial amount of time, and then it has to invoke the algorithm A. So whatever is the running time of the algorithm
A, if the running time of the algorithm A
is polynomial, then the running time of A’
invoking A is also going to be a polynomial time. Then finally A’ outputs a bit b’ which
is going to take a polynomial amount of time.
So, if at all the running time of my algorithm
A, which I am assuming to exist is polynomial time, the running time of my algorithm A’
is also polynomial time. The second thing here is the algorithm A’
invoke algorithm A in a black-box fashion.
Namely, A’ is not going to know that what
exactly is the way what algorithm the algorithm A is following to identify the ith bit of
the underlying message which has been encrypted in c.
So you can imagine there is some kind of interface
here. There is an input interface through which
A’ can provide a challenge to A and there is an output interface through which A’
provides its output.
Namely, it says what exactly is the ith bit
of the underlying message which has been encrypted in c, right. Apart from that, we make no assumption whatsoever
about the internal details or internal working
of the algorithm A. So, what it models here
is that if at all there is a way to identify the ith bit of the underlying plaintext where
the underlying message is a random message and encrypted in c, then without even knowing
the internal details of the algorithm A, the
adversary A’ can actually win an instance
of the COA experiment. The important thing here is the way reduction
has been formulated here is what A’ has done here is A’ has created a view and by
view I mean, the information which has been
provided to the algorithm A, namely whatever
information the adversarial algorithm A sees, and what adversarial algorithm A sees, it
basically sees an encryption of a random l bit string and the property of our algorithm
A is that it expects an encryption of a uniformly
random l bit string, and if you give an encryption
of a uniformly random l bit string with probability half plus non-negligible, it can tell you
what exactly is the ith bit of the underlying message.
That is what is the property of the algorithm
A, right. So, the way this reduction has been formulated,
what the reduction has done is in the reduction A’ has created a view which is identical
for the algorithm A to launch its attack,
right? Because in the COA experiment with probability
half, it could be either the message m0 which could be encrypted in c and the probability
half it could be the message m1 which is encrypted
in c, right, and says both these m0 and m1
are randomly chosen except they are ith bit. What A’ has created for A is actually a
view where A is actually seeing an encryption of a random message where the ith bit could
be either 0 or ith bit could be 1 with probability
1/2, 1/2 each, right. So, it is very important that A’ creates
a view which is identical to the view which A expects to launch it attacks, because if
the view which A’ creates for the adversary
A is not identical, then we cannot say that
with whatever advantage A can break or identify the ith bit from the ciphertext c, which is
almost the same probability A’ can identify what message has been encrypted in c.
That relationship we cannot claim if the view
which A’ has created for A is not identical, which A expects, right. So, let me go back to the previous slide,
and in the previous slide, I claimed that
the probability with which the adversary A’
can win the COA experiment is identical to the probability that the adversary A identifies
the ith bit of the random message, which has been encrypted in c.
The reason that this equality holds is as
I said, when I say that the probability that A could output the ith bit of a randomly chosen
message, then with probability 1/2, that randomly chosen message could be the set of all possible
l bit strings, where the ith bit is 0 and
with probability 1/2, it could belong to the
set of all possible l bit string where ith bit is 1, right? So this probability of adversity A outputting
the ith bit is over the random choice of the
underlying message. So the probability if I come back to this
reduction with probability 1/2, b is going to be 0, and with probability 1/2 be is going
to be 1.
That means the c is either going to be an
encryption of a random message from the set of l bit strings, where the ith bit is 0 or
the c could be belonging to the set of random l bit strings where the ith bit is 1, and
that is what has been forwarded to this algorithm
A to identify what exactly is the ith bit. So if indeed, the message m0 has been encrypted
in the challenge ciphertext c, the adversary A is going to output ith bit to be 0 and that
ensures that b’ = 0, whereas if m1 has been
encrypted in this challenge ciphertext c,
that means the ith bit of the message which has been encrypted in this challenge ciphertext
is 1, then this adversary A is actually seeing an encryption of a random message whose ith
bit is 1, and as per the property of this
algorithm with probability half plus non-negligible,
it is going to output mi = 1 and with same probability, the algorithm A’ is going to
output b’ = 1. So with whatever advantage, the adversary
A’ could identify the ith bit of the message
which has been encrypted in c with exactly
the same probability, our adversary A’ could identify what has been encrypted in c. So that means through this reduction, what
we have established is if this adversary A
could compute f(m), namely it could identify
the ith bit of the underlying message with significant probability better than half,
then with almost the same probability, in fact with exactly the same probability, our
algorithm A’ could actually win the COA
game and that will be a contradiction to our
assumption to the claim which we are making about our encryption process because we are
assuming that our encryption process has indistinguishable encryption.
That means, when I say my encryption process
has indistinguishable encryption, that means that no such A’ exist which further implies
no such A exist, and this is formally established by doing this reduction, right.
So, throughout this course, we are going to
do proofs like this based on reductions. This is one of the simplest reductions which
we have introduced here and you should understand this in a clear fashion, right.
So, that brings me to the end of this lecture. To summarize in this lecture, we have introduced
the notion of semantic security in the COA attack model.
So, the original definition states that a
scheme COA secure if the probability with which adversary could compute some function
of the underlying message by seeing a ciphertext is almost the same with which it could compute
the same function of the underlying message
without actually seeing the ciphertext. An equivalent version of this definition is
the indistinguishability based definition, where the requirement is that if the adversary
sees an encryption of a randomly chosen message
from a pair of messages where the pair of
messages is known to the attacker, then the probability with which it can identify whether
it is an encryption of this 0th message or the first message is upper bounded by half
plus negligible.
So which can be also stated as the distinguishing
advantage of the attacker and distinguishing apart whether the challenge ciphertext it
sees in the experiment, in the indistinguishable based experiment belongs to m0 or m1, it cannot
separate apart except with a negligible probability.
We also saw an illustration where we showed
actually that if your encryption process satisfies the indistinguishability based definition,
then it indeed implies that adversary cannot compute any of the underlying bits of the
plaintext.
In that illustration, we introduced a reduction-based
proof, which are central to cryptography. I hope you enjoyed this lecture. Thank you.
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeRelated Summaries
Understanding Modern Cryptography: Foundations of Computational Security
Explore modern cryptography's principles, focusing on computational security and efficient algorithms. Dive into the birth of cryptography!
Understanding Cryptography: Key Agreement and Symmetric Encryption
Explore the fundamental problems of cryptography including key agreement and symmetric encryption techniques.
Understanding Cryptography: Key Agreement and Secure Communication
Explore the fundamentals of cryptography, including key agreement and secure communication problems.
Understanding Modern Cryptography: The Shift from Perfect Secrecy to Computational Security
Explore the evolution of cryptography, focusing on key reusability and computational security concepts.
Understanding Historical Ciphers and Modern Cryptography
Explore historical ciphers like Shift, Mono-Alphabetic, and Vigenere and their impact on modern cryptography principles.
Most Viewed Summaries
Kolonyalismo at Imperyalismo: Ang Kasaysayan ng Pagsakop sa Pilipinas
Tuklasin ang kasaysayan ng kolonyalismo at imperyalismo sa Pilipinas sa pamamagitan ni Ferdinand Magellan.
A Comprehensive Guide to Using Stable Diffusion Forge UI
Explore the Stable Diffusion Forge UI, customizable settings, models, and more to enhance your image generation experience.
Pamamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakaran ng mga Espanyol sa Pilipinas, at ang epekto nito sa mga Pilipino.
Mastering Inpainting with Stable Diffusion: Fix Mistakes and Enhance Your Images
Learn to fix mistakes and enhance images with Stable Diffusion's inpainting features effectively.
Pamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakarang kolonyal ng mga Espanyol sa Pilipinas at ang mga epekto nito sa mga Pilipino.

