Difference between revisions of "GenericAttacksMerkleDamgaard"

From The ECRYPT Hash Function Website
(Second Preimage Attacks)
Line 1: Line 1:
=== Second Preimage Attacks ===
+
= Collision style Attacks =
 +
In case a hash function is not considered as a black box, but built
 +
from compression functions (which in turn are considered as black
 +
boxes at this point), multi-collisions can be constructed more
 +
efficiently. Ideally, the effort to find $2^t$ single collisions
 +
should grow according to the birthday paradox: for $t \ll n/2$ the
 +
effort should grow almost linearly with each additional collision.
 +
What Joux showed in 2004~\cite{} is that for iterated constructions
 +
the effort to find a $2^t$-multicollision is actually $t*2^{n/2}$.
 +
The idea is to simply concatenate $t$ collisions found by a birthday
 +
attack (or by any other mean like shortcut attacks for that matter).
 +
Since each collision allows to pick a message out of a pair of
 +
messages, and this choice is available $t$ times, a set of $2^t$
 +
different messages consisting of $t$ message blocks can be
 +
constructed that all lead to the same hash value.
  
<bibtex>
+
 
@inproceedings{Kelsey2005SecondPreimageOn,
+
An application of Joux's multicollisions (also given in~\cite{}) is
  author    = {John Kelsey and Bruce Schneier},
+
the analysis of concatenated constructions. Assuming two hash
  title    = {Second Preimages on n-Bit Hash Functions for Much Less than $2^n$ Work.},
+
functions of output size $n$ each whose outputs is concatenated, one
  booktitle = {EUROCRYPT},
+
would ideally expect a security of $2^n$ against birthday based
  year      = {2005},
+
collision search attacks. Generating a $2^{n/2}$ multicollision for
  pages    = {474-490},
+
one of the hash functions is however enough to find a collision in
  url      = {http://dx.doi.org/10.1007/11426639_28},
+
the concatenated construction. This has a total cost of
  publisher = {Springer},
+
$2^{n/2+log(n)}$.
  series    = {LNCS},
+
 
  volume    = {3494},
+
As a historic note, it should be mentioned that Coppersmith's attack
  abstract  = {We expand a previous result of Dean [Dea99] to provide a second preimage attack on all n-bit iterated hash functions with Damgård-Merkle strengthening and n-bit intermediate states, allowing a second preimage to be found for a 2<sup>k</sup>-message-block message with about k × 2<sup>n/2+1</sup> + 2<sup>n - k + 1</sup> work. Using RIPEMD-160 as an example, our attack can find a second preimage for a 2<sup>60</sup> byte message in about 2<sup>106</sup> work, rather than the previously expected 2<sup>160</sup> work. We also provide slightly cheaper ways to find multicollisions than the method of Joux [Jou04]. Both of these results are based on expandable messages-patterns for producing messages of varying length, which all collide on the intermediate hash result immediately after processing the message. We provide an algorithm for finding expandable messages for any n-bit hash function built using the Damgård-Merkle construction, which requires only a small multiple of the work done to find a single collision in the hash function.},
+
in 1985~\cite{DBLP:conf/crypto/Coppersmith85} on the Davies-Price
}
+
variant~\cite{} of Rabin's scheme~\cite{} builds already on exactly
</bibtex>
+
this idea.
  
 
<bibtex>
 
<bibtex>
Line 33: Line 47:
 
</bibtex>
 
</bibtex>
  
'''Note:''' This artcle shows that second preimages can be found in much less than 2<sup>n</sup> work. This approach works for all iterated hash functions. Nevertheless, this attack is not practical since a huge amount of data is required.
+
 
 +
= Second Preimage Attacks =
 +
Discoveries about second preimage attacks on iterated hash functions
 +
span the last three decades. Merkle notes in 1979 that for messages
 +
of length $2^k$, the same number of different target hash values
 +
will speed-up the search for second preimages (of potentially
 +
different length) to $2^{n-k}$ trials. \todo{Winternitz,lai}.
 +
 
 +
 
 +
One of the reasons to include the message length as part of the
 +
message to be hashed in constructions since then, is to prevent
 +
these type of attacks. However, Dean~\cite{} describes in 1999 a way
 +
to circumvent this measure by used so-called expandable messages.
 +
Expandable messages are a set of messages of different lengths that
 +
all yield the same intermediate hash value.
 +
 
 +
Dean's construction only works for compression functions that have
 +
easily constructed fixed-points, \ie where it is easy to find a
 +
message block and an input chaining value that results into the same
 +
output chaining value. Many popular hash function construction
 +
indeed do have this property. In 2005, Kelsey and Schneier managed
 +
to remove this constraint and gave an algorithm to construct
 +
expandable messages for any compression function with an $n$-bit
 +
intermediate value. Their idea is to construct multicollisions out
 +
of collisions between message blocks of different length. From that,
 +
again example messages can be constructed and hence the search for
 +
second preimages is again of order $2^{n-k+1}$ word.
 +
 
 +
<bibtex>
 +
@inproceedings{Kelsey2005SecondPreimageOn,
 +
  author    = {John Kelsey and Bruce Schneier},
 +
  title    = {Second Preimages on n-Bit Hash Functions for Much Less than $2^n$ Work.},
 +
  booktitle = {EUROCRYPT},
 +
  year      = {2005},
 +
  pages    = {474-490},
 +
  url      = {http://dx.doi.org/10.1007/11426639_28},
 +
  publisher = {Springer},
 +
  series    = {LNCS},
 +
  volume    = {3494},
 +
  abstract  = {We expand a previous result of Dean [Dea99] to provide a second preimage attack on all n-bit iterated hash functions with Damgård-Merkle strengthening and n-bit intermediate states, allowing a second preimage to be found for a 2<sup>k</sup>-message-block message with about k × 2<sup>n/2+1</sup> + 2<sup>n - k + 1</sup> work. Using RIPEMD-160 as an example, our attack can find a second preimage for a 2<sup>60</sup> byte message in about 2<sup>106</sup> work, rather than the previously expected 2<sup>160</sup> work. We also provide slightly cheaper ways to find multicollisions than the method of Joux [Jou04]. Both of these results are based on expandable messages-patterns for producing messages of varying length, which all collide on the intermediate hash result immediately after processing the message. We provide an algorithm for finding expandable messages for any n-bit hash function built using the Damgård-Merkle construction, which requires only a small multiple of the work done to find a single collision in the hash function.},
 +
}
 +
</bibtex>
 +
 
 +
 
 +
= Preimage Attacks =
 +
 
 +
Herding attacks are a special kind of preimage attack, in the sense
 +
that an additional assumption is being made for the attack to work.
 +
The basic scenario in which herding attacks are applicable is as
 +
follows. At the cost of a pre-computation step, an attacker can
 +
commit to a digest of a hash function without yet knowing the input.
 +
In \cite{Kelsey2005HerdingHashFunctionsa}, this attack is described
 +
and shows that for all iterated hash functions the complexity is
 +
less than one would expect from an ideal hash function.
 +
 
 +
\begin{definition}[Resistance against herding attacks]
 +
Given a hash function $h$, the attacker may choose a digest $H$. If
 +
she is given $P$, it should not be possible to find $S$ such that
 +
$h(P||S)=H$ is considerably faster than by $2^n$ invocations of $h$.
 +
\end{definition}
 +
 
 +
For short suffixes, the workfactor for a herding attack on an
 +
iterative hash functions as shown in
 +
\cite{Kelsey2005HerdingHashFunctionsa} is $2^{(2n-5)/3}$. First a
 +
so-called diamond structure is built in a precomputation phase that
 +
results in a particular digest $H$. After $P$ is given to the
 +
attacker, a linking message $S_1$ is searched that connects $P$ with
 +
one of the edges of the diamond structure. Let's denote the path
 +
between the found entry point in the diamond structure and the
 +
digest $H$ at its end $S_2$, then the result string $S$ such that
 +
$h(P||S)=H$ is $S=S_1 || S_2$.
 +
 
 +
 
 +
Besides observing this theoretical weakness, we can also consider
 +
the feasibility in practice of this attack. In the case of SHA-1,
 +
and without partial knowledge of $P$, a pre-computation effort of
 +
$2^{107}$ would be needed to compute $H$. This requires about
 +
$2^{60}$ bits of storage for the required data-structure.
 +
Afterwards, $2^{107}$ effort would be needed to compute $S$ given a
 +
particular $P$, by search for a linking message block. This amounts
 +
to a total running time of $2^{108}$. If partial knowledge of $P$
 +
exists (as is the case when facing the challenge of predicting the
 +
outcome of presidential elections when MD5 is
 +
used~\cite{Stevens2008}), the attack can be much faster.
 +
 
 +
In order to exploit dedicated collision-search attacks on SHA-1, a
 +
collision search which is faster than about $2^{55.5}$ would be
 +
needed. Such a fast collision search would need to find a pair
 +
$(m,m^*)$  such that $h_c(cv_1,m)=h_c(cv_2,m^*)$ where the attacker
 +
has little control over the chaining variables $cv_1$ and $cv_2$.
 +
Such an algorithm is not known to date.
 +
 
 
----
 
----

Revision as of 17:37, 10 March 2008

1 Collision style Attacks

In case a hash function is not considered as a black box, but built from compression functions (which in turn are considered as black boxes at this point), multi-collisions can be constructed more efficiently. Ideally, the effort to find $2^t$ single collisions should grow according to the birthday paradox: for $t \ll n/2$ the effort should grow almost linearly with each additional collision. What Joux showed in 2004~\cite{} is that for iterated constructions the effort to find a $2^t$-multicollision is actually $t*2^{n/2}$. The idea is to simply concatenate $t$ collisions found by a birthday attack (or by any other mean like shortcut attacks for that matter). Since each collision allows to pick a message out of a pair of messages, and this choice is available $t$ times, a set of $2^t$ different messages consisting of $t$ message blocks can be constructed that all lead to the same hash value.


An application of Joux's multicollisions (also given in~\cite{}) is the analysis of concatenated constructions. Assuming two hash functions of output size $n$ each whose outputs is concatenated, one would ideally expect a security of $2^n$ against birthday based collision search attacks. Generating a $2^{n/2}$ multicollision for one of the hash functions is however enough to find a collision in the concatenated construction. This has a total cost of $2^{n/2+log(n)}$.

As a historic note, it should be mentioned that Coppersmith's attack in 1985~\cite{DBLP:conf/crypto/Coppersmith85} on the Davies-Price variant~\cite{} of Rabin's scheme~\cite{} builds already on exactly this idea.

Mridul Nandi, Douglas R. Stinson - Multicollision Attacks on Some Generalized Sequential Hash

Functions

IEEE Transactions on Information Theory 53(2):759-767,2007
http://dx.doi.org/10.1109/TIT.2006.889721
Bibtex
Author : Mridul Nandi, Douglas R. Stinson
Title : Multicollision Attacks on Some Generalized Sequential Hash Functions
In : IEEE Transactions on Information Theory -
Address :
Date : 2007


2 Second Preimage Attacks

Discoveries about second preimage attacks on iterated hash functions span the last three decades. Merkle notes in 1979 that for messages of length $2^k$, the same number of different target hash values will speed-up the search for second preimages (of potentially different length) to $2^{n-k}$ trials. \todo{Winternitz,lai}.


One of the reasons to include the message length as part of the message to be hashed in constructions since then, is to prevent these type of attacks. However, Dean~\cite{} describes in 1999 a way to circumvent this measure by used so-called expandable messages. Expandable messages are a set of messages of different lengths that all yield the same intermediate hash value.

Dean's construction only works for compression functions that have easily constructed fixed-points, \ie where it is easy to find a message block and an input chaining value that results into the same output chaining value. Many popular hash function construction indeed do have this property. In 2005, Kelsey and Schneier managed to remove this constraint and gave an algorithm to construct expandable messages for any compression function with an $n$-bit intermediate value. Their idea is to construct multicollisions out of collisions between message blocks of different length. From that, again example messages can be constructed and hence the search for second preimages is again of order $2^{n-k+1}$ word.

John Kelsey, Bruce Schneier - Second Preimages on n-Bit Hash Functions for Much Less than $2^n$ Work.

EUROCRYPT 3494:474-490,2005
http://dx.doi.org/10.1007/11426639_28
Bibtex
Author : John Kelsey, Bruce Schneier
Title : Second Preimages on n-Bit Hash Functions for Much Less than $2^n$ Work.
In : EUROCRYPT -
Address :
Date : 2005


3 Preimage Attacks

Herding attacks are a special kind of preimage attack, in the sense that an additional assumption is being made for the attack to work. The basic scenario in which herding attacks are applicable is as follows. At the cost of a pre-computation step, an attacker can commit to a digest of a hash function without yet knowing the input. In \cite{Kelsey2005HerdingHashFunctionsa}, this attack is described and shows that for all iterated hash functions the complexity is less than one would expect from an ideal hash function.

\begin{definition}[Resistance against herding attacks] Given a hash function $h$, the attacker may choose a digest $H$. If she is given $P$, it should not be possible to find $S$ such that $h(P||S)=H$ is considerably faster than by $2^n$ invocations of $h$. \end{definition}

For short suffixes, the workfactor for a herding attack on an iterative hash functions as shown in \cite{Kelsey2005HerdingHashFunctionsa} is $2^{(2n-5)/3}$. First a so-called diamond structure is built in a precomputation phase that results in a particular digest $H$. After $P$ is given to the attacker, a linking message $S_1$ is searched that connects $P$ with one of the edges of the diamond structure. Let's denote the path between the found entry point in the diamond structure and the digest $H$ at its end $S_2$, then the result string $S$ such that $h(P||S)=H$ is $S=S_1 || S_2$.


Besides observing this theoretical weakness, we can also consider the feasibility in practice of this attack. In the case of SHA-1, and without partial knowledge of $P$, a pre-computation effort of $2^{107}$ would be needed to compute $H$. This requires about $2^{60}$ bits of storage for the required data-structure. Afterwards, $2^{107}$ effort would be needed to compute $S$ given a particular $P$, by search for a linking message block. This amounts to a total running time of $2^{108}$. If partial knowledge of $P$ exists (as is the case when facing the challenge of predicting the outcome of presidential elections when MD5 is used~\cite{Stevens2008}), the attack can be much faster.

In order to exploit dedicated collision-search attacks on SHA-1, a collision search which is faster than about $2^{55.5}$ would be needed. Such a fast collision search would need to find a pair $(m,m^*)$ such that $h_c(cv_1,m)=h_c(cv_2,m^*)$ where the attacker has little control over the chaining variables $cv_1$ and $cv_2$. Such an algorithm is not known to date.