Hacker News new | past | comments | ask | show | jobs | submit login

edit: ignore below, I misinterpreted what q meant in this context and thought it was the private key.

>The clever trick is to compute a secure hash whose input includes the message to be signed and also the private key [...]

> PuTTY's technique worked by making a SHA-512 hash, and then reducing it mod q, where q is the order of the group used in the DSA system. For integer DSA (for which PuTTY's technique was originally developed), q is about 160 bits; for elliptic-curve DSA (which came later) it has about the same number of bits as the curve modulus, so 256 or 384 or 521 bits for the NIST curves.

I know hindsight is 20/20, but why did PuTTY implement it this way to begin with? Given the description in the first paragraph, I'd naively implemented it as

    SHA-512(message || private_key)[:number_of_bits_required]
or if I was being paranoid I would have done

    SHA-512(SHA-512(message) || SHA-512(private_key))[:number_of_bits_required]
Moduloing it by q makes no sense unless you're trying to save a few cycles.



SHA-512(...)[:521] would still produce the same vulnerability, there would be 9 unchanging bits (assuming the [:521] would pad the 512 bits to 521). Those 9 guessable bits are enough to recover the key from 60 signatures, as the post explained in detail.

A more interesting question (while we are on the 20/20 hindsight express) is why the dsa_gen_k() function did not include an assert(digest_len <= 512).


It appears that some of this was designed in the late 90s/early aughts, when Windows didn't have a cryptographically-secure random number generator.

PuTTY carries its own formats from this time.


You don't need a CSPRNG to do SHA512, which their solution already incorporates.


This is the text from the original article.

"For this reason, since PuTTY was developed on Windows before it had any cryptographic random number generator at all, PuTTY has always generated its k using a deterministic method, avoiding the need for random numbers at all."


Both my proposed naive method and the PuTTY implementation are deterministic. The only difference is how the primitives (eg. SHA-512) are combined.


The problem is that 521 > 512, not that they used modulo. To get a sufficiently large nonce out of the hash, you need to think in terms of expanding the number of bits, not reducing it.


Ah okay, after re-reading the message it looks like using the SHA-512 result directly (without modulo) would still have the issue. I originally thought the problem was moduloing by the key, which was approximately 521 bits

> In all of those cases except P521, the bias introduced by reducing a 512-bit number mod q is negligible. But in the case of P521, where q has 521 bits (i.e. more than 512), reducing a 512-bit number mod q has no effect at all – you get a value of k whose top 9 bits are always zero.


You'd want something like this:

  Truncate(
    SHA-512(0x01 || message || private_key)
      || SHA-512(0x02 || message || private_key),
    bitsNeeded
  )
Two separate hashes, with domain separation, that produce an output of at least n+64 bits (if it is to be reduced mod 2^n - k, for some small integer k). In this case, 1024 bits reduced mod 2^521-1 is safe.

Even better, though, is to just use RFC 6979 and not implement it yourself.


Presumably because 512 of bits (a) seemed "random enough" and (b) was the nicest block size that fit comfortably in 521 bits of modulus. This is a common mistake.


From TFA it seems more like they already had the SHA512-based implementation for DSA (where it was fine), and reused it when implementing ECDSA without realizing that it wasn't suitable in situations with moduli larger than 512 bits.


I'd have to look at the code, but a DSA modulus is much larger than a P-521 modulus. Maybe it just happened to line up nicely?


The nonce is taken modulo the order of the prime-order subgroup. For DSA that's generally a 256ish-bit prime (e.g.: choose a 1024-bit prime p such that a 256-bit prime q divides p-1; then there exists an order-q subgroup of Zp).

For P-521, the base field is 2^521 - 1, but the modulus used when computing the nonce is not that value, it's the order of the P-521 curve. By Hasse's theorem, that's roughly p +- sqrt(p), which is essentially p for such large numbers (the cofactor of P-521 is 1, so the order of the group is prime).

So: both are 521-bit numbers, but the group order is less than 2^521-1. Its hex representation is 0x01fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffa51868783bf2f966b7fcc0148f709a5d03bb5c9b8899c47aebb6fb71e91386409.


Ahh. Thanks! (I don't think about FFDLP much, as you can see).


Isn't modulo the same as truncation when dealing with powers of two?


Caution here. If your modulus is too close to the maximum truncated value, there can be a bias in the upper bits, too. For example, if you reduce a number between 0 and 15 by the modulus 13, the values 0, 1 and 2 will be twice as likely as the values 3-12. This means that the highest bit will be 0 in 11 out of 16 cases. Even such a small bias might be exploitable (for example, sub 1 bit bias up to 160-bit ECDSA here: Gao, Wang, Hu, He https://eprint.iacr.org/2024/296.pdf)


Neither 15 nor 13 are powers of two.


The interval [0, 15] represent 16 possible values, which is a power of 2.

The correct way to get an unbiased distribution from a sample of 2^x to a modulo that is not an even power of 2 is to use rejection sampling.

This is what RFC 6979 says to do https://datatracker.ietf.org/doc/html/rfc6979#section-3.2

But you can also see this technique in CSPRNG code; i.e. https://github.com/php/php-src/blob/d40726670fd2915dcd807673...


I missed it was [0,15]

This doesn't make 13 a power of two. I'm aware of rejection sampling; my point was if you have a N bit value X and want M bits, truncating X to M bits and X MOD 2*M is the same. Neither solve the problem where M > N, which is what TFA is about.


> This doesn't make 13 a power of two.

Where did I imply that it is?

> I'm aware of rejection sampling; my point was if you have a N bit value X and want M bits, truncating X to M bits and X MOD 2*M is the same.

Sure.

> Neither solve the problem where M > N, which is what TFA is about.

If you observe my other comments, you'll see I'm well aware of what the article is about.


> Where did I imply that it is?

You used 13 as an example in a response to my comment that was:

Isn't modulo the same as truncation when dealing with powers of two?


> You used 13 as an example

I don't see the number 13 in any of my comments on this thread (except this one, or where I quoted you). Perhaps you are confusing me with someone else?


Ah yes, that was lambdafu


But 16 is.


Good point. I misread the OP and thought q was the key, but really it just corresponded to the key length (eg. 521 for p521)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: