edit: ignore below, I misinterpreted what q meant in this context and thought it was the private key.
>The clever trick is to compute a secure hash whose input includes the message to be signed and also the private key [...]
> PuTTY's technique worked by making a SHA-512 hash, and then reducing it mod q, where q is the order of the group used in the DSA system. For integer DSA (for which PuTTY's technique was originally developed), q is about 160 bits; for elliptic-curve DSA (which came later) it has about the same number of bits as the curve modulus, so 256 or 384 or 521 bits for the NIST curves.
I know hindsight is 20/20, but why did PuTTY implement it this way to begin with? Given the description in the first paragraph, I'd naively implemented it as
SHA-512(...)[:521] would still produce the same vulnerability, there would be 9 unchanging bits (assuming the [:521] would pad the 512 bits to 521). Those 9 guessable bits are enough to recover the key from 60 signatures, as the post explained in detail.
A more interesting question (while we are on the 20/20 hindsight express) is why the dsa_gen_k() function did not include an assert(digest_len <= 512).
"For this reason, since PuTTY was developed on Windows before it had any cryptographic random number generator at all, PuTTY has always generated its k using a deterministic method, avoiding the need for random numbers at all."
The problem is that 521 > 512, not that they used modulo. To get a sufficiently large nonce out of the hash, you need to think in terms of expanding the number of bits, not reducing it.
Ah okay, after re-reading the message it looks like using the SHA-512 result directly (without modulo) would still have the issue. I originally thought the problem was moduloing by the key, which was approximately 521 bits
> In all of those cases except P521, the bias introduced by reducing a 512-bit number mod q is negligible. But in the case of P521, where q has 521 bits (i.e. more than 512), reducing a 512-bit number mod q has no effect at all – you get a value of k whose top 9 bits are always zero.
Two separate hashes, with domain separation, that produce an output of at least n+64 bits (if it is to be reduced mod 2^n - k, for some small integer k). In this case, 1024 bits reduced mod 2^521-1 is safe.
Even better, though, is to just use RFC 6979 and not implement it yourself.
Presumably because 512 of bits (a) seemed "random enough" and (b) was the nicest block size that fit comfortably in 521 bits of modulus. This is a common mistake.
From TFA it seems more like they already had the SHA512-based implementation for DSA (where it was fine), and reused it when implementing ECDSA without realizing that it wasn't suitable in situations with moduli larger than 512 bits.
The nonce is taken modulo the order of the prime-order subgroup. For DSA that's generally a 256ish-bit prime (e.g.: choose a 1024-bit prime p such that a 256-bit prime q divides p-1; then there exists an order-q subgroup of Zp).
For P-521, the base field is 2^521 - 1, but the modulus used when computing the nonce is not that value, it's the order of the P-521 curve. By Hasse's theorem, that's roughly p +- sqrt(p), which is essentially p for such large numbers (the cofactor of P-521 is 1, so the order of the group is prime).
So: both are 521-bit numbers, but the group order is less than 2^521-1. Its hex representation is 0x01fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffa51868783bf2f966b7fcc0148f709a5d03bb5c9b8899c47aebb6fb71e91386409.
Caution here. If your modulus is too close to the maximum truncated value, there can be a bias in the upper bits, too. For example, if you reduce a number between 0 and 15 by the modulus 13, the values 0, 1 and 2 will be twice as likely as the values 3-12. This means that the highest bit will be 0 in 11 out of 16 cases. Even such a small bias might be exploitable (for example, sub 1 bit bias up to 160-bit ECDSA here: Gao, Wang, Hu, He https://eprint.iacr.org/2024/296.pdf)
This doesn't make 13 a power of two. I'm aware of rejection sampling; my point was if you have a N bit value X and want M bits, truncating X to M bits and X MOD 2*M is the same. Neither solve the problem where M > N, which is what TFA is about.
I don't see the number 13 in any of my comments on this thread (except this one, or where I quoted you). Perhaps you are confusing me with someone else?
>The clever trick is to compute a secure hash whose input includes the message to be signed and also the private key [...]
> PuTTY's technique worked by making a SHA-512 hash, and then reducing it mod q, where q is the order of the group used in the DSA system. For integer DSA (for which PuTTY's technique was originally developed), q is about 160 bits; for elliptic-curve DSA (which came later) it has about the same number of bits as the curve modulus, so 256 or 384 or 521 bits for the NIST curves.
I know hindsight is 20/20, but why did PuTTY implement it this way to begin with? Given the description in the first paragraph, I'd naively implemented it as
or if I was being paranoid I would have done Moduloing it by q makes no sense unless you're trying to save a few cycles.