Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Note that this is equivalent to computing `upperbound * randbits(nbits + 64) >> (nbits + 64)` combined with an early-exit optimization, albeit one that becomes less effective as `upperbound` increases: if `upperbound = 2^nbits - 1`, the probability of using the early exit is basically zero. (As the author points out, Lemire's optimization may be used to improve this exit test).

The early exit optimization, however, may leak information about the random output, since it tells someone with access to timing information whether one or two outputs was used. With standard rejection sampling, the entire sample is discarded if a bias would be produced, so the output is independent of the number of samples. However, here, certain outputs cannot be produced by the early exit (for sufficiently large upper bounds), meaning that an attacker could narrow down the random number if the algorithm exited early.

I think that, to avoid this timing attack vector, the early exit would need to be eliminated - in which case the algorithm would perform no better (and usually worse) than Lemire's approach in terms of expected calls to the random number generator.



Right, this is a generator that's used in contexts where a timing side-channel is not a concern, but it's still a great distinction to point out.

More broadly, the Swift language--like almost every other language--doesn't have any notion of constant-time operation, so it's not clear what you would do with an RNG free of timing side channels if you had one; use it in a computation that can't make any guarantees about side channels?*

You're correct that the number of expected calls collapses to 2 (matching the worst case for Lemire) even in that case, but (for our purposes) it's a "better" 2 (at least with most random sources) since (a) it still never divides (b) it is branch-free, so it never mispredicts and (c) it's always 2 instead of 1 with probability 1/2, 2 with probability 1/4, 3 with probability 1/8 and so on.

* I am generally very (probably too) negative on “best effort” constant time operations in high-level languages, but having seen innumerable schemes undone by compiler optimizations, I’m jaded. If you need to eliminate side channels, use hardware designed to do so.

[All that said, I added a note to the PR to make this clear.]


Cryptographic operations come to mind. I could see someone reaching for this function to get a uniform value below some prime, for example, which might be reasonable if it were backed by a good CSPRNG for the underlying bits - and a good CSPRNG might well be slow enough to make this timing channel easily observable.

Some of the good modern crypto libraries make fairly strong guarantees about constant-time or at least data-independent operations. Granted, these may not be 100% side-effect-free thanks to processor bugs, but at least those are fairly time-intensive to attack. In any case, the concern isn’t the variable number of calls, but the fact that the number of calls could leak information about the results.

EDIT: Well, two of the places your PR has been linked to are RustCrypto (https://github.com/RustCrypto/utils/issues/453) and OpenSSL (https://github.com/openssl/openssl/issues/16500), two places where I definitely would not want to see a significant timing vulnerability. I wish that the folks posting these to crypto libraries would be more aware that this timing issue exists!

To concretely demonstrate the problem: suppose that the 64-bit upper bound is `2^64 * 3/4 = 13835058055282163712`. In this case, if the early exit is triggered, the result modulo 3 must be 0 or 2 (with 50% probability for each): numbers congruent to 1 mod 3 cannot come out of the early exit. By contrast, Lemire's method always outputs a uniform distribution of outputs at each iteration. Lemire's method has a smaller timing leak in that the division doesn't always happen - and since divisions are expensive, this could be significant, but it won't outweigh the cost of an unbuffered read of urandom, for example.


Having worked on implementations of side-channel resistant crypto libraries, processor bugs mostly don’t scare me, but uArch and compiler optimizations very much do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: