Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Nobody knows what entropy really is, so in a debate you will always have the advantage."

~ Claude Shannon

I've become more and more uncomfortable with the word "entropy" as used in cryptography. The word "random" has always been problematic. I try to use "unpredictable" instead, because I have half a clue what that word means.

I think "entropy" is used to mean "unpredictable in principle", which would exclude chaotic processes like CPU jitter. Chaotic processes like the weather are predictable in principle; but they are so complex that in practice they are unpredictable. Quantum processes such as radioactive decay or tunneling seem to be unpredictable in principle. If it's unpredictable, then to my mind it's sufficiently unpredictable for cryptography.

The Intel HWRNG depends on chaos - the jitter in gate propagation times. I'd settle for that, if it weren't for the fact that the raw chaos from the process isn't available for inspection - you can only see the result of obfuscating it using AES. How much bias is there in the raw bitstream from the unencrypted HWRNG output? I have no way of finding out.

Entropy accounting in the Linux RNG seems to be somewhere between heuristics and guesswork. It seems to be awfully complicated, given that it comes down to finger-in-the-air estimates.

/me not a cryptographer, mathematician or physicist.



You need to get access to the raw entropy stream in order to characterize it and test it under a number of different situations. At Cryptography Research, we did a number of reviews of hardware entropy sources.

You have to look into behavior during very early startup (power on reset), suspend/resume from low power states, under high heat and thermal shutdown, as well as stable operation. You look at different samples of chips to look for production variation. You build software models that try to simulate the underlying hardware behavior to see how close they get to predicting outputs (which is a bad thing if it works too well!)

You then review how the system processes this entropy since ideas that look good to hardware engineers (like a string filter) are actually really bad for entropy. You analyze the path for side channels or race conditions that could leak raw entropy across process boundaries.

Anyway, here are the reports:

https://www.rambus.com/wp-content/uploads/2015/08/IntelRNG.p... (1999)

https://www.rambus.com/wp-content/uploads/2015/08/VIA_rng.pd... (2003)

https://web.archive.org/web/20141230024150/http://www.crypto... (2012)


> ~ Claude Shannon

I misattributed that quote. It's from a conversation between Shannon and Von Neumann; Shannon attributed the quote to Von Neumann.

  I thought of calling it "information", but the word was overly used, so I decided to call it "uncertainty". [...] Von Neumann told me, "You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage."
https://en.wikipedia.org/wiki/Entropy#Entropy_of_a_system

Sorry.


> I think "entropy" is used to mean "unpredictable in principle", which would exclude chaotic processes like CPU jitter. Chaotic processes like the weather are predictable in principle; but they are so complex that in practice they are unpredictable.

Chaotic systems are practically unpredictable because they have an exponential dependence on their initial conditions. It has nothing to do with complexity -- a double pendulum is a very simple system and is still chaotic.

In other words, as you evolve t, you need more and more bits of precision in the state variables of the initial condition in order to continue predicting the future state correctly. As soon as you start drifting away from the real state the gap between your prediction and reality exponentially increases.

As there is a physical limit to what accuracy you can measure things, at some point is becomes physically impossible to continue accurately predicting the future state.


> I've become more and more uncomfortable with the word "entropy" as used in cryptography.

I don't like the word either. On the other hand, the meaning in the kernel seems clear: with a threat model of the attacker, it's the amount of bits that could be generated from internal state that are believed to be random, i.e. not predictable by the attacker.

> The word "random" has always been problematic. I try to use "unpredictable" instead, because I have half a clue what that word means.

Yes, "random" has many meanings. Sometimes it means simply that variable has non-repeating values with few desired statistical properties. This is not enough in cryptography, where we care about the attacker, so random there is much more demanding.

If a bit stream is looking unpredictable, but is known to be actually produced purely algorithmically, it is then not called random, but pseudo-random, because in principle, with enough knowledge of internal state, it can be predicted.

> I think "entropy" is used to mean "unpredictable in principle", which would exclude chaotic processes like CPU jitter.

"In principle" is too strong and vague. We don't have a way to test and verify a source is unpredictable in principle. We can only pronounce it based on current state of knowledge. In practice, the criterion is always "unpredictable in practice". A number source like CPU jitter may be declared unpredictable in practice. It is a decision of the implementor.

> Chaotic processes like the weather are predictable in principle; but they are so complex that in practice they are unpredictable.

Chaotic processes like weather are theoretical continuous processes, i.e. whose description is based on real numbers. We can measure such numbers only with unknown non-zero error, which makes predictions have errors too, which makes fine enough details of these processes unpredictable. There is nothing qualitatively better in unpredictability that quantum phenomena bring here. Classical thermal noise is already unpredictable.

> The Intel HWRNG depends on chaos - the jitter in gate propagation times. I'd settle for that, if it weren't for the fact that the raw chaos from the process isn't available for inspection - you can only see the result of obfuscating it using AES. How much bias is there in the raw bitstream from the unencrypted HWRNG output? I have no way of finding out.

How do they obfuscate it using AES? Maybe the AES obfuscation could be reversed?

> Entropy accounting in the Linux RNG seems to be somewhere between heuristics and guesswork. It seems to be awfully complicated, given that it comes down to finger-in-the-air estimates.

This seems to be the state of the things. Kernel programmers are constantly dabbling in RNG science which they shouldn't - they end up changing kernel RNG system design and breaking stuff. I don't expect kernel to provide high quality RNG, I expect it to boot and get out of my way.


> whose description is based on real numbers.

Yes, I get it. But you mean real real numbers from the real world, not the representations used in computers. Real real numbers have infinite precision, which means you can't measure a real real quantity.

cyphar (parents sibling) said:

> a double pendulum is a very simple system and is still chaotic

You're right, it's nothing to do with complexity. It's that immeasurably small changes in initial conditions lead to huge differences in the way a system develops. So a chaotic system that depends on real-world real quantities is in principle unpredictable, because you can't measure real-world reals precisely.

Thanks people, for nailing a stake through my misguided notion that you need a quantum process to be sure your RNG is immune to prediction by some hypothetical arbitrarily-powerful computer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: