Hacker News new | past | comments | ask | show | jobs | submit login

Von Neumann's solution is clever, but extremely inefficient in terms of how many flips you need per random bit produced.

More interesting methods produce a stream of unbiased random bits out of a stream of flips; and you can look at the asymptotic efficiency.

A simple method to run better than von Neumann's method:

Produce 1,000 coin flips. Say you produce k heads and 1,000-k tails.

Produce a table of all possible sequences of length 1,000 with k heads and 1,000-k tails.

Look up the position of your actual realized sequence in that table. The index of that position is a uniform random integer in a range from 1 to the size of the table.

Use your favourite method to extract unbiased bits from that unbiased integer.

The only requirement to make the above method work is that coin flips are i.i.d., ie independent and identically distributed.

(In practice you probably don't want to actually construct the table, but instead compute the index directly as-if you had constructed the table.

You might also want to work with arbitrary number of flips, instead of a fixed 1,000; or even adopt the method to an arbitrary length stream of coin flips that you don't know in advance.

Also in practice, you don't need to store the whole sequence. What you do is keep a count of how many heads and tails you've seen so far, but feed the individual coin flips into an 'entropy pool'. Your head/tail count will inform your estimate of how many bits of entropy you have in your pool. (It basically feeds into the formula for how many possible orders of arrangement you have, similar to the fixed-size method suggested above.)

You generate your unbiased bits by draining the from the entropy pool.

The method of entropy pools described here is pretty much how /dev/random works on Linux.)




Interestingly, your method is a generalization of Von Neumann's method with more than two coin flips. With TT and HH, the length of the list is one (i.e., no bit of information), and with TH and HT, the length is two (exactly one bit of information).


That's an interesting way to look at it, yes!

I don't claim originality here either; the method is pretty well known.


There is a generalization allowing asymptotically optimal entropy extraction from Markov chains traces that doesn't require transition weights knowledge (like Von Neumann trick doesn't require bias knowledge).

https://ieeexplore.ieee.org/document/1684974

Shameless plug of a paper on related topics: https://link.springer.com/article/10.1007/s00446-017-0311-5


I don't see how this works.

> The index of that position is a uniform random integer in a range from 1 to the size of the table. Use your favourite method to extract unbiased bits from that unbiased integer.

Why is that position unbiased and random? If I'm using a coin that produces heads every single time, then I'm always going to land on table position 0, so therefore the table index is biased.


Well, if your coin always lands heads, your table will only have a single entry. So you won't extract any bits from it.

Von Neumann's method won't extract any randomness from this coin either.

Do keep in mind that, crucially, you only build the table _after_ you figured out how many heads or tails you got in this particular instance of 1000 flips.

The randomness we extract is not in how _many_ heads or tails we got, but how those heads and tails are shuffled in your sequence.

For the math, have a look at https://news.ycombinator.com/item?id=30697559


I misinterpreted your algorithm. So you're creating a table of all permutations given the number of heads and tails you got, and then looking it up in the table.

This does feel like a generalization of Von Neumann's method from 2 tosses to 1000


Yes, you can view it like that. Basically we have 1001 tables, for each amount of heads that are possible.

If you are sufficiently clever, you can generalize it to an indefinite stream of tosses, and produce output as you go along. See some of the papers mentioned elsewhere in the comments.


> Produce a table of all possible sequences of length 1,000 with k heads and 1,000-k tails.

This is more efficient than Von Neumann?


This method generates approximately

  log_2(sqrt(1/(2*pi*p*(1-p))) - 1000*log_2(p^p * (1-p)^(1-p))
bits of entropy from 1000 coin flips, where "p" is the probablity of flipping heads.

Neumann's method generates 1000/(p(1-p)) bits of entropy from 1000 coin flips. The theoretical maximum is

  -1000*log_2(p^p * (1-p)^(1-p))
but it requires knowing "p" exactly. This method is close to the theoretical maximum. I didn't plug in numbers, or analyise further, but it's far better than Neumann's method.

One obvious drawback is that you have to flip the coin a 1000 times to produce the first unbiased random bit, while Neumann's method starts producing bits much earlier.


> One obvious drawback is that you have to flip the coin a 1000 times to produce the first unbiased random bit, while Neumann's method starts producing bits much earlier.

Definitely. Though you could fix that problem relatively easily.

I think you might even be able to run von Neumann's method first, but store the coin flips; and then once you've got enough stored, extract a few more bits from the already used flips.

Perhaps like this:

When you do two flips, you add one of three possible tokens to your list:

'double-heads', 'double-tails' or 'mixed'.

Crucially, you only store 'mixed' and not whether it was 'head-tails' or 'tails-heads' because that information was already used to produce the von-Neuman-bit.

After your list has 1000 entries, you run an algorithm a bit like what I originally described to extract bits. The complication is that the table you construct has all possibilities of shuffling a fixed number of 'double-heads', 'double-tails' or 'mixed' tokens.


It feels like an awkward middle ground. Like this method has a limited entropy output for the first 2000 coin flips (first 1000 double-heads/double-tails/mixed entries), and then suddenly it adds back a ton of lost entropy.

An other commenter linked to some papers for asymptotically optimal entropy generation, I wonder if there is more of a streaming method there. It feels like there has to be, even maybe after a slow start. My naive intuition is that after 1000000 coin flips you have a good idea what p is, and then you can basically do arithmetic coding from there. Of course a theoretically correct method can't do exactly this, but it might asymptotically approach it.


Oh, if you want something practical, the approach that Linux's /dev/random takes is probably the one to go.

/dev/random being unbiased relies on some assumptions about cryptography; but in practice these assumptions are at least as well-founded as our assumption that our coin flips are independent.

Look at some of the papers mentioned in other comments on this submission. There are (near) optimal streaming methods.


More efficient if coin flips are your scarce resource, less efficient if computation is scarce. But nowhere near as inefficient as it sounds - you don't actually have to materialize the table.


As with so many algorithms, he's trading compute time for memory and pre-compute time. Generating that table is costly, but once you have that table it's pretty efficient.


You don’t need to store a big table with the table method, though. As long as there’s a pattern to it, you can figure out your position algorithmically.

For example, if H=1 and T=0 then you do something like “any result above 11010 is a real heads, others are tails.”

Edit: never mind, that’s a different method than OP. (Doesn’t require exactly k heads.) Doesn’t it get the best of both worlds though?

Edit2: Double never-mind, the problem assumes you don’t know the level of bias. But I think the OP threw that out as well.


The method I described it a bit weird and round-about, exactly because it has to work around not knowing the bias.

Essentially, it generates a thousand flips, and then carefully extracts the randomness only out of how those flips are shuffled, but now how many heads there are.

However, the closer to 500/500 your distribution is, the more entropy the method I describes can extract. Conversely, if you randomly hit 1000 heads and 0 tails, the method can't extract anything, and you'll have to sample again.

In contrast, a method that knows the bias and can exploit it, can extract the same number of random bits, no matter what you actually flipped. Eg in the special case of a known unbiased coin, you always get 1000 random bits for 1000 flips. Even if those flips happen to be 1000 heads.


But the point remains, your method shouldn’t need a full table cached in advance, right? You should be able to use a formula to compute its “effective index”.


Yes, definitely. I just didn't mention that, because I thought it was obvious that we agree on that.

And instead of computing the "effective index" as an abstract number and then extract bits from that, you can work out directly how to generate the unbiased bits.


Yes. Though in practice you wouldn't want to generate that specific table for 1000 flips. It's far too big.


More efficient in how many flips you consume to produce an unbiased bit.

Less efficient in terms of runtime (if you implement it as naively as I have described here).


> Produce a table of all possible sequences of length 1,000 with k heads and 1,000-k tails. > Look up the position of your actual realized sequence in that table. The index of that position is a uniform random integer in a range from 1 to the size of the table.

I don’t see why it would be. The normal way to produce such a table is as hhh…hh, hhh…ht, hhh…th, … ttt…ht, ttt…th, ttt…tt. If heads is more likely than tails, the index is more likely to be in the lower half than the upper half.

Reading https://blog.cloudflare.com/ensuring-randomness-with-linuxs-..., I also don’t think that’s how Linux’ entropy pool works.


The parent comment was saying that when you fix the number of heads to be k in advance, the set of head positions you get is random. That sounds correct.


> Reading https://blog.cloudflare.com/ensuring-randomness-with-linuxs-..., I also don’t think that’s how Linux’ entropy pool works.

Where does the description in that article differ materially from what I described in the parenthetical remark?

(Of course, it does differ from what I described in the main text of the comment.)


Cool idea! I wonder what you mean concretely by “extract unbiased bits from unbiased integer”?

Let’s say my experiment results in an index of 42, how do you convert that integer into information about a string of unbiased coin flips?



Arithmetic coding is the industrial strength way to do this, yes!

However, you can do much simpler and still be pretty efficient (and unbiased).

Suppose your range was numbers between 1 to 49 (inclusive).

If you get a number between 1 to 32, you interpret that as 5 bits. If you got a number between 33 to 48, you interpret that as 4 bits (because you have 16 == 2^4 possibilities between 33 to 48). If you hit 49, you just drop everything and sample again.

So if you got 42, like j7ake suggested, that would be 42 - 32 = 10 = 0b1010.

You can generalize this scheme to other numbers of possibilities.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: