Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to pick a random number from 1-10 (torvaney.github.io)
362 points by torvaney on June 29, 2019 | hide | past | favorite | 173 comments


There is a simpler solution if you do not mind leaving entropy on the table. Break the people up into groups of two without regard for their selection. If two people provide the same number, ignore them. Otherwise, output "0" if the first person's number is smaller then the second, and output "1" if the first person's number is larger.

From this, you have a sequence of uniformly random bits, from which you can construct a uniformly random number between 1 and 10. A simple way (although, again, likely leaving entropy on the table) is to break these bits into groups of 4; interperate them as 4 bit unsigned integers, and discard any result that is not in the range 1-10.


That works and is perfectly uniform (if the people are i.i.d.), but requires quizzing around 20, 40, 60 or more people for a number before you can deliver one, while the algorithm described only requires one or two.

EDIT: Not quite that many, more like around 9, 18, 27 or so, see below.


I'm not sure I understand.

Humans aren't like weighted dice. For example, it's certainly conceivable that a few pairs of these humans misunderstand the goal such that they have a 0% chance of ever choosing numbers that change the respective ordering. Add to that the higher probability that many of these pairs of humans will just toggle their orders each round.

Edit: clarification-- the humans don't even have to misunderstand the rule, just the upshot of the process.


Maybe you don't even need to ask them to think of a number. The person whose name is sorted before the other person's name is person 1. And they compare their lengths instead of thinking of a number?


That gives you a single bit of entropy from those two people instead of a steady stream, though.


That's a good point, I guess you have to also randomly choose the humans?


I don't think so, if they were all lined up in alphabetical order and you took them in pairs from that ordering, it would be just as good.


But then you'll always get the same output from the same set of humans. It's a PRNG where the set of humans is the seed.


Nonsense. It requires roughly 1+log(10) bits in expectation and has a geometric tail.


Eh? 4 bits minimum to get a uniform 1-16, and each bit requires requires 2 numbers that are not identical. (Ah yeah, I thought that in turn requires a bit more than 4 numbers, but it requires only a bit more than 2 numbers (probability that numbers coincide is not 1/2, as with bits, but 1/10, or a bit more than that due to non-uniformity)).

Thus, I recant "around 20, 40, 60 or more" and replace it by "around 9, 18, 27 or more" numbers. Probability that you have to reject any resulting hex digit is obviously 3/8.


Rejection sampling isn't the best way to go.

With b bits you are subdividing the (0,1) interval into 2^b regions, and only need more bits if one of the 9 multiples of 0.1 land in the interval you've picked. As b increases this probability drops exponentially.

It's a fair point that the number of "numbers" depends on how low the entropy of the source is, but in the link the probability of collision wasn't massive.


But the original post described rejection sampling (once you got your bits from the larger/smaller trick).


Absolutely! I read the von Neumann extractor and presumed too much. For fun, you can reduce the numbers asked by getting a batch of k people, and if all numbers are dissimilar you get a draw from k!

The OP is silly though; if you know the distribution there are way better techniques. They don't seem to worry about how they measure it.


Parents algorithm always works, the one described has a massive failure mode - people slowly learn that they are likely to say 7 and self-censor by picking another number. Or maybe in a different culture the distribution changes (say China and the numbers 4, 8).


> the one described has a massive failure mode

Well, if you're really interested in generating random numbers using a group of people, then yes, this may be a problem.

But you should take this as an illustration of the more general problem of generating uniform random numbers from a distribution that is not.

If you think about it this way, author's solution can be applied in a number of practical scenarios. For example, if you want to generate a hash values for objects using properties that are not uniformly distributed. Or if you want to generate random numbers from a physical artifact that is not perfectly uniform.


I would guess that in China the probabilities of 4 and 8 goes down. Asking people to pick randomly drives them away from special numbers.


In the US 7 is a lucky # and a statistical outlier in that it was more frequently chosen in the article above.


You might be right, but my guess is that 7's reputation for luck isn't famous enough to make a difference, and in fact it's common because other numbers seem too "unrandom".

We could test by asking people to pick numbers less than 100. I bet people would focus on odd numbers, especially those greater than 50, not ending in 5 or not having both digits the same.


I decided to do that.

Data and light analysis is here: https://docs.google.com/spreadsheets/d/1Dh0wiTCRkBhckWGXtjZg...

I definitely overpaid on Mechanical Turk per response, given how lightning quickly the data came in. (I decided to pay $0.10/HIT for 50 responses and got 68 responses in 8m26s.) I suspect that offering a nickel would have gotten the survey filled in under half an hour still...


Maybe 7 cents?


Excellent point. The algorithm in the article is predicated on a specific distribution and iid, GP algorithm is predicated only on iid.


That's pretty good, as long as each person can't hear any of the previous numbers! The article has the same issue, and might even be biased by the second person knowing they're the second person.


Maybe have them write their numbers down and put them in a hat? Or hey, this is 2019, have them connect to a website or bluetooth beacon and enter a number on their phone?

Ooh, or I bet you could make a 10-digit keypad that wirelessly reports to a nearby Raspberry Pi for $2-5 in parts, sort of like those 'clickers' that universities use to quiz large lecture classes.

It's too bad that most cheap radio modules are so short-range, because it would be interesting to make something like that and nail them to telephone poles with notes asking people to press a button. You'd have to control for people doing things like hammering the same button repeatedly for a laugh, but it would be fun to see what happened.

Would you be allowed to use HAM bands in the US for that sort of thing if you made them like beacons which sent a callsign after the number value and device ID?


If you are picking numbers from a hat, why not just put in ten numbers and pick one?


Here, we see the difference between scientists and engineers ;)


Well, each group of two could use their own hat. Or they could just write the numbers down instead of saying it. I dunno, the point is they don't have to say them out loud.


Just use the unlicensed ISM bands.


Exactly. Not only aren't people random, they're predictable based on history[0]. Each person could only be asked for one number and they shouldn't even wait perceptibly different lengths of time to be asked.

[0] http://people.ischool.berkeley.edu/~nick/aaronson-oracle/ind...


I've tried this a few times (refreshing the page in between). After a hundred or so strokes the computer is consistently less than 50%, usually ~45%. Suggesting my brain has somehow spontaneously worked out whatever algorithm it's using to guess. I don't think I'm random but there's something too simplistic about the method on this page.


Works except when both people in every group give the same number


does this algorithm have a name?



thanks!


...or just add up the answers mod 10.

This has the property that if even a single person answers uniformly at random, then the final number you compute will be uniformly random, regardless of how everyone else answers.


Paradoxically, this may be the best answer because it is not the best answer.

In a room of 8500 people, it is likely that there is someone in the room smarter than you and knows a better way to pick random numbers. Come into the room and tell everyone that your objective is to pick a random number from 1-10. Maybe even go as far as to tell them your mod 10 idea. Wait a bit to let them to think about it, then start asking for the random numbers. It is likely that someone will have come up with a better way to pick a random number than your mod 10 solution, and thus their answer will be more uniformly random that what your solution can produce. And thus your solution becomes at least as uniformly random as theirs.


> ...or just add up the answers mod 10.

This is how my board game groups pick the first player.


But if you're playing a board game, don't you already have dice at hand?


It depends which game, many do not use dice.


How do you prevent the last person to select a number from deciding who goes first?


Just ask everyone to write their number down and reveal at the same time.


We actually do it with fingers instead of writing, but same idea.


I'd love to see something written up about how quickly this converges, assuming the variables are i.i.d and distributed according to the distribution in the article.


Empirically, from running a quick script on the data in the article, it seems like you only need to sample 5 numbers to get a distribution that looks uniform using this "sum mod 10" method.


If you're going for empirical, why not ask for between 1 and a very large number? May as well extract as much entropy per person as possible for post-processing.


Mod10, no matter how large the number you’re only keeping the last digit (assuming they choose base 10 numbers), which... is probably no more random than asking them for a single digit in the first place.


Don't do mod 10, sum the digits.


> This has the property that if even a single person answers uniformly at random

Has any human ever proven they're capable of this? Generating truly random sequences is more or less impossible for humans AFAIK. (E.g., see https://news.ycombinator.com/item?id=19336754 which challenges you to do just that. Spoiler: you will probably fail miserably.)

It's an interesting idea, but in practice I think relying on the assumption that "even" one person is truly answering randomly (let alone uniformly at random) is a non-starter. But perhaps if there are enough people, the resulting sequence blends enough entropy together to get something that looks almost like a uniform random variable anyway? It would be interesting to test empirically.


https://dilbert.com/strip/2001-10-25

Philosophically, no it is not possible, though neither can any natural phenomenon for which we reasonably rely on for randomness.

Practically I suppose your goal is just to generate numbers such that the next number cannot be predicted given only the previous numbers but also without considering any outside knowledge. Potentially possible. Potentially impossible to test. If you can beat the test, it probably just means your method beats that specific test


Aren't there algorithms which reliably beat humans at rock paper scissors?


There are algorithms that beat most humans, because humans don’t play randomly. Conversely there are humans that can beat the human-beating algorithms reliably because the algorithms don’t play randomly either.



No free will needed. You see the predictions. Let the model learn and then change your sequence. It takes some time for the model to adjust. This way you can get the prediction rate to 50% or below.


I found myself reliably beating it just by making the runs of one letter longer than I would naturally if I was just trying to be random.


can you point to me a proof of that property.


I think you need at least one person who is completely independent of all others. For example if the same random is added up twice (say if the room is just you and your evil clone), you can get stuck with just even numbers. Independence is a pretty strong property to ask for since we all share the same biology...

But if you do have independence, the proof is easy! Let S be the sum of everyone else and X be the discrete uniform random in [0, 9]. Then:

  Pr(X + S mod 10 = i) 
    = \sum_j Pr((X + j) mod 10 = i | S = j) Pr(S = j)
    = \sum_j Pr(X = i | S = j) Pr(S = j)
    = \sum_j Pr(X = i) Pr(S = j)
    = Pr(X = i) \sum_j Pr(S = j)
    = Pr(X = i)
Due to total probability, symmetry, independence, and total probability again respectively. The handwaved part is the mod where you can imagine the histogram columns in the pmf getting rotated/shifted around but ending up looking exactly the same afterwards since all columns are symmetrical.


thanks!


Am I missing something, or is this a textbook case of overfitting? It's a neat project, but at the bottom he prescribes what to do if the person says 1-3, or if they say 4 or 5, etc. Therefore, it's completely overfit to the Reddit data at the top.


The model could be extended by adding a penalty term to the objective. I guess the penalty would need to be nonlinear, as the current objective already minimizes the x_ij (for i != j) implictly? One could also introduce a fixed-cost offset for any nonzero redistribution amount, but that would change the problem type from linear program (LP) to mixed-integer program (MIP).


Overfitting typically happens when the number of training samples is small compared to the number of parameters of the model. It especially happens when the input space is high-dimensional.

Here the input space has a only one dimension. The model has 10 parameters. There are 8500 samples. So, the author safely assumes that no over-fitting occurs.


Wrong. You are assuming a good selection over the whole search space of humans. If you took HN I bet you would get a different distribution. The Reddit data only applies to sampling Redditors sampled using similar strategies, the generalization is limited by how general the collection strategy was


Yes, there is a possibility that the sample is biased. But that does not matter for this experiment.

The OP says that the model should be built with people in the same room than him, and then the human RNG should be executed with the same people.

As a consequence, OP has to capture the bias of the people in his room to make his RNG work.

The model would break only if the people in the sample change their strategy to pick random numbers after some time.

The data from Reddit is just used as an example of how to make a human RNG from redditer random picks.

As long as the training and testing are done from the same group of people, there is no issue, even if this group is biased.

Moreover, I don't see any prior reason to believe that redditors and HN readers would have different biases. That would be an interesting experiment, though. However, the predominance of the number 7 in article's data might be a western culture thing.


This doesn't exclude overfitting. You can take samples n much larger than parameters p in, for example, a political or household earnings survey performed in the San Francisco Bay Area, but obviously one cannot safely assume no overfitting occurs in that scenario.


I think the situation given is that you know the skewed input distribution.


> The easy thing to do is to ask someone “Hey, pick a random number from 1 to 10!”. The person replies “7!”.

Seven factorial is way bigger than 10. Just kidding, sorry.

But what I was actually going to say was, when I read the title of the post, before clicking through, I decided to think of a number myself and I chose the number seven. So it was fun to see that same number in the article.

10% chance in theory and in reality it’s about 25% likely for someone to pick the specific number I did. If only the odds in the lottery were this good.

Why, by the way, is it that seven is so popular?


From the Reddit comments someone wrote:

"Can't have the evens, they aren't random. 1 and 9 are too close to the outside. 5 isn't random, it's right in the middle! 3 is too low. Leaving 7"

It does feel like the type of biased logic that would run through your mind in the moment — interesting to think, if we have such 'fallacies' on simple stuff, what level of blindspots do we have on more complex split-second decisions.


Quite a few, anchoring is pretty interesting and can often be completely subconscious.


I think most people (or at least those who choose 7, which I definitely used to do) associate 'randomness' with 'uniqueness', and since lots people would consider 7 to be the most 'unique' number between 1 and 10 (When considering things such as divisors and how often they are used), they pick that number as it feels the most 'random'.


My impression is that 37 is a similar outlier if you ask for random numbers from 1-100.


I was going to pick seven as well.

Personal guess for why it’s a favorite... 1. I want to signal that I’m creative. If you say “random, between 1-10” picking 5 feels amateur. 2. You said “between 1-10”, which is inclusive of 1 and 10, but you positioned me to not choose the poles with the word “between.”

3. It feels random to iterate through options and settle on one. I iterate by counting up. Since 5 is out, that leaves 6,7,8,9 in my big pile’o’random.

From there, for whatever reason 7 tends to stand out.

Crazy how predictable we are.


My guess: it is all about culture. I guess asian people are more likely to choose 8 (lucky number) and avoid 4 (unlucky number).


I think 7s feel more random because they are so uncommon in normal life. We want to avoid multiplying or dividing by 7 whenever possible. Imagine if "a dozen" were 7 instead of 12.


We see 7 all the time, though, in a week.


It's large and prime, so it feels less "round".


Haha wow!

Ok so I thought of the same. Let me think of a number before clicking.

Then I thought let me try something fancier, I know I would go around 6-8 so let’s switch it up. Let’s put the scale like 5 6 7 8 9 10 1 2 3 4 5 in my mind. Ok now let’s pick from a weird place, like the first half, say the middle of it. What’s that number. Oh it’s 7. I arrived to the same number while trying not to. Different kind of “randomness” I guess.

Humans are weird! :p


This past week I was reading Neverwhere, by Neil Gaiman which has the same interaction:

> "Now me," said Mr. Vandemar. "What number am I thinking of?"

> "I beg your pardon?"

> "What number am I thinking of?" repeated Mr. Vandemar. "It's between one and a lot," he added, helpfully.

> "Seven," said the marquis. Mr. Vandemar nodded, impressed.

When I encountered the question the first number that popped into my head was also seven. This was immediately followed by the thought that if I were the one picking a number I should explicitly avoid seven. Does this question and answer pair give anyone else a strong feeling of déjà vu? I think this interaction is a meme [0].

It's also interesting that most people will limit their choice to a (whole number | positive integer). The best choice is probably an irrational number like the square root of two (√2) or pi (π). Although those are still relatively predictable since most people would probably make their selection from the small set of well-known irrational numbers, so you should go one step further and add to or multiply the value.

[0] https://en.wikipedia.org/wiki/Meme


Your best choice is probably to stick with rationals, and just list off an arbitrary number of digits to the right of the decimal.


I tried that once with 2 friends, I asked them to think of a number between 1 and 10 and I then asked if it was 7 and they both were amazed :).


AIUI 3 is the most common if you ask someone to pick between 1 and 5.

I was actually surprised to see 3 wasn't the second-most-common for a number from 1 to 10.


Think of a vegetable. I predict you think of a carrot.


Think of a number between 1 and 10. Multiply it by 9. Add the digits of that number. Subtract 5. Convert it to a letter using A = 1, B = 2, ...). Think of a country beginning with that letter. Take the second letter of the country. Think of an animal beginning with the new letter.

I predict you thought of (rot13) Ryrcunag.

Not exactly the same trick, and not guaranteed to work, but this one impressed me at a young age.


It works because there are remarkably few country names starting with D that don't start with DE. And everyone picks the same E.

Now try asking for an animal whose name starts with U.


woah, I thought of seven too, before reading the article or the comments.


There is a simple way to generate random numbers in your head: https://groups.google.com/forum/#!msg/sci.math/6BIYd0cafQo/U...


For ease of reference, here is the text at that link verbatim:

Choose a 2-digit number, say 23, your "seed".

Form a new 2-digit number: the 10's digit plus 6 times the units digit.

The example sequence is 23 --> 20 --> 02 --> 12 --> 13 --> 19 --> 55 --> 35 --> ...

and its period is the order of the multiplier, 6, in the group of residues relatively prime to the modulus, 10. (59 in this case).

The "random digits" are the units digits of the 2-digit numbers, ie, 3,0,2,2,3,9,5,... the sequence mod 10. The arithmetic is simple enough to carry out in your head.

This is an example of my "multiply-with-carry" random number generator, and it seems to provide quite satisfactory sequences mod 2^32 or 2^64 , particularly well suited to the way that modern CPU's do integer arithmetic.

You may choose various multipliers and moduli for examples of random selection of the types you ask about.

A description of the multiply-with-carry method is in the postscript file mwc1.ps, included in

The Marsaglia Random Number CDROM with

The DIEHARD Battery of Tests of Randomness,

available at

http://stat.fsu.edu/pub/diehard/

George Marsaglia


I usually just look at my watch and use the last digit of the seconds value. 0=10

reasonably random


Very easy to predict.


how so? if you ask me for a number from 1-10 and I use my watch, how do you have any predictive information? Even if you looked at your watch there is no reason it should have the same seconds value as mine.

Obviously this is not really something you could automate and if you were to use a clock on an ongoing basis you might be able to predict likely numbers, but that was not in the original scenario.


Ah but there is a reason for my watch to have the same seconds value as yours, and that is if our watches are kept synchronized to an accurate time source, for ex. via NTP. By assuming otherwise you're sidestepping the problem by claiming you already have a reliable RNG.

Heck, even granted you have an unsynchronized watch, your "random" numbers are easy to predict, I'll ask you for a random number once per second and after a few answers I don't need to ask any more to know your answer.


This is an example of a precise analysis that is unhelpful because it is irrelevant in the contexts in which the algorithm would be used. It's like saying "all algorithms are constant time-complexity in practice because computers are finite". True but useless.


> I'll ask you for a random number once per second and after a few answers I don't need to ask any more to know your answer.

You know that we aren't computers right? We don't need any fancy software to detect that kind of attack.


If I had to generate a random number I would do this:

As I ask each person I cycle a secret number through 0-9 in my head. (I increment mod 10 after I ask each person)

When I ask I get their answer, I secretly add my secret number modulo 10 (and consider 0 === 10) and record this in secret.

Then record a tally for each number, the one with the most wins.

This assumes the asked people will not know or guess my internal number. So I seed based on the first person's number (OK they know!) but everyone else wont.

Edit: someone came up with a simpler solution: https://news.ycombinator.com/item?id=20315835


How about not using 'pick a random number from 1-10' as the source of entropy? For example, have everyone (from a group of some size) select a natural language sentence of 5 words or more, sum the ASCII values of the upcased alphabetical characters of everyone's sentences, and take the last digit (then add one).


This looks to be the discrete random variable problem with the alias method:

https://en.wikipedia.org/wiki/Alias_method

but they don't seem to reduce each choice (1-10) to just two options.


Put everyone in a numbered sequence unknown to them. Add their chosen number to their sequence number, mod 10.

This re-maps everyone’s choices so they actually have no idea what number they are actually choosing and efficiently redistributes the bias in their choices without massively complicated functions. It is also robust to changes in the distribution pattern. However it would only work well if you had at least 10 people and the number of people is divisible by 10.


Wouldn't putting "everyone in a numbered sequence unknown to them" require a random number, which we don't have?


> Wouldn't putting "everyone in a numbered sequence unknown to them" require a random number, which we don't have?

Yes. The above answer just obfuscates the issue. Consider ten people overwhelmingly biased towards 7. The suggested approach in all likelihood gets you 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. Then what? You're clearly not much closer to getting a single random number without a random way of picking one of them.


>You're clearly not much closer to getting a single random number without a random way of picking one of them.

That’s a characteristic of the original formulation of the question, not specifically in my solution. It’s implicit in the article that you can choose an arbitrary person in the room without bias.


It's a shortcoming specifically introduced by your proposal. If you've got a random way of picking one of ten people, then you've already got a random number between one and ten. Your answer essentially reduces to: ask one person to think of a number then add an actual random number and mod 10.


From the article:

>The easy thing to do is to ask someone “Hey, pick a random number from 1 to 10!”. The person replies “7!”. Great! Now you have a number. However, you start to wonder, is the number uniformly random?

So the problem reduces to transforming one form of randomness (picking a person) to another form (a uniform numeric distribution. That's implicit in the whole rest of the article.

>If you've got a random way of picking one of ten people, then you've already got a random number between one and ten.

You've got a source of randomness, but you still need to turn that into a number. The article gives one method and I've given another. You could also assign the people numbers from a stack of shuffled numbered tiles, then pick a person although that requires you have the tiles which is not really part of the setup. The conceit is that you have to use people's ability to pick numbers pseudo-randomly. There are probably numerous ways.


> Wouldn't putting "everyone in a numbered sequence unknown to them" require a random number, which we don't have?

No, because that ordering can be chosen by someone not in the sequence.


Assuming you would put them in 10 buckets, that order wouldn't have to be random, as long as people have similar biases.


Actually the proposal in the article requires that the group's biases be stable and uniform, where mine doesn't.


The question assumes we have the normal faculties of humans. You can choose whatever non random criteria for ordering them you like as long as it doesn’t affect their number choice.

For example height, or age, or distance from the door to the room, or how much I like them, or where in the sequence I arbitrarily decide to put them in. As long as that does not a priori influence their number choice we’re good.


How about 1 person labels 10 people 1 thru 10 and another person not knowing how they're labeled chooses 1 of those people

edit: the first person labels the 10 in a jumbled up way, then the 10 people not knowing how they are labeled then jumble themselves up


This seems like it could work pretty well. It would be interesting to try it out.

I see two possible sources of bias:

1. The way in which a person labels the others won't be random. And multiple people may exhibit the same pattern. For example, let's say that people are biased towards trying to label an "average" person as #1. This might be based on height, weight, attractiveness, etc. Then a given person may have an uneven distribution of labels attached to them. Further compounding this could be that the most "average" person is the one most likely to be chosen by the second person.

2. You would need to come up with a random way to pair up the first and second person. The worst case might be that the same two people are always paired with each other, and each person always chooses the same person in the room. Then you would end up with a series of 10 digits repeating themselves.


i don't know how strict it is with having no props. i think a blindfold could be acceptable because it can be used from the materials of the people in the room but i don't know about actual labels (maybe post-it notes) or a marker pen. so the first person can't see the people and can only like touch their back to put the label on.

edit: but then it gets to where you may as well just put the numbers in a hat :P


Top poker players have mental models for generating random numbers at the table, to avoid predictable play. I can’t find any link, but it doesn’t require a room full of people (or a wristwatch).


0-9: Look at the second hand on your watch, what is the last digit?

odd/even or heads/tails: is the seconds odd or even?


I bet that has sample bias as there’s a small amount of choice in when they reply


Only if they know the time.

For binary selection, keep the watch and tell them to make an indication (say something) in a few seconds. Note down whether the seconds were even or odd, then hand them the clock and swap roles. Keep the watch hidden until the moment you need to inspect it; record the time as soon as you can see the watch. Merge the two notepads to get the final stream of randoms.

For last digit of seconds, wait a few minutes between each query (making sure they can't see the clock during the rest period). "A few minutes" is imprecisely measured by you, who does not have access to a clock. They delay would foil any attempts at them keeping track of the time in their head. Bonus for distracting them to help with that. And instead of asking them for the last digit, have them flip the watch and then record the time that you [both] first saw.


The general method illustrated here (rearranging a sufficiently uniformly distribution over 100 into a very-close-to-uniform distribution over 10) is a special case of a topic in information theory called randomness extraction:

https://cs.haifa.ac.il/~ronen/online_papers/ICALPinvited.pdf

https://people.seas.harvard.edu/~salil/pseudorandomness/extr...

The problem being solved is trying to obtain a distribution arbitrarily close to uniform from sampling a known random distribution.


Maybe an easier solution would be to append all the answers, take a hash (e.g. sha1), then convert the last 2 digits from hex to decimal, and mod 10.

1 liner.

--

EDIT: I lied, this doesn't work, it biases towards 1-5. I guess you could convert to decimal, and divide by (FFFFFFFFFFFFF / 10)


"But, let’s say you have to do this without access to coins, computers, radioactive material, or other such access to traditional (pseudo) random number generators. All you have is a room of people."

Can you do a SHA1 in your head?


Not quite in his head, but Ken Shirriff did SHA-256 with pencil and paper:

http://www.righto.com/2014/09/mining-bitcoin-with-pencil-and...


I would call this "PHP hacker solution" :)


Yeah, there's a few types of "great solutions" in engineering. There are those that run really fast (big O) and those that can be written really fast.

99% of the time the job calls for engineers who can come up with the second type of solution.


Would be slightly biased, though (256 is not divisible by 10, you'd have to reject if >=250, say).

EDIT: and don't forget to add 1.


A few years ago I was stumped by a similar question about generating uniformly random numbers with less than ideal constraints. The helpful minds at MathOverflow solved it [0] but it wasn't something that you could guarantee would always work. (The question explains in more detail... Due to the miniscule probability of a never ending sequence of less than ideal random integers.)

I'm curious whether this technique could be applied to my original problem!

[0]: https://math.stackexchange.com/q/1273214/196899


You could use the biased distribution to build a lookup table for a less biased distribution.

Build a 10^n lookup table with an entry for each of the possible (ordered) n-tuples of values, give each entry a weight of the product of the probabilities of each of the n values making the entry.

Create a set of probabilities for each output and initialise to zero, run through each of the generated tuples in descending order of weight, set the lookup value to the output with the current lowest probability (or the first, or feed in another PRNG - but you have to stop somewhere) and increase the probability according to the weight associated with the tuple.

At the end of this process you'll have a PRNG that's at least as good as the input (and generally better) - although you'll need to query the seed PRNG n times per result. The higher the value of n the better the output (Although the lookup table will become quite large).


This solution gives no consideration to security only the distribution of outputs.


There's no need to solve the distribution balancing problem with linear programming. You can just use a greedy algorithm where you repeatedly give probability mass from numbers with more than 10% to numbers with less.


In similar situation (we were playing RPG while on a trip, and we needed to "throw dice") - we simply had the DM count silently and the player that was "throwing" say "STOP".

It was pretty much uniform.


Or have them give the last digit of the age of their oldest living family member.


Due to the distribution of people's age towards the "old" end of the scale, I'd guess that this is more likely to be 0 than 9?


Yeah, I'm curious to look at data, but I think this will be skewed towards lower digits.


What about Benford's Law?


Benford's concerns the first digit, not the last digit.


Except in a distribution that doesn’t go past one digit, of course (no, really, it isn’t that rare to be able to apply Benford’s irl to sequences that don’t go past ten). But, yeah, not applicable in this case.


> Ideally we want to preserve as much of the initial distribution (i.e. do as little chopping and changing) as possible.

I was thinking through the post that this sounds like a straightforward problem and it is: https://stackoverflow.com/a/5953133/86433

Except the introduction of an "ideally" optimization condition turns it from a straightforward transform into something requiring a linear programming solver. I wonder if the resultant algorithm is actually any simpler though...


The interesting part is the recursive application of the algorithm. I'm wondering why it isn't applied fully recursively but only up to one step. Surely, if the algorithm can generate a uniform probability distribution, it can also generate an arbitrary probability distribution, so why not use that arbitrary probability distribution in order to generate the uniform probability distribution?


If you have a pair of scissors or are willing to ask people to take a brief bit of pain...

Hair.

Cut (or pull) off a chunk of hair from everyone in the room. Put in a big pile. You'll have hundreds of thousands of hairs. It will be a glorious mess :)

Now ask people in the room to grab a handful of the hairs and count them. Take the last digit of the each count - there's your random number. You could maybe get multiple digits from a single handful - just be aware of Benford's law.


So this relies on that being a somewhat normal distribution (7 being the most common number).

What if the people were primed by the first person saying 7? What if the first person had said 3?

So really you'd need to get your 'random' numbers, check the distribution, than redistribute your 'random' numbers. Which doesn't seem random or 'random' to me.


The 8500 students are numbered off (assigned id's) from 1 to 10. Then you wait. Your first random number is the ID of the first person who dies. Second random is the ID of second person to die. Etc.

Fairly slow algorithm. And only good for 8500 random numbers. But I think it would give a good distribution.


I actually laughed out loud when I saw the image of the human RNG distributions. 7 is a random number. Who knew?


I feel like you could probably 'fix' it by the following protocol:

1. Ask your participant to write down a random number. 2. After they've written that number down, inform them you will guess at their number, and give them a dollar if you guess wrong. 3. Ask if they'd like to change their random number. 4. Allow them to write down their new number.

The before/after would probably change the distribution, and pretty much demonstrate that people can be more random when motivated.


I'm sure the distribution would change, but not sure it would be closer to uniform. It's now a completely different problem: choose a number most resistant to guessing. For example, if the participant hypothesises that your guess will be a "random" number chosen by you, she should always pick 10.


7 was what came into my mind right after reading the title. Seeing it later in the text felt like a magic trick.


If you can widen the answers the people can give you could first determine the distribution of a larger width (i.e., pick a number from 1 to 100) then map those to the 1-10 distribution based on their frequency.

This is effectively the same as the person proposing sampling tuples.


This can be generalized to having any distribution, simulating a die:

http://www.keithschwarz.com/darts-dice-coins/


Survey for analysis on this thread:

https://www.surveymonkey.com/r/XWML9JM


Data (including a batch I paid for on Mechanical Turk):

https://docs.google.com/spreadsheets/d/1Dh0wiTCRkBhckWGXtjZg...

(Amusingly, "69" is an over-represented response...)


This assumes independent events. If someone picks 7 and you ask them for another random number, it is highly unlikely (certainly not 28.1%) that they will pick 7 again.


"Ask another person for a random number"


So you need access to a uniform RNG to implement this as it takes the human bias and multiplies by a factor and the RNG... so why not just use the uniform RNG?


You do not need access to a uniform RNG for this - he uses more humans to get the redistribution chance.


You could use the month of birthday as random source,if it is bigger than 10,then ignore.


Use the middle square method:

https://en.wikipedia.org/wiki/Middle-square_method

It can be trivially calculated with pencil and a paper and will be random enough. You just need to pick a seed and off you go.

If it was good enough for Von Neumann it is good enough for you.


Didn't the world agreed already that the new Pseudocode Language was Python?


What number are people most likely to pick the second time if you ask them twice?


Maybe the most likely number of the ones they haven't picked? So 5 if you picked 7 and 7 if you picked anything else. This should be tested.


Or just carry a coin-like flippable object. (Cell phone?)


And if no objects are allowed in the room at all, then pick the lightest person and flip them :) Though flipping a person multiple times is difficult, so it could be biased towards 1 flip.


Or just use digits of pi; most people have at least 10 digits memorized, some have 20+ memorized. There's also a formula you can memorize for calculating the nth digit of pi, so you have infinite random numbers without memorizing an esoteric algorithm with a bazillion corner cases.

That said, I still thought this was an interesting article and I loved the animated bar graphs.


Most people have at least 10 digits memorized? That is one bold claim


I meant here on Hacker News. I would guess most of us here played with our TI-83s enough to know the first 10


I wonder how many people on HN have even seen a TI-83, I don’t think I’ve seen one since I sold mine when I finished the Danish equivalent of high school in 99. I mean, they were apparently discontinued in 2004, that’s 15 years ago.


In the US TI has established a multi-decade monopoly in schools. Basically, the teachers are only familiar with specifically them, so they require student only use them, so new teachers are only familiar with them.

As a result, a new TI-85 equivalent is only trivially upgraded from what I used 25 years ago and costs exactly as much minus inflation.


Wow, why aren’t students using laptops?


In college some students use laptops for taking notes. Not so much in high school. But in both during tests, electronics are strictly regulated to prevent cheating. Teachers know how to factory reset TI calculators. So, they and only they are allowed. Also, teachers know how to instruct how to use TI calculators. So, they are sometimes even required for certain classes.


Lol in Germany i was only ever allowed to use non-programmable calculators all the way through high school, college, university.


I'm from the US and graduated high school in 2014; TI-83s were ubiquitous. I'm not sure if they were the original model or the "plus," though.


Probably the plus. The TI-84 Plus was the calculator my high school (and maybe middle school) offered - I graduated in 2016.


I'm not sure that most people have memorized the first 10 digits of Pi. I know 3 for certain, any more than that and I can look them up.


I think most would know the first 5 because it's pretty accurate making 3.14159 into 3.1416 and you forget what's after the 9


3.14159 might even be a stretch though. I'm pretty sure the average person (or american at least) would only know 3.14


That is pretty arrogant to think that just because somebody posts here they have 10 digits of pi memorized. This is a public forum. Anybody can post. You and everybody else posting here are in all likelihood just average intelligence.

There is nothing exclusive about using this forum. You aren’t the first or the last random person to exist here.

Don’t let your own supposition of your intelligence get to your head. There is even a theory for that.... Dunning Kruger.


Memorizing digits of pi is easier than memorizing the algorithm in the article. 3.14159265, come on, stop making such a big deal of this.


I count only nine digits so it looks like you gatekept yourself out.


It's still an unsolved problem whether or not the digits of pi are uniformly distributed.


Just try it out for yourself. Analyze the distribution of the first billion digits and I think you'll find they will produce a more uniform distribution than this algorithm in the article given 1 billion human responses...


Even taking for granted that the digits are evenly distributed, which digit do you pick?


The next one in the sequence, anytime you need a new number?


It's not random if you're forcing it to be uniform. May as well skip the math and do this instead:

Divide all the people into groups of 10. Have the members of a group play each other at whatever, e.g. arm wrestling, to produce a rank from best to worst. Their rank becomes their answer.

Now everyone will reply with a number as close to uniform as the total number of people is divisible by the number of choices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: