*> We're looking for a specific sequence, though, not a specific number of heads...

bhouston · on April 19, 2018

> This means "00110" matches are easier to find, and happen more frequently!

They do not happen more frequently in a random bit stream. You are absolutely and completely wrong about this.

They only happen more frequently if you switch the problem from counting occurances to counting bitstreams that contain these occurrences.

The reason this is is because the sequence 111 contains two substrings of 11. Thus if this happens in a bitstream and you are counting bitstreams you only get a count of 1. Where as with the counting frequency you would still get two.

This will occur any time a sequence can be overlapped with itself.

panic · on April 19, 2018

You're right, I didn't word that very clearly. I meant "more strings contain this substring", not "this substring occurs more times total" (which, as you say, would be the usual meaning of "matches happen more frequently").

Why care about strings containing a particular substring versus total number of occurrences? The article referenced this PDF: https://www.cs.cornell.edu/~ginsparg/physics/INFO295/mh.pdf, which is about how many coin flips it takes to see a run of N heads. That's really what the argument in the second half of my post is about, but it applies to the "how many strings contain this substring" problem too, and that one seemed simpler to draw a table for.

zwischenzug · on April 19, 2018

You've answered a different question:

How likely are the substrings "11" and "10" to appear in binary strings of length 3?

rather than:

Is the sequence of '111' as equally likely to appear as '010' given that 1s and 0s are equally likely to appear?

zwischenzug · on April 19, 2018

Reminds me of a question an old tutor used to ask potential thesis candidates:

If I throw two dice, what's the probability I throw at least one six?

pure-awesome · on April 19, 2018

Am I being stupid, or is the answer 11/36? I.e., slightly less than 1/3?

Out of 36 possible throws of a pair of dice, 10 of them have a single six and 1 more has both sixes.

tomsmeding · on April 19, 2018

Seems correct. The probability of at least one 6 is one minus the probability that both aren't a 6, which is 5/6 * 5/6 = 25/36; so the result is (36-25)/36 = 11/36.

rtkwe · on April 19, 2018

Why wouldn't it just be the probability of either rolling at least one 6 which would be 1/6 + 1/6 = 1/3? Genuinely curious, I was never super good at probability calculations.

edit: clarification of problem.

Retric · on April 19, 2018

Picture two coins and 2 flips and count the H's there (2x2) = 4 options and ((2x2) x 1/2 x 2(flips)) = 4 Heads.

HH, HT, TH, TT

However, if you count the sequences with at least one H there are (4-1) = 3 with at least 1 H (HH, HT, HT) as the HH eats up 2 H's.

Now, Picture 3 sided dice, that's (3x3) = 9 options. You will see ((3x3) x 1/3 x 2(flips)) = 6 2's.

00, 01, 02, 10, 11, 12, 20, 21, 22

However if you count there are (6 - 1) = 5 sequences with at least one 2 you get 02, 12, 20, 21, 22 as the 22 eats up an extra 2.

Now, what do you think happens with an N sided dice? What if N is 6.

PS: This seems less clear as I added more info...

rtkwe · on April 19, 2018

Yeah doing a brute force count of the possibilities the (6,6) option is double counted by the 1/6 + 1/6 method as two distinct possibilities. Or that's what it looks like is happening. Been too long since my college combinatorics class.

pkaye · on April 19, 2018

Imagine doing that with 7 dice... 7/6 probability!

escherplex · on April 19, 2018

Which suggests the probability of the event not happening would be -1/6, which suggests the quasiprobability of the event not not happening. Obviously no ambiguity there. /s

oaisdjoaisjdoi · on April 19, 2018

Because those are just numbers that work. The only way to roll a 2 (for example) is to roll 2 1's - which is obviously 1/36, not 1/3.

rtkwe · on April 19, 2018

Sorry the problem was probability of rolling at least one 6 not of rolling a 6. Edited my original to make that clearer without the original post.

0x0 · on April 19, 2018

Because you're not guaranteed to roll a 6, even if you roll 6 times: 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 6/6 = 100% chance? Nope. :)

zwischenzug · on April 19, 2018

Nope, you're being clever.

uryga · on April 19, 2018

could you explain? I got the same thing...

jwilk · on April 19, 2018

I think it was meant to be a direct answer to "Am I being stupid [...]?"

zwischenzug · on April 19, 2018

Yup... shows the value of HN votes...

emj · on April 19, 2018

Is the tutor fair and unbiased?

daveFNbuck · on April 19, 2018

I wrote a quick Python snippet to compute the probabilities for all sequences of length 3 over 20 flips:

  [(x, sum(1 for i in range(2**20) if x in '{:020b}'.format(i)) / 2.0**20) for x in map('{:03b}'.format, range(8))]

The output shows a 79% chance for "000" and a 91% chance for "010". "000" and "111" are actually the least likely substrings, with "010" and "101" in second place.

Furthermore, "001" has a 97% chance of occurring. It's pretty easy to see why "001" is so much more likely than the others. Unless your string perfectly alternates 0s and 1s, you're going to get repeating pairs. What happens after you get a "00"? If there's ever a 1 again, there were at least two 0s before it.

So you're almost guaranteed a "001" after "00", unless it's near the very end of the string. You only have a 50% chance of getting "000" from these though. A 1 completely resets the search for "000", but a 0 keeps the search alive for "001".

zwischenzug · on April 19, 2018

Again: that looks for whether that sequence is in each tested string, which is a different problem.

It tests if 'x' is 'in', rather than the number of times it is seen.

daveFNbuck · on April 20, 2018

But the question being tested is whether a given string is in the bits that have been transferred over the internet. The number of times it is seen is not what we're talking about here.

freyir · on April 19, 2018

Yes, but the sequence of bits that has been transmitted over the Internet is finite. That makes panic’s argument technically true.

Since we’re dealing with a sequence that’s zettabytes long, I’d guess it falls under the category of technically true but irrelevant.

comex · on April 19, 2018

Wow. That's really nonintuitive to me. Like the Monty Hall problem: you have something that's obviously an irrelevant detail that can't affect the probability… except… somehow it does.

plopilop · on April 19, 2018

There are problems that I find even more counterintuitive than Monty Hall, such as the sister's paradox.

    Your neighbours has two children, one of which is a girl.
    What is the probability of the other child to be a girl?

A. One may say 1/2, but actually, using basic Bayes theorem: P(2 girls | at least 1 girl) = P(2 girls AND 1 at least girl) / P(at least 1 girl) = (1/4) / (3/4) = 1/3.

I think this problem is actually equivalent to the Monty Hall problem. But it can get much weirder.

    Your neighbour has 2 children, one is a girl.
    You know the other child is born on a Sunday,
    what is the probability that this other child is a girl?

A. Similarly, counting all possible cases, one does not get 1/2 or 1/3, but rather 13/27.

DavidSJ · on April 19, 2018

  Your neighbours has two children, one of which is a girl.
  What is the probability of the other child to be a girl?

This is oft-mentioned but with this wording it's ambiguous. Here are three possible interpretations:

1. Your neighbor has two children. We observed both and found that there was exactly one girl. What is the probability that both are girls? (Answer: 0)

2. Your neighbor has two children. We observed one and found that she was a girl. What is the probability that the other (unobserved) child is a girl? (Answer: 1/2)

3. Your neighbor has two children. We observed both and found that there was at least one girl. What is the probability that both are girls? (Answer: 1/3)

Interpretation #1 is kind of silly but it serves to illustrate the ambiguity of the plain English wording. Interpretation #2 is how most people interpret the problem, and it's bolstered by the use of the word "other" in your prompt. For there to be an "other" child, there must have been a specific girl observed originally, rather than a general observation that there was at least one girl. Interpretation #3 is how you intended the problem.

plopilop · on April 20, 2018

yup, you're right.

cjslep · on April 19, 2018

You are horribly butchering Bayes' Theorem. Both your examples are incorrect because your two events ('2 girls' and 'at least 1 girl') are not oberved independently of one another, which is a necessary condition to using Bayes' Theorem with a valid result. Great tool but wrong job.

plopilop · on April 20, 2018

What? Bayes' theorem does not require anything about the events, even more so it is useful when the events are not independent. If events are independent, Bayes' theorem has no usefulness. https://en.wikipedia.org/wiki/Bayes%27_theorem#Statement_of_...

Edit: however now that I look into it, I should rather have used "conditional probabilities" name rather than Bayes'. I always mix both of them, my bad.

alecmev · on April 19, 2018

Looks like this is the Boy or Girl paradox [0], which has ambiguity problems.

[0] https://en.wikipedia.org/wiki/Boy_or_Girl_paradox

grkvlt · on April 20, 2018

I love how Wikipedia helpfully informs us that 'Biologically, according to studies of the human sex ratio, it is more likely for any one person to be born into the male sex than the female sex.' as though it is somehow relevant or has anything to do with the actual paradox being discussed...

NOTE I've since edited the Wiki page, and deleted the sentence - since it isn't relevant and there was no citation for the claim!

olavk · on April 19, 2018

> Your neighbours has two children, one of which is a girl. What is the probability of the other child to be a girl?

The way you have phrased the question, the probability is 1/2.

Gh0stRAT · on April 19, 2018

Actually, it's ever so slightly higher than 1/2 because there is a non-zero chance of them being identical twins. If we assume an identical twin incidence rate of 0.4%, then the actual probability would be 63/125 that the second child is also a girl.

kqr · on April 19, 2018

In the first case, I guess what's happening is that when we pick two people randomly with uniform distribution, we expect, with equal probability, any combination of boys and girls: MM MF FF FM. When we observe a girl, we can exclude the case of two boys, so the remaining cases to consider are MF FF FM, where obviously only 1/3 of the cases is two girls.

So I guess what makes this counterintuitive is that we think of the example as independent variables actually affecting each other, but what happens is really that we're drawing from a collection without replacement, and obviously the mix of the collection is going to change as we do so.

This should apply also on a humankind, world-wide scale: if you have met three women in a row, the fourth person you meet is more likely to be a man. Gambler's fallacy! Or just drawing without replacement. A roulette table is drawing with replacement. :)

KozmoNau7 · on April 19, 2018

But MF and FM in this case are the same?

cecilpl2 · on April 19, 2018

No because children are distinguishable.

If you flip two coins, you get HH, HT, TH, and TT with equal probability. Try it.

KozmoNau7 · on April 19, 2018

But the sequence is irrelevant.

HT is equal to TH, because in this situation we only care whether the two coins are equal to each other or different from each other.

Dylan16807 · on April 20, 2018

It's the percent chance that matters, so it doesn't matter if you can distinguish them. The cases still exist with equal chance.

KozmoNau7 · on April 20, 2018

To elaborate, the situation as stated in the question is currently Hx, so HH and HT are the only possible results.

Dylan16807 · on April 20, 2018

You can calculate 50/50 but it's not by ignoring the difference between HT and TH.

KozmoNau7 · on April 20, 2018

The first child is not an influencing factor on the second child. It's either a boy or a girl, 50/50.

rkangel · on April 19, 2018

My favourite quote from the wonderful David Spiegelhalter, Winton Professor of the Public Understanding of Statistics:

> Why do people find probability unintuitive and difficult? Because it is unintuitive and difficult.

speleding · on April 19, 2018

Another counter intuitive probabilistic property that may be relevant here:

- If you make a distribution of the first digit of prices in a webshop you will notice that 1 appears more frequently than other digits.

- If you convert the prices to another currency and redo the distribution, the most frequent first digit will still be 1!

So it's not a given that you can assume the properties of a random bit stream will hold for the calculation in the article.

(As an aside: this weird property explains why you need slightly fewer coins in a currency that has quarters than those with 20 cent coins)

_2xvp · on April 19, 2018

The logic behind this is more or less the same idea behind the Knuth-Morris-Pratt algorithm (https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93P...), which might help with your intuition.

trowftd · on April 19, 2018

Aren't you miscounting though? 111 contains the 11 sequence twice.

azernik · on April 19, 2018

Which is probably the key intuition here - the expected number of matches is exactly the same, but because you can have more matches for the purely-repeated sequence in the same string the total number of strings containing those matches is smaller.

panic · on April 19, 2018

Sure, but the question is "how likely" not "how many times". Counting the expected number of times is a much easier problem because you can use the linearity of expectation.

jstanley · on April 19, 2018

But you're trying to find a bit-substring of length 2 that is least likely to have been transmitted.

If the bit strings of length 3 are chosen uniformly, 11 is less likely to appear than 10.

bhouston · on April 19, 2018

You are totally right. He is incorrect.

_dcwr · on April 19, 2018

This is (almost) how Knuth–Morris–Pratt string search algorithm works: https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93P...

bigiain · on April 19, 2018

Another possible detail this misses - I learnt about De Bruijn sequences from Samy Kamkar here:

http://samy.pl/opensesame/

(scroll about 1/3rd way down to the "(U) The OpenSesame Attack" section for the bits relevant to this discussion...)

You don't need to enumerate every n-bit sequence, you just need to enumerate a (shorter - by 62% in Samy's 12 bit case) sequence that contains all the n-bit sequences.

saagarjha · on April 19, 2018

Interestingly, there was an old USAMTS problem that had De Bruijn sequences in it, though I wasn't aware of it at the time: http://www.usamts.org/Tests/Problems_25_1.pdf

gumby · on April 19, 2018

This is the heart of the Boyer-Moore string searching algorithm too.

hnuser1234 · on April 19, 2018

Does this still apply when your search space is on the order of thousands of petabytes, and the smallest unseen sequences are, according to the linked post, around 75 bits?

acobster · on April 19, 2018

Wow, how surprising and, in a strange way, kind of beautiful! Thanks for this.

improv32 · on April 19, 2018

I'm reminded of constructing DFAs that represent regular expressions for class.

panic · on April 19, 2018

Yeah, you can see finite automata explicitly used in the PDF that's linked in the article itself: https://www.cs.cornell.edu/~ginsparg/physics/INFO295/mh.pdf

anonlastname · on April 19, 2018

Yes, google "penney ante"