Nice article. It reminds me of my year living in London, and taking the bus ever...

asdkhadsj · on Oct 28, 2018

Wow, interesting idea!

Imagine if (in the future) some item like a phone can detect this information around you, and automatically record it. Forming games ontop of this life data would be weird, neat, fun and sad all at the same time. Imagine seeing a real example of where someone else is just more lucky than you are in stupid but impactful (on your morale) ways.

If it didn't seem so tedious to track, I'd love to implement an app to record this info. Unfortunately no one I know would care, and I'm sure I'd get too lazy to keep it accurate. Neat nonetheless, thanks for the cool thoughts :)

adrianN · on Oct 29, 2018

Similar to the recruiters that throw away the top half of the application stack because they don't want unlucky people in their company I could see such data become valuable to some people.

bostonpete · on Oct 28, 2018

Your use of "of course" seems to imply that there's some statistical reason that the probability of the next bus being inbound vs outbound wouldn't be equal. Is there? If so, it seems like it must be a different reason than the one in the article. What am I missing...?

herodotus · on Oct 28, 2018

Because we might score -1, -2 or worse if 2 or 3 busses went in the other direction before ours came, but if ours came first, we score 1. We get on the bus and thus don’t know if another one or more arrives first on our side.

repsilat · on Oct 29, 2018

This reminds me of a mathematical paradox that makes me doubt your conclusion: "In this country, every couple wants to have one daughter. They keep having children until they have a daughter, and then they stop. What gender balance should we expect?"

Couples can have any number of sons, and every couple has exactly one daughter. Still, the accepted mathematical solution is an equal gender ratio for the couples' children.

Terr_ · on Oct 29, 2018

I think the "paradox" comes from how people implicitly assume "any number of sons" is somehow distributed or weighted in a way that favors towards numbers of 1 or above.

In contrast, "0 sons" is going to describe a full half of all marriages.

jon_richards · on Oct 29, 2018

Same situation with the bus.

pwagland · on Oct 29, 2018

Not really. In the son/daughter case, the calculations are: expected daughters: 1 expected sons: 1/20 + 1/41 + 1/82 + 1/163 + 1/324 + …
So number of expected daughters = 1, number of expected sons = 1. In practice since women can't have an infinite number of children, then this wouldn't be an infinite series, so the real number of expected boys would be lower than one, but there you go…

Now, for the bus case, you get +1 if your bus turns up first, and -1 for every other bus that turns up first. Assume that it is completely random, then: expected + score is: 1/2 1 expected - score is: 1/2 * -1 + 1/4 * -2 + …

Expected + is 0.5, expected - is -1.

repsilat · on Oct 29, 2018

I guess it would be balanced if the rule was,

> 1 when your bus turns up, -1 for every bus going the other way.

selestify · on Nov 6, 2018

Isn't that the same as what the OP's rules were?

pwagland · on Oct 30, 2018

Correct.

DalekBaldwin · on Oct 29, 2018

The expected number of sons is 1, and the expected number of daughters is 1 (by the framing of the problem, in every possible scenario, there is exactly one daughter), but the expected value of the ratio is not 1:1. E[X]/E[Y] = E[X/Y] is not a valid identity.

http://www.thebigquestions.com/2010/12/22/a-big-answer-2/

Pristina · on Oct 29, 2018

I read that and it seems wrong. The question asked "what fraction of the pop is female" but his argument is that 3 families of 4 girls and 1 family of 12 boys make the fraction of girls in the average family 75% (the average of 100% x3 and 0% x1) which is non-sensical to me.

>E[X]/E[Y] = E[X/Y] is not a valid identity.

is completely irrelevant here because it is being used to point out that a non-answer is wrong.

DalekBaldwin · on Nov 1, 2018

He starts with a simpler problem to elucidate the principle that is key to the original problem. Do continue reading.

theoh · on Oct 28, 2018

It is impossible to catch more than one inbound bus on any given occasion, whereas any number of outbound buses might pass.

BTW, on a slightly unrelated point, if there's no timetable, but the interval between buses is maintained reliably, the expected waiting time is uniformly distributed over that interval.

If you have to get a second bus, you need to convolve two of those two uniform distributions to find out the distribution of overall journey times. This is a trapezoidal distribution, which is just about analytically manageable.

But a journey with two transfers (3 buses in total) results in a likely overall time distributed according to a uniform distribution convolved with a trapezoidal distribution, which is a very weird non-smooth shape. You can see why people choose to model distributions with Gaussians, which are well-behaved (convolve two Gaussians, get another Gaussian). The Gaussian just lends itself ideally to recursive applications, hence recursive filtering (e.g. Kalman filters).

LolWolf · on Oct 28, 2018

Also, gaussians are great approximations for large n, too, since the convolution of any distribution with itself n times (for n "large enough") is close to gaussian (by the CLT. More generally, there are very nice error estimates for many distributions).

I suspect this analysis can be carried out and yield quite good results in the gaussian case (a careful analysis might even yield error bounds on the result).

theoh · on Oct 28, 2018

Yes. If you spend your whole life on one long multi-transfer bus journey, you'll end up with a gaussian.

It's a bit less clear that gaussians should be used when e.g. fitting a coordinate to an astronomical feature, which might not actually be symmetrical.

The other useful property that the gaussian has is its separability, in the 2D case. That is unique to the gaussian and counts for a lot.

LolWolf · on Oct 29, 2018

Eh, I don’t think that many are required. Convergence to a Gaussian is pretty fast (you should check out page 299 of [0]), at four or five a Gaussian is already a quite good approximations.

———

[0] https://www.dartmouth.edu/~chance/teaching_aids/books_articl...

dmurray · on Oct 28, 2018

It's not the next bus, it's the number of buses going the other way, which will on average be greater than 0.5 by the same reasoning given in the article.

It's also possible for the result to be biased because of scheduling. If inbound buses pass every 10 minutes at 16.00, 16.10, 16.20,... and outbound buses at 16.01, 16.11, 16.21, ... you'll usually see an inbound bus first. Though I expect this was not the case here.

hanoz · on Oct 29, 2018

This is pretty counterintuitive. In the game described you should expect to see one 'wrong way' bus per play on average, not half as you might expect. On the other hand, you have an exactly even chance of catching your own bus before seeing a wrong one, so if your scoring system had been +1 for your bus and -1 for one or more wrong ones, then you would indeed score 0 over time. But with your point per bus scoring system your expected score turns out to be -0.5 per play.