Hacker News new | past | comments | ask | show | jobs | submit login

I'll add my own Bayesian analysis to the fray. Assuming a binomial, in Julia:

    using Distributions

    b_old = Beta(66+1, 6392-66+1)
    b_yel = Beta(83+1, 6362-83+1)

    N = 1000000
    # Sample from both distributions, count the fraction of samples that are better
    sum(rand(b_old, N) .> rand(b_yel, N)) / N
This yields 7.7% chance that the old one is better than "yellow". It's fascinating to see how we can get such different answers to a simple question.



Thanks for this analysis.

I did an A/B test on an older framework which didn't automate statistical significance at all, but the website was getting more than 2000-3000 orders per day, so after a single week we had enough data to determine that sales had increased by 36% (reduced the page's load time by almost half, changed the checkout to use Ajax, and a few other small changes.) without the need to quantify things. In fact, at the time, I didn't even know what "statistical significance" was... not that I know too much more about statistics now than I did then.

Anyway, in theory all of the exact p values, etc. matter, but in practice, the bottom line is all that matters, because the p values can change in a moment based on something I haven't factored in. That's where, at least for the time being, intuition still plays a great part in being actually correct, which is why we still have people with repeated successes.


Good point, there are always unmodeled factors and prior information, so the % is to be interpreted in context.


Why is that answer so different? It's saying there's a 90%+ chance that the new one is better. That's also his answer.


I get about the same result using my favorite online Bayesian split test calculator: http://www.peakconversion.com/2012/02/ab-split-test-graphica...




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: