Hacker News new | past | comments | ask | show | jobs | submit login

If you aim to make inferences about which ideas work best, you should pick a sample size prior to the experiment and run the experiment until the sample size is reached.

That's not a very Bayesian thing to say. It doesn't matter what sample size you decided to pick at the beginning. A Bayesian method should yield reasonable results at every step of the experiment, and allows you to keep on testing until you feel comfortable with the posterior probability distributions.

If 10 customers have converted so far, and 30 haven't, then you would expect the conversion rate to be somewhere between 10% and 40%, as evidenced by this graph of the Beta distribution(10,30):

http://www.wolframalpha.com/input/?i=plot+BetaDistribution+1...

You then do the same with method B, and stop testing once the overlap between the two probability distributions looks small enough.

Anscombe's rule is interesting, but it seems rather critically dependent on the number of future customers, which is hard to estimate. The advantage of the visual approach outlined above is that it's more intuitive, and people can use their best judgment to decide whether to keep on testing or not.

Disclaimer: I am not an A/B tester.




This way of framing the problem is known as the bandit problem. You can find lots of papers about it (Bayesian and frequentist). As others have mentioned in this thread we have a startup providing bandit algorithms as SaaS: http://mynaweb.com/


I've looked at your website, and from what I gathered, I would make the same criticism as I made for Anscombe's rule: it's not easy at all to decide what rewards should be, and how to put a price on exploration vs. exploitation. The more I think about it, the more I feel that an engineer looking at Beta distributions could weigh the trade-offs and make a better decision than a black-box algorithm with inadequate assumptions.

Granted, this doesn't really scale to testing many combinations of feature, and I think that I can see what you're shooting for. Best of luck with Myna.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: