Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe the right way to think about it isn't error bars, but the entire probability distribution -- what's the probability that if everybody voted, the upvote/downvote ratio would be 75/25, 80/20, 85/15, etc. Once you've figured out the probability distribution, you can calculate error bars any way you like (e.g. 95% confidence interval).

The beta distribution is one model you can choose for that probability distribution, which happens to have some nice properties that make it easy to work with.

The other question is, what's the "zero knowledge" probability distribution? I think your "0 votes with an error bar +/- the number of possible votes" would translate to "uniform probability of any result", which I think is beta(1,1).

Depending on the scenario, though, you might look at the data and observe that extreme values are very uncommon, and therefore start with something like beta(2,2) instead (a bell curve rather than a flat distribution). That has minimal impact once you have lots of real upvote/downvote data, but it makes a huge difference to how the first few votes are interpreted.



Right, that sounds more like what I meant. Still familiarizing myself with the terminology, thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: