Hacker News new | past | comments | ask | show | jobs | submit login

Besides complexity, a price you pay with multi-armed bandits is that you learn less about the non-optimal options (because as your confidence grows that an option is not the best, you run fewer samples through it). It turns out the people running these experiments are often not satisfied to learn "A is better than B." They want to know "A is 7% better than B," but a MAB system will only run enough B samples to make the first statement.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: