I'm impressed with how nicely this handles the whole A/B testing stack. But the statistics have some potentially important shortcomings -- I posted a github issue about it:
(Full disclosure: I authored the ABBA library mentioned by matsiyatzy, so this is incidentally self-promotional. But I wrote that tool with the hopes of improving the use and interpretation of A/B test statistics in the wider community :)
https://github.com/maccman/abba/issues/3
(Full disclosure: I authored the ABBA library mentioned by matsiyatzy, so this is incidentally self-promotional. But I wrote that tool with the hopes of improving the use and interpretation of A/B test statistics in the wider community :)