Hacker News new | past | comments | ask | show | jobs | submit login

We aren't using hosted postgres (much, yet). We provision EC2 instances and self-manage it. Failover is scripted, and manually invoked as needed.

None of us trust any of the automated failover solutions enough to use them. We want human judgement in that loop, even if it means being woken at 3AM to push the button. It's that hard to get right.

Just one incident like The Fine Article's is well more than our entire infrastructure's total downtime for the rolling year, and we have hundreds of postgres instances.

Done wrong, automated failover is a net increase in risk. And, in case my thesis is somehow unclear, it's hard to get right.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: