The server for our e-commerce website went down during a recent holiday and we lost a significant number of transactions.
What redundancy procedures have you implemented to protect against these kinds of issues?
After doing a bit of research on this topic, I'm left with even more questions. DNS propagation would still leave our website down for up to a day or more for some users and we operate on a relatively tight budget (using a $50/mo. VPS).
Hoping someone can shed some light on this and that I'm just ignorant to the obvious solution.
Basically, you have two VPSs (or servers) and if the first goes down, the second jumps in to take its place on the same IP address so that you don't have to change the DNS or anything.
Second, you can turn your TTLs down for your DNS so that when you make changes they happen faster.
Third, you can have an offsite VPS mirror with a different hosting company ready to roll.
That's usually how I deal with the issue of high availability. Have two boxes with one company with a shared-IP setup so that if one goes down, the second just takes over. Then have an off-site mirror with a different company and have my DNS TTLs set low enough that hopefully most people can get access in a couple hours.
The problem is that it all costs money. Rather than one server, you now have three. Costs have tripled to move you from, say, 99.5% reliability to 99.9%. So, you're paying a lot more for a very marginal improvement and the question that you have to ask is whether that tiny marginal improvement warrants paying triple. I think it's worth it - servers always seem to go down when I'm least available. So, that's my 2 cents.