Well, at linode you can't have a structure that is immune to failover, as they have single points of failure within their infrastructure, apart from anything else - all their London kit for instance lives in Telehouse East, in a few adjacent racks.
Once we'd done the initial up sticks and move to AWS, our first priority was to use their redundancy and failover to the fullest (six months of sleepless nights due to linode made this rather front and foremost in our minds) - so nothing that's happened at AWS has ever been anything more than an inconvenience - we've managed five nines since the move - before, we managed one.
Only one 9? Even through this crap during the holidays I've managed 3 9's on my service hosted on several servers in Linode Dallas (the most hard-hit region in this DDoS attack). I would have moved to AWS by now if Linode didn't have such cheaper bandwidth.
Yeah... our issues mostly arose from the fact that at the time, they were advertising 1Gbps node interconnect - which actually turned out to be 1Gbps HOST interconnect, with the nodes actually throttled down to 50Mbps. We use memcached extensively, and this was absolutely crippling for performance. It didn't help that they furiously denied that they were throttling until we demonstrated it beyond all doubt.
They did obligingly increase these caps when we begged them to do so, but at that point the writing was on the wall, and we kept on bumping into other weird and wonderful limitations and issues, such as the fact that someone running an intense job on the same host could bring our VPS's to an absolute crawl.
It really is a shame, as we desperately wanted to make it work, as we liked Chris's hands on approach (very much like ours), but ultimately our confidence in them was so eroded by the point that things started going genuinely wrong on their end that we had no choice but to leave.
As I said, we kept random small single-server stuff (wordpress sites mostly) there, as if you're not dealing with their networking, performance is generally OK - but the network limitations were an absolute clincher for us, and it was at one point literally every day that we'd find that one of their switches had broken, or we couldn't ARP IP's for no apparent reason, etc. etc.
ATL was much worse than Dallas in this go around. You did not get 3 9s in the last month if your service was located solely in ATL. They were completely null routed for the majority of 2 days.
Strange how this works. I moved everything to cloud providers because of availability issues. Now a few years later, I'm moving everything back to dedicated hardware in several different datacenters because of availability issues. Thanks, Docker!! <3
We are leasing hardware, and we are leasing a lot of OVH (in addition to a number of smaller providers). OVH does a good job, and if you build your infrastructure to load-balance around failure (which does not happen often), you'll hardly notice it. Often we get an email from OVH telling us that support techs have been automatically dispatched to a node which we hadn't even alerted about yet. Docker is basically a miracle drug for this kind of thing. Our hardware costs are less than half of what they were with EC2. Huge savings.