Power outage in Linode Fremont Datacenter

derrickpetzold · on Nov 21, 2010

My linode has been down for over an hour and still counting. They just went under three nines in one day...

liuliu · on Nov 21, 2010

I noticed this an hour later because my monitor script was set up in the same datacenter. Lesson learned.

theDoug · on Nov 21, 2010

I have zero affiliation with the group, apart from being a casual user but http://wasitup.com/ has come in handy for me quite often. Even for free, it'll do a check 20 times an hour.

uggedal · on Nov 21, 2010

Just keep in mind that I host http://wasitup.com in Linode's London facilities. Data center redundancy has been on my TODO list for some time now.

archon810 · on Nov 21, 2010

Another vote for wasitup.com. I use wasitup.com, host-tracker.com, and basicstate.com, and wasitup.com is consistently the first one to report an outage, followed by basicstate, which has a surprisingly gloomy and odd interface, yet good service, and then followed by host-tracker.com, which sucks pretty much on all levels. All are free btw.

jackowayed · on Nov 21, 2010

Pingdom also has a very good free plan. It's limited to one url, but for that URL, it's just as good as their paid plans. So you get checks from a ton of DC's, and it tracks response time as well as uptime.

akl · on Nov 21, 2010

It hasn't been well publicized after the original announcement (and isn't prominently placed on their plans page, either) but I've had great success with Cloudkick's free developer plan on my personal virt (which is still down..):

https://www.cloudkick.com/accounts/signup/developer/

CK sent me a text message as soon as my host went down earlier, which gave me time to file the ticket against Linode.

Only caveat - they don't provide a signed RPM, which can cause some administrative hardship for linux users using rpm-based distributions (most of the yum operations require you to add an extra cli option to allow working with unsigned rpm's), but it's a minor complaint given that their agent isn't mandatory and doesn't need a lot of updates.

archon810 · on Nov 21, 2010

So... they ever heard of redundant power generators? Does anyone know what happened exactly?

akira2501 · on Nov 21, 2010

I've had power transfer switches fail more than once. Just because you have a redundancy available doesn't mean you are reducing failure points in the system.

cvg · on Nov 21, 2010

I live right down the street from the datacenter (~ 0.5 miles). The whole area's power was out for a bit, but know more than an hour. My site is having errors.

jfb · on Nov 21, 2010

I had just brought up a new instance. Why not choose Fremont? It's right down the .... oh dammit.

delano · on Nov 21, 2010

That's a good case in point. In general you're better off choosing locations further away from where you live/work to prevent issues like this where your office/home and datacenter could be affected at the same time.

jfb · on Nov 21, 2010

It was especially dumb of me because my other instance was ... in Fremont.

sanswork · on Nov 21, 2010

We had just launched a test server about 40 minutes ago and had been waiting for it to boot for ages when I finally thought to check the status page. Talk about bad timing.

on Nov 21, 2010

[deleted]

drm237 · on Nov 21, 2010

Linode has multiple "availability zones" aka datacenters so you can achieve the same redundancy with them. I would guess that linodes have better uptimes than aws but I don't have any numbers to actually back that up.

initself · on Nov 21, 2010

Knowing the Linode team's development pathway and philosophy, it is likely this outage will force them to offer redundancy as their standard offering.

Backup was unavailable for a long time and now it is standard.

archon810 · on Nov 21, 2010

If by standard, you mean you have to pay for it, prorated to the Linode disk space size, then yeah, it's standard.

jinhow · on Nov 21, 2010

Still wait for the reboot jobs. maybe I should move to EC2.

npsomaratna · on Nov 21, 2010

Yeah .... our iPhone app downloads content from our server - so far we've got 3 1-star reviews saying 'Your app sucks - it can't download any content'

Damn .........

mike-cardwell · on Nov 21, 2010

http://aws.amazon.com/ec2-sla/

Their SLA says 99.95% uptime. What has your Linode uptime been?

jinhow · on Nov 21, 2010

my server is back.

le · on Nov 21, 2010

The Fremont datacenter, if I remember correctly, is at Hurricane Electric. They had power issues in 2009 as well.

http://blog.feedly.com/2009/11/03/24-minutes-of-unscheduled-...

jeffy · on Nov 21, 2010

Going on 4 hours of downtime here, this is a disaster. Linode used to be really honest about the status of things, this time, they claim "most linodes should be booted" which seems not to be true.

dchest · on Nov 21, 2010

3:30am (EST): Some hosts were damaged by the power outage and we are working on moving these Linodes to hot standby equipment.

initself · on Nov 21, 2010

When it went down, I quickly realized my addiction to Twitter via irssi.

pyre · on Nov 21, 2010

What are you using?

mike-cardwell · on Nov 21, 2010

I get access to Twitter via IRC by using Bitlbee.

pyre · on Nov 22, 2010

I don't use Bitlbee, but I had no idea that it supported micro-blogging. I thought that it was just IM protocols.

mike-cardwell · on Nov 22, 2010

Bitlbee currently supports Jabber, MSN, OSCAR (AIM/ICQ), Yahoo and Twitter

There is also a Twitter/Jabber gateway for those not using Bitlbee: https://www.tweet.im/

initself · on Nov 21, 2010

Twirrsi

whakojacko · on Nov 21, 2010

I wonder if this was related to the crappy weather in the south bay? My power was out for from 8 until 8:45ish (PST). Of course, I would hope their datacenter has a UPS and generators...

initself · on Nov 21, 2010

Latest News:

3:30am (EST): Some hosts were damaged by the power outage and we are working on moving these Linodes to hot standby equipment.

Please tell me that doesn't mean potential data loss!

sandaru1 · on Nov 21, 2010

>> Please tell me that does mean potential data loss!

Looks like it's not. Our server just went up - after the damage notication. Data seems to be intact.

mike-cardwell · on Nov 21, 2010

> Please tell me that doesn't mean potential data loss!

That's not a problem. You have backups of course.

bdonlan · on Nov 24, 2010

Typically by that they mean they take the hard drives out and slot them into some spare servers.

grovulent · on Nov 21, 2010

DOES? I'm beginning to doubt your commitment to sparkle motion...

bengtan · on Nov 21, 2010

Grrr... my one remaining down Linode has been down for a bit over 3 hours now (extrapolating from the Dashboard graphs). Sort of ... quite bad.

csmoak · on Nov 21, 2010

there was a brief outage at the hurricane electric datacenter in fremont, probably due to the storm that came through the bay area this evening. our servers (not part of linode) went down, too, but were back up a couple minutes later. i can't confirm the cause, though, b/c i can't reach anyone at he.net on the phone. =(

grovulent · on Nov 21, 2010

They are saying that linodes are on their way back up. Not mine though... alas. Anyone else have any more luck?

sandaru1 · on Nov 21, 2010

Still waiting. There is a queued task to restart the server about "36 years 10 months ago".

grovulent · on Nov 21, 2010

Yeah - that message really bugs me for some reason. It's as though it injects the fear that something is DEEPLY wrong straight into my chest...

Rantenki · on Nov 21, 2010

The just queued the restart at some very low value of a unix timestamp, which translates to 36 years ago. Strangely, not the unix epoch at midnight Jan 1 1970.

mnordhoff · on Nov 27, 2010

Old post, but...

FYI, it uses the epoch of caker -- i.e. the birthday of caker, Linode's founder.

palish · on Nov 21, 2010

Oh boy. Screenshot that!

drm237 · on Nov 21, 2010

Half of mine are up, still waiting on the other half. More than 2.5 hours into it now.

grovulent · on Nov 21, 2010

That's encouraging at least.

npsomaratna · on Nov 21, 2010

Nope. Ours has been down for over 2 hours now - and it's nearly an hour since the 'linodes are on their way up' message :(

npsomaratna · on Nov 21, 2010

Just got a message saying that the server is booting up .... 3 hours :(

bengtan · on Nov 21, 2010

Most of mine are up. Just waiting for one more.

Andrenid · on Nov 21, 2010

Mine are all still down.

saikat · on Nov 21, 2010

Suggest also hanging out in #linode on irc.oftc.net. Perihelion is being very patient and answering questions.

initself · on Nov 21, 2010

I would, if I could ssh to my linode.

saikat · on Nov 21, 2010

That is an IRC channel. You shouldn't need your linode to use IRC?

jjcm · on Nov 21, 2010

I access irc via screen + irssi on my linode box. I'm in the same boat as him right now. Revering to prgmr for irc for the time being.

initself · on Nov 21, 2010

True, except for that how I access IRC, server side.

JabavuAdams · on Nov 21, 2010

And certificate errors on https://linode.com.

RyanGWU82 · on Nov 21, 2010

Go to https://www.linode.com instead. They've got a wildcard SSL certificate for *.linode.com, which will not work (and has never worked) without a subdomain like "www".

thegyppo · on Nov 21, 2010

I still have 2/3 servers down, one of them is still "Powered off" in the console.

initself · on Nov 21, 2010

Same here.

revicon · on Nov 21, 2010

And here as well. Am I justified in being pissed about this outage, or am I failing in my responsibilities by relying on a company like linode to keep my site up?

mike-cardwell · on Nov 21, 2010

If you get a single server/vps from anyone, you should assume that it might go down for a few hours at any point. If that's a problem for you, you should create your own redundancy. No matter how much redundancy your provider claims.

asnyder · on Nov 21, 2010

Glad my linode is in the New Jersey data center.

duskwuff · on Nov 21, 2010

Might this be related to current prgmr outages?