Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Power outage in Linode Fremont Datacenter (linode.com)
39 points by drm237 on Nov 21, 2010 | hide | past | favorite | 63 comments


My linode has been down for over an hour and still counting. They just went under three nines in one day...


I noticed this an hour later because my monitor script was set up in the same datacenter. Lesson learned.


I have zero affiliation with the group, apart from being a casual user but http://wasitup.com/ has come in handy for me quite often. Even for free, it'll do a check 20 times an hour.


Just keep in mind that I host http://wasitup.com in Linode's London facilities. Data center redundancy has been on my TODO list for some time now.


Another vote for wasitup.com. I use wasitup.com, host-tracker.com, and basicstate.com, and wasitup.com is consistently the first one to report an outage, followed by basicstate, which has a surprisingly gloomy and odd interface, yet good service, and then followed by host-tracker.com, which sucks pretty much on all levels. All are free btw.


Pingdom also has a very good free plan. It's limited to one url, but for that URL, it's just as good as their paid plans. So you get checks from a ton of DC's, and it tracks response time as well as uptime.


It hasn't been well publicized after the original announcement (and isn't prominently placed on their plans page, either) but I've had great success with Cloudkick's free developer plan on my personal virt (which is still down..):

https://www.cloudkick.com/accounts/signup/developer/

CK sent me a text message as soon as my host went down earlier, which gave me time to file the ticket against Linode.

Only caveat - they don't provide a signed RPM, which can cause some administrative hardship for linux users using rpm-based distributions (most of the yum operations require you to add an extra cli option to allow working with unsigned rpm's), but it's a minor complaint given that their agent isn't mandatory and doesn't need a lot of updates.


So... they ever heard of redundant power generators? Does anyone know what happened exactly?


I've had power transfer switches fail more than once. Just because you have a redundancy available doesn't mean you are reducing failure points in the system.


I live right down the street from the datacenter (~ 0.5 miles). The whole area's power was out for a bit, but know more than an hour. My site is having errors.


I had just brought up a new instance. Why not choose Fremont? It's right down the .... oh dammit.


That's a good case in point. In general you're better off choosing locations further away from where you live/work to prevent issues like this where your office/home and datacenter could be affected at the same time.


It was especially dumb of me because my other instance was ... in Fremont.


We had just launched a test server about 40 minutes ago and had been waiting for it to boot for ages when I finally thought to check the status page. Talk about bad timing.


[deleted]


Linode has multiple "availability zones" aka datacenters so you can achieve the same redundancy with them. I would guess that linodes have better uptimes than aws but I don't have any numbers to actually back that up.


Knowing the Linode team's development pathway and philosophy, it is likely this outage will force them to offer redundancy as their standard offering.

Backup was unavailable for a long time and now it is standard.


If by standard, you mean you have to pay for it, prorated to the Linode disk space size, then yeah, it's standard.


Still wait for the reboot jobs. maybe I should move to EC2.


Yeah .... our iPhone app downloads content from our server - so far we've got 3 1-star reviews saying 'Your app sucks - it can't download any content'

Damn .........


http://aws.amazon.com/ec2-sla/

Their SLA says 99.95% uptime. What has your Linode uptime been?


my server is back.


The Fremont datacenter, if I remember correctly, is at Hurricane Electric. They had power issues in 2009 as well.

http://blog.feedly.com/2009/11/03/24-minutes-of-unscheduled-...


Going on 4 hours of downtime here, this is a disaster. Linode used to be really honest about the status of things, this time, they claim "most linodes should be booted" which seems not to be true.


3:30am (EST): Some hosts were damaged by the power outage and we are working on moving these Linodes to hot standby equipment.


When it went down, I quickly realized my addiction to Twitter via irssi.


What are you using?


I get access to Twitter via IRC by using Bitlbee.


I don't use Bitlbee, but I had no idea that it supported micro-blogging. I thought that it was just IM protocols.


Bitlbee currently supports Jabber, MSN, OSCAR (AIM/ICQ), Yahoo and Twitter

There is also a Twitter/Jabber gateway for those not using Bitlbee: https://www.tweet.im/


Twirrsi


I wonder if this was related to the crappy weather in the south bay? My power was out for from 8 until 8:45ish (PST). Of course, I would hope their datacenter has a UPS and generators...


Latest News:

3:30am (EST): Some hosts were damaged by the power outage and we are working on moving these Linodes to hot standby equipment.

Please tell me that doesn't mean potential data loss!


>> Please tell me that does mean potential data loss!

Looks like it's not. Our server just went up - after the damage notication. Data seems to be intact.


> Please tell me that doesn't mean potential data loss!

That's not a problem. You have backups of course.


Typically by that they mean they take the hard drives out and slot them into some spare servers.


DOES? I'm beginning to doubt your commitment to sparkle motion...


Grrr... my one remaining down Linode has been down for a bit over 3 hours now (extrapolating from the Dashboard graphs). Sort of ... quite bad.


there was a brief outage at the hurricane electric datacenter in fremont, probably due to the storm that came through the bay area this evening. our servers (not part of linode) went down, too, but were back up a couple minutes later. i can't confirm the cause, though, b/c i can't reach anyone at he.net on the phone. =(


They are saying that linodes are on their way back up. Not mine though... alas. Anyone else have any more luck?


Still waiting. There is a queued task to restart the server about "36 years 10 months ago".


Yeah - that message really bugs me for some reason. It's as though it injects the fear that something is DEEPLY wrong straight into my chest...


The just queued the restart at some very low value of a unix timestamp, which translates to 36 years ago. Strangely, not the unix epoch at midnight Jan 1 1970.


Old post, but...

FYI, it uses the epoch of caker -- i.e. the birthday of caker, Linode's founder.


Oh boy. Screenshot that!


Half of mine are up, still waiting on the other half. More than 2.5 hours into it now.


That's encouraging at least.


Nope. Ours has been down for over 2 hours now - and it's nearly an hour since the 'linodes are on their way up' message :(


Just got a message saying that the server is booting up .... 3 hours :(


Most of mine are up. Just waiting for one more.


Mine are all still down.


Suggest also hanging out in #linode on irc.oftc.net. Perihelion is being very patient and answering questions.


I would, if I could ssh to my linode.


That is an IRC channel. You shouldn't need your linode to use IRC?


I access irc via screen + irssi on my linode box. I'm in the same boat as him right now. Revering to prgmr for irc for the time being.


True, except for that how I access IRC, server side.


And certificate errors on https://linode.com.


Go to https://www.linode.com instead. They've got a wildcard SSL certificate for *.linode.com, which will not work (and has never worked) without a subdomain like "www".


I still have 2/3 servers down, one of them is still "Powered off" in the console.


Same here.


And here as well. Am I justified in being pissed about this outage, or am I failing in my responsibilities by relying on a company like linode to keep my site up?


If you get a single server/vps from anyone, you should assume that it might go down for a few hours at any point. If that's a problem for you, you should create your own redundancy. No matter how much redundancy your provider claims.


Glad my linode is in the New Jersey data center.


Might this be related to current prgmr outages?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: