Hacker News new | past | comments | ask | show | jobs | submit login
Heroku Isn't for Idiots (rdegges.com)
114 points by niallsmart on Aug 30, 2012 | hide | past | favorite | 58 comments



I love this post, and hope it keeps coming up. Some day, developers will learn that even though they could do it as well as Heroku, given enough time and effort, it's just a lot cheaper not to!

This comes up all the time with my company. We make hosted Continous Integration (https://circleci.com), and often hear "can't I just set up Jenkins?". And the answer is the same, "you could, but ...". Run it on EC2, where your tests fail because the IO is bad? What about when three people push at once and you want to get results ASAP? Are you going to manually compile Postgres 9.1? And again when you add a second box to you cluster?

I could go on.


... except heroku doesn't do multi-region, which is where things start getting tricky. They make the easy immediate, but haven't really tackled the hard (yet.)


We use heroku, and every now and again that terrifies us. I'm not sure which will come first: Heroku going multi-region, or us having an outage due to another AWS problem and switching providers.

:fingers crossed:


>or us having an outage due to another AWS problem and switching providers.

Why, because the other providers (or a roll-your-own solution) will not have outages?

Or because the customers cannot tolerate even the more or less half a day of outage per year AWS has?

Even critical businesses like banking have work stoppages far more often than that for tons of reasons, in both the physical world and the digital/networked one.


Good to know Circle is running on Heroku. At least now we'll know why our CI isn't up when there's another lightning storm in North Carolina.

Heroku does make a lot of sense: it's an abstraction. You're using a system designed to do a lot of the heavy lifting so you don't have to deal with those details, and for many startups and small operations, it's a great value proposition.

But you have to admit that it is only a value proposition; and fundamentally a tradeoff. The truth is, especially from the perspective of a seasoned operations manager, all of the things above are just not that hard, nor are any of the complaints outlined in the OP article.

The more layers you put between you and your infrastructure, the harder it will be to control your availability. When EBS's start failing and you have no idea why and Amazon only shows a green dot on their status page with a cryptic message like "minor availability issues experienced in certain availability zones; investigating." Now that is hard.

Cloning a server and keeping a mirrored backup in another data center if you really need the availability? On hardware you have full control over? Not that hard. http://whoownsmyavailability.com/


I tried out Circle CI... I have some feedback if you want.

First the UI could use some work, but it was really awesome seeing it pull down a Flask app and knowing to activate venv and install pip.

Also, not having an option to delete your account is a bit scary, seeing as I'm part of an organization on github and not comfortable with you having access to that code.


Cool, thanks for trying! The UI could definitely use work, and we're working on it.

There is an option to delete your account. One option is to contact us (I know, that's not great, but we delete same day). The other option is to rip out Circle's authentication via GitHub, if you're worried about us doing it properly.


Hey, BTW the join using github button at the bottom of the page doesn't work (links to undefined), the one at the top does. Probably want to fix that :)


Cool, thanks!


The persistence scaling story for Heroku seems pretty questionable to me. Once you've maxed out what they offer for MySQL and Postgres, what exactly are you supposed to do? Start using EC2 RDS?

Heroku seems to be like a more useful Google App Engine, a good place to host a blog or experimental project if you're not into dev-ops.

If you have a knack (at all) for dev-ops, you're not saving yourself anything.

The downtime is pretty bleh too. The moment you start doing multi-provider to offset this, you'll end up doing all the dev-ops work you would've had to have done before. Except now, you have to do it all at once in a time of what is probably high stress.

If you do the dev-ops/automation yourself from the start, you can start small/simple and grow that as you go, deploying your services to arbitrary hosting providers (EC2, Linode, dedicated boxes, whatever).

This is why whenever anybody asks me my opinion of Heroku, I respond, "it's a great place to host that blog engine you wrote in Haskell/Clojure/{hipster_language_of_choice}".


There are definitely reasons that you would want to be on your own managed metal infrastructure, but let's not exaggerate so much and say that it's only use is for blogs and experimental projects. Why do people have to always be so extreme to try and make a point?

There are a ton of real sites that never see more than a couple hundred thousand visitors a month. E-commerce sites, mobile APIs, SAAS apps, etc. I've used Heroku for higher traffic sites than this without any issues at all. Incredibly easy actually.

Saying it's only good for your hipster blog? C'mon man really? Have you even done anything commercial / critical on their platform?


So Heroku is good for higher margin sites. You couldn't run Facebook, because the value of each customer is marginal, and they are expensive customers.

You could host a B2B app no problem, because they are paying (high margin) customers, and will probably not use too many resources.


No platform could run Facebook. There is a certain scale where you have to do it yourself because no one else has done it that way before.


> The persistence scaling story for Heroku seems pretty questionable to me.

Agree. Those people who tell horror stories about Heroku/EC2, usually give solid numbers, i.e. this was how much I spent, this is how much I save by moving away, and our response time is now X% faster.

On the other hand, we have article like this that shows pretty graph for a web app serving 16.2 requests per minute, and make bold claims that everything will scale.


Here's my pretty graph for serving ~500,000 requests per minute on Heroku:

https://api.playtomic.com/load.html

Savings aren't exclusively because of Heroku, I also switched the underlying architecture during that migration from C# / ASP.NET to NodeJS which is exceptionally well suited to what I'm doing there.

Previously: Dedicated servers, ~$1600 a month

8x dedicated servers at ~$200 each, each running 3 to 5 instances of the API depending on how many IP addresses they were provisioned with. Uploading was done via a simple hand-rolled script that'd just FTP everything to each server.

Now: Heroku, current usage $400 - $500 a month

With Heroku I don't have to worry about concurrent connections (typically 200 - 400 thousand people at once), I don't have to maintain all those servers and I don't have to fuck around with all the stupid things that can go wrong when you're operating at scale.

It was a lot of work to get to this point and I made a lot of mistakes like having a heavy redis pub/sub outside of the EC2 network that cost me $350 in excess bandwidth providing inter-dyno communication, and I saturated database connections lots of times because in the old days those dedicated servers each had a local mongodb which could keep up with ordinary connection pooling, but it was totally worth it.


So let's assume thats $500 in dynos, that's approximately 14 dynos. You have 30k concurrent users per dyno using Node.JS?

It's not that I don't believe you, I just think you don't understand what you're saying.

(Actually, it is that I don't believe you)

That said, if this is if you're doing 500 requests/sec (that's very different than 500 concurrent users) per dyno, good for you. My main bottleneck was not so much CPU on the web machines (I hit memory limitations), but the database layer.


Just over 20,000,000 people hit my API yesterday 700,749,252 times, playing the ~8,000 games my analytics platform is integrated in for a bit under 600 years in total play time. That's just yesterday.

There are lots of different bottlenecks waiting for people operating at scale. Heroku and NodeJS, for my use case, eventually alleviated a whole bunch of them very cheaply.


>>> 20000000/24/60/60.0 ~231.46666666666667 requests a second?


That's people, not requests. :)


Is your graph showing concurrent (as in active connection) users?


The graph shows http requests, each individual person sends 'event.general...' requests approximately 2x per minute.

In the last minute ~460,000 of those requests were made which means somewhere around 230,000 people sent data although there's room for that to be higher or lower depending on sessions starting, sessions ending, and just what period of time you want 'concurrent' to live within.


So, another pretty graph for Heroku, yet no comparison to alternatives due to the total architecture change. The numbers are useless for comparison purpose.


Your app isn't "serving" anything up, it's an event API. 100% different use-case than what I was talking about.


You keep using that word "serving". I don't think it means what you thing it means. Parent is spot on.


>The persistence scaling story for Heroku seems pretty questionable to me.

I said this and was talking exclusively about this. Read my original comment.


>16.2 requests per minute

Was it written in TCL? ;)


Just because you can do the devops work doesn't mean that's the best use of your time. Particularly on a small team (or in my current situation, sole developer). Heroku makes a lot of sense in that case. And since for the most part the follow standard development practices, there's little to no lock-in when you actually have the money to hire a sysadmin


Some people may be defending the Platonic ideal of Heroku (multi-AZ, NewSQL, etc.) rather than Heroku as it stands today.


Pretty sure you're right, most seem like they haven't done anything with Heroku than host their 200 uniques a month blog.


heretotroll: Heroku provides plenty of tunable capacity.


Considering their only support for sharding or replication is their async read-only slaves mechanism, I'm going to assume you don't know what you're talking about and think the difference between "1/10th of an overloaded EC2 machine running postgres" and "a whole slow EC2 machine running postgres" somehow constitutes "scalability".


Cool bro.

Who said I'm using postgres? And if you don't like the postgres setup, sure, roll your own. EC2 has those badass SSD instances for you.

Apps' that need more capacity then what Heroku offers can surely drop the dev-ops dollars. Until then, why waste your time?


> Heroku seems to be like a more useful Google App Engine, a good place to host a blog or experimental project if you're not into dev-ops.

Considering Heroku is single-homed, I find it hilarious that you say it's less of a toy than GAE.


I was giving it the benefit of the doubt since there wasn't as much platform specific lock-in as GAE.

But yeah, the single-homing is pants-on-head.


> there wasn't as much platform specific lock-in as GAE

You can always host your App Engine application on top of an AppScale or Typhoonae install.

I wish they were easier to set up, but, if your Google bill starts getting high, you can dedicate the resources required to do it. And multi-homing them is a whole different game.



I'd love to see an example of someone saying Heroku is for idiots. Although it's a bit pricey[1], I love it to pieces because it lets me focus on what I like to do: shipping product.

That said, I'd be delighted if Heroku would introduce a high-memory dyno. I've been working on something for the past few days where their soft'ish 512MB cap has been biting me in the ass.

[1] Assuming you value your time at something around $0/hour.

edit: thanks for the link!


This rant is a reply to http://justcramer.com/2012/06/02/the-cloud-is-not-for-you/ which is indeed trashing Heroku.


I wasn't aiming to trash Heroku specifically, but the post by Randall is very uninformed.


It's not trashing Heroku specifically, it's actually a pretty good read and highlights real problems.


This is just like the garbage collection versus manual memory management debate.


I also save significant time and money using Heroku and it's awesome being able to scale up and down automatically using HireFireApp. Most of this article rings true with my experience however some of it doesn't feel that valid for me having a high volume NodeJS app, not to mention the entire "Let's talk about bad ideas" section is just lame.

1) If your app crashes it takes ages to restart dynos "automatically", there's nothing at all instant about it and if it's a bug and you have a high volume of requests it's going to hit every dyno which means you are offline.

2) Performance can be variable and it can be hard to be sure an optimization has done anything. This will be easier when New Relic supports NodeJS but right now you're stuck using less elegant solutions. It is shared hosting and it's not necessarily anybody's fault if something is slow, and I know this because the same requests frequently have orders of magnitude difference in response time for me.

    dyno=web.10 queue=0 wait=0ms service=3ms status=200 bytes=25
    dyno=web.7 queue=0 wait=0ms service=207ms status=200 bytes=28
Between Heroku and NodeJS I run my API server usually on just 8 dynos doing 6,000 - 10,000 requests per second and having come from C# and dedicated Windows servers it is a dream - nothing to maintain and easy deployment and easy debugging on Heroku's side, and NodeJS is just amazing once you start realizing what's possible with it.


I use Hirefire as well for several sites. For the most part I like it, however it can be awfully slow to react to traffic spikes. I highly recommend leaving a bit of a buffer (few extra dynos) during prime time traffic.


You are completely correct, the default settings are not good and it will aggressively scale your app down too far if you let it.


Can you shed some light on how you are using HireFireApp with node.js? I thought this was only good for Rails/rack frameworks.

Maybe I misunderstood and your main app is rails with some of the other functionality in node.


I couldn't get it to work initially, it just would not recognize my app at all and I couldn't add it.

I sent an email and got sent to this new link (not sure if it's launched) which I believe covers the entire Cedar stack rather than anything particular to NodeJS, it pings a provided URL.

http://manager.hirefireapp.com/


Awesome. I'll check it out.


<quote> Heroku is Just Unix

At its core, Heroku is just a simple unix platform; specifically, Ubuntu 10.04 LTS. </quote>

I just threw up a little


I saw someone else have a bad reaction to this statement, but I might be to blame for this. See:

http://news.ycombinator.com/item?id=4062983

But let me try to frame it more carefully: at the time (and maybe still) a lot of providers expect you to use special APIs endemic to their platform, and you can't get the real-deal implementation or a drop-in replacement if you want to model the production environment more closely or move off the platform. Even if you get the real-deal implementation, running it may be infeasibly hairy.

Heroku very assiduously tries to avoid this, and there is a downside: we compromise with the problems in existing software that people like to use as-is, and to fix those problems we have to somehow get that software fixed, too.


Why?


I'm going to guess it's because Ubuntu was called "unix" (which actually raised my eyebrow as well). Although similar, unix != linux, and Ubuntu especially != unix, as opposed to a distro like Slackware, which is more or less unix with a linux kernel slapped in.

I got what the author meant though, and it's not that big a deal.


There is "Unix", an operating system originated at AT&T and "unix", a generic name that usually refers to a family of operating systems that are based more or less on the same ideas. Linux is a lot closer to AT&T's ideas of Unix than other certified Unixes like OSX and AIX. Linux is not Unix, but it certainly is a unix.

When I really want to annoy my BSD friends, I call it a "Linux-like" operating system. It never fails.


"Each instance (Heroku calls them dynos), has:

512MB of RAM, 1GB of swap. Total = 1.5GB RAM. 4 CPU cores (Intel Xeon X5550 @ 2.67GHz)."

That is very interesting. So you get 200% of the CPU from a c1.medium but only 1/6 that memory? For 1/8 the price that Amazon charges? Maybe this is a the new math I keep hearing about.


I would guess that Heroku runs ~29 dynos per m1.xlarge, so each dyno gets a minimum of ~1/7th of a core or ~366 MHz. Just because you can see 4 cores doesn't mean you can use them.


One day I noticed that Heroku gives you access to netstat, so I started trying to figure out the actual number.

Running this:

    heroku run "netstat -l | grep lxc | wc -l"
...seems to imply that it's actually 100 ± 25 dynos per instance.

(It's possible I could be entirely misunderstanding the netstat output, in which case someone please speak up.)


Yes, I know that. The OP doesn't seem to know that. This is a problem, because the piece he is responding to in a large part is concerned with the meager resources provided by a single Dyno.


Honest question: Why do startups continue to setup their own infrastructure on AWS with ops? Scaling? Cost?


"Heroku Isn't for Idiots " But going by current trends there could soon be a "Heroku for Idiots" :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: