Web Framework Benchmarks

goodwink · on March 28, 2013

One of the most interesting things this comparison brings out to me is not so much the differences between the various frameworks (although the differences between options on the same platform is definitely very useful information), but also the issue that few of us seem to think about these days: the cost of any of the frameworks above the bare standard library of the platform its hosted on.

Theres a consistant, considerable gap between their "raw" benchmarks (things like netty, node, plain php, etc.) and frameworks hosted on those same platforms. I think this is something we should keep in mind when we're tuning performance-sensitive portions of APIs and the like. We may actually need to revisit our framework choice and implement selected portions outside of it (just like ruby developers sometimes write performance-critical pieces of gems in C etc.) or optimize the framework further.

I'd like to crunch these numbers further to get a "framework optimization index" which would be the percentage slowdown or ratio of performance between the host platform and the performance of the framework on top of it. I might do this later if I get a chance.

jondot · on March 28, 2013

I think this is a much needed and excellent point to make. Just take a look at how Go dips down when using Webgo.

bradrydzewski · on March 28, 2013

I used Go's benchmarking tool to compare raw routing performance of various frameworks. The handlers all return a simple "Hello World" string. Here are the results:

  PASS
  Benchmark_Routes           100000	     13945 ns/op
  Benchmark_Pat              500000	      6068 ns/op
  Benchmark_GorillaHandler   200000	     11042 ns/op
  Benchmark_Webgo            100000	     26350 ns/op
  ok  	github.com/bradrydzewski/routes/bench	12.605s

I then ran the same benchmark, but this time I modified the handler to serialize a struct to JSON and write to the response. Here are the results:

  Benchmark_Routes              100000	     21446 ns/op
  Benchmark_Pat                 100000	     14130 ns/op
  Benchmark_GorillaHandler      100000	     17735 ns/op
  Benchmark_Webgo                50000	     33726 ns/op
  ok  	github.com/bradrydzewski/routes/bench	9.805s

In the first test, Pat is almost twice as a fast as the Gorilla framework. In the second test, when we added a bit more logic to the handler (marshaling a struct to JSON), Pat was only about 18% faster than Gorilla. In fact, it turns out it takes longer to serialize to JSON (8000ns) than it does for Pat to route and serve the request (6000ns).

Now, imagine I created a third benchmark that did something more complex, like executing a database query and serving the results using the html/template package. There would be a negligible difference in performance across frameworks because routing is not going to be your bottleneck.

I would personally choose my framework not just based on performance, but also based on productivity. One that can help me write code that is easier to test and easier to maintain in the long run.

pifflesnort · on March 28, 2013

rorr, you appear to be hellbanned. Here's your comment, since it seemed like a reasonable one:

> Now, imagine I created a third benchmark that did something more complex, like executing a database query and serving the results using the html/template package. There would be a negligible difference in performance across frameworks because routing is not going to be your bottleneck. If you're performing a DB query on every request, you're doing something wrong. In the real world your app will check Memcached, and if there's a cached response, it will return it. Thus making the framework performance quite important.

bradrydzewski · on March 29, 2013

Ok, so I added a third benchmark where the handler gets and item from memcache (NOT from a database). Here are the results:

  PASS
  Benchmark_Routes             10000	    234063 ns/op
  Benchmark_Pat                10000	    233162 ns/op
  Benchmark_GorillaHandler      5000	    265943 ns/op
  Benchmark_Webgo               5000	    348349 ns/op
  ok  	github.com/bradrydzewski/routes/bench	10.062s

Notice the top 3 frameworks (pat, routes and Gorialla) have almost identical performance results. The point being is that routing and string manipulation are relatively inexpensive when compared to even the most lightweight TCP request, in this case to the memcache server.

codewright · on March 29, 2013

By the way:

https://plus.google.com/u/0/115863474911002159675/posts/L3o9...

More SPEEEEEEEEEEEEED coming down the pipe in 1.1 for Go's net/http. :)

goodwink · on March 28, 2013

I think Go is almost the ideal example here, you're right. Go provides a pretty rich "standard library" for writing web serving stuff so it's a good place where you could really imagine writing your performance critical stuff just on the base platform even if you use something like Webgo for the rest of your app.

Some of the other platforms are much less amenable to that since the standard primitives the platform exposes are very primitive indeed (servlets api, rack api, etc.). Perhaps there's some value in looking at how your favorite framework stacks up against its raw platform and trying to contribute some optimizations to close the gap a bit.

grey-area · on March 28, 2013

I'm curious about that - because there's so little to webgo I suspect the answer is something really trivial. I haven't really looked at it before, but the framework itself is just 500 lines or so unless I'm looking at the wrong one...

Given that the json marshalling and server components would be exactly the same between go and webgo, I'm curious as to whether changing the url recognised to be just /json in the goweb tests would make any difference, any reason it was different?

grey-area · on March 28, 2013

Just had a look at the tests and the urls responded to differ:

http://localhost:8080/json

http://localhost:8080/(.*)

Shouldn't all these examples at least be trying to do the same sort of work? For such a trivial test differences like that could make a huge difference to the outcome.

It's great to see replicable tests like this which show their working, but they do need to be testing the same thing. I also think they should have something more substantial to test as well as json marshalling on all the platforms, like serving an HTML page made with a template and with the message, as that'd give a far better indication of speed for tasks web servers typically perform.

Still, it's a great idea and really interesting to see this sort of comparison, even if it could be improved.

pfalls · on March 28, 2013

One of the next steps we'd like to take is to have a test that does cover a more typical web request, and is less database heavy than our 20 query test, just like you describe. Ultimately, we felt that these tests were a sufficient starting point.

grey-area · on March 28, 2013

I was a little confused by the different urls used in the tests, as for this sort of light test, particularly in Go, where all the serving stuff is common between frameworks, you're mostly going to be testing routing. Any reason you chose a different route here? (/json versus /(.★) )?

I can't think of much else that this little web.go framework does (assuming the fcgi bits etc are unused now and it has moved over to net/http). I don't think many people use web.go, gorilla and its mux router seems to be more popular as a bare bones option on Go, so it'd possibly be interesting to use that instead. It'd be great to see a follow up post with a few changes to the tests to take in the criticisms or answer questions.

While you may come in for a lot of criticism and nitpicking here for flaws (real or imagined) in the methodology, I do think this is a valuable exercise if you try to make it as consistent and repeatable as possible - if nothing else it'd be a good reference for other framework authors to test against.

voidlogic · on March 28, 2013

Webgo... I'll stick with "net/http" and Gorilla thanks.

Also, They used Go 1.0.3... I hope they update to 1.1 next month. Most everyone using Go for intensive production uses is using Go tip (which is the branch due to become 1.1 RC next month)

pfalls · on March 28, 2013

This is great to know, we were hesitant to use non-stable versions (although we were forced to in certain cases), but knowing that it's what is common practice for production environments would change our minds.

voidlogic · on March 28, 2013

We switched to using tip after several Go core devs recommended that move to us, the folks on go-nuts IRC agreed and we tested it and found it to be more stable than 1.0.3

goodwink · on March 28, 2013

Wow, directly on tip? That seems to speak very highly of the day-to-day development stability of Go.

voidlogic · on March 28, 2013

A good tip build tends to be more stable then 1.0.3 and has hugely improved performance (most importantly for large application in garbage collection and generation).

To select a suitable tip build we use http://build.golang.org/ and https://groups.google.com/forum/?fromgroups#!forum/golang-de... . My recommendation would be to find a one or two week old build that passed all checks, do a quick skim of the mailing list to make sure there weren't any other issues and use that revision. Also, you will see some the the builders are broken-

Of course if your application has automated unit tests and load tests, run those too before a full deployment.

goodwink · on March 28, 2013

Thanks, this comment really helped me in my evaluation of Go today. I had been playing around with 1.0.3 for a couple days, but tip is definitely where it's at.

voidlogic · on March 28, 2013

I'm glad I could help. Go 1.1 RC should be out early next month. So if you want you could wait for that (for production use).

papsosouid · on March 28, 2013

Or it could speak poorly of their release process, which is more accurate. The stable release is simply so bad compared to tip that everyone uses tip. There should have been multiple releases since the last stable release so that people could get those improvements without having to run tip.

pekk · on March 28, 2013

Why not both? Insisting on a very stable API can result in long times between releases, which can mean more people using tip. That's distinct from how stable tip is.

papsosouid · on March 28, 2013

Given the frequent complaints that the previous stable release isn't very stable, I think trying to interpret it as "tip is super stable" is wishful thinking. Tip is currently less bad than stable. The fact that stable releases are not stable is a bad thing, not a good thing.

voidlogic · on March 28, 2013

What does stable mean? If stable means there are not unexpected crashes, then Go 1.0.3 is extremely stable.

If stable means suitable for production, Go tips vastly improved performance, especially in regards to garbage collection, make it more suitable than 1.0.3 for large/high-scale applications in production.

bilbo0s · on March 28, 2013

I'm not sure many people would use webgo in real life. I don't know... maybe some people... certainly not pros.

Also, the 1.0.3 thing is probably dragging on the numbers a bit. 1.1 would boost it a little. Not enough to get it into the top tier... but a little.

Also, for Vert.x, they seem to be only running one verticle. Which would never happen in real life.

Play could be optimized a bit... but not much. What they have is, to my mind, a fair ranking for it.

Small issues with a few of the others but nothing major. I think Go and Vert.x are the ones that would get the biggest jumps if experts looked at them. And let's be frank... does Vert.x really need a jump?

So what they have here is pretty accurate... I mean... just based on looking through the code. But Go might fare better if it used tip. And Vert.x would DEFINITELY fair better with proper worker-nonworker verticles running.

jholla14 · on March 28, 2013

The Play example was totally unfair since it blocks on the database query which will block the underlying event loop and really lower the overall throughput.

bilbo0s · on March 28, 2013

Well... to be fair...

the Vert.x example, as configured, blocks massively as well waiting for mongodb queries.

goodwink · on March 28, 2013

Could you point me at an example of an idiomatic, non-trivial Go REST/JSON API server? I've been trying for a while to find something to read to get a better handle on good patterns and idiomatic Go, but I haven't really come up with anything. I've found some very good examples of much lower-level type stuff, but I think I have a decent handle on that type of Go already. What I really would like is a good example of how people are using the higher level parts of the standard library, particularly net/http etc.

bilbo0s · on March 28, 2013

Sorry... I'm not really a book kind of guy when it comes to this stuff. The golang resources are mostly what I use.

pfalls · on March 28, 2013

for Vert.x, we specified the number of verticals in App.groovy, rather than on the command line, which we think is a valid way to specify it.

bilbo0s · on March 28, 2013

OK... I ran the Vert.x test... runs a bit faster here with 4 workers instead of 8. I suspect what is happening there is that at times all 8 cores can be pinned by workers, while responses wait to be sent back on the 8 non workers. But not that big a change in speed actually. One thing more, when you swap in a couchbase persister for the mongo persister it's faster yet. The difference is actually much larger than the difference you get balancing the number of worker vs non worker verticles. Also thinking that swapping gson in for jackson would improve things... but I don't think that those are fair changes. (well... the couchbase may be a fair change)

Also tested Cake just because it had been a while since I have used it... and I couldn't believe it was that much worse than PHP. Your numbers there seem valid though given my test results. That's pretty sad.

Finally, tried to get in a test of your Go stuff. I'm making what I think are some fair changes ... but it did not speed up as much as I thought. In fact, initially it was slower than your test with 1.1.

So after further review... well done sir.

commentzorro · on March 29, 2013

That first line is truly one of the best comments I've seen when discussing languages. I've clipped it and will use it from now on:

>> I'm not sure many people would use [xxx] in real life. I don't know... maybe some people... certainly not pros.

smandou · on March 28, 2013

The Play app uses MySql in a blocking way, while Nodejs uses mongo. It's not comparable.

bhauer · on March 29, 2013

We have a pull request that changes the Play benchmark (thank you!) so we will be including that in a follow-up soon.

We tested node.js with both Mongo and MySQL. Mongo strikes us the more canonical data-store for node, but wanted to include the MySQL test out of curiosity.

nobleach · on March 29, 2013

That is bad benchmarking!

cacois · on March 28, 2013

I'd love to see your framework optimization index. Honestly, all of this would be a wonderful thing to automate and put in a web app - a readily-accessible, up-to-date measure of the current performance of the state of the art in languages and frameworks. I bet it would really change some of the technology choices made.

goodwink · on March 28, 2013

Here's a quick version of the framework optimization index. Higher is better (ratio of framework performance to raw platform performance, multiplied by 100 for scale):

Framework Framework Index

Gemini 87.88

Vert.x 76.29

Express 68.85

Sinatra-Ruby 67.88

WebGo 51.08

Compojure 45.69

Rails-Ruby 31.75

Wicket 29.33

Rails-Jruby 20.09

Play 18.02

Sinatra-Jruby 15.96

Tapestry 13.57

Spring 13.48

Grails 7.11

Cake 1.17

rallison · on March 28, 2013

In the same vein, I was curious to compare the max responses/second on dedicated hardware vs ec2 on a per framework basis. The following is percentage throughput of ec2 vs dedicated (in res/s):

cake 18.9% (312 vs 59)

compojure 12.1% (108588 vs 13135)

django 16.8% (6879 vs 1156)

express 16.9% (42867 vs 7258)

gemini 12.5% (202727 vs 25264)

go 13.3% (100948 vs 13472)

grails 7.1% (28995 vs 2045)

netty 18% (203970 vs 36717)

nodejs 15.6% (67491 vs 10541)

php 11.6% (43397 vs 5054)

play 20.6% (25164 vs 5181)

rack-jruby 15.6% (27874 vs 4336)

rack-ruby 22.7% (9513 vs 2164)

rails-jruby 22.7% (3841 vs 871)

rails-ruby 20.7% (3324 vs 687)

servlet 13.4% (213322 vs 28745)

sinatra-jruby 21.2% (3261 vs 692)

sinatra-ruby 22.2% (6619 vs 1469)

spring 7.1% (54679 vs 3874)

tapestry 5.2% (75002 vs 3901)

vertx 22.3% (125711 vs 28012)

webgo 13.5% (51091 vs 6881)

wicket 12.7% (66441 vs 8431)

wsgi 14.8% (21139 vs 3138)

I found it interesting that something like tapestry took a 20x slowdown when going from dedicated to ec2, while others only took ~5x slowdown.

Edit: To hopefully make it clearer what the percentages mean - if a framework is listed at 20%, this means that the framework served 1 request on ec2 for every 5 requests on dedicated hardware. 10% = 1 for every 10, and so on. So, higher percentage means a lower hit when going to ec2.

Disclosure: I am a colleague of the author of the article.

oldpond · on March 29, 2013

You're saying that running a query across the internet to ec2 is 5 times faster than running it on dedicated hardware in the lab? I find that hard to believe.

rallison · on March 29, 2013

Sorry, maybe my original post was not entirely clear. Let's take tapestry, for example. On dedicated hardware, the peak throughput in responses per second was 75,002. On ec2, it was 3,901 responses per second.

So, in responses per second, the throughput on ec2 was 5.2% that of dedicated hardware, or approximately 20 times less throughput. The use of the word slowdown was possibly a bad choice, as none of my response had to do with the actual latency or roundtrip time of any request.

goodwink · on March 28, 2013

These could probably be further broken down into micro-frameworks (like Express, Sinatra, Vert.x etc.) and large MVC frameworks (like Play and Rails).

Gemini is sort of an outlier that doesn't really fit either category well, but the micro-frameworks have a fairly consistently higher framework optimization index than the large MVC frameworks which is as expected.

Express and Sinatra really stand out as widely-used, very high percentage of platform performance retained frameworks here. I've never used Vert.x, but I will certainly look into it after seeing this. I'm very impressed that Express is so high on this list when it is relatively young compared to some of the others and the Node platform is also relatively young.

Play seems particularly disappointing here since it seems any good performance there is almost entirely attributable to the fast JVM it's running on. Compojure is also a bit disappointing here (I use it quite a bit).

jholla14 · on March 28, 2013

The play test was written totally incorrectly since it used blocking database calls. Since play is really just a layer on top of Netty it should perform nearly as well if correctly written.

goodwink · on March 28, 2013

I believe they're encouraging pull requests to fix that sort of thing. It will be interesting to see if it helps to that degree; I hope so!

abalone · on March 29, 2013

But Play's trivial JSON test was much slower than Netty's.

mrspeaker · on March 29, 2013

That's because they inexplicably use the Jackson library for the simple test, rather than Play's built in JSON support (they use the built-in JSON for the other benchmarks).

rzidane · on March 29, 2013

Both Netty and Play use Jackson though one the Netty version uses a single ObjectMapper and the Play version uses a new ObjectNode per request (created through Play's Json library).

ms-tg · on April 1, 2013

I hope this gets fixed!

abalone · on March 29, 2013

No, they use Play's JSON lib. It's kind of a moot point because Play's lib is in fact a wrapper for Jackson.

Here's the source: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

So the question stands: If Play & Netty are using the same JSON serialization code, why is Play seven times slower?

ctide · on March 28, 2013

I'd imagine that being relatively young is an advantage in a test like this. You're not utilizing any features, and features are what slow down requests. The less features something has, the faster it should perform in these trivial tests.

goodwink · on March 29, 2013

That's a very good point; I hadn't thought of it that way. Maybe this is some small part of why we seem to keep flocking to the new kids on the block.

bhauer · on March 28, 2013

It's funny you should put this together because in an earlier draft of this blog entry I had created a tongue-in-cheek unit to express the average cost per additional line of code. Based on our Cake PHP numbers, I wanted to describe PHP as having the highest average cost per line of code. But we dropped this because I felt it ultimately wasn't fair to say that based on the limited data we had and it could be easily interpreted as too much editorializing. Nevertheless, as you point out, it's interesting to know how using a framework impacts your performance versus the underlying platform.

I too wanted Play to show higher numbers. There's certainly a possibility we have something configured incorrectly, so we'd love to get feedback from a Play expert.

asadkn · on March 28, 2013

Good thing you opted against sensational journalism.

About the cost per additional line of code for PHP, it mainly comes from not having an opcode cache and having to load and interpret files on every visit. mod_php was and will always be trash. I commented earlier about it too.

In case of Ruby, and talking about Rails, even when using Passenger, the rails app is cached via a spawn server. That's not the case with PHP.

Similarly, Python has opcode cache built-in (.pyc files). Also, I am not sure about gunicorn but others do keep app files in-memory.

Servlet does the same about keeps apps in memory. You get the idea.

Frameworks definitely have an impact but it's very hard for one person to know the right configuration for every language. You had done some good work there, but it will take a lot of contributions before the benchmark suite becomes fair for every language/framework.

bhauer · on March 29, 2013

We've received a lot of great feedback already and even several pull requests. Speaking of, we want to express a emphatic "Thank you" to everyone who has submitted pull requests!

We're hoping to post a follow up in about a week with revised numbers based on the pull requests and any other suggested tweaks we have time to factor in. We're eager to see the outcome of the various changes.

pankajkhairnar · on April 3, 2013

I am completely agree with you, this is not proper bench marking as opcode caching is missing in php, benchmarking should be re calculated by configuring APC.

sluukkonen · on March 28, 2013

For Play, you'll want to either 1) handle the (blocking) database queries using Futures/Actors that use a separate thread pool (this might be easier to do in Scala) or 2) bump the default thread pool sizes considerably. The default configuration is optimized purely for non-blocking I/O.

See e.g. https://github.com/playframework/Play20/blob/master/document... and https://gist.github.com/guillaumebort/2973705 for more info.

rzidane · on March 29, 2013

Why would the JSON serialization test perform so poorly though?

kclay · on March 28, 2013

I'm pretty shocked that Play scored so low. One would think that being built on netty would put Play in a higher rank. Database access for sure needs to be in async block

abalone · on March 29, 2013

It's not just the DB test. The JSON test was way slower than Netty, too.

bhauer · on March 28, 2013

Agreed. We need to get the Play benchmark code updated to do the database queries in an asynchronous block. Accepting pull requests! :)

marcamillion · on March 28, 2013

Not quite sure I understand your point re: the cost of the frameworks, above the bare standard library.

Do you mind breaking it down for me a bit please?

Cost in dollars or cost in hardware utilization or some other cost?

goodwink · on March 28, 2013

If you have developers who for one reason or another prefer a given platform, then the most important performance comparisons are about how close various frameworks on that platform get to the performance of the platform itself.

Knowing how much I'm giving up in performance in order to get the features a given framework gives me is an important consideration. Also understanding when it's worthwhile to work outside the framework on the bare platform given the speedup I'll get versus the cost I'll incur by doing so is a very important optimization decision-making tool.

pekk · on March 28, 2013

How do you derive how much performance you are giving up from these benchmarks? There is not a neat relationship between the two.

goodwink · on March 29, 2013

There are performance numbers for a framework (Cake PHP, for example) and for the raw primitives of the platform it runs on (PHP, in that case). By finding the ratio between the two one can arrive at the performance loss attributable primarily to the framework you've chosen.

See my "framework optimization index" in comments below for a rundown on all these ratios which I was able to back out from this set of benchmarks.

vanderZwan · on March 28, 2013

"Among the many factors to consider when choosing a web development framework, raw performance is easy to objectively measure."

Oh really? Then why did Zed write such an angry rant about how you are doing it wrong?

http://zedshaw.com/essays/programmer_stats.html

Can we please see some standard deviations, at least?

k__ · on March 28, 2013

What's wrong with that guy?

vanderZwan · on March 28, 2013

It's a pretty serious problem with how we benchmark though.

EDIT: when looking up what I vaguely remembered I somehow managed to come across a similar article that was published just today[1], even though I was referring to an older one[2] which was about microstuttering (basically: a high standard deviation in frame rate). The point still stands - in fact it applies to both cases in somewhat different ways.

To give an example: Crossfire and SLI graphics card setups a few years ago[1]. It turned out that while both gave a similar performance increase in average framerates. Then was it discovered that one of them had a significantly lower minimum framerate than the other. A high minimum framerate is probably more important in shooters than peak performance, but that's not what we've been testing all of these years, is it? That's exactly the problem highlighted in the article by Zed.

I know this is a gaming example, but I'm sure that in user perception of the performance this matters just as much for the responsiveness of webpages.

[1] http://www.tomshardware.com/reviews/graphics-card-benchmarki... [2] http://www.tomshardware.com/reviews/radeon-geforce-stutter-c...

spartango · on March 28, 2013

Indeed, it's important to look not just at the average performance and performance extremes, but also the distribution of performance.

Standard deviation helps with this. Also, often times looking at the latency at the 50, 90, and 99th percentiles is valuable as you can see events that would make your users unhappy. They're a very tangible series of metrics.

sergiotapia · on March 28, 2013

He's a fantastic software engineer but he is _very_ abrasive. I've read some of his posts claiming he would fight some other developer in person at a conference if he steps up. He would rent out a ring and he would put his yellow belt to practice.

Really, I'm not making this up. He sounds like a jerk to work with.

coolsunglasses · on March 28, 2013

He's actually a really nice person. He helped me out (to him I was just some stranger on the phone) when I was trying to decide what to do with my career when I was in NYC.

Zed Shaw is probably one the best people you can know in the developer community, a very good guy.

Your sensationalism based on some of the stuff he says on Twitter and Blogs is amusing though.

k__ · on April 8, 2013

> Your sensationalism

well he doesn't try to be a nice guy in public...

tinco · on March 28, 2013

He is actually a very nice person to be around and work with. Besides being professional and experienced he is also outspoken with a firm opinion.

You might think that's being a jerk, I think that's honest and reliable.

vixen99 · on March 29, 2013

I am among those appreciative of his contribution to my education. At the same time I wish he'd let the word 'fuck' regain its undoubted impact by not using it on what seems to be every conceivable occasion (excuse the pun).

skrebbel · on March 29, 2013

He's just a bit too honest and direct for the average American.

From that perspective at least, I expect people like Zed to do great in Northern Europe. There's a reason Americans think DHH is rude. He isn't; he's just Danish.

pekk · on March 28, 2013

It seems like some context might be missing here.

sergiotapia · on March 28, 2013

Context: http://web.archive.org/web/20080103072111/http://www.zedshaw...

" I mean business when I say I’ll take anyone on who wants to fight me. You think you can take me, I’ll pay to rent a boxing ring and beat your fucking ass legally. Remember that I’ve studied enough martial arts to be deadly even though I’m old, and I don’t give a fuck if I kick your mother fucking ass or you kick mine. You don’t like what I’ve said, then write something in reply but fuck you if you think you’re gonna talk to me like you can hurt me."

sp332 · on March 28, 2013

He's a lifestyle ranter. He used to have a section of his website called /rants/ which you can still read here: http://web.archive.org/web/20080105054424/http://www.zedshaw...

k__ · on April 8, 2013

Seems like a bad case of ADD :\

ehm_may · on March 28, 2013

Ha! Gotta love Zed

atonse · on March 28, 2013

I'm torn about this. On the one hand, while I've known my framework of choice (Rails) is slow, I didn't know how much slower it could be in the grand scheme of things. But on the other hand, I'm more shocked by the difference between EC2 and dedicated hardware (10x improvement with rails), and even 89 requests per second (20 query benchmark on EC2) is still a decent amount of traffic. (Plus this doesn't count any optimizations I would make anyway, like caching).

Either way good architectures usually optimize the high traffic or high CPU areas anyway away from a scripted language.

Thanks for the really informative post! Go seems to be a good balance as a high performance language without having to go back to my traumatic Java days.

jondot · on March 28, 2013

Don't be too disappointed about Rails.

As a rule I like to divide this world into "Featureless" and "Featurefull" products.

When you use Rails, you're aiming to pile up features. You want to react to Product managers, to users, you want to work fast and satisfy the needs of customers - or else you won't have anyone to build to.

In this reality, the fact that you're doing 20req/s is OK. In fact, I'm betting that even when you take Go or Node.js - pile up all of the infrastructure and features that exist in Rails, and pile up a ton of your code - buggy and not buggy - you'll get around the same kind of satisfaction index from users.

This is because your product can be perceived as slow even though your servers are blazingly fast.

On the other side of the spectrum there are "Featureless" products. These are infrastructural products. A logging service. An analytics service. A full-text search. A classification and recommendation engine.

These you don't want to build in Rails. I'm sure you haven't even considered it. These you want to build with one of the top-notch libraries that this survey indicate.

film42 · on March 28, 2013

Also, there are certain features about rails like thin or unicorn that can drastically increase your overall performance. So in that sense, I think it's a lot more complicated to determine.

bhauer · on March 28, 2013

Thanks for the feedback, atonse! We had a great amount of fun putting this together, as you can imagine.

I agree, a remarkable take-away for us was how dramatically our i7s excelled over the EC2 instances. Admittedly, those were EC2 Large and not Extra Large instances.

A previous draft of this blog entry had a long-winded analysis of hosting costs--discussing the balance between ease and peace-of-mind provided by something like AWS versus the raw performance of owned hardware--but we elected to remove that since it wasn't really the point of the exercise.

goodwink · on March 28, 2013

Were these the new 2nd-generation large instances or the original ones?

bhauer · on March 28, 2013

They were m1.large.

pfalls · on March 28, 2013

EC2 was a hard platform to test on, only because our i7 hardware would give us results fairly instantaneously, but we became impatient when we had to wait upwards of 10x as long for the data on the EC2 large instances.

We're actually very interested in how the large/newer instances perform.

saraid216 · on March 28, 2013

I feel obliged to ask what constitutes 10x of instantaneous.

bhauer · on March 28, 2013

I worked with Pat (pfalls) on this effort. He pulled the benchmarks together and built the script to automate the tests. We aimed to deploy each framework/platform according to best-practices for a production environment and then stress test common operations: JSON serialization of objects and database connectivity. We were surprised by the wide spectrum of performance we observed and hope that this interesting to you as well. Four orders of magnitude in one of our tests!

If you have any questions or see something we stupid we did, please let us know. We'd like to correct any mistakes straight away, especially since we're certainly not experts on all of these frameworks and platforms.

ubershmekel · on March 28, 2013

I'm no expert, but I think certain languages/frameworks are better suited to be behind certain servers when high concurrency is tested. E.g. from http://nichol.as/benchmark-of-python-web-servers it seems django would be better served behind gevent.

rozap · on March 28, 2013

There are a lot of variables and tweaking that can be done, and it would be nearly impossible to optimize each.

Similarly, I was wondering what sort of an effect connection pooling would have, as the out of the box django distribution doesn't do that. It really didn't perform too well in their tests.

knappador · on March 28, 2013

At LEAST gevent with session write-through caching, psycopg2pool, Postgres SQL (hello, excellent South support?), no unnecessary middlewares or applications that rely on them (if it's a speed oriented use of Django, we're hosting on a specific API sub-domain, right?). At most, THEN you tune the settings to have an optimum number of Postgres threads staying alive and tweak some gunicorn/nginx max connections parameters for your site. If running all locally, use UNIX sockets. This article is trash when it comes to providing any useful data other than Django that's barely been configured beyond not using SQL-Lite, and who the hell uses that in production, so I don't buy their argument about "oh well we just wanted to see what It'd do out of the box" rhetoric. Might as well benchmark ./manage.py runserver. I wish they'd it right or don't publish, let alone publish to shill their company that doesn't provide what they advertise.

bhauer · on March 29, 2013

Thanks for your pull requests, knappador. We will try to get a revised post out in roughly a week or so with as many of the tweaks we've received (as pulls and tips) as we can muster.

goodwink · on March 28, 2013

I'd be interested to see performance for Vert.x on its other hosts (this is the JVM version, I believe).

spartango · on March 28, 2013

I think you may misunderstand Vert.x's polyglot features:

While vert.x supports many programming languages, all of these are run on the JVM runtime. This means when you use the ruby vert.x API, you're using JRuby; likewise with Javascript run through Rhino, Python through Jython, and Groovy/Scala run through their own interpreter/compilers.

That said, it would definitely be interesting to see the performance implications of using one of those languages and vert.x on the JVM.

bhauer · on March 28, 2013

Interesting point. You're correct, we've only tested Vert.x as a Java/JVM platform.

proland · on March 28, 2013

I agree, seeing vert.x with it's other language options would be interesting.

kptnkrk · on April 4, 2013

Rhino with RingoJS would be another good JVM-based test.

davidw · on March 28, 2013

Erlang would be nice to see, with, say, Chicago Boss.

bhauer · on March 28, 2013

Agreed. As you can imagine, we had to stop adding additional frameworks somewhere or we'd never get this posted. :)

bascule · on March 28, 2013

Since these tests seem to be all about JSON serialization, it would've been interesting to see the tests with rails-api instead of the standard Rails stack:

https://github.com/rails-api/rails-api

What webserver were you using on JRuby? Was it Trinidad? Did you try Jetpack?

stephenhuey · on March 28, 2013

I concur, and Rails 4 may not be officially released yet but it's stable enough to run these tests against.

mikeycgto · on March 28, 2013

rails 4 with some concurrency would be interesting to see

sergiotapia · on March 28, 2013

Where is ASP.Net MVC? Odd that you list obscure frameworks like Wicket and leave out one of The Big Four frameworks.

(The big four in my book are: ASP.Net MVC, Rails, Django and CakePHP)

bhauer · on March 28, 2013

We'd love to have ASP.Net MVC included. One minor gotcha is that to do it justice, we'd need to spin up a Windows EC2 instance and figure out how to script that. It's on our to-do list!

We did briefly test ASP.Net on Mono (see another comment in this thread) but didn't include it since we didn't believe that qualifies as a "production" grade ASP.Net MVC deployment.

voidlogic · on March 28, 2013

You should include it; Lots of shops are deploying their products on .NET MVC with Mono.

Revex · on March 28, 2013

I agree... Let's see what the numbers show for Mono, and on Windows.

sakopov · on March 28, 2013

Just use AppHarbor.

Revex · on March 28, 2013

I came here to say this. I don't understand why some of the development community likes to act like .Net doesn't exist....

It pains me to see charts and reporting done like this while leaving out my favorite framework.

pekk · on March 28, 2013

Since these benchmarks are so wide-ranging, I agree. But that means setting up an entire new testbed on Windows, and then trying to make it comparable to the other platform testbed; possibly tuning. You need a Windows expert to do this. My question is, why aren't Windows experts setting these up?

radicalbyte · on March 28, 2013

If you can cover ASP.Net MVC, then I'd recommend including ServiceStack. My own tests of their JSON implementation have shown it to be 5-10x faster than ASP.net MVC.

Drop me a line if you need a hand with either :)

jasonlotito · on March 29, 2013

Just curious, but is CakePHP seen as a major framework by people outside the PHP Community?

krzyk · on March 29, 2013

Big four without any Java framework, but with .Net one? And CakePHP as a major one, I wasn't aware it's still used.

akmiller · on March 28, 2013

I'd like to see how .Net MVC would compare. I realize you'd have to spin it up on a Windows EC2 instance and there would definitely be some variance in the performance of that box vs. the nix EC2 instances but I'd still be interested in seeing how it fares in comparison.

bhauer · on March 28, 2013

We agree. We aim to provide a .NET MVC test soon. We did briefly test that on our i7 hardware and if memory serves me correctly, it clocked in at around the position as Spring.

But don't quote me on that! :)

hhudolet · on March 28, 2013

I would very much love to see:

it would hopefully compare with java stacks!

macca321 · on March 28, 2013

Yep, would also be interesting to see Web Api, hosted on Windows and on Mono.

bblackshear · on March 28, 2013

I would also suggest ServiceStack

adad95 · on March 28, 2013

And Synchronous and Asycnhrounous (Using C#5 async) version.

pfalls · on March 28, 2013

I think you're about right on that one. Of course we were running on mono.

sequoia · on March 28, 2013

> This exercise aims to provide a "baseline" for performance across the variety of frameworks. By baseline we mean the starting point, from which any real-world application's performance can only get worse.

I disagree with the implication here (that this is a good point for comparison because "real-world application's performance can only get worse."). Yes it can only get worse but how much worse (per unit of "features") is both significant and unaddressed.

This isn't the best example but look at the gap between the top and bottom of the scale in the Database access test (single query) and Database access test (multiple queries) charts: In the first, Gemini is ~340x faster than Cake, in the second, only ~23x faster. There is still a big gap but it closed by an order of magnitude once you stepped past the most trivial possible DB access test.

So nodejs or php-raw is faster than cake at a single DB access, but what about when you create a real world scenario with authentication, requirement to be able to update features faster (i.e. use an ORM), env. portability requirement, etc.? It seems to me this would look like a little slower, a little slower, a little slower in the {raw} versions, and already included, already included, already included in Rails or Cake. The full featured frameworks take a lot of their performance penalty up-front, with less of a hit as features are added (maybe? :P).

My point is that it's not reasonable to assume that hackernews-benchmarks will actually reflect production use. That said I think the article is cool, and agree that it's good to keep framework authors' feet to the fire regarding performance!

campnic · on March 29, 2013

This is completely loaded. Your implication is that the only viable test is a test which exercises all of the functionality of the most feature rich framework. How would that be a)viable and b) meaningful?

We know that there is a set of common features and the benchmarks goal is to test least common denominator stuff on the networks. Authentication and portability are not LCD. The argument that they are is capricious. What if we made the requirement be that the framework is a lisp? Now we've completely changed the intent.

sequoia · on March 29, 2013

I meant to suggest that comparing php-raw to Rails is apples & oranges- not "you must benchmark in a way that benefits larger frameworks", just "please acknowledge that LCD tests like this inherently cast Railsy frameworks in a bad light."

It's like condemning a swiss army knife because it's not as efficient as a fixed blade at cutting apples. Well yeah that's true, but what about when you need to screw a screw or pull a cork? One is a multitool, it doesn't make sense to compare it to a specialized tool unless all you plan to do is cut apples.

mythz · on March 28, 2013

Would've loved to see http://servicestack.net on this list which has great performance on .NET and Mono: https://github.com/ServiceStack/ServiceStack/wiki/Real-world...

And also maintains .NET's fastest JSON and Text Serializers: http://theburningmonk.com/2011/11/performance-test-json-seri...

bhauer · on March 29, 2013

Thanks for the tips on these. We'll add ServiceStack to our to-do list. As you might imagine, that list is getting long as a result of some great community feedback. Pat (pfalls) is diving into the pull requests this morning.

jondot · on March 28, 2013

pfalls - amazingly, I spent the last 2 days of my holiday doing the same thing for a future open source project. I was just stumped when I saw you guys did the same (could have saved me a couple of days!)

I wanted to find the leanest Web framework on any kind of platform; but the difference from your approach - I already knew the kind of code that would run on it.

I tested: Go, Java (servlet, dropwizard), Scala (scalatra), Ruby, Node.js (connect).

For me it was:

* Scala

* Java

* Clojure (equal to Java - big surprise here)

* Node.js

* Go (almost equal to Node.js)

* Ruby (far far down)

Scala took the lead with amazing results. More over, a good metric was latency which Scala was the only one to take micro-second resolution.

I'm not a fan of Scala because of its surrounding tools, which is why I'm still considering going for either Clojure or Node.js.

I think the most surprising positively was Clojure, being that it is a dynamic language. And most surprising negatively was Go - by itself is impressive, but when given real work (Web handling, Redis/mongodb) goes bad quickly. Happy to see this correlates with your findings too; I'm assuming this is a symptom of library maturity..?

I'd be happy to see how Scala fares on your tests.

You've done an awesome job!

pfalls · on March 28, 2013

Thanks for the comment!

This started out as a small exercise, that quickly ballooned because we were curious about every framework and platform. Obviously we had to stop somewhere, but we're very interested in adding more tests in the future. In fact, we're hoping the community will help us out as well!

jondot · on March 28, 2013

Yes, I know the feeling :)

What started as a couple of hours of exercise for myself ended up as 2 days of hacking and barely sleeping, as surprises in my assumptions kept unfolding, and as I wrote and rewrote POCs just to validate that Clojure is as fast as the number say, and that Scala is faster than Java, etc.

smrtinsert · on March 28, 2013

Dropwizard looks awesome, just the kind of project I've been looking for, also it links to JDBI which is very similar to a sql lib I maintained for myself all these years and looks awesome. Thanks for posting!

lennydizzy · on March 29, 2013

Would you consider releasing some of the benchmark code/setup? (better yet integrated into OP's project) I am very interested in seeing clojure's performance. What kind of framework did you use for clojure?

zerni · on March 28, 2013

Scala really would have been nice. Especially a Scala/Akka/Spray kombo :)

I'm working with a setup like this and just love it!

xenonflash · on March 28, 2013

1) The Python version has some basic newbie coding errors. This sort of code is what Python programmers call "Java written in Python". It may be a valid algorithm in Java, but it's the wrong way to do it in Python. Code like this will work, but it will be slow. Depending on the size of "queries", you are potentially allocating gobs of memory in two different places for no reason, and then throwing it away without using it. I wouldn't be surprised if the examples in other languages had similar problems.

2) The JSON serializer in Django 1.4 uses a method which is known to be very slow, but which is easily portable across different platforms and works with older versions of Python. They no doubt included for easy bundling. In a real application you would probably want to simply use the normal JSON serializer from the standard library (which is many times faster).

3) The examples are little more than "hello world". I did some benchmark tests with several Python async frameworks, Pypy, and Node.js for an application I was working on. With small JSON objects there wasn't much difference in performance. Once you started using large JSON objects the performance lines for all versions were indistinguishable from each other. The performance bottlenecks were in libraries, and those standard libraries were all written in 'C', so interpreter versus compiler versus JIT made little difference.

4) The problem with "toy" examples is that in real life there are two performance factors which must be taken into account. Think of as y = mx + b. With a toy example you are probably only measuring "b". With most real life applications it's "m" that matters. There are often different optimization approaches that are best for varying ratios of "b" and "m". You have to know your application intimately and benchmark using data which is realistic for that application.

Python has a reputation for being "easy to learn". However, it is "easy" in the sense of being able to hack something together that works without knowing very much. There can be several different ways of doing things and doing it one way versus another way can mean a difference in performance of several orders of magnitude. The same may be true for some of the other languages, but I haven't examined them in enough detail to say.

pekk · on March 28, 2013

Numerous irregularities plus a strong vested interest in the JVM make me doubt they have given adequate shrift to Go, here.

Given the amount of interest in Haskell and Yesod around here, it is strange that it is missing.

goodwink · on March 28, 2013

Could you elaborate? Their website seems to indicate that they are a very polyglot shop, not someone pushing a JVM agenda:

"On the back-end, we use Java, Ruby, Python, .NET, PHP and others based on what makes sense balancing server performance, scalability, hosting costs, development efficiency, and your internal development team's capabilities."

pekk · on March 28, 2013

"We have included our in-house Java web framework, Gemini, in our tests. ... "

Then you see results for some languages that are completely out of whack with most other such benchmarks (they themselves mention the weirdness of Sinatra vs. Rails, for example).

Then you see on a couple platforms that more performant mainstream options have been excluded, for no good reason.

Then if you look at the repo, there are deployment choices and code mistakes in some of the other languages which go well beyond elementary incompetence...

kainsavage · on March 28, 2013

Feel free to issue a pull request to help our testing. We are not trying to push Java-based frameworks over any others and we believe we are being fair across the board. That being said, if there are "code mistakes... which go well beyond elementary incompetence", then we would love to correct and retest these.

stephenhuey · on March 28, 2013

Maybe you should've had the controllers execute raw SQL for a better comparison. I see you executing a regular query in Rails whereas your Java servlet is using prepared statements.

bhauer · on March 28, 2013

We love Go here! But admittedly, we have not yet deployed an actual Go web application to a production environment, so the tests demonstrate our first attempt at creating a Go production environment. We based the approach on whatever material we could find on Go reference sites.

That said, we'd love to hear what we did wrong in the Go tests so that we can fix those up.

We'll be posting follow ups as we've had a chance to go through all the recommended tweaks.

_kdhr · on March 29, 2013

I hope no one contributes a Haskell solution to this farce.

papsosouid · on March 28, 2013

>Given the amount of interest in Haskell

I see more people making uninformed "haskell sucks" posts than expressing interest in it.

>and Yesod

Really? Yesod is the anti-haskell haskell framework.

markokocic · on March 28, 2013

> Really? Yesod is the anti-haskell haskell framework.

Can you please elaborate. Being interested in Haskel web development and trying to choose web framework makes me wish for more information.

pilgrim689 · on March 28, 2013

I wouldn't call Yesod "anti-haskell". By default, it relies on QuasiQuotes and TemplateHaskell a lot [1], which are extensions to the GHC. So by default, you'd have a hard time running Yesod applications on anything else but GHC (the Glasgow Haskell Compiler). These extensions allow you to write in an EDSL that generates Haskell for you. IMO, Yesod's use of these extensions are a benefit, as it allows the user to get stuff like type-safe URLs in HTML for free (e.g. you put href=@{Home} on your HTML element and Yesod will ensure that the value interpolates to a route that exists at compile time).

Haskell libraries often depend on language extensions, whether it is overloaded strings or type families or whatnot... so I think it's strange that Yesod gets picked on for doing the same: taking advantage of the tools provided by GHC to create a better environment for the developer.

[1] http://www.yesodweb.com/book/haskell#template-haskell-14

bjterry · on March 28, 2013

Hugs is now defunct (last release in September 2006, doesn't even support the 2010 language standard), so there is no reason that being GHC-only should be a consideration in selecting a Haskell framework. It's the only real option.

josh-j · on March 28, 2013

I'm guessing, but I think it's because yesod uses a lot of magic such as templates and the like. The other frameworks like Snap use more idiomatic Haskell.

papsosouid · on March 28, 2013

Yesod is designed to try to replicate rails style frameworks, which as an approach, doesn't work well in a static language. It is also designed to try to hide any traces of haskell. Rather than provide a framework to write haskell code in, they use quasi-quotation to provide a bunch of totally different syntaxes for different parts of the app, which get compiled to haskell behind the scenes. Most haskell users prefer to write haskell rather than specialized, single use languages with limited functionality and poor error reporting.

Then on top of that, the marketing behind yesod is essentially deliberate mis-truths that suggest weaknesses in haskell which do not exist. See how pilgrim689 thinks that the EDSL yesod uses for routing "gets you type safe urls"? Type safe urls are also available in happstack and snap, but written in haskell rather than a weird custom pre-processor language. The EDSL is just giving you a different syntax (and making error messages complex and hard to understand), not giving you the type safety. Pointing out that creating custom languages that are inferior to haskell and have no benefits is bad results in whining of "stop picking on yesod just because we use extensions, everyone uses extensions!", despite the use of extensions never being brought up.

As for more information about web frameworks in haskell: I've tried all three and can give you my thoughts. I ended up using snap, so consider me biased when reading this. Yesod is rails-like in that it pushes a misinterpretation of MVC on you that encourages writing redundant code. Happstack and snap aren't really frameworks in that sense, they don't say "give me some code following my conventions and I'll run it", they say "here's how you get access to the request, have fun". More like libraries than frameworks.

Yesod's DB access layer they provide is of the "dumb everything down to the lowest common denominator" variety, except that it adds even more limitations beyond that. So you end up having to use something else that is not integrated at all. Happstack and snap don't provide a DB access layer, but do provide integration with several DB libraries off hackage (hdbc, haskelldb, postgresql-simple, acid-state).

Happstack has the best documentation of them, and it and snap are very similar design wise. Porting an application from one to the other is pretty straight forward. The only reason we settled on snap instead of happstack is that snap includes a development mode that works well, and happstack does not. Meaning with snap you just change your code and it picks up the changes, recompiles and reloads it automatically, and shows you any errors in the browser when you refresh. With happstack you either need to work out your own way to deal with that, or keep recompiling manually all the time.

pekk · on March 28, 2013

You and tikhonj are all over the place in here. If you were being downvoted into the gray, you would know. There is no shortage of people praising Haskell every day, this is what is in fashion today.

There are also regular posts about Yesod.

I conclude that you know perfectly well that Haskell and Yesod are regularly mentioned on HN, but find it inconvenient to have mentioned for some reason I do not fathom.

_kdhr · on March 29, 2013

> There is no shortage of people praising Haskell every day, this is what is in fashion today.

I really don't like this characterization of interest in Haskell. It implies that it's no different from any other language and is just arbitrarily picked up because it's trendy. Learning Haskell is a very substantial investment of time and effort, it is very different from languages that most programmers have used before. It practically tells people “don't even try to like me, I'm high maintenance.”

papsosouid · on March 28, 2013

>You and tikhonj are all over the place in here

I am all over the place in here for the exact reason I mentioned. Go look at my posts, for every post about haskell by me, it is in response to someone posting some absurd nonsense like "haskell can't do real world" and "functional programming is great except you can't really do it because state". If people were interested in haskell, they would express interest, not strawman dismissals.

darksaints · on March 29, 2013

But that is like claiming nobody wants gay marriage because look how loud those Westboro people are screaming.

I'm interested in Haskell. I find it to be frustrating sometimes, and sometimes I vent my frustrations. It is hard to learn. But out of all the opinionated languages out there, Haskell is the one that I agree with the most.

There plenty of people here that are obviously interested. Why does it matter that the naysayers say nay?

papsosouid · on April 1, 2013

>But that is like claiming nobody wants gay marriage because look how loud those Westboro people are screaming.

That analogy would only be accurate if those Westboro people were in the majority.

>Why does it matter that the naysayers say nay?

I'm not sure how to answer this, given the context. I simply pointed out that I don't think the idea that there's a lot of interest in haskell here is accurate, and cited all the uninformed crap spewed about haskell all the time as evidence.

akg · on March 28, 2013

Would love to see how these results compare to some of the web frameworks for concurrent functional languages like Erlang/Haskell: Nitrogen, Chicago Boss, Snap, Yesod, etc.

_kdhr · on March 29, 2013

Please don't encourage them. Even if Haskell comes out on top, I would still be unsatisfied because the rest of the benchmarks are unfair. Lies by confusion are still lies.

pilgrim689 · on March 28, 2013

ditto. I hear Warp (the server behind Yesod) is a beast.

bhauer · on March 28, 2013

I'm not familiar with Warp. Would one of you guys be willing to help us put together a test for Yesod?

josh-j · on March 28, 2013

#haskell, #yesod, #snapframework on freenode are very helpful.

lorddamien · on March 28, 2013

Right, where is Zotonic??, which is actually focused on speed/performance

tnash · on March 28, 2013

This is exactly why I decided to use PHP for my startup. I have something along these lines that I hope to blog about in the coming weeks (I tested php-fpm on nginx/go/node.js/silk.js and php won by a landslide when it came to speed).

I would love to see php-fpm on nginx included in this test.

snaky · on March 30, 2013

When it comes to speed, considering you are using nginx, the way to go is using nginx as an app server not just fastcgi frontend. Lua-nginx-module combined with proper database module (async with connection pool support like ngx_drizzle or ngx_postgres) can give you speed. OpenResty provides the simplified preconfigured way to try it and adds some features too. http://agentzh.org/misc/slides/libdrizzle-lua-nginx

TylerE · on March 29, 2013

The problem with php is that it looks great on (some) micro-benchmarks, but on real apps under real sustained load it certainly turns to cold dog shit from time to time for no apparent reason.

xd · on March 29, 2013

What are you basing this on? I've been using PHP for well over a decade in high load environments and never experienced it turn "to cold dog shit" .. any issues I have experienced had a good reason, not "no apparent reason".

But then, I've never used a PHP framework in all the time I've used it .. maybe that has something to do with me never having had negative issue with PHP.

jalopy · on March 28, 2013

As a Rails developer and admirer, this is eye-opening. I love the framework (and Ruby especially), but these numbers bear some serious consideration.

30-50x performance difference gets really... real, no? The standard refrain of "throw more hardware at it" must reconcile with the fact that a factor of 30-50x means real dollars for the same amount of load. Is the developer productivity really that much greater?

bradleyland · on March 28, 2013

Preface: This post is going to come across as a Rails apologist piece, but please read the entire thing before you reach a conclusion. Please also consider that you could apply these same arguments to just about any of the high-level language based frameworks on the list. I use Ruby on Rails in my comparisons, but I'm a huge fan of Node.js, Python/Django, and Go.

I fully respect the JVM family of languages as well. I just think that Mark Twain said it best when he said: "There are three kinds of lies: lies, damned lies, and statistics." It's not that the numbers aren't true, it's that they may not matter as much, and in the way, that we initially perceive them.

Performance is certainly something you should consider when selecting a language/framework, but it is not the only thing.

========================

You should undertake a detailed examination of these statistics before making any decisions.

Issue #1) The 30-50x performance difference only exists in a very limited scenario that you're unlikely to encounter in the real world.

Look carefully at the tests performed. The first test is an extraordinarily simple operation: take this string, serialize it, and send it to the client. This is the test in which we see massive differences:

Gemini vs Rails

25,264/687 (gemini/rails-ruby) = 36.774

25,264/871 (gemini/rails-jruby) = 29.000

Node.js vs Rails

10,541/687 (nodejs/rails-ruby) = 15.343

10,541/871 (nodejs/rails-jruby) = 12.102

That's a 37x performance win for Gemini, and 15x for Node.js.

Side note: You might be wondering why I didn't compare to the top performer, Netty. Netty is more like Rack. You build frameworks on top of Netty, not with Netty. As a Ruby dev, you could think of this in the same context of comparing Ruby on Rails with Rack; not a good comparison. Hence We won't compare Rails to Netty.

The error would be in extrapolating that a move to Gemeni or Node.js would give you a 37x or 15x performance increase in your application. To understand why this is an error, we jump down to the "Database access test (multiple queries)" benchmark.

Issue #2) Performance differences for one task doesn't always correlate proportionally with performance differences for all tasks.

In the multi-query database access test, we start to see the top JSON performers slow down significantly when compared to the slow down for Rails:

Gemini vs Rails

663/89 (gemini/rails-ruby) = 7.449

663/108 (gemini/rails-jruby) = 6.138

Node.js vs Rails

116/108 (nodejs-mysql-raw/rails-ruby) = 1.077

60/108 (nodejs-mysql/rails-ruby) = 0.555

In this scenario -- which is arguably much closer to the real world -- Ruby on Rails closes the gap and even beats some of the hip new kids.

But why? The in-depth answer to this question would require a lot of space, but the really, really short version is kind of a "what's the sound of one hand clapping" response: Ruby isn't actually all that slow.

To understand what the hell that means, check out this presentation from Alex Gaynor (of rdio/Topaz fame):

https://speakerdeck.com/alex/why-python-ruby-and-javascript-...

Ruby is just about as fast as C, provided you're comparing it to C that does exactly the same operations on the hardware as the Ruby code. Don't get me wrong, that's a HUGE provision. But it warrants close examination.

The real benefit of lower level languages like C is that they give you the flexibility to drill down in to your actual bare-metal operations and optimize the way the program executes on the hardware. As Alex points out, we don't currently have that level of flexibility in languages like Ruby (without dropping down to inline C), so we suffer a performance penalty.

This penalty is huge for simple tasks because they involve only a handful of operations that execute extremely quickly. As you add complexity, however, the benefits of micro-optimizations get lost in the vastness of the overall execution time.

Look at it like this. When Gemini hits 36,717 req/s in the JSON test, each request only lasts about 1.6 ms. This is only possible because of the simplicity of the operations being done on the hardware. Ruby loses big here because there is a lower boundary to the way you can optimize without dropping down to C.

gemini: 1.6 ms per request

rails-ruby: 87.3 ms per request

When we look at the multi-query database access test, we can see how the optimization at the low level gets lost in the sea of time taken to process the request.

gemini: 90.5 ms per request

rails-ruby: 674.2 ms per request

Granted, that is still over a 7x performance win for Gemini, but this is where the Ruby arguments about programmer efficiency come in to play. I don't know Gemini, so it may very well beat Rails in that comparison too. Ruby is getting more performant with every release though, so it's easier to justify on the basis of preference alone when we're this close.

_kdhr · on March 29, 2013

Don't conflate Ruby with Rails. Ruby _is_ slow:

http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...

and so is Python:

http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...

The only time Ruby or Python are fast is when the program is not running Ruby or Python but running some C code underneath. If your programs only consist of that, you can very well say "this ‘Ruby/Python’ code is fast". But as soon as you have something that isn't in your standard library, welcome to the actual language, and welcome to performance problems.

Elaborating on the implications of this: whenever you actually _use_ the language to do some abstraction, you pay heavily for it: http://bos.github.com/reaktor-dev-day-2012/reaktor-talk-slid...

bradleyland · on March 29, 2013

You should really check out Alex Gaynor's slide deck. Nothing I said disagrees with what you've said here, provided you take the entire thing in context.

jalopy · on March 28, 2013

That's a very detailed and thoughtful response. You make some valid points. Maybe I'll try to craft some more complicated benchmarks that replicate normal CRUD operations found in most webapps.

bradleyland · on March 28, 2013

That's kind of missing the point. In statistics, there exists something called confounding variables. Confounding variables are factors that affect your outcome, but are hidden, or are outside your control. As your example becomes more complex the opportunity for confounding to impact your result goes up significantly.

I believe the multi-query database access test is actually a good example of a "complex enough" test, but not too susceptible to confounding. In this test, we see that Rails isn't so far behind.

Basing your choice of framework on speed alone is a pretty bad idea. When you select a framework, you need to optimize for success, and the factors that determine the success of a project are often less related to the speed of a framework and more related to the availability of talent and good tools.

That's not to say you should ignore speed entirely, but that you have to weight your factors. There is a tendency to believe that you will need rofflescale when you really won't. Keep that in mind when you're weighting your factors.

tiedemann · on March 28, 2013

Really depends on which kind of app you're working on. My main work app is 99% cached content so it would probably work just fine with almost anything. Developer time is certainly the biggest expense in my case so high-level it is.

_kdhr · on March 29, 2013

It comes as a surprise to me that this comes as a surprise to you. Really, you didn't know Ruby is pretty much as slow as it gets?

neya · on March 28, 2013

How about Lift? Btw, the play framework you tested is Java or Scala based?

Either way, I'm shocked to see Play perform so slow comparatively. Although it's easily 10x faster than rails on most tests, I'm shocked to see Node.js faster than Play! (by 2x in most cases) Wow!!

Maybe Node.js critics should start appreciating it after all..

kainsavage · on March 28, 2013

It is probably worth noting that while we strive to make the tests as fair as possible, we followed the official tutorials for each framework when building out the tests but we fully expect there to be small instances where minor tweaks improve a given test. Given that I am no Play expert, it would be of great value to have one who is (and it sounds like you could lend a hand there) to check out the code on the github page. If we did anything wrong with the setup or in general, we would rerun the tests. Again, we followed all the official 'getting started' posts for each framework, so we believe we have best practices used.

Disclaimer: I am a colleague of the author of the linked article.

mark242 · on March 28, 2013

There are a few problems with your Play code that are causing it to be unnecessarily slow.

First-- what you're really testing here is the Jackson library. A majority of the cycles used in your application are being burned in that toJson call of an array of objects. This isn't a fair test compared to the servlet implementation because you're calling Jackson against a map in the Play example, versus against a simple String in the servlet example.

Second-- you are running database calls serially, and those are blocking. Considering that you're using the more-or-less default Play/Akka configuration, there are only enough threads as you have available processor cores. I would start by increasing the parallelism-factor and parallelism-max to be higher, so you'll have more available threads. More importantly though, the database access should be wrapped in a Future, and you should be returning asynchronously. This should speed up the application by a huge amount.

neya · on March 28, 2013

Do you think such a configuration could outrun the Vert.x configuration they've posted? I'm not challenging you, I'm just genuinely curious! Because if Play+Akka can outrun Vert.x, then it would be an interesting game altogether...

mark242 · on March 28, 2013

I think Play, with well-written asynchronous code, could approach the Netty/Vert.x speed. In other words, I'd be willing to trade the ease-of-use of Play for the slight speed impairment vs. writing directly to Netty/Vert.x/etc.

bhauer · on March 28, 2013

We'd love to test that theory. Can you or any Play expert rewrite our Play code and submit a pull request?

abalone · on March 29, 2013

Not sure if your theory can really account for the 7X difference between Play and Netty in the JSON test. They both create a trivial name/value pair and pass it to the JSON serializer. In Play's case, it gets passed to their Jerkson wrapper for Jackson. What would you even change about that code?

(Note: I'm not necessarily saying the Jerkson wrapper is the culprit. Could be Play's routing framework, or something else.)

jholla14 · on March 28, 2013

The problem is that the database queries were done in a blocking fashion. The test essentially blocks the main event loop which is of course going to kill performance.

vph · on March 28, 2013

This seems to be a nice benchmark. For the Python group, I would suggest two things: (1) include a lightweight framework like Bottle, and (2) Try pypy.

bhauer · on March 28, 2013

Thanks. Agreed, we'd love to get a Python micro-framework in the test. If you have some free time and feel like putting together a test for Bottle as a Github pull request, we'd really appreciate that.

est · on March 29, 2013

(3) use bjoern

seriously, in my rough helloworld and sqlite value increment by 1 benchmark, Bjoern+wsgi app runs 2x as fast than nodejs.

apendleton · on March 28, 2013

I'd be curious about a gevent-worker-driven server, too. Presumably they're running Django on either a thread- or processor-driven concurrency server, and gevent can show some pretty major gains. You could also do permutations of these: gevent on pypy (using pypycore), etc.

_p62c · on March 28, 2013

I'm most surprised that PHP seems to be around an order of magnitude faster than Ruby on Rails, I knew it was faster, but didn't think it would be that much.

justinsteele · on March 28, 2013

The comparison would be php-raw vs ruby. The actual "php" benchmark did just as poorly as rails.

xal · on March 28, 2013

What kind of server did you guys use for your rails test? Thin, Puma, Unicorn? Are you sure you ran it in production environment?

Update:

Looks like passenger in development mode. Good job you benchmarked a web server that no one uses wile reloading all code between requests.

Update2:

Ok it seems to run in production mode but still, passenger is not an idiomatic choice.

pmahoney · on March 28, 2013

For the sake of curiosity, I happen to have recently done a benchmark of a "hello world" rack app (literally just responds "hello world" to every request) on a number of Ruby servers (mostly JRuby, but also Puma on MRI).

They were all run in production mode with logging disabled, etc.

http://polycrystal.org/~pat/scratch/microbenchmark.png

Note that a difference from 10k requests per second seems huge compared to 3k, but if you invert it, you get 100 and 333 micro seconds per request, respectively. In a real, non-"hello world" app, these differences are going to be negligible.

Though perhaps it would be more interesting if instead of just responding "hello, world", the app parsed some query parameters or something. But I was mostly interested in the overhead of different JRuby servers, not comparing different app servers (i.e. overhead from Sinatra should be more or less identical whether you're on Puma, Trinidad, or whatever).

kbd · on March 28, 2013

Sheesh, you can clearly see the GC pauses in the Java versions.

pfalls · on March 28, 2013

We used Phusion Passenger, although we have plans to add additional servers (such as Unicorn). We tried to spend time with various server choices for all platforms, and for ruby, in our short test, Passenger won out against the others.

Our understanding is that when running Passenger, simply passing '-e production' to the command line is sufficient to run in production, but if that's incorrect, we'll gladly update the test.

cheald · on March 28, 2013

Please make sure you're setting higher GC limits for the Ruby tests. Ruby's defaults are awful for a framework, and result in a LOT of GC thrash. It's not uncommon to see an order of magnitude improvement in performance when they're tuned properly. (edit: I'll just send a pull request, I found the setup file!)

Something else you might consider is the OJ gem rather than just the stock Ruby json gem. The latter is notoriously slow and memory-hungry (which will compound the GC issues!)

somlor · on March 28, 2013

> make sure you're setting higher GC limits for the Ruby tests

Could you elaborate on this, or point me in the right direction? I'm learning Rails and curious.

somlor · on March 29, 2013

For anyone still lurking, this user replied to me via email:

Ruby allocates heaps for its objects, and sets GC thresholds based on those heap sizes. Ruby allows you to change those settings via environment variables, which means that you can end up doing fewer allocations and less aggressive GC, which makes sense when using a full framework like Rails, which is going to allocate a lot of objects.

There's a more complete answer here to get you started: http://stackoverflow.com/questions/13387664/ruby-gc-executio...

adrr · on March 28, 2013

Pretty sure passenger+nginx configuration is the most common rails deployment. Not the fastest but most common much like apache is the most common web server for php.

aliston · on March 28, 2013

If that's true, it's not really a fair comparison. Running in development mode slows things considerably. They do have a question in the FAQ about that point:

"You configured framework x incorrectly, and that explains the numbers you're seeing." Whoops! Please let us know how we can fix it, or submit a Github pull request, so we can get it right."

Perhaps you/we should submit a pull request?

goodwink · on March 28, 2013

Passenger wouldn't be my choice personally, but I don't think there's anything non "idiomatic" about it. Engine Yard, Cloud 66, etc. use passenger in their PaaS configs, it's been very popular (on the wane now, but still), etc. It seems fair enough and the differences aren't going to be the sort of order of magnitude change which would really matter on this sort of thing regardless.

Groxx · on March 28, 2013

Are you sure? This looks like the right file to me, and it says '-e production', last changed 6 days ago: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast... (unless that's an incorrect switch / something else overrides it - I haven't tried it, not too familiar with Passenger)

awj · on March 28, 2013

This (from [1]) looks like passenger in production mode to me.

> rvm ruby-2.0.0-p0 do bundle exec passenger start -p 8080 -d -e production --pid-file=$HOME/FrameworkBenchmarks/rails/rails.pid --nginx-version=1.2.7 --max-pool-size=24

https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

daenz · on March 28, 2013

I'm not familiar with Rails. Could you explain what you mean in layman terms? Why is code being reloaded on every request by default?

awj · on March 28, 2013

In development mode Rails will reload code on each request to pick up on changes you have made to your application. That way you can interact with the application to help verify that your code is working properly.

As of Rails 3.2 development mode watches for file changes and attempts to only reload those files, but it's still a significant performance issue.

By default all Rails applications start in development mode, so one gotcha of benchmarking Rails is that some people will forget to set the mode correctly. That said, from the setup code[1] (line 14) it looks like they were running passenger in production mode. The max pool size seems excessive, especially when running on large ec2 instances, but I'm not fully convinced that it's out of line.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

pmahoney · on March 28, 2013

In development mode, it's nice to be able edit ruby code, then hit reload in the browser to see it in action without restarting the whole app. Sure there are other ways to accomplish this, but reloading code is typically how it's done in Ruby.

matt_yoho · on March 28, 2013

The Rails framework is written to reload (reevaluate) most application code between requests in development mode, so that changes during active dev are reflected. This is not the (default) behavior in production.

anuraj · on March 28, 2013

Interesting - C# should have also been added. Java still rules the roost. If you should absolutely have a scripting language - old php with warts is better. More magic in the framework - more abstraction and indirection and code inefficiency - that can be the spoil port when it comes to performance.

bhauer · on March 28, 2013

We agree and want to get a C# test in there. It's among the top priorities for us.

desireco42 · on March 28, 2013

What I can take from this is that when you use ORM, it slows things down considerably. Also, looking at rails example, you didn't use active record, which is really wrong.

I think you should tweak your tests to use more real world like examples. I realize it would be hard to do this across frameworks.

Like let's have a database query pull user record from 100,000 users by username. And maybe do md5 on password.

plainOldText · on March 28, 2013

How can you write a web framework benchmark and not include some of the non mainstream languages with probably the most performant frameworks like Erlang (Cowboy, Mochiweb), Haskell (Yesod, Snap Framework)? That's just wrong; anyway.

brass9 · on March 29, 2013

Oh... and you think Erlang or Haskell frameworks are mainstream while you fail to mention ASP.net?

krg · on March 28, 2013

There is a huge difference between raw PHP and CakePHP. I'd be curious to see other PHP frameworks (such as Zend, or Slim) in there-- is Cake just particularly slow, or is that simply what happens when you have a PHP framework?