Hacker News new | past | comments | ask | show | jobs | submit login
Web Framework Benchmarks (techempower.com)
438 points by pfalls on March 28, 2013 | hide | past | favorite | 396 comments



One of the most interesting things this comparison brings out to me is not so much the differences between the various frameworks (although the differences between options on the same platform is definitely very useful information), but also the issue that few of us seem to think about these days: the cost of any of the frameworks above the bare standard library of the platform its hosted on.

Theres a consistant, considerable gap between their "raw" benchmarks (things like netty, node, plain php, etc.) and frameworks hosted on those same platforms. I think this is something we should keep in mind when we're tuning performance-sensitive portions of APIs and the like. We may actually need to revisit our framework choice and implement selected portions outside of it (just like ruby developers sometimes write performance-critical pieces of gems in C etc.) or optimize the framework further.

I'd like to crunch these numbers further to get a "framework optimization index" which would be the percentage slowdown or ratio of performance between the host platform and the performance of the framework on top of it. I might do this later if I get a chance.


I think this is a much needed and excellent point to make. Just take a look at how Go dips down when using Webgo.


I used Go's benchmarking tool to compare raw routing performance of various frameworks. The handlers all return a simple "Hello World" string. Here are the results:

  PASS
  Benchmark_Routes           100000	     13945 ns/op
  Benchmark_Pat              500000	      6068 ns/op
  Benchmark_GorillaHandler   200000	     11042 ns/op
  Benchmark_Webgo            100000	     26350 ns/op
  ok  	github.com/bradrydzewski/routes/bench	12.605s
I then ran the same benchmark, but this time I modified the handler to serialize a struct to JSON and write to the response. Here are the results:

  Benchmark_Routes              100000	     21446 ns/op
  Benchmark_Pat                 100000	     14130 ns/op
  Benchmark_GorillaHandler      100000	     17735 ns/op
  Benchmark_Webgo                50000	     33726 ns/op
  ok  	github.com/bradrydzewski/routes/bench	9.805s
In the first test, Pat is almost twice as a fast as the Gorilla framework. In the second test, when we added a bit more logic to the handler (marshaling a struct to JSON), Pat was only about 18% faster than Gorilla. In fact, it turns out it takes longer to serialize to JSON (8000ns) than it does for Pat to route and serve the request (6000ns).

Now, imagine I created a third benchmark that did something more complex, like executing a database query and serving the results using the html/template package. There would be a negligible difference in performance across frameworks because routing is not going to be your bottleneck.

I would personally choose my framework not just based on performance, but also based on productivity. One that can help me write code that is easier to test and easier to maintain in the long run.


rorr, you appear to be hellbanned. Here's your comment, since it seemed like a reasonable one:

> Now, imagine I created a third benchmark that did something more complex, like executing a database query and serving the results using the html/template package. There would be a negligible difference in performance across frameworks because routing is not going to be your bottleneck. If you're performing a DB query on every request, you're doing something wrong. In the real world your app will check Memcached, and if there's a cached response, it will return it. Thus making the framework performance quite important.


Ok, so I added a third benchmark where the handler gets and item from memcache (NOT from a database). Here are the results:

  PASS
  Benchmark_Routes             10000	    234063 ns/op
  Benchmark_Pat                10000	    233162 ns/op
  Benchmark_GorillaHandler      5000	    265943 ns/op
  Benchmark_Webgo               5000	    348349 ns/op
  ok  	github.com/bradrydzewski/routes/bench	10.062s
Notice the top 3 frameworks (pat, routes and Gorialla) have almost identical performance results. The point being is that routing and string manipulation are relatively inexpensive when compared to even the most lightweight TCP request, in this case to the memcache server.


By the way:

https://plus.google.com/u/0/115863474911002159675/posts/L3o9...

More SPEEEEEEEEEEEEED coming down the pipe in 1.1 for Go's net/http. :)


I think Go is almost the ideal example here, you're right. Go provides a pretty rich "standard library" for writing web serving stuff so it's a good place where you could really imagine writing your performance critical stuff just on the base platform even if you use something like Webgo for the rest of your app.

Some of the other platforms are much less amenable to that since the standard primitives the platform exposes are very primitive indeed (servlets api, rack api, etc.). Perhaps there's some value in looking at how your favorite framework stacks up against its raw platform and trying to contribute some optimizations to close the gap a bit.


I'm curious about that - because there's so little to webgo I suspect the answer is something really trivial. I haven't really looked at it before, but the framework itself is just 500 lines or so unless I'm looking at the wrong one...

Given that the json marshalling and server components would be exactly the same between go and webgo, I'm curious as to whether changing the url recognised to be just /json in the goweb tests would make any difference, any reason it was different?


Just had a look at the tests and the urls responded to differ:

http://localhost:8080/json

http://localhost:8080/(.*)

Shouldn't all these examples at least be trying to do the same sort of work? For such a trivial test differences like that could make a huge difference to the outcome.

It's great to see replicable tests like this which show their working, but they do need to be testing the same thing. I also think they should have something more substantial to test as well as json marshalling on all the platforms, like serving an HTML page made with a template and with the message, as that'd give a far better indication of speed for tasks web servers typically perform.

Still, it's a great idea and really interesting to see this sort of comparison, even if it could be improved.


One of the next steps we'd like to take is to have a test that does cover a more typical web request, and is less database heavy than our 20 query test, just like you describe. Ultimately, we felt that these tests were a sufficient starting point.


I was a little confused by the different urls used in the tests, as for this sort of light test, particularly in Go, where all the serving stuff is common between frameworks, you're mostly going to be testing routing. Any reason you chose a different route here? (/json versus /(.★) )?

I can't think of much else that this little web.go framework does (assuming the fcgi bits etc are unused now and it has moved over to net/http). I don't think many people use web.go, gorilla and its mux router seems to be more popular as a bare bones option on Go, so it'd possibly be interesting to use that instead. It'd be great to see a follow up post with a few changes to the tests to take in the criticisms or answer questions.

While you may come in for a lot of criticism and nitpicking here for flaws (real or imagined) in the methodology, I do think this is a valuable exercise if you try to make it as consistent and repeatable as possible - if nothing else it'd be a good reference for other framework authors to test against.


Webgo... I'll stick with "net/http" and Gorilla thanks.

Also, They used Go 1.0.3... I hope they update to 1.1 next month. Most everyone using Go for intensive production uses is using Go tip (which is the branch due to become 1.1 RC next month)


This is great to know, we were hesitant to use non-stable versions (although we were forced to in certain cases), but knowing that it's what is common practice for production environments would change our minds.


We switched to using tip after several Go core devs recommended that move to us, the folks on go-nuts IRC agreed and we tested it and found it to be more stable than 1.0.3


Wow, directly on tip? That seems to speak very highly of the day-to-day development stability of Go.


A good tip build tends to be more stable then 1.0.3 and has hugely improved performance (most importantly for large application in garbage collection and generation).

To select a suitable tip build we use http://build.golang.org/ and https://groups.google.com/forum/?fromgroups#!forum/golang-de... . My recommendation would be to find a one or two week old build that passed all checks, do a quick skim of the mailing list to make sure there weren't any other issues and use that revision. Also, you will see some the the builders are broken-

Of course if your application has automated unit tests and load tests, run those too before a full deployment.


Thanks, this comment really helped me in my evaluation of Go today. I had been playing around with 1.0.3 for a couple days, but tip is definitely where it's at.


I'm glad I could help. Go 1.1 RC should be out early next month. So if you want you could wait for that (for production use).


Or it could speak poorly of their release process, which is more accurate. The stable release is simply so bad compared to tip that everyone uses tip. There should have been multiple releases since the last stable release so that people could get those improvements without having to run tip.


Why not both? Insisting on a very stable API can result in long times between releases, which can mean more people using tip. That's distinct from how stable tip is.


Given the frequent complaints that the previous stable release isn't very stable, I think trying to interpret it as "tip is super stable" is wishful thinking. Tip is currently less bad than stable. The fact that stable releases are not stable is a bad thing, not a good thing.


What does stable mean? If stable means there are not unexpected crashes, then Go 1.0.3 is extremely stable.

If stable means suitable for production, Go tips vastly improved performance, especially in regards to garbage collection, make it more suitable than 1.0.3 for large/high-scale applications in production.


I'm not sure many people would use webgo in real life. I don't know... maybe some people... certainly not pros.

Also, the 1.0.3 thing is probably dragging on the numbers a bit. 1.1 would boost it a little. Not enough to get it into the top tier... but a little.

Also, for Vert.x, they seem to be only running one verticle. Which would never happen in real life.

Play could be optimized a bit... but not much. What they have is, to my mind, a fair ranking for it.

Small issues with a few of the others but nothing major. I think Go and Vert.x are the ones that would get the biggest jumps if experts looked at them. And let's be frank... does Vert.x really need a jump?

So what they have here is pretty accurate... I mean... just based on looking through the code. But Go might fare better if it used tip. And Vert.x would DEFINITELY fair better with proper worker-nonworker verticles running.


The Play example was totally unfair since it blocks on the database query which will block the underlying event loop and really lower the overall throughput.


Well... to be fair...

the Vert.x example, as configured, blocks massively as well waiting for mongodb queries.


Could you point me at an example of an idiomatic, non-trivial Go REST/JSON API server? I've been trying for a while to find something to read to get a better handle on good patterns and idiomatic Go, but I haven't really come up with anything. I've found some very good examples of much lower-level type stuff, but I think I have a decent handle on that type of Go already. What I really would like is a good example of how people are using the higher level parts of the standard library, particularly net/http etc.


Sorry... I'm not really a book kind of guy when it comes to this stuff. The golang resources are mostly what I use.


for Vert.x, we specified the number of verticals in App.groovy, rather than on the command line, which we think is a valid way to specify it.


OK... I ran the Vert.x test... runs a bit faster here with 4 workers instead of 8. I suspect what is happening there is that at times all 8 cores can be pinned by workers, while responses wait to be sent back on the 8 non workers. But not that big a change in speed actually. One thing more, when you swap in a couchbase persister for the mongo persister it's faster yet. The difference is actually much larger than the difference you get balancing the number of worker vs non worker verticles. Also thinking that swapping gson in for jackson would improve things... but I don't think that those are fair changes. (well... the couchbase may be a fair change)

Also tested Cake just because it had been a while since I have used it... and I couldn't believe it was that much worse than PHP. Your numbers there seem valid though given my test results. That's pretty sad.

Finally, tried to get in a test of your Go stuff. I'm making what I think are some fair changes ... but it did not speed up as much as I thought. In fact, initially it was slower than your test with 1.1.

So after further review... well done sir.


That first line is truly one of the best comments I've seen when discussing languages. I've clipped it and will use it from now on:

>> I'm not sure many people would use [xxx] in real life. I don't know... maybe some people... certainly not pros.


The Play app uses MySql in a blocking way, while Nodejs uses mongo. It's not comparable.


We have a pull request that changes the Play benchmark (thank you!) so we will be including that in a follow-up soon.

We tested node.js with both Mongo and MySQL. Mongo strikes us the more canonical data-store for node, but wanted to include the MySQL test out of curiosity.


That is bad benchmarking!


I'd love to see your framework optimization index. Honestly, all of this would be a wonderful thing to automate and put in a web app - a readily-accessible, up-to-date measure of the current performance of the state of the art in languages and frameworks. I bet it would really change some of the technology choices made.


Here's a quick version of the framework optimization index. Higher is better (ratio of framework performance to raw platform performance, multiplied by 100 for scale):

Framework Framework Index

Gemini 87.88

Vert.x 76.29

Express 68.85

Sinatra-Ruby 67.88

WebGo 51.08

Compojure 45.69

Rails-Ruby 31.75

Wicket 29.33

Rails-Jruby 20.09

Play 18.02

Sinatra-Jruby 15.96

Tapestry 13.57

Spring 13.48

Grails 7.11

Cake 1.17


In the same vein, I was curious to compare the max responses/second on dedicated hardware vs ec2 on a per framework basis. The following is percentage throughput of ec2 vs dedicated (in res/s):

cake 18.9% (312 vs 59)

compojure 12.1% (108588 vs 13135)

django 16.8% (6879 vs 1156)

express 16.9% (42867 vs 7258)

gemini 12.5% (202727 vs 25264)

go 13.3% (100948 vs 13472)

grails 7.1% (28995 vs 2045)

netty 18% (203970 vs 36717)

nodejs 15.6% (67491 vs 10541)

php 11.6% (43397 vs 5054)

play 20.6% (25164 vs 5181)

rack-jruby 15.6% (27874 vs 4336)

rack-ruby 22.7% (9513 vs 2164)

rails-jruby 22.7% (3841 vs 871)

rails-ruby 20.7% (3324 vs 687)

servlet 13.4% (213322 vs 28745)

sinatra-jruby 21.2% (3261 vs 692)

sinatra-ruby 22.2% (6619 vs 1469)

spring 7.1% (54679 vs 3874)

tapestry 5.2% (75002 vs 3901)

vertx 22.3% (125711 vs 28012)

webgo 13.5% (51091 vs 6881)

wicket 12.7% (66441 vs 8431)

wsgi 14.8% (21139 vs 3138)

I found it interesting that something like tapestry took a 20x slowdown when going from dedicated to ec2, while others only took ~5x slowdown.

Edit: To hopefully make it clearer what the percentages mean - if a framework is listed at 20%, this means that the framework served 1 request on ec2 for every 5 requests on dedicated hardware. 10% = 1 for every 10, and so on. So, higher percentage means a lower hit when going to ec2.

Disclosure: I am a colleague of the author of the article.


You're saying that running a query across the internet to ec2 is 5 times faster than running it on dedicated hardware in the lab? I find that hard to believe.


Sorry, maybe my original post was not entirely clear. Let's take tapestry, for example. On dedicated hardware, the peak throughput in responses per second was 75,002. On ec2, it was 3,901 responses per second.

So, in responses per second, the throughput on ec2 was 5.2% that of dedicated hardware, or approximately 20 times less throughput. The use of the word slowdown was possibly a bad choice, as none of my response had to do with the actual latency or roundtrip time of any request.


These could probably be further broken down into micro-frameworks (like Express, Sinatra, Vert.x etc.) and large MVC frameworks (like Play and Rails).

Gemini is sort of an outlier that doesn't really fit either category well, but the micro-frameworks have a fairly consistently higher framework optimization index than the large MVC frameworks which is as expected.

Express and Sinatra really stand out as widely-used, very high percentage of platform performance retained frameworks here. I've never used Vert.x, but I will certainly look into it after seeing this. I'm very impressed that Express is so high on this list when it is relatively young compared to some of the others and the Node platform is also relatively young.

Play seems particularly disappointing here since it seems any good performance there is almost entirely attributable to the fast JVM it's running on. Compojure is also a bit disappointing here (I use it quite a bit).


The play test was written totally incorrectly since it used blocking database calls. Since play is really just a layer on top of Netty it should perform nearly as well if correctly written.


I believe they're encouraging pull requests to fix that sort of thing. It will be interesting to see if it helps to that degree; I hope so!


But Play's trivial JSON test was much slower than Netty's.


That's because they inexplicably use the Jackson library for the simple test, rather than Play's built in JSON support (they use the built-in JSON for the other benchmarks).


Both Netty and Play use Jackson though one the Netty version uses a single ObjectMapper and the Play version uses a new ObjectNode per request (created through Play's Json library).


I hope this gets fixed!


No, they use Play's JSON lib. It's kind of a moot point because Play's lib is in fact a wrapper for Jackson.

Here's the source: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

So the question stands: If Play & Netty are using the same JSON serialization code, why is Play seven times slower?


I'd imagine that being relatively young is an advantage in a test like this. You're not utilizing any features, and features are what slow down requests. The less features something has, the faster it should perform in these trivial tests.


That's a very good point; I hadn't thought of it that way. Maybe this is some small part of why we seem to keep flocking to the new kids on the block.


It's funny you should put this together because in an earlier draft of this blog entry I had created a tongue-in-cheek unit to express the average cost per additional line of code. Based on our Cake PHP numbers, I wanted to describe PHP as having the highest average cost per line of code. But we dropped this because I felt it ultimately wasn't fair to say that based on the limited data we had and it could be easily interpreted as too much editorializing. Nevertheless, as you point out, it's interesting to know how using a framework impacts your performance versus the underlying platform.

I too wanted Play to show higher numbers. There's certainly a possibility we have something configured incorrectly, so we'd love to get feedback from a Play expert.


Good thing you opted against sensational journalism.

About the cost per additional line of code for PHP, it mainly comes from not having an opcode cache and having to load and interpret files on every visit. mod_php was and will always be trash. I commented earlier about it too.

In case of Ruby, and talking about Rails, even when using Passenger, the rails app is cached via a spawn server. That's not the case with PHP.

Similarly, Python has opcode cache built-in (.pyc files). Also, I am not sure about gunicorn but others do keep app files in-memory.

Servlet does the same about keeps apps in memory. You get the idea.

Frameworks definitely have an impact but it's very hard for one person to know the right configuration for every language. You had done some good work there, but it will take a lot of contributions before the benchmark suite becomes fair for every language/framework.


We've received a lot of great feedback already and even several pull requests. Speaking of, we want to express a emphatic "Thank you" to everyone who has submitted pull requests!

We're hoping to post a follow up in about a week with revised numbers based on the pull requests and any other suggested tweaks we have time to factor in. We're eager to see the outcome of the various changes.


I am completely agree with you, this is not proper bench marking as opcode caching is missing in php, benchmarking should be re calculated by configuring APC.


For Play, you'll want to either 1) handle the (blocking) database queries using Futures/Actors that use a separate thread pool (this might be easier to do in Scala) or 2) bump the default thread pool sizes considerably. The default configuration is optimized purely for non-blocking I/O.

See e.g. https://github.com/playframework/Play20/blob/master/document... and https://gist.github.com/guillaumebort/2973705 for more info.


Why would the JSON serialization test perform so poorly though?


I'm pretty shocked that Play scored so low. One would think that being built on netty would put Play in a higher rank. Database access for sure needs to be in async block


It's not just the DB test. The JSON test was way slower than Netty, too.


Agreed. We need to get the Play benchmark code updated to do the database queries in an asynchronous block. Accepting pull requests! :)


Not quite sure I understand your point re: the cost of the frameworks, above the bare standard library.

Do you mind breaking it down for me a bit please?

Cost in dollars or cost in hardware utilization or some other cost?


If you have developers who for one reason or another prefer a given platform, then the most important performance comparisons are about how close various frameworks on that platform get to the performance of the platform itself.

Knowing how much I'm giving up in performance in order to get the features a given framework gives me is an important consideration. Also understanding when it's worthwhile to work outside the framework on the bare platform given the speedup I'll get versus the cost I'll incur by doing so is a very important optimization decision-making tool.


How do you derive how much performance you are giving up from these benchmarks? There is not a neat relationship between the two.


There are performance numbers for a framework (Cake PHP, for example) and for the raw primitives of the platform it runs on (PHP, in that case). By finding the ratio between the two one can arrive at the performance loss attributable primarily to the framework you've chosen.

See my "framework optimization index" in comments below for a rundown on all these ratios which I was able to back out from this set of benchmarks.


"Among the many factors to consider when choosing a web development framework, raw performance is easy to objectively measure."

Oh really? Then why did Zed write such an angry rant about how you are doing it wrong?

http://zedshaw.com/essays/programmer_stats.html

Can we please see some standard deviations, at least?


What's wrong with that guy?


It's a pretty serious problem with how we benchmark though.

EDIT: when looking up what I vaguely remembered I somehow managed to come across a similar article that was published just today[1], even though I was referring to an older one[2] which was about microstuttering (basically: a high standard deviation in frame rate). The point still stands - in fact it applies to both cases in somewhat different ways.

To give an example: Crossfire and SLI graphics card setups a few years ago[1]. It turned out that while both gave a similar performance increase in average framerates. Then was it discovered that one of them had a significantly lower minimum framerate than the other. A high minimum framerate is probably more important in shooters than peak performance, but that's not what we've been testing all of these years, is it? That's exactly the problem highlighted in the article by Zed.

I know this is a gaming example, but I'm sure that in user perception of the performance this matters just as much for the responsiveness of webpages.

[1] http://www.tomshardware.com/reviews/graphics-card-benchmarki... [2] http://www.tomshardware.com/reviews/radeon-geforce-stutter-c...


Indeed, it's important to look not just at the average performance and performance extremes, but also the distribution of performance.

Standard deviation helps with this. Also, often times looking at the latency at the 50, 90, and 99th percentiles is valuable as you can see events that would make your users unhappy. They're a very tangible series of metrics.


He's a fantastic software engineer but he is _very_ abrasive. I've read some of his posts claiming he would fight some other developer in person at a conference if he steps up. He would rent out a ring and he would put his yellow belt to practice.

Really, I'm not making this up. He sounds like a jerk to work with.


He's actually a really nice person. He helped me out (to him I was just some stranger on the phone) when I was trying to decide what to do with my career when I was in NYC.

Zed Shaw is probably one the best people you can know in the developer community, a very good guy.

Your sensationalism based on some of the stuff he says on Twitter and Blogs is amusing though.


> Your sensationalism

well he doesn't try to be a nice guy in public...


He is actually a very nice person to be around and work with. Besides being professional and experienced he is also outspoken with a firm opinion.

You might think that's being a jerk, I think that's honest and reliable.


I am among those appreciative of his contribution to my education. At the same time I wish he'd let the word 'fuck' regain its undoubted impact by not using it on what seems to be every conceivable occasion (excuse the pun).


He's just a bit too honest and direct for the average American.

From that perspective at least, I expect people like Zed to do great in Northern Europe. There's a reason Americans think DHH is rude. He isn't; he's just Danish.


It seems like some context might be missing here.


Context: http://web.archive.org/web/20080103072111/http://www.zedshaw...

" I mean business when I say I’ll take anyone on who wants to fight me. You think you can take me, I’ll pay to rent a boxing ring and beat your fucking ass legally. Remember that I’ve studied enough martial arts to be deadly even though I’m old, and I don’t give a fuck if I kick your mother fucking ass or you kick mine. You don’t like what I’ve said, then write something in reply but fuck you if you think you’re gonna talk to me like you can hurt me."


He's a lifestyle ranter. He used to have a section of his website called /rants/ which you can still read here: http://web.archive.org/web/20080105054424/http://www.zedshaw...


Seems like a bad case of ADD :\


Ha! Gotta love Zed


I'm torn about this. On the one hand, while I've known my framework of choice (Rails) is slow, I didn't know how much slower it could be in the grand scheme of things. But on the other hand, I'm more shocked by the difference between EC2 and dedicated hardware (10x improvement with rails), and even 89 requests per second (20 query benchmark on EC2) is still a decent amount of traffic. (Plus this doesn't count any optimizations I would make anyway, like caching).

Either way good architectures usually optimize the high traffic or high CPU areas anyway away from a scripted language.

Thanks for the really informative post! Go seems to be a good balance as a high performance language without having to go back to my traumatic Java days.


Don't be too disappointed about Rails.

As a rule I like to divide this world into "Featureless" and "Featurefull" products.

When you use Rails, you're aiming to pile up features. You want to react to Product managers, to users, you want to work fast and satisfy the needs of customers - or else you won't have anyone to build to.

In this reality, the fact that you're doing 20req/s is OK. In fact, I'm betting that even when you take Go or Node.js - pile up all of the infrastructure and features that exist in Rails, and pile up a ton of your code - buggy and not buggy - you'll get around the same kind of satisfaction index from users.

This is because your product can be perceived as slow even though your servers are blazingly fast.

On the other side of the spectrum there are "Featureless" products. These are infrastructural products. A logging service. An analytics service. A full-text search. A classification and recommendation engine.

These you don't want to build in Rails. I'm sure you haven't even considered it. These you want to build with one of the top-notch libraries that this survey indicate.


Also, there are certain features about rails like thin or unicorn that can drastically increase your overall performance. So in that sense, I think it's a lot more complicated to determine.


Thanks for the feedback, atonse! We had a great amount of fun putting this together, as you can imagine.

I agree, a remarkable take-away for us was how dramatically our i7s excelled over the EC2 instances. Admittedly, those were EC2 Large and not Extra Large instances.

A previous draft of this blog entry had a long-winded analysis of hosting costs--discussing the balance between ease and peace-of-mind provided by something like AWS versus the raw performance of owned hardware--but we elected to remove that since it wasn't really the point of the exercise.


Were these the new 2nd-generation large instances or the original ones?


They were m1.large.


EC2 was a hard platform to test on, only because our i7 hardware would give us results fairly instantaneously, but we became impatient when we had to wait upwards of 10x as long for the data on the EC2 large instances.

We're actually very interested in how the large/newer instances perform.


I feel obliged to ask what constitutes 10x of instantaneous.


I worked with Pat (pfalls) on this effort. He pulled the benchmarks together and built the script to automate the tests. We aimed to deploy each framework/platform according to best-practices for a production environment and then stress test common operations: JSON serialization of objects and database connectivity. We were surprised by the wide spectrum of performance we observed and hope that this interesting to you as well. Four orders of magnitude in one of our tests!

If you have any questions or see something we stupid we did, please let us know. We'd like to correct any mistakes straight away, especially since we're certainly not experts on all of these frameworks and platforms.


I'm no expert, but I think certain languages/frameworks are better suited to be behind certain servers when high concurrency is tested. E.g. from http://nichol.as/benchmark-of-python-web-servers it seems django would be better served behind gevent.


There are a lot of variables and tweaking that can be done, and it would be nearly impossible to optimize each.

Similarly, I was wondering what sort of an effect connection pooling would have, as the out of the box django distribution doesn't do that. It really didn't perform too well in their tests.


At LEAST gevent with session write-through caching, psycopg2pool, Postgres SQL (hello, excellent South support?), no unnecessary middlewares or applications that rely on them (if it's a speed oriented use of Django, we're hosting on a specific API sub-domain, right?). At most, THEN you tune the settings to have an optimum number of Postgres threads staying alive and tweak some gunicorn/nginx max connections parameters for your site. If running all locally, use UNIX sockets. This article is trash when it comes to providing any useful data other than Django that's barely been configured beyond not using SQL-Lite, and who the hell uses that in production, so I don't buy their argument about "oh well we just wanted to see what It'd do out of the box" rhetoric. Might as well benchmark ./manage.py runserver. I wish they'd it right or don't publish, let alone publish to shill their company that doesn't provide what they advertise.


Thanks for your pull requests, knappador. We will try to get a revised post out in roughly a week or so with as many of the tweaks we've received (as pulls and tips) as we can muster.


I'd be interested to see performance for Vert.x on its other hosts (this is the JVM version, I believe).


I think you may misunderstand Vert.x's polyglot features:

While vert.x supports many programming languages, all of these are run on the JVM runtime. This means when you use the ruby vert.x API, you're using JRuby; likewise with Javascript run through Rhino, Python through Jython, and Groovy/Scala run through their own interpreter/compilers.

That said, it would definitely be interesting to see the performance implications of using one of those languages and vert.x on the JVM.


Interesting point. You're correct, we've only tested Vert.x as a Java/JVM platform.


I agree, seeing vert.x with it's other language options would be interesting.


Rhino with RingoJS would be another good JVM-based test.


Erlang would be nice to see, with, say, Chicago Boss.


Agreed. As you can imagine, we had to stop adding additional frameworks somewhere or we'd never get this posted. :)


Since these tests seem to be all about JSON serialization, it would've been interesting to see the tests with rails-api instead of the standard Rails stack:

https://github.com/rails-api/rails-api

What webserver were you using on JRuby? Was it Trinidad? Did you try Jetpack?


I concur, and Rails 4 may not be officially released yet but it's stable enough to run these tests against.


rails 4 with some concurrency would be interesting to see


Where is ASP.Net MVC? Odd that you list obscure frameworks like Wicket and leave out one of The Big Four frameworks.

(The big four in my book are: ASP.Net MVC, Rails, Django and CakePHP)


We'd love to have ASP.Net MVC included. One minor gotcha is that to do it justice, we'd need to spin up a Windows EC2 instance and figure out how to script that. It's on our to-do list!

We did briefly test ASP.Net on Mono (see another comment in this thread) but didn't include it since we didn't believe that qualifies as a "production" grade ASP.Net MVC deployment.


You should include it; Lots of shops are deploying their products on .NET MVC with Mono.


I agree... Let's see what the numbers show for Mono, and on Windows.


Just use AppHarbor.


I came here to say this. I don't understand why some of the development community likes to act like .Net doesn't exist....

It pains me to see charts and reporting done like this while leaving out my favorite framework.


Since these benchmarks are so wide-ranging, I agree. But that means setting up an entire new testbed on Windows, and then trying to make it comparable to the other platform testbed; possibly tuning. You need a Windows expert to do this. My question is, why aren't Windows experts setting these up?


If you can cover ASP.Net MVC, then I'd recommend including ServiceStack. My own tests of their JSON implementation have shown it to be 5-10x faster than ASP.net MVC.

Drop me a line if you need a hand with either :)


Just curious, but is CakePHP seen as a major framework by people outside the PHP Community?


Big four without any Java framework, but with .Net one? And CakePHP as a major one, I wasn't aware it's still used.


I'd like to see how .Net MVC would compare. I realize you'd have to spin it up on a Windows EC2 instance and there would definitely be some variance in the performance of that box vs. the nix EC2 instances but I'd still be interested in seeing how it fares in comparison.


We agree. We aim to provide a .NET MVC test soon. We did briefly test that on our i7 hardware and if memory serves me correctly, it clocked in at around the position as Spring.

But don't quote me on that! :)


I would very much love to see:

(win | mono) + (httphandler | asp.net mvc | webapi | servicestack | nancyfx)

it would hopefully compare with java stacks!


Yep, would also be interesting to see Web Api, hosted on Windows and on Mono.


I would also suggest ServiceStack


And Synchronous and Asycnhrounous (Using C#5 async) version.


I think you're about right on that one. Of course we were running on mono.


> This exercise aims to provide a "baseline" for performance across the variety of frameworks. By baseline we mean the starting point, from which any real-world application's performance can only get worse.

I disagree with the implication here (that this is a good point for comparison because "real-world application's performance can only get worse."). Yes it can only get worse but how much worse (per unit of "features") is both significant and unaddressed.

This isn't the best example but look at the gap between the top and bottom of the scale in the Database access test (single query) and Database access test (multiple queries) charts: In the first, Gemini is ~340x faster than Cake, in the second, only ~23x faster. There is still a big gap but it closed by an order of magnitude once you stepped past the most trivial possible DB access test.

So nodejs or php-raw is faster than cake at a single DB access, but what about when you create a real world scenario with authentication, requirement to be able to update features faster (i.e. use an ORM), env. portability requirement, etc.? It seems to me this would look like a little slower, a little slower, a little slower in the {raw} versions, and already included, already included, already included in Rails or Cake. The full featured frameworks take a lot of their performance penalty up-front, with less of a hit as features are added (maybe? :P).

My point is that it's not reasonable to assume that hackernews-benchmarks will actually reflect production use. That said I think the article is cool, and agree that it's good to keep framework authors' feet to the fire regarding performance!


This is completely loaded. Your implication is that the only viable test is a test which exercises all of the functionality of the most feature rich framework. How would that be a)viable and b) meaningful?

We know that there is a set of common features and the benchmarks goal is to test least common denominator stuff on the networks. Authentication and portability are not LCD. The argument that they are is capricious. What if we made the requirement be that the framework is a lisp? Now we've completely changed the intent.


I meant to suggest that comparing php-raw to Rails is apples & oranges- not "you must benchmark in a way that benefits larger frameworks", just "please acknowledge that LCD tests like this inherently cast Railsy frameworks in a bad light."

It's like condemning a swiss army knife because it's not as efficient as a fixed blade at cutting apples. Well yeah that's true, but what about when you need to screw a screw or pull a cork? One is a multitool, it doesn't make sense to compare it to a specialized tool unless all you plan to do is cut apples.


Would've loved to see http://servicestack.net on this list which has great performance on .NET and Mono: https://github.com/ServiceStack/ServiceStack/wiki/Real-world...

And also maintains .NET's fastest JSON and Text Serializers: http://theburningmonk.com/2011/11/performance-test-json-seri...


Thanks for the tips on these. We'll add ServiceStack to our to-do list. As you might imagine, that list is getting long as a result of some great community feedback. Pat (pfalls) is diving into the pull requests this morning.


pfalls - amazingly, I spent the last 2 days of my holiday doing the same thing for a future open source project. I was just stumped when I saw you guys did the same (could have saved me a couple of days!)

I wanted to find the leanest Web framework on any kind of platform; but the difference from your approach - I already knew the kind of code that would run on it.

I tested: Go, Java (servlet, dropwizard), Scala (scalatra), Ruby, Node.js (connect).

For me it was:

* Scala

* Java

* Clojure (equal to Java - big surprise here)

* Node.js

* Go (almost equal to Node.js)

* Ruby (far far down)

Scala took the lead with amazing results. More over, a good metric was latency which Scala was the only one to take micro-second resolution.

I'm not a fan of Scala because of its surrounding tools, which is why I'm still considering going for either Clojure or Node.js.

I think the most surprising positively was Clojure, being that it is a dynamic language. And most surprising negatively was Go - by itself is impressive, but when given real work (Web handling, Redis/mongodb) goes bad quickly. Happy to see this correlates with your findings too; I'm assuming this is a symptom of library maturity..?

I'd be happy to see how Scala fares on your tests.

You've done an awesome job!


Thanks for the comment!

This started out as a small exercise, that quickly ballooned because we were curious about every framework and platform. Obviously we had to stop somewhere, but we're very interested in adding more tests in the future. In fact, we're hoping the community will help us out as well!


Yes, I know the feeling :)

What started as a couple of hours of exercise for myself ended up as 2 days of hacking and barely sleeping, as surprises in my assumptions kept unfolding, and as I wrote and rewrote POCs just to validate that Clojure is as fast as the number say, and that Scala is faster than Java, etc.


Dropwizard looks awesome, just the kind of project I've been looking for, also it links to JDBI which is very similar to a sql lib I maintained for myself all these years and looks awesome. Thanks for posting!


Would you consider releasing some of the benchmark code/setup? (better yet integrated into OP's project) I am very interested in seeing clojure's performance. What kind of framework did you use for clojure?


Scala really would have been nice. Especially a Scala/Akka/Spray kombo :)

I'm working with a setup like this and just love it!


1) The Python version has some basic newbie coding errors. This sort of code is what Python programmers call "Java written in Python". It may be a valid algorithm in Java, but it's the wrong way to do it in Python. Code like this will work, but it will be slow. Depending on the size of "queries", you are potentially allocating gobs of memory in two different places for no reason, and then throwing it away without using it. I wouldn't be surprised if the examples in other languages had similar problems.

2) The JSON serializer in Django 1.4 uses a method which is known to be very slow, but which is easily portable across different platforms and works with older versions of Python. They no doubt included for easy bundling. In a real application you would probably want to simply use the normal JSON serializer from the standard library (which is many times faster).

3) The examples are little more than "hello world". I did some benchmark tests with several Python async frameworks, Pypy, and Node.js for an application I was working on. With small JSON objects there wasn't much difference in performance. Once you started using large JSON objects the performance lines for all versions were indistinguishable from each other. The performance bottlenecks were in libraries, and those standard libraries were all written in 'C', so interpreter versus compiler versus JIT made little difference.

4) The problem with "toy" examples is that in real life there are two performance factors which must be taken into account. Think of as y = mx + b. With a toy example you are probably only measuring "b". With most real life applications it's "m" that matters. There are often different optimization approaches that are best for varying ratios of "b" and "m". You have to know your application intimately and benchmark using data which is realistic for that application.

Python has a reputation for being "easy to learn". However, it is "easy" in the sense of being able to hack something together that works without knowing very much. There can be several different ways of doing things and doing it one way versus another way can mean a difference in performance of several orders of magnitude. The same may be true for some of the other languages, but I haven't examined them in enough detail to say.


Numerous irregularities plus a strong vested interest in the JVM make me doubt they have given adequate shrift to Go, here.

Given the amount of interest in Haskell and Yesod around here, it is strange that it is missing.


Could you elaborate? Their website seems to indicate that they are a very polyglot shop, not someone pushing a JVM agenda:

"On the back-end, we use Java, Ruby, Python, .NET, PHP and others based on what makes sense balancing server performance, scalability, hosting costs, development efficiency, and your internal development team's capabilities."


"We have included our in-house Java web framework, Gemini, in our tests. ... "

Then you see results for some languages that are completely out of whack with most other such benchmarks (they themselves mention the weirdness of Sinatra vs. Rails, for example).

Then you see on a couple platforms that more performant mainstream options have been excluded, for no good reason.

Then if you look at the repo, there are deployment choices and code mistakes in some of the other languages which go well beyond elementary incompetence...


Feel free to issue a pull request to help our testing. We are not trying to push Java-based frameworks over any others and we believe we are being fair across the board. That being said, if there are "code mistakes... which go well beyond elementary incompetence", then we would love to correct and retest these.


Maybe you should've had the controllers execute raw SQL for a better comparison. I see you executing a regular query in Rails whereas your Java servlet is using prepared statements.


We love Go here! But admittedly, we have not yet deployed an actual Go web application to a production environment, so the tests demonstrate our first attempt at creating a Go production environment. We based the approach on whatever material we could find on Go reference sites.

That said, we'd love to hear what we did wrong in the Go tests so that we can fix those up.

We'll be posting follow ups as we've had a chance to go through all the recommended tweaks.


I hope no one contributes a Haskell solution to this farce.


>Given the amount of interest in Haskell

I see more people making uninformed "haskell sucks" posts than expressing interest in it.

>and Yesod

Really? Yesod is the anti-haskell haskell framework.


> Really? Yesod is the anti-haskell haskell framework.

Can you please elaborate. Being interested in Haskel web development and trying to choose web framework makes me wish for more information.


I wouldn't call Yesod "anti-haskell". By default, it relies on QuasiQuotes and TemplateHaskell a lot [1], which are extensions to the GHC. So by default, you'd have a hard time running Yesod applications on anything else but GHC (the Glasgow Haskell Compiler). These extensions allow you to write in an EDSL that generates Haskell for you. IMO, Yesod's use of these extensions are a benefit, as it allows the user to get stuff like type-safe URLs in HTML for free (e.g. you put href=@{Home} on your HTML element and Yesod will ensure that the value interpolates to a route that exists at compile time).

Haskell libraries often depend on language extensions, whether it is overloaded strings or type families or whatnot... so I think it's strange that Yesod gets picked on for doing the same: taking advantage of the tools provided by GHC to create a better environment for the developer.

[1] http://www.yesodweb.com/book/haskell#template-haskell-14


Hugs is now defunct (last release in September 2006, doesn't even support the 2010 language standard), so there is no reason that being GHC-only should be a consideration in selecting a Haskell framework. It's the only real option.


I'm guessing, but I think it's because yesod uses a lot of magic such as templates and the like. The other frameworks like Snap use more idiomatic Haskell.


Yesod is designed to try to replicate rails style frameworks, which as an approach, doesn't work well in a static language. It is also designed to try to hide any traces of haskell. Rather than provide a framework to write haskell code in, they use quasi-quotation to provide a bunch of totally different syntaxes for different parts of the app, which get compiled to haskell behind the scenes. Most haskell users prefer to write haskell rather than specialized, single use languages with limited functionality and poor error reporting.

Then on top of that, the marketing behind yesod is essentially deliberate mis-truths that suggest weaknesses in haskell which do not exist. See how pilgrim689 thinks that the EDSL yesod uses for routing "gets you type safe urls"? Type safe urls are also available in happstack and snap, but written in haskell rather than a weird custom pre-processor language. The EDSL is just giving you a different syntax (and making error messages complex and hard to understand), not giving you the type safety. Pointing out that creating custom languages that are inferior to haskell and have no benefits is bad results in whining of "stop picking on yesod just because we use extensions, everyone uses extensions!", despite the use of extensions never being brought up.

As for more information about web frameworks in haskell: I've tried all three and can give you my thoughts. I ended up using snap, so consider me biased when reading this. Yesod is rails-like in that it pushes a misinterpretation of MVC on you that encourages writing redundant code. Happstack and snap aren't really frameworks in that sense, they don't say "give me some code following my conventions and I'll run it", they say "here's how you get access to the request, have fun". More like libraries than frameworks.

Yesod's DB access layer they provide is of the "dumb everything down to the lowest common denominator" variety, except that it adds even more limitations beyond that. So you end up having to use something else that is not integrated at all. Happstack and snap don't provide a DB access layer, but do provide integration with several DB libraries off hackage (hdbc, haskelldb, postgresql-simple, acid-state).

Happstack has the best documentation of them, and it and snap are very similar design wise. Porting an application from one to the other is pretty straight forward. The only reason we settled on snap instead of happstack is that snap includes a development mode that works well, and happstack does not. Meaning with snap you just change your code and it picks up the changes, recompiles and reloads it automatically, and shows you any errors in the browser when you refresh. With happstack you either need to work out your own way to deal with that, or keep recompiling manually all the time.


You and tikhonj are all over the place in here. If you were being downvoted into the gray, you would know. There is no shortage of people praising Haskell every day, this is what is in fashion today.

There are also regular posts about Yesod.

I conclude that you know perfectly well that Haskell and Yesod are regularly mentioned on HN, but find it inconvenient to have mentioned for some reason I do not fathom.


> There is no shortage of people praising Haskell every day, this is what is in fashion today.

I really don't like this characterization of interest in Haskell. It implies that it's no different from any other language and is just arbitrarily picked up because it's trendy. Learning Haskell is a very substantial investment of time and effort, it is very different from languages that most programmers have used before. It practically tells people “don't even try to like me, I'm high maintenance.”


>You and tikhonj are all over the place in here

I am all over the place in here for the exact reason I mentioned. Go look at my posts, for every post about haskell by me, it is in response to someone posting some absurd nonsense like "haskell can't do real world" and "functional programming is great except you can't really do it because state". If people were interested in haskell, they would express interest, not strawman dismissals.


But that is like claiming nobody wants gay marriage because look how loud those Westboro people are screaming.

I'm interested in Haskell. I find it to be frustrating sometimes, and sometimes I vent my frustrations. It is hard to learn. But out of all the opinionated languages out there, Haskell is the one that I agree with the most.

There plenty of people here that are obviously interested. Why does it matter that the naysayers say nay?


>But that is like claiming nobody wants gay marriage because look how loud those Westboro people are screaming.

That analogy would only be accurate if those Westboro people were in the majority.

>Why does it matter that the naysayers say nay?

I'm not sure how to answer this, given the context. I simply pointed out that I don't think the idea that there's a lot of interest in haskell here is accurate, and cited all the uninformed crap spewed about haskell all the time as evidence.


Would love to see how these results compare to some of the web frameworks for concurrent functional languages like Erlang/Haskell: Nitrogen, Chicago Boss, Snap, Yesod, etc.


Please don't encourage them. Even if Haskell comes out on top, I would still be unsatisfied because the rest of the benchmarks are unfair. Lies by confusion are still lies.


ditto. I hear Warp (the server behind Yesod) is a beast.


I'm not familiar with Warp. Would one of you guys be willing to help us put together a test for Yesod?


#haskell, #yesod, #snapframework on freenode are very helpful.


Right, where is Zotonic??, which is actually focused on speed/performance


This is exactly why I decided to use PHP for my startup. I have something along these lines that I hope to blog about in the coming weeks (I tested php-fpm on nginx/go/node.js/silk.js and php won by a landslide when it came to speed).

I would love to see php-fpm on nginx included in this test.


When it comes to speed, considering you are using nginx, the way to go is using nginx as an app server not just fastcgi frontend. Lua-nginx-module combined with proper database module (async with connection pool support like ngx_drizzle or ngx_postgres) can give you speed. OpenResty provides the simplified preconfigured way to try it and adds some features too. http://agentzh.org/misc/slides/libdrizzle-lua-nginx


The problem with php is that it looks great on (some) micro-benchmarks, but on real apps under real sustained load it certainly turns to cold dog shit from time to time for no apparent reason.


What are you basing this on? I've been using PHP for well over a decade in high load environments and never experienced it turn "to cold dog shit" .. any issues I have experienced had a good reason, not "no apparent reason".

But then, I've never used a PHP framework in all the time I've used it .. maybe that has something to do with me never having had negative issue with PHP.


As a Rails developer and admirer, this is eye-opening. I love the framework (and Ruby especially), but these numbers bear some serious consideration.

30-50x performance difference gets really... real, no? The standard refrain of "throw more hardware at it" must reconcile with the fact that a factor of 30-50x means real dollars for the same amount of load. Is the developer productivity really that much greater?


Preface: This post is going to come across as a Rails apologist piece, but please read the entire thing before you reach a conclusion. Please also consider that you could apply these same arguments to just about any of the high-level language based frameworks on the list. I use Ruby on Rails in my comparisons, but I'm a huge fan of Node.js, Python/Django, and Go.

I fully respect the JVM family of languages as well. I just think that Mark Twain said it best when he said: "There are three kinds of lies: lies, damned lies, and statistics." It's not that the numbers aren't true, it's that they may not matter as much, and in the way, that we initially perceive them.

Performance is certainly something you should consider when selecting a language/framework, but it is not the only thing.

========================

You should undertake a detailed examination of these statistics before making any decisions.

Issue #1) The 30-50x performance difference only exists in a very limited scenario that you're unlikely to encounter in the real world.

Look carefully at the tests performed. The first test is an extraordinarily simple operation: take this string, serialize it, and send it to the client. This is the test in which we see massive differences:

Gemini vs Rails

25,264/687 (gemini/rails-ruby) = 36.774

25,264/871 (gemini/rails-jruby) = 29.000

Node.js vs Rails

10,541/687 (nodejs/rails-ruby) = 15.343

10,541/871 (nodejs/rails-jruby) = 12.102

That's a 37x performance win for Gemini, and 15x for Node.js.

Side note: You might be wondering why I didn't compare to the top performer, Netty. Netty is more like Rack. You build frameworks on top of Netty, not with Netty. As a Ruby dev, you could think of this in the same context of comparing Ruby on Rails with Rack; not a good comparison. Hence We won't compare Rails to Netty.

The error would be in extrapolating that a move to Gemeni or Node.js would give you a 37x or 15x performance increase in your application. To understand why this is an error, we jump down to the "Database access test (multiple queries)" benchmark.

Issue #2) Performance differences for one task doesn't always correlate proportionally with performance differences for all tasks.

In the multi-query database access test, we start to see the top JSON performers slow down significantly when compared to the slow down for Rails:

Gemini vs Rails

663/89 (gemini/rails-ruby) = 7.449

663/108 (gemini/rails-jruby) = 6.138

Node.js vs Rails

116/108 (nodejs-mysql-raw/rails-ruby) = 1.077

60/108 (nodejs-mysql/rails-ruby) = 0.555

In this scenario -- which is arguably much closer to the real world -- Ruby on Rails closes the gap and even beats some of the hip new kids.

But why? The in-depth answer to this question would require a lot of space, but the really, really short version is kind of a "what's the sound of one hand clapping" response: Ruby isn't actually all that slow.

To understand what the hell that means, check out this presentation from Alex Gaynor (of rdio/Topaz fame):

https://speakerdeck.com/alex/why-python-ruby-and-javascript-...

Ruby is just about as fast as C, provided you're comparing it to C that does exactly the same operations on the hardware as the Ruby code. Don't get me wrong, that's a HUGE provision. But it warrants close examination.

The real benefit of lower level languages like C is that they give you the flexibility to drill down in to your actual bare-metal operations and optimize the way the program executes on the hardware. As Alex points out, we don't currently have that level of flexibility in languages like Ruby (without dropping down to inline C), so we suffer a performance penalty.

This penalty is huge for simple tasks because they involve only a handful of operations that execute extremely quickly. As you add complexity, however, the benefits of micro-optimizations get lost in the vastness of the overall execution time.

Look at it like this. When Gemini hits 36,717 req/s in the JSON test, each request only lasts about 1.6 ms. This is only possible because of the simplicity of the operations being done on the hardware. Ruby loses big here because there is a lower boundary to the way you can optimize without dropping down to C.

gemini: 1.6 ms per request

rails-ruby: 87.3 ms per request

When we look at the multi-query database access test, we can see how the optimization at the low level gets lost in the sea of time taken to process the request.

gemini: 90.5 ms per request

rails-ruby: 674.2 ms per request

Granted, that is still over a 7x performance win for Gemini, but this is where the Ruby arguments about programmer efficiency come in to play. I don't know Gemini, so it may very well beat Rails in that comparison too. Ruby is getting more performant with every release though, so it's easier to justify on the basis of preference alone when we're this close.


Don't conflate Ruby with Rails. Ruby _is_ slow:

http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...

http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...

and so is Python:

http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...

The only time Ruby or Python are fast is when the program is not running Ruby or Python but running some C code underneath. If your programs only consist of that, you can very well say "this ‘Ruby/Python’ code is fast". But as soon as you have something that isn't in your standard library, welcome to the actual language, and welcome to performance problems.

Elaborating on the implications of this: whenever you actually _use_ the language to do some abstraction, you pay heavily for it: http://bos.github.com/reaktor-dev-day-2012/reaktor-talk-slid...


You should really check out Alex Gaynor's slide deck. Nothing I said disagrees with what you've said here, provided you take the entire thing in context.


That's a very detailed and thoughtful response. You make some valid points. Maybe I'll try to craft some more complicated benchmarks that replicate normal CRUD operations found in most webapps.


That's kind of missing the point. In statistics, there exists something called confounding variables. Confounding variables are factors that affect your outcome, but are hidden, or are outside your control. As your example becomes more complex the opportunity for confounding to impact your result goes up significantly.

I believe the multi-query database access test is actually a good example of a "complex enough" test, but not too susceptible to confounding. In this test, we see that Rails isn't so far behind.

Basing your choice of framework on speed alone is a pretty bad idea. When you select a framework, you need to optimize for success, and the factors that determine the success of a project are often less related to the speed of a framework and more related to the availability of talent and good tools.

That's not to say you should ignore speed entirely, but that you have to weight your factors. There is a tendency to believe that you will need rofflescale when you really won't. Keep that in mind when you're weighting your factors.


Really depends on which kind of app you're working on. My main work app is 99% cached content so it would probably work just fine with almost anything. Developer time is certainly the biggest expense in my case so high-level it is.


It comes as a surprise to me that this comes as a surprise to you. Really, you didn't know Ruby is pretty much as slow as it gets?


How about Lift? Btw, the play framework you tested is Java or Scala based?

Either way, I'm shocked to see Play perform so slow comparatively. Although it's easily 10x faster than rails on most tests, I'm shocked to see Node.js faster than Play! (by 2x in most cases) Wow!!

Maybe Node.js critics should start appreciating it after all..


It is probably worth noting that while we strive to make the tests as fair as possible, we followed the official tutorials for each framework when building out the tests but we fully expect there to be small instances where minor tweaks improve a given test. Given that I am no Play expert, it would be of great value to have one who is (and it sounds like you could lend a hand there) to check out the code on the github page. If we did anything wrong with the setup or in general, we would rerun the tests. Again, we followed all the official 'getting started' posts for each framework, so we believe we have best practices used.

Disclaimer: I am a colleague of the author of the linked article.


There are a few problems with your Play code that are causing it to be unnecessarily slow.

First-- what you're really testing here is the Jackson library. A majority of the cycles used in your application are being burned in that toJson call of an array of objects. This isn't a fair test compared to the servlet implementation because you're calling Jackson against a map in the Play example, versus against a simple String in the servlet example.

Second-- you are running database calls serially, and those are blocking. Considering that you're using the more-or-less default Play/Akka configuration, there are only enough threads as you have available processor cores. I would start by increasing the parallelism-factor and parallelism-max to be higher, so you'll have more available threads. More importantly though, the database access should be wrapped in a Future, and you should be returning asynchronously. This should speed up the application by a huge amount.


Do you think such a configuration could outrun the Vert.x configuration they've posted? I'm not challenging you, I'm just genuinely curious! Because if Play+Akka can outrun Vert.x, then it would be an interesting game altogether...


I think Play, with well-written asynchronous code, could approach the Netty/Vert.x speed. In other words, I'd be willing to trade the ease-of-use of Play for the slight speed impairment vs. writing directly to Netty/Vert.x/etc.


We'd love to test that theory. Can you or any Play expert rewrite our Play code and submit a pull request?


Not sure if your theory can really account for the 7X difference between Play and Netty in the JSON test. They both create a trivial name/value pair and pass it to the JSON serializer. In Play's case, it gets passed to their Jerkson wrapper for Jackson. What would you even change about that code?

(Note: I'm not necessarily saying the Jerkson wrapper is the culprit. Could be Play's routing framework, or something else.)


The problem is that the database queries were done in a blocking fashion. The test essentially blocks the main event loop which is of course going to kill performance.


This seems to be a nice benchmark. For the Python group, I would suggest two things: (1) include a lightweight framework like Bottle, and (2) Try pypy.


Thanks. Agreed, we'd love to get a Python micro-framework in the test. If you have some free time and feel like putting together a test for Bottle as a Github pull request, we'd really appreciate that.


(3) use bjoern

seriously, in my rough helloworld and sqlite value increment by 1 benchmark, Bjoern+wsgi app runs 2x as fast than nodejs.


I'd be curious about a gevent-worker-driven server, too. Presumably they're running Django on either a thread- or processor-driven concurrency server, and gevent can show some pretty major gains. You could also do permutations of these: gevent on pypy (using pypycore), etc.


I'm most surprised that PHP seems to be around an order of magnitude faster than Ruby on Rails, I knew it was faster, but didn't think it would be that much.


The comparison would be php-raw vs ruby. The actual "php" benchmark did just as poorly as rails.


What kind of server did you guys use for your rails test? Thin, Puma, Unicorn? Are you sure you ran it in production environment?

Update:

Looks like passenger in development mode. Good job you benchmarked a web server that no one uses wile reloading all code between requests.

Update2:

Ok it seems to run in production mode but still, passenger is not an idiomatic choice.


For the sake of curiosity, I happen to have recently done a benchmark of a "hello world" rack app (literally just responds "hello world" to every request) on a number of Ruby servers (mostly JRuby, but also Puma on MRI).

They were all run in production mode with logging disabled, etc.

http://polycrystal.org/~pat/scratch/microbenchmark.png

Note that a difference from 10k requests per second seems huge compared to 3k, but if you invert it, you get 100 and 333 micro seconds per request, respectively. In a real, non-"hello world" app, these differences are going to be negligible.

Though perhaps it would be more interesting if instead of just responding "hello, world", the app parsed some query parameters or something. But I was mostly interested in the overhead of different JRuby servers, not comparing different app servers (i.e. overhead from Sinatra should be more or less identical whether you're on Puma, Trinidad, or whatever).


Sheesh, you can clearly see the GC pauses in the Java versions.


We used Phusion Passenger, although we have plans to add additional servers (such as Unicorn). We tried to spend time with various server choices for all platforms, and for ruby, in our short test, Passenger won out against the others.

Our understanding is that when running Passenger, simply passing '-e production' to the command line is sufficient to run in production, but if that's incorrect, we'll gladly update the test.


Please make sure you're setting higher GC limits for the Ruby tests. Ruby's defaults are awful for a framework, and result in a LOT of GC thrash. It's not uncommon to see an order of magnitude improvement in performance when they're tuned properly. (edit: I'll just send a pull request, I found the setup file!)

Something else you might consider is the OJ gem rather than just the stock Ruby json gem. The latter is notoriously slow and memory-hungry (which will compound the GC issues!)


> make sure you're setting higher GC limits for the Ruby tests

Could you elaborate on this, or point me in the right direction? I'm learning Rails and curious.


For anyone still lurking, this user replied to me via email:

Ruby allocates heaps for its objects, and sets GC thresholds based on those heap sizes. Ruby allows you to change those settings via environment variables, which means that you can end up doing fewer allocations and less aggressive GC, which makes sense when using a full framework like Rails, which is going to allocate a lot of objects.

There's a more complete answer here to get you started: http://stackoverflow.com/questions/13387664/ruby-gc-executio...


Pretty sure passenger+nginx configuration is the most common rails deployment. Not the fastest but most common much like apache is the most common web server for php.


If that's true, it's not really a fair comparison. Running in development mode slows things considerably. They do have a question in the FAQ about that point:

"You configured framework x incorrectly, and that explains the numbers you're seeing." Whoops! Please let us know how we can fix it, or submit a Github pull request, so we can get it right."

Perhaps you/we should submit a pull request?


Passenger wouldn't be my choice personally, but I don't think there's anything non "idiomatic" about it. Engine Yard, Cloud 66, etc. use passenger in their PaaS configs, it's been very popular (on the wane now, but still), etc. It seems fair enough and the differences aren't going to be the sort of order of magnitude change which would really matter on this sort of thing regardless.


Are you sure? This looks like the right file to me, and it says '-e production', last changed 6 days ago: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast... (unless that's an incorrect switch / something else overrides it - I haven't tried it, not too familiar with Passenger)


This (from [1]) looks like passenger in production mode to me.

> rvm ruby-2.0.0-p0 do bundle exec passenger start -p 8080 -d -e production --pid-file=$HOME/FrameworkBenchmarks/rails/rails.pid --nginx-version=1.2.7 --max-pool-size=24

https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...


I'm not familiar with Rails. Could you explain what you mean in layman terms? Why is code being reloaded on every request by default?


In development mode Rails will reload code on each request to pick up on changes you have made to your application. That way you can interact with the application to help verify that your code is working properly.

As of Rails 3.2 development mode watches for file changes and attempts to only reload those files, but it's still a significant performance issue.

By default all Rails applications start in development mode, so one gotcha of benchmarking Rails is that some people will forget to set the mode correctly. That said, from the setup code[1] (line 14) it looks like they were running passenger in production mode. The max pool size seems excessive, especially when running on large ec2 instances, but I'm not fully convinced that it's out of line.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...


In development mode, it's nice to be able edit ruby code, then hit reload in the browser to see it in action without restarting the whole app. Sure there are other ways to accomplish this, but reloading code is typically how it's done in Ruby.


The Rails framework is written to reload (reevaluate) most application code between requests in development mode, so that changes during active dev are reflected. This is not the (default) behavior in production.


Interesting - C# should have also been added. Java still rules the roost. If you should absolutely have a scripting language - old php with warts is better. More magic in the framework - more abstraction and indirection and code inefficiency - that can be the spoil port when it comes to performance.


We agree and want to get a C# test in there. It's among the top priorities for us.


What I can take from this is that when you use ORM, it slows things down considerably. Also, looking at rails example, you didn't use active record, which is really wrong.

I think you should tweak your tests to use more real world like examples. I realize it would be hard to do this across frameworks.

Like let's have a database query pull user record from 100,000 users by username. And maybe do md5 on password.


How can you write a web framework benchmark and not include some of the non mainstream languages with probably the most performant frameworks like Erlang (Cowboy, Mochiweb), Haskell (Yesod, Snap Framework)? That's just wrong; anyway.


Oh... and you think Erlang or Haskell frameworks are mainstream while you fail to mention ASP.net?


There is a huge difference between raw PHP and CakePHP. I'd be curious to see other PHP frameworks (such as Zend, or Slim) in there-- is Cake just particularly slow, or is that simply what happens when you have a PHP framework?


Per my experience, CakePHP is probably one of the slowest PHP frameworks. It has a large overhead and lots of legacy code that slows down the whole system. Generally, frameworks targeted at PHP 5.3+ are faster. PHP 5.4 has further performance boost in production and PHP 5.5 has Zend Optimizer built-in, which suppose to further speed up but I have never try that in production.


We have Zend in mind as the next PHP framework we'd like to include in the tests.


use either symfony 2 or silex. There is really 0 reason to use Zend Framework II.


CakePHP is always at the bottom end of these "framework performance" tests even when comparing PHP only. It's the everything but the kitchen sink of PHP frameworks with bells and whistles never used by these "hello world" performance tests. Just the wrong tool for the job if you want to spill out some js.


I'm curious if they configured php with apc or zend optimizer. The huge difference is easy to explain as parse overhead for the framework's code, which happens on each request if you're not using a bytecode cache.


Yep, this is exactly what I was wondering. Opcode caching is a key part of production PHP environments - in this case, since the code isn't changing, they should even disable the "change check" (apc.stat=0) as one would do in a production environment.

And if this is the case with their configuration of PHP, it makes me wonder what other platforms are not configured for production in this benchmark :)


The relation of PHP to everything else in this benchmark is so unusual among benchmarks, that I strongly suspect a failure of parity with other configurations. Hopefully that is unintentional.


Also, it would be beneficial to see Phalcon PHP - its implemented as a C extension so should theoretically be faster. http://phalconphp.com.


If we're going that route, the fastest php performance you would probably get from facebook's HHVM JIT compiling php engine: https://github.com/facebook/hiphop-php

It's no accident that zend optimizer (bytecode cache) is being bundled into php 5.5 as open source, as mere bytecode caching is now not fast enough to charge money for when hhvm is open source. I expect the next version of zend's commercial php server product to contain a JIT engine to match what hhvm can do.


That would certainly be very interesting, especially as PHP is being nudged into that area (Twig also now offers a c-module).

I really hope the author picks up on this and compares it. It could provide some tremendous insights into pushing PHP further into this space.


Utter crap.

"Sadly Django provides no connection pooling and in fact closes and re-opens a connection for every request. All the other tests use pooling."

But it's free, open-source software and we provide asynchronous database connection pooling for Postgres SQL:

https://github.com/iiilx/django-psycopg2-pool


Your comment is valid and would have been vastly improved without the first two words.


I'll elaborate on the first two words:

They're shilling for their company with a config and tools I wouldn't be caught dead using. No idea what other craziness lurks in the other daemon configs. It's irresponsible and misleading. They're misrepresenting the framework I use to do my work and probably others while hoping I or someone else is going to do their work for them. "Outsourced CTO services?" Their trash. My lawn.

Someone's going to eventually ask me to develop the rest of xyz in node and I'll have to repeat myself about articles like this. Bad enough when it's bloggers. Worse when it's self-shilling company that's obviously not willing to put the time in to be what they claim to offer.


Obviously if they optimized each and every one of these benchmarks we would see different results, but it would take a massive amount of time to learn the ins and outs of each framework to the point where you can do so effectively.

For one person do a benchmark over this many samples, they have to just go with the out of the box setup for each.


This is really the point.

As our blog post suggests, where we are not experts we had to rely on the tutorials provided by each framework's authors to build a test setup. If a specific framework seems low on the list, it could be due to the fact that the best practices guides we found for getting set up were not correctly configured for production use.

Draw what conclusions you would like from this statement, but we did aim to be as fair and unbiased as possible.


We aim to do Postgres testing soon. As you can imagine, the feedback from this has been awesome. Looking forward to seeing Django on Postgres.


Great to hear you're planning to do some Postgres benchmarks!

To improve fairness, you might want to consider using pgbouncer (setup to only offer simple session pooling) in between the Postgres db and any framework that doesn't have internal support for connection pooling.

E.g. I'd love to see how Flask performs using just the psycopg2 driver (i.e. raw db access, no ORM) and pgbouncer to handle the pooling.


I'm curious to see Django results when using the gevent worker for gunicorn. For these type of quick JSON calls, you can see huge performance increases.


Along with gevent, it would be good to throw in psycogreen to improve DB (well, postgresql) evented connections.


It looks as if we've got a Github pull request including these changes, so we'll be able to revise the Python-Django numbers soon.


I Will love to see dotNet (C#) incluid in this test. Asp.Net WebAPI (Synchronous and Asycnhrounous) Asp.Net MVC (Synchronous and Asycnhrounous) Asp.Net HTTP Handlers (Synchronous and Asycnhrounous)


Here's a fairly recent end-to-end ServiceStack vs WebApi benchmark: https://twitter.com/anilmujagic/status/272544925478973440

    ServiceStack      9615ms
    WebApi           30607ms
GitHub project for benchmarks used: https://github.com/anilmujagic/ServiceBenchmark


Also ServiceStack vs ASP.NET MVC vs NancyFX vs Fubu, Mono vs IIS/windows, default serializer vs JSON.NET vs ServiceStack's, etc. There are tons of variations that could be done.


ServiceStack vs WCF would probably be a more apt comparison, but I suppose there are people using SerivceStack for web apps as well.


haha, all those hipster developers using rails can now eat the php guys shorts :)

Seriously though, this isn't news to anyone that does this professionally. The further up the abstraction curve you climb, the less performant the code will be. Ease of development vs run-time performance.


Exactly. And since human time is much more expensive than CPU time, very few people are writing their web apps in machine code.


What I'd love to see paired with this data is a cost comparison. At what scale does performance of ruby/python/php become cost prohibitive? Twitter made the move from Ruby to Java some years back, did they ever post a comparison of their numbers before and after?

Also, the difference between EC2 and local i7 hardware is glaringly obvious. At what scale does owning the server hardware become imperative?

I know these questions are beyond the scope of a performance review, but inquiring minds would like to know.


As you might imagine, during this exercise, we've had a lot of conversations about the points you raise. We have our own opinions, but we ultimately removed most of that content from the blog post because we didn't want it to be too editorial. We will be posting follow ups with some of our opinions.

Some things are really difficult to answer in a vacuum. If you already have a competent devops staff, hosting your own hardware is probably beneficial. The increased performance per "server" is substantial. But no devops staff? Then it's either very risky or cost-prohibitive to own hardware.


I too was shocked at the deltas between Amazon and dedicated hardware. I think AWS runs on Xen, so I wonder if you could strike a more optimal performance/flexibility balance using lighter virtualization (LXC or OpenVZ) with either in-house hardware or another VPS provider.


While it is true that EC2 runs on a customized version of Xen, it's very unlikely that latency is being introduced by the hypervisor itself. With paravirtualized kernel and drivers, Xen introduces negligible overhead, and thus LXC would likely be no better.

The reason that the virtualized setup performs more poorly than the dedicated setup is that you are fighting for CPU time with other AWS customers, so those other customers are introducing latency into your application. Any shared/virtualized host will have this problem.

Interestingly, where Netflix really wants to squeeze CPU performance out of EC2 instances, they allocate the largest instance type so that they know that there's nobody else on the underlying machine.


Regarding performance of Amazon, this post from 2009 is (still) very interesting:

http://uggedal.com/journal/vps-performance-comparison/

Amazon performance is surprisingly low, but also surprisingly consistent. For some use cases, consistency (knowing what you get you for what you pay) might be worth a considerable hit in performance.


Actually, we had an early revision of the benchmark tests that had exactly that analysis, but we really felt it is a very interesting subject that deserves more focus.


Since amazon always takes a cut it stands to reason that if you can fully load a server on a continuous basis, owning s cheaper than renting, regardless of how efficient EC2 is. You should only rent the peak load, not the base load.


What would be much more interesting than "peak responses per second," which is a weird metric to begin with, is the actual histogram of sampled throughputs. Or at the very least a box-and-whisker plot (http://en.wikipedia.org/wiki/Box_plot).

Most folks who have run Rails at scale, for example, find that the untuned garbage collector in MRI (Ruby's default interpreter) introduces a large amount of variance, for example.


We don't have a box plot, but we have line charts for all of the data (performance at multiple client concurrency levels). Just click the "All samples (line chart)" tab on the data panels.

(Note that the very first data panel in the intro is an image and doesn't have tabs.)


No love for Flask? I would have love to see how it compare to Django and RoR


I'd love to see how PHP 5.4 compares. In my own app, I saw a noticeable speedup and RAM usage per request dropped by half.


One glaring ommision: DOS on Dope [1]

[1] http://dod.codeplex.com/


It would be nice to see raw Python like you have raw PHP. I would expect Django to perform very poorly unless you optimize it's caching.


What was the parallelization like across these tests? Were they all running single thread/process mode or did you try to take advantage of threading/multiple-processes/etc. to optimize a production-like performance across all the different frameworks? If the latter, that's a really impressive amount of work!


The short answer is that we attempted to use the CPU as fully as possible across all frameworks. So for those that had tunable parallelism, we used the settings that seemed best given the number of cores available on each platform (2 for EC2 Large, 8 for i7).

We posted the deployment approach for each framework to the Github page.


After having a look at the code on github it looks like they did set up multi processes/threads to really test the scalability on these platforms. That's really impressive, way beyond the single thread experiment I would have anticipated something like this doing! Wow!


We attempted to take advantage of threading/multiple-processes as best we could (see the nodejs code for an example of using the cluster module). But we suspect there are additional areas of improvement here.


This is super awesome. I wish Yesod/Warp was available though :)


I've just send a pull request[1] with an Yesod implementation.

It runs pretty well, scoring similar to webgo for the JSON pong benchmark, is almost at the same level as grails for the 1 query benchmark and is slightly faster that Play for the last benchmark.

So, Yesod is in the same performance gap as Play or Grails, and is 3~4x faster that Django or Rails.

But I've tested those on a mono-core Virtual Box, and I know Yesod scales pretty well on a multi-threaded environment.

Also, keep in mind that most of the top performing frameworks are not fully featured web framework but asynchronous I/O libraries (netty, go, nodejs, vertx, ...) which implementations just write mindlessly the raw response directly on the socket whatever the HTTP request was.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/pull/39


Thanks, Raphaelj! Pat (pfalls) will be in touch if he has any questions. Really appreciate the contribution!


What's the difference between "php" and "php-raw" in some of the data? Maybe I'm still in my morning fog, but having trouble thinking of what "php-raw" might mean. Sigh.


The code says php-raw refers to using PDO (which all project should be using) vs using an ORM or ActiveRecord.

I am completely stunned by the performance cost of using ORM/AR, and will be using this to shame our team lead into giving it up and going for raw queries.


Don't be so hasty. As others pointed out, caching often plays a factor on larger projects. A degree of portability provided by ORMs is nice (test mode hits SQLite, prod is pg or mysql, etc). Also, the readability of ORM-oriented code shouldn't be overlooked, especially as new people come in to the team. Two or three expressive lines in ORM vs 15-20 of nested SQL hitting weird tables names and aliased column names and such isn't easy to decipher the intent of (especially when there's a problem).

I typically use ORM for about 95% of a project, falling back to a few explicitly native SQL calls when performance can be shown to be a bottleneck in those locations.


Some of the ORM cost can be mitigated by using caching. In most cases this is essential in a production deployment.


Agreed. However, for our database tests we expressly wanted to stress-test the ORM and, where configurable, disabled caching.

We plan a subsequent test, time permitting, that enables caching.


Important to note that OP doesn't mention using APC with this, something that any production code would be using in a real world case. This would have an impact on the numbers he uses and on the diff we are seeing.


Gotcha. Thanks!


The best way to understand it is to look at the source:

php - https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

php-raw - https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

For "php", they used an ORM. For php-raw they used the stdlib (pdo).


Hi Chasing. We put a note about that suffix in the "Environment Details" section. The "raw" suffix means there is no ORM used. If there is no "raw" suffix, you can assume some type of ORM is used.


FYI on the PHP-DB-Raw approach: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

By default, PDO does not do true prepare()s, it just does string interpolation. You need to pass this parameter with the options:

    PDO::ATTR_EMULATE_PREPARES => false
And then you'll actually be using MySQL prepared statements. You'll see a noticeable performance improvement for large amounts of queries.


Thank you! We'll get this modified and re-tested.


Isn’t it that django and rails has flat curves because they are by nature blocking. So it wouldn’t matter how many request you throw at them as they are limited in how much they can handle at once.

And to on top of this compare database requests is meaningless as the blocking nature of the framework itself is the major bottleneck, and not the database.

Now, for a normal web application, the largest amount of request would come from static content, or cached content within the web app so the real gain would be a tiny fraction between python/ruby and go/java-based frameworks.

That said, if you want to handle static content (images and such) from within your app, or build a java script-centric application with lots of tiny requests, or even persistent once...nor rails or django would do.


I roughed up a flask benchmark but I haven't been able to test locally.

If you want to help, check out this pull request: https://github.com/TechEmpower/FrameworkBenchmarks/pull/14/f...


Awesome, thank you! We'll aim to get this incorporated into a follow up post soon.


A few additions/tweaks that would be great to see:

1) Add a raw http test -- no template compilation or html, just return "OK". Would give a relative idea of the cost to "just turn the thing on"

2) Don't json encode after the database tests. By json encoding you're doing two things but saying you're only testing one.

I come from the python/django world, and I know that different python json packages have orders of magnitude differences in their performance. From that I can infer that there are probably similar or greater cross-language differences -- I suspect node.js having json as a native object type helps immensely.


Why are there no Go results for the DB access benchmarks?


I'd be interested to know this as well. I can understand time constraints in compiling these kind of comparisons, but as I'm about to embark on a web project in Go, I'd be interested to know if there was any technical constraints that prevented benchmarking Go.


We recognize that deficiency and we do aim to add database tests for Go. If you could quickly draft that up and submit a pull request, we'd love to add it.


Sorry, but as much as I'd love to contribute I'm still learning how to write database apps in Go myself (that's actually part of the reason I wanted to see your tests). So I'd be the wrong person to to write code for that particular test.


Very nice benchmark. As developer of a toy "framework", but powerful http server library, I wanted to check on my project. I did a fast blog entry at http://bit.ly/10l9Smj. I got ~58288.10 for simple json, 8594.39 on db using sqlite, 503.23 for 20 requests.

Anyway for me its more important speed of development than performance on the server. Maybe my servers do not get that many visits.


Oh this is awesome! Very nice performance for a mobile CPU.

Pull request, maybe? :)


Would love to see how modern Perl frameworks (Mojolicious/Dancer) ranks in such a test.


Interesting thought. Would you be able to put together a test for one of those? pfalls can give you the details for how we expect the test for a new framework to work.


Catalyst too please.


[deleted]


No, they're not. They can run via CGI (so, yes, compiled for each request) but that's slow as ass. They run via Plack/PSGI - they can be deployed in a multitude of ways, including via FastCGI or running standalone using built-in webservers, or via Starman - the latter in particular is very fast indeed. With a simple "hello world" type app, > 6000 requests/sec can be easily handled. Obviously an app of more realistic complexity won't be quite that fast, but you'll still handle many requests every second with no trouble. http://stackoverflow.com/a/4770406/4040 contains some basic benchmarks.

EDIT: also, mod_perl is generally best avoided these days; it's old, not very pleasant to work with, and ties you to Apache. Writing an app with Dancer / Mojolicious etc means you can deploy in various different ways with ease.


Funny enough that SO link was the same one I got my misinformation from as the top answer pointed to CGI benchmarks.

Thanks for the correction though, but that still doesn't answer my original question: how does Plack compare with mod_perl?


Plack/PSGI (Perl) == WSGI (Python) == Rack (Ruby)

These are all abstraction layers (for each language) which then can be run seamlessly on top of CGI, SCGI, mod_(perl|python|ruby), etc.

- http://plackperl.org

- http://en.wikipedia.org/wiki/Plack_(software)

- http://en.wikipedia.org/wiki/PSGI

- http://en.wikipedia.org/wiki/Wsgi

- http://en.wikipedia.org/wiki/Rack_(web_server_interface)


"Oh, they're just frameworks for CGI/Perl and thus severely crippled by CGI's mode" - This isn't true.

Dancer use PSGI (not CGI) and most modern Perl webapp frameworks (Mojo & Catalyst) use PSGI/Plack (http://plackperl.org/). The PSGI model is very similar to wsgi which allows for different types of persistent servers.

Dancer's deployment POD is good reference on options: https://metacpan.org/module/Dancer::Deployment


Cake was an interesting choice for a PHP framework to test. I wonder why they didn't choose Symfony, which is arguably the leading PHP framework.


Symfony and Zend might perform even worse. At least Zend's bootstrapping is even heavier. The sheer number of files that have to be loaded from disk on each hit is appalling. Hence I mentioned in another comment the opcode cache (ex. APC) requirement and how every serious PHP app utilizes it.


Zend and Symfony are probably the leading OO MVC frameworks for PHP.

I'd also like to see a "light" framework meant for building APIs, like Slim, for instance.


Yes, I'd definitely like to see Slim tested as a light framework. People keep mentioning Silex as a "light" framework, but I've found it to be almost as heavy as Symfony2 (which isn't much of a surprise considering it's just Symfony2 components strapped together).


It was simply what we have seen most often in our experience. We've got Zend as a next target for PHP frameworks. We can also put Symfony on the list.

We'd also love to have anyone who has production experience with those to contribute a test for them. The test should be fairly quick to write.


Im surprised to see that Sinatra on JRuby often performs worse than Sinatra on MRI, while Rack on the other hand performs much better.


We were surprised by the sinatra-jruby tests as well. If you read our "expected questions" section, it's not clear to us why Sinatra's performance on JRuby was weak. We'd love to hear from a JRuby expert about how to address Sinatra's "wrong" looking numbers.


Maybe the web-server choice (I didn't look to see what that was). But last I looked at it, the Sinatra router really is very naive/awful.

For the size of the framework, that may be the right choice ultimately.

On the other end of the spectrum is Rails, who've written their own Regexp engine for routing basically. It seems to be generally quite a bit faster.

All major Ruby frameworks have just awful routers though really.


Having had a look at the Gemfiles, it might be the choice of HTTP server used. I believe defaults to WEBrick, if no other server is present and I believe WEBrick is single-threaded.

I don't have any experience with JRuby, but this might be a possibility.


Yeah, the Sinatra numbers really ought to be more in line with the straight Rack numbers. Something's amiss.

Glad to see that on Rack, where I'd expect us to be fast...we are doing ok. On par with PHP (not a big thing to brag about, perhaps) and Play (worth bragging about, since it's Java/Scala end to end).

Obviously there's more work needed not just in JRuby but in the servers that server it and the frameworks that run on it. The slow performance of Rails here, for example, is largely Rails' fault.

But yeah...the sinatra numbers are wack.


It would be very interesting to see OpenResty (Nginx + Lua) in there, since it's so different from other approaches.


I would love to see Openresty tested, for simple json and databases Openresty is a magnificent tool.


Agreed! Especially built with luajit.


I would love to contribute a code for a specific PHP framework: Silex. Do you have requirements you'd like to hit?


That would be fantastic. Get in touch with pfalls on Github and he can give you the information. It's fairly simple.


+1 this would be interesting given that there are a lot of PHP frameworks besides cake, namely.

Silex/Symfony2

Yii

Zend

Kohana

Fuel

Laravel


I'll be happy to see benchmarks with PHP frameworks written as a C extension: 1. Yaf PHP http://www.yafdev.com/ , 2. PHP-ActiveRecord++ https://github.com/roeitell/php-activerecord-plusplus


Very interesting! But isn't there a disavantage for node.js considering it's single threaded? Did you cluster it to use every core on the box ? (http://stackoverflow.com/a/8685968/1909827 - "Scaling throughput on a webservice")


From the OP's response to a similar question:

>We attempted to take advantage of threading/multiple-processes as best we could (see the nodejs code for an example of using the cluster module). But we suspect there are additional areas of improvement here.


Thanks


Since a few people have asked about Haskell, here are some Haskell benchmarks in comparison: http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-...


Still, it would be nice to have these results independently verified and put alongside all the rest.

What do you think, bhauer? :)


Our to-do list is getting unwieldy. :) We would love to get this added, though.

Any chance a Haskell expert could create a test and submit it as a pull request?


It sure sounds like it is!

I'm a far cry from a Haskell expert unfortunately, so I'll have to leave it to someone else.


What I've also found is that Node (V8) tends to have much higher variance in response times when compared to Java Servlets (JVM): https://github.com/olegp/common-node#benchmarks


In the FAQ for the benchmarks, could you also add a bit about how you configured the JVM (OpenJDK?). Curious about heap settings, etc.? Did you just use the default install? e.g. http://planet.jboss.org/post/rhel_openjdk_performance_tuning http://www.jaspersoft.com/sunopenjdk-jvm-garbage-collection-...


>And let us simply draw the curtain of charity over the Cake PHP results.

Something about that phrase "Let us simply draw the curtain of charity..." resulted in an immediate spittake. Coffee everywhere.


Thank you for such an informative comparison. You've helped shed light beyond the frameworks that are known to most of us - Ruby on Rails, Node.js, Django, Express, etc...

Thanks again! :)


In case it's not obvious, you can hide frameworks/platforms of no interest to you to narrow the view. For example, here's Node.js versus Express:

http://www.techempower.com/blog/2013/03/28/framework-benchma...

(Note the initial chart in the introduction section is an image and it won't be affected.)


Quickly taking a glance -- these benchmarks (like most benchmarks) seem like they might be highly misleading.

For instance, in the Express example code, they're sending JS objects rather than serializing them to raw data that the socket can just send. Instead, serialization/copying are happening on each request, which is a significant overhead.

I don't have domain knowledge over many others, but I suspect a similar problem might exist with others.


For json serialization, this benchmark seems to indicate Netty is twice as fast as Golang. Don't think that's right. The golang benchmark code is not exactly equivalent to netty benchmark code.

The netty code creates the ObjectMapper once and uses it for all requests, whereas golang code creates the json encoder for every request (enc := json.NewEncoder(w)). Just getting rid of that would make this trivial code so much faster.


Also I dont think evaluating json based on a simple 'Hello World' is a valid test. In fact, 'hello world' should never be the standard to test the speed of any processes. If all your script does is to output 'hello world' to the screen, why bother using a server-side programming language at all? You are better off just writing it in HTML, its faster than using any server-side and client-side languages lol.


It just goes to show that your framework of choice is pretty unimportant. The developers (and their skills), who are actually developing on and configuring the framework, outweigh the "baseline" for performance on multiple orders of magnitude. From that I'd say that developer friendliness (easily approachable concepts) and sensible defaults is by far the most important thing when choosing a framework.


These sort of "benchmarks" are so subjective that they really don't provide any value. Looking at the framework lineup, a lot of the frameworks at the bottom have so many (tightly coupled) features (which also lead to bloat, like excessive middleware layers) that they're obviously not going to perform anywhere near as well as some of the barebones frameworks at the top. Neat post, but yeah, zero value.


Go's performance is great, and hearing good things about it, does any one have any experience on the type of applications this language is meant for?.


We need a project similar to TodoMVC [1] for benchmarking different languages and frameworks.

http://todomvc.com/


I think this benchmark is not that useful.

If you seriously thought that the difference between a netty and django would be 4 times then you simply don't understand what these frameworks do in the first place.

I would have guessed something like less than 100 times slower and that would have been still fine since the cost of all kinds of latencies in the system usually far outweigh the speed of the framework itself.


These are some brutal results for CakePHP (which I use). However, in practice no intermediate developer would be issuing queries like that in a loop. They would loop through a set of data, building a IN statement example: WHERE field IN ('x','y','z'). Thus only sending a single query to the database. Still, the Cake developers really need to improve the speed of their framework.


I keep seeing numbers like this in benchmarks over the last few years, it would be great to see CakePHP's numbers "rehabilitated" using techniques like you discuss.


It's a fair way of testing, its just not the way a developer would write it in practice. Clearly disregards the "Big O".


Send a pull request, then.


I would like to point out that there is no reason you would benchmark a raw SQL query in PHP, and then only use ActiveRecord on the ruby side. At the very least you could have used the Sequel gem to just build a MySQL query for the Sinatra app. ActiveRecord is going to give you the same performance failure that it did in PHP too.


Regarding PHP, well, isnt CakePHP by far the slowest major PHP framework? Why dont you test the other PHP frameworks? How about Yii, Codeigniter, Lithium and Symfony? Id say the result is kinda biased for PHP frameworks since you pick the slowest PHP framework available, instead of the fastest or at least the average one.


I see alot of people saying how certain configs are missing/better/not tested. So what about a "crowdsourced benchmark" instead?

Maybe a few elaborate scenarios are posted and people can simply submit their best setup/config/code to be benchmarked. I imagine devs would improve on it over time and eventually the most optimized would surface?


We are doing precisely that: if there are any mistakes or misunderstandings about the production-level best practices, we are accepting pull requests on our github (set up with all these tests, in case you want to run them yourself) to fix them, rerun them, and then report our updated findings.


Not to nitpick, but that (and another popular one http://www.techempower.com/blog/2013/03/26/everything-about-...) was a clever way to get people to techempower.com :)

Useful comparison anyway. Seems Go struck a good balance.


So this means I should stop using CakePHP?


Why would you? If you write clean maintainable code and are productive using it there's no problem, there are tons of ways of tuning Cake (as most frameworks). The default configuration is meant for 'bootstrapping'.

However, if you are interested, you can always check out other PHP frameworks (Yii, slim), or if you like all the bell and whistles from Cake and are willing to dive into another language, you can always experiment with ruby (ror, sinatra), node (express) and python (flask).


Not really. Take this benchmark with a grain of salt unless they benchmark it again with APC at minimum. Nobody serious about a web app would run a PHP app without APC.

Similarly, relatively modern deployment standards (not really cutting-edge) like nginx and php-fpm OR apache 2.4 with worker mpm + php-fpm should be added to the mix.


Asadkn, we'll try to get things revised based on your feedback. Admittedly, we have some learning to do with tuning production PHP environments.


I posted a note above that might help:

apc.enable=1 turns on Opcode caching (when php-apc is installed), and apc.stat=0 turns off "stat" checks - this means that once a file is opcode cached, PHP won't even have to touch the file on disk to execute it. The I/O gains from this, as well as the execution gains from not having to parse the file, should help quite a bit.


If you're serving only json snippets from memory and might have hundreds of simultaneous users, yes!

If you have existing sites built with it which work fine, then this benchmark doesn't tell you a great deal other than that php is relatively slow, and frameworks built upon it slower. For many websites which have a caching strategy sitting behind a server like apache or nginx and less than a few thousand users a day, this really doesn't matter, and other things like features are more important. This hlds even for bigger site too - Facebook for example still runs PHP (compiled to one big binary). Personally I wouldn't use PHP because of the language/std lib but this sort of performance shoot-out should not put you off it.


Obviously, the data should not steer you towards rewriting applications using a different framework. Really, it should just show as another metric to help deciding which framework is best suited for your needs. If you plan on writing up a blog application for yourself and you might get 100 concurrent users, then you could use CakePHP for the sake of familiarity. Then again, if you are trying to write the next Twitter and have the requirement of needing to be able to service 20,000 concurrent requests, then I would say "yes, this means you should stop using CakePHP."


So this means I should stop using CakePHP

I always strongly advise not using CakePHP, purely because its model system is so broken, it makes my teeth itch.


How is it broken?


How is it broken?

All data is returned as an associative array, so there is no way to call something like:

  $cows = $this->Cows->findByStatus(COW_STATUS_NOT_MOOING);
  foreach($cows as $cow)
  {
     $cow->moo();
  }
But that's pretty minor.

The big issue is: all data is returned as an associate array! There is no lazy-loading of the data as you e.g. seek through it, which means it's very easy for people to pass around huge arrays with thousands upon thousands of rows and associations without even thinking about it.

/rant over


I don't use Cake, so don't read this is a defence of Cake.

Lazy loading seems to me a kludge trying to fix poor SQL querying. If you can filter at the app layer, why not pass the information down and filter at the DB layer? You'll be transferring less data and may even read less data from disk (with proper indexes)


Lazy loading seems to me a kludge trying to fix poor SQL querying.

I very much agree with you, lazy loading should not be used as substitute for only querying and/or filtering the data you actually need.

However, there's always going to be a minimum; eventually, the data needs to be processed or displayed! Why load all of it immediately into memory (and potentially pass that around), rather than when it's actually needed?

I think of it a bit like using pointers; passing around the location of the data, rather than the data itself, makes life good for all concerned.

I don't use Cake, so don't read this is a defence of Cake.

Thanks for the disclaimer, as I'm guessing you've experienced before, it's all too easy to get into accidental flamewars about this stuff :)


If you care about performance, the thinner the framework, the better.


Doesn't matter if you compile it like Facebook had done with their php at one time when they needed to scale.


To some extent yes, but bloated framework issue will remain even if you compile to C++ despite the compiler's best effort optimization.


Some of the choices are unfair in an apples/oranges sort of way: They are testing Rails and Sinatra against Java servlets, but the Ruby equivalent of servlets is not a high-level framework like Rails, but Rack. (Also, the Ruby tests use Passenger, which is probably not a great choice for performance.)


Probably not a major factor but I'm curious to know why mongoose was thrown into the mix for the node.js test rather than going with the native mongo driver.

It might be more of a real world test to include mongoose with the node+express test, but for the node-only test the native driver might be more appropriate.


We wanted to use an ORM in all cases, it was only recently that we started working on native mysql access. We hope to add native mongodb results as well.


The wsgi benchmark uses gunicorn, which is fine. But if you are surprised by the so-so performance, know that the worker class can be changed via the -k flag (-k gevent), and may improve performance. Of course, other wsgi servers, like uwsgi, are also available.


Are they not even using gevent? Someone please get these guys off the internet. The are worse than some personal blogs at everything except the comprehensiveness in their low-quality testing.


Nothing for Flask ?


Yeah I was disappointed to see a lack of flash as well, especially since it advertises itself as a microframework and all.


We received a Github pull request with a Flask test! So we'll aim to include that in a follow up post soon.


It's interesting to note the time per request when you invert these. Even some of the most drastic differences (like Spring vs. Django) turn out to be the difference of a millisecond or two. That's pocket change in terms of user experience.


It is a good and interesting observation, but as the benchmarks approach real-world scenarios, the difference is more than a millisecond or two.

For instance, if you're running on EC2 hardware (which is common enough) and you're executing ~20 DB queries/request (which is probably, unfortunately, common enough), the difference between Java servlets and Rails is more like 10ms.

Then what happens when more than 89 real-world users start to hit your Rails server each second?

(Note: I really like Ruby and Rails. Much more so than Java and its offerings.)


How much of the difference is due to the random number generator calls? They aren't free or equal cost across platforms by any stretch. I really have a hard time trusting these numbers with how casually calls to RNGs are bandied about...


For anyone else curious on how Vert.x stacks up against the famous Node.js:

http://vertxproject.wordpress.com/2012/05/09/vert-x-vs-node-...


you haven't read discussion, right?


I posted this way before the discussion happened.


How much does 'asynchronous' matters for the performance of a web framework? I mean netty, node.js are all async frameworks as I understand, but the source code for Go and Compojure seems was using a thread/process per request model?


I kinda miss asp.net web api (okay, it's on Windows, but it's also a framework :))


Why Play benchmark is running on Resin? Shouldn't it be directly on top of Netty?


Play is not running on Resin. Did we accidentally say that somewhere? If so, we'll correct it.


Here: https://github.com/TechEmpower/FrameworkBenchmarks/tree/mast...

The tests were run with:

Java OpenJDK 1.7.0_09 Resin 4.0.34 Play 2.1.0


Whoops! Fixing that up presently.

Thanks for the catch.


I'd be curious to see how ruby fares using a setup other than apache + passenger; I have a feeling that nginx with unicorn or puma would provide better results, and might give it a go when I have a bit more time.


Hi, great work! I only have two suggestions:

a) Please, publish not only the requests per second, but the memory and CPU usage of the machine for each framework. b) For JAVA systems, can you publish de heap configuration of the JVM?

Cheers!


wow, these results and the performance of go is making me consider rewriting my tornado app again. I just put off doing it in order to build more stuff in to the python version. Maybe I'll reprioritize.


If I wanted to return a bunch of _ids from a mongo database i would do a _id:$in[array of ObjectIDs] and then stream the resulting cursor to res. Streams are core to node. You should use them.


JVM can use many cores at the same time node/rails only use one core. The test is totally nonsense.

You should show us the cpu usage. If you had done so, you would have found how absurd the mistake you had make.


The raw json output merely tells about the performance of the language itself. Nice, but not that interesting if you need to get the job done and don't happen to know 10+ languages/frameworks.


I haven't looked at this detail yet, but it should be noted that we haven't even optimised Vert.x yet so there should be plenty of scope for further improvement :)

(Disclaimer: I'm the Vert.x project lead)


I guess it is a good thing we are finally migrating away from Cake :/


I wonder how a cgi binary run from xinetd would perform?


Java (JVM) > JavaScript (Node.JS) > Ruby-Rails


If all you're doing is trying to serve JSON as fast as possible.


I still believe launching a usable web application next week is preferable to launching a really fast web application 3 months from now, if ever.


That's a straw man. Competent programmers should be able to code fast code in a fast manner.


Different languages and platforms yield very different productivity levels.


I am 99% sure that you haven't used contain=>false for cake php this way all related tables are loaded to serve the related data.


Can't believe you didn't include Zend in there :(


Just read this above:

"We've got Zend as a next target for PHP frameworks."


Thanks. We are reconsidering our stack and are in middle of evaluation, I am sure this post is going to be of great help.


You're using PDO in the PHP raw test - use direct mysqli_* API, it's lot faster than the PDO abstraction layer.


Wow, this benchmark is like comparing apples, oranges, mangos and cherries. Just because they are all fruit.


It's fine as long as people take it for what it is. I still find the fruit salad interesting :-)


Very useful info. Thanks for sharing.


Was Play 1 or Play 2 tested? I thought Play ran on Netty so I'm surprised it didn't fair better.


They're using Play 2.1 with Java, without using futures for database queries. Not sure how representative of real usage that would be, though.


We've got a bunch of feedback that the Play database test needs to be asynchronous, so we'll make that a priority for our next run. Perhaps I can entice you to submit a pull request to improve the Play configuration?


But the JSON test was 7X slower too.


Woah....PHP is faster than Ruby in lot of results? I wonder why is this the case?


Wasn't it agreed a long time ago that benchmarking against a VM was a bad idea?


It is a great idea when your production environment is a VM.


Possibly, although we believe it represents precisely what we wanted to test: a realistic production environment.

However, we also tested on our physical i7 hardware. Did you happen to scroll down? :)


I did, but didn't see a pretty graph for it. Full production results are interesting, but benchmarks should test on a single variable. A cloud VM adds several to the mix.


This is awesome. Thank you.


node.js is single threaded and EC2 large instance has 4 CPUs. So to be fair you must either:

a) bench on a single core processor (a small EC2 instance)

b) configure node.js as a cluster with as many instances as processors.


Needs more NancyFX + Mono! Nice post tho, very interesting.


Could you kindly add error bars to your test results?


I would love to see Dropwizard in this benchmarks!


What about C++ Web frameworks?


Don't think we didn't consider it! :) If you are willing to craft a comparable test for a C++ framework, we'd definitely like to get one into the next set of numbers.

We especially don't feel confident in our C++ chops to do a C++ framework justice.


...flask?


im not a PHP nazi, but you can't compare PHP and a framework


You guys should feel bad about generating such awful benchmarks.

How the hell can you compare accessing a MySQL database to accessing a MongoDB database?

Its like comparing apples to piles of poop.

Also when you're testing things like Django in web requests, you're testing gunicorn, not Django.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: