Hacker News new | past | comments | ask | show | jobs | submit login
Web Framework Benchmarks Round 7 (techempower.com)
227 points by mangeletti on Oct 31, 2013 | hide | past | favorite | 181 comments




Summary: if you want performance, use Java, Scala, Go, Clojure, Lua, or C++.

Honestly, now with all the great Scala frameworks, Clojure, and the ability to run Rails, plus Cassandra, Storm, etc, I'm a little creeped out that I'm actually strongly considering building my current new project completely on the JVM.


I'm a fan of Go and an even bigger fan of Lua, and I'm not sure I agree with your summary. Go and OpenResty can push 300,000 "Hello World" plaintext responses a second, and some Java stuff can do x2 that, so JVM frameworks are strictly better? And something called cpoll_cppsp trumps a sane language choice? Don't be cute. :)

It is perfectly fine to not squeeze every drop from your hardware and pick Go or Lua. In the same vain, Flask may be x20 slower at a meaningless benchmark but you'd note that the ones that involve actual work, you know, the kind your complex app will actually perform, the more involved it is the more the difference shrinks. All the way down to x4, in other words not an order of magnitutde.

The real takeaway summary should probably be: a well made framework like flask doesn't have so much overhead that the productivity gains it offers are not worth trading a bit of performance for.

A terribly made framework (I won't name names) will bite you in the ass. Choose carefully.


I think the idea of a "meaningless benchmark" has become a dogma. Can you possibly know all the complexities that an app will contain when benchmarking a framework? No, but that doesn't make baseline benchmarks useless. Part of the reason for including the most basic (CRUD-ish) operations as separate benchmarks in their tests is so that you can make comparisons between basic "Hello, World" operations and more complex operations, which gives you a bit more insight into the overhead of each framework.

I'll take that any day, over guessing which framework is fastest.


Agreed, I should have avoided that particular phrasing.


Actually, it's other way around. In 'JSON encoding' tornado and flask-pypy are 4x slower than compojure, for example, but in 'Multiple queries' they are 14x slower.


Just for the record, my comment didn't intend to imply that JVM frameworks are strictly better in all cases, and I don't believe that at all. In fact, I'd argue that for the majority of applications, choosing any modern framework purely for benchmark performance is very foolish since all of them perform more than adequately.

Rails, for example, despite always having been "slow," performs far more than adequately for the vast majority of web applications, scaling patterns are well established, and it's obviously hugely productive for many, many developers and companies.


> A terribly made framework (I won't name names) will bite you in the ass. Choose carefully.

Facebook and Twitter seem to be doing alright. Care to elaborate?


Well, it really depends on the application type. We are a Java and Ruby on Rails software house (http://codedose.com), we also have experience working for investment banks, and for some cases Java with Spring or Play is the best way to go, especially if you think about apps with complex back-end.

We have recently released an online market for physical gold trading capable of handling 10k+ concurrent users with horizontally scalable architecture and complex trading engine/accounting logic, and it's all Java. It has to be fast and there is not enough static or almost-static content to make caching effective.


When ever I feel like that, a few hours with either the Java or Scala persistance frameworks rapidly cures me of it.


This. In my last project I tried to use scalatra with slick, and while I really liked scalatra, slick made me go nuts. I had to jump over so many hoops that it was just a pain.

I'm currently working on a new project (still research phase) where I was pondering going with Spray and a lightweight Postgresql wrapper because I primarily need to read data from a database, do some transformations on it, and write it out as JSON as fast as possible. I had it working in Spray, but I had issues with server crashes and the speed wasn't as I'd have expected. I fiddled with it for one day, and then, out of frustration, decided to give openresty a try. I've never written much Lua in my life, but after only a couple of hours, I had it working, and it was far, far faster than the Spray implementation. I did some research there, and it seems that the database stuff took a whole lot longer in Scala/Spray than in Lua. Now, of course, I loose type safety, so there may be hidden issues in there, but since I'm really just doing simple data transformations, I think I'm fine with lua / openresty.

As a first verdict, I really, really like what I've seen of Openresty so far.


Honestly, JPA is not that bad.


mybatis.


Too bad memory is not measured/displayed as well. For many cloud-hosted servers, you will end up paying for memory, rather than CPU.

And Java/Scala are disqualified pretty quickly.


Java and Scala would have large base memory requirements, but not necessarily large per-concurrent-request memory requirements.


Or PHP, as you're most likely hitting some kind of DB / cache and not just serving plaintext or cached JSON out of your backend... backs away slowly


Actually I was quite surprised that multiple queries benchmark showed more than 1 entry for PHP in top 10. take-away: if you don't do much sophisticated data processing then PHP is ok choice. gets ready to run


That's because php is a really thin wrapper around the underlying C libraries. The thing that makes it ugly is what makes it fast. Php is still interpreted and not jit'ed, even with the opcode cache used in this benchmark, so it loses out in anything that exercises the php code side of the equation, which is why the big php frameworks (symfony, cake and laravel) are 50 times slower than raw php. I wonder how facebook's hhvm engine would stack up, as it is a jit'ing php engine and supposedly an order of magnitude faster.


That's my understanding of the situation as well.

As a related aside, we have an intention to eventually capture some additional statistics about the implementations such as source lines of code, total number of commits, and possibly lines of code of libraries (where available).

These are just additional data points to use as the reader sees fit, but they may provide some insights. For example: (a) developer efficiency, assuming you are comfortable using sloc as a proxy for developer efficiency, which is admittedly a hotly debated matter; (b) commits are a proxy for the level of attention the particular test implementation has received in our project, since a test with only 1 commit may be unrefined and in need of tuning and a test with dozens of commits may be considered highly volatile; (c) where applicable, the performance impact of additional code, perhaps even as granular as a calculated average per-line cost in rps. The last point might be particularly illuminating for languages such as PHP.


http://blog.liip.ch/archive/2013/10/29/hhvm-and-symfony2.htm...

Doesn't look to be "an order of magnitude faster" - 200% faster in some cases - certainly nice numbers, but not a massive game changer for many (yet?).


php-raw performed quite well.


As soon as you throw a php framework on top though, it sinks to the bottom of the benchmarks.


I'd run.


If you ignore Erlang and run your database on a ramdisk.

Sorry, round 7 isn't helping much. Wait for round 8.


or Haskell :)


[deleted]


Here is the environment info:

"As of Round 7, three Intel Sandy Bridge Core i7-2600K workstations with 8 GB memory each (early 2011 vintage) for the i7 tests; database server equipped with Samsung 840 Pro SSD Two Amazon EC2 m1.large instances for the EC2 tests Switched gigabit Ethernet"

So 8GB workstations. Even assuming full memory usage, the JVM, Go and C++ frameworks destroyed all comers.


Just for reference, the deleted comment used to say:

Raw performance is nothing without memory usage information.


That said - memory is dirt cheap


There are still practical limits, and sometimes you have to work within those. I've got a project on a 16G server - the data center doesn't have anything bigger. Moving everything to a custom server or different data center to get to 24G or 32G is... a lot of work. Some time spent finding memory reductions is still worth it.

And what if we went to 32G or 64G, but still needed more? "RAM is cheap" doesn't scale. The hardware needed to host a 128G system (or multiple 16/32G systems) isn't cheap.


Either you're wrong about the cost or we have a different definition of "cheap". The sweet spot for RAM pricing right now seems to be 16GB DIMMs, which cost $150 to $250. I can go to dell.com and configure a R320 with a Xeon CPU and 96GB of RAM for $2,500. That's less than a high-end Macbook Pro! A R420 with two Xeon CPUs and 192GB of RAM costs $4,300.

I recently upgraded two Dell Poweredge 2950s from 16 to 32GB of RAM. The cost per server was $300. If I'd needed to, I could have upgraded them to 64GB. That's on a five year old server that sells refurbished for $800.


Not really, on a desktop it is, on Amazon it's very expensive.


So don't use EC2 with your java apps.


I was surprised not to see Revel included in the round 7 results. You can filter results by it but nothing is shown.

This is significant because, on round 6, Revel was performing better than raw Go in at least some of the test scenarios, which is not something you can say about many frameworks.

Was there ever an explanation why Go without any frameworks was performing worse than Revel? Simply because not much effort had been put into optimizing the Go code?


I was also confused that when I switched to EC2 instances I didn't see ANY go.


Good question, I would like to know this as well.


I have the dubious honor of the slowest framework on the test on page #1! For some reason the i7 tests perform really badly for Phreeze, whereas the EC2 tests it performs near the top of all the PHP frameworks.

I've been too busy with work to look at the tests but my goal is to make a proper showing in round #8!


Thanks, jakejake. Your participation is always appreciated, especially your good spirit.

Yes, if you have time, please do help us resolve the issue (and I am guessing it is some configuration issue) for Round 8.


Slightly glad to see ASP.NET improved a little (at least the stripped/raw/barebones version).

Slightly sad to see the Mono performance is still abysmal.

Edit: link to the blog. It was submitted earlier, but apparently got killed:

http://www.techempower.com/blog/2013/10/31/framework-benchma...


The .net benchmarks, the http listener one in particular, uses threadpool tuning that actually hurts performance.

In my own testing there was another ~60% performance to gain with better tweaks.


I implemented the threadpool tuning originally first for the ASP.NET tests (https://github.com/TechEmpower/FrameworkBenchmarks/issues/39...) because it improved performance 5-10%. If you've seen data showing that it actually hurts performance, please open an issue or send a pull request (especially for the additional ~60% performance gain). Thanks.


ASP.NET`s performance can be improved further, but that requires application specific tuning. That said, it's pretty damn good, for a full framework.

I'm impressed by the Java options. Definitely something to consider if absolute performance is the requirement.


I agree - I wonder if the Entity Framework results could be improved by using Compiled Queries.


Performance should improve massively with the use of async..


There were a variety of experiments with async and I don't think there has yet been any data to show that it improved performance of this benchmark.

[0] https://github.com/TechEmpower/FrameworkBenchmarks/pull/272#... [1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/31... [2] https://github.com/TechEmpower/FrameworkBenchmarks/pull/339#...


Benchmarks aside, I'm really impressed with the whole One ASP.NET thing. Seeing the whole MVC with web api and SignalR all now coming together now is making me finally feel excited about developing on the MS stack again.


Maybe there is one an I'm just missing it, but I'd love to see an OWIN-based benchmark in that list.


I'm not familiar with OWIN. However, a GitHub user named @damianh is working on an OWIN implementation. Perhaps reach out to him and offer to help? :)

[1] https://github.com/damianh/FrameworkBenchmarks/blob/nowin/no...


I’d like to see ASP.net MVC natively on Windows, as well as ServiceStack. I think they’d compete. Mono is just not a thing for those of us in the .Net world, frankly.


We are running several .NET tests on Windows. Some are using SQL Server as their database, too. See: http://www.techempower.com/benchmarks/#section=data-r7&hw=i7...


http://www.techempower.com/benchmarks/#section=data-r7&hw=i7...

If you notice the specific filters I've set, PHP actually outperforms Go for many cases. That's actually nice to see (PHP gets better with each release version). Well, if there was something like Coffeescript, but for PHP, then there's no better time than now to make it.

I really love the simplicity of Ruby, the performance of Golang/JVM and the massive popularity of PHP and it's compatibility with budget hardware (Shared hosting, etc). If there was a "Write your code in Ruby and we'll compile it to PHP" kind of generator/converter/service, that would be so awesome :D

(If anyone knows something like that that already exists, then I would be extremely thankful).


This is interesting, but why is there such a huge gap between raw php, and most of the php frameworks?

The one PHP framework that seems to do well is yaf. However, Yaf is actually written in C and is a php extension. I would expect it to do better than all the php frameworks, but that doesn't account for the huge gap between raw php and the other frameworks.


Assuming that the PHP frameworks are not just all terribly written (which seems unlikely), what this shows us is that the PHP interpreter is very slow, but the libraries that PHP uses are fast. The raw PHP benchmark doesn't actually have very much PHP code, so the slowness of the interpreter doesn't matter much, while the PHP frameworks (other than Yaf) all have a great deal of PHP code.


Yeah, me neither, I don't understand why the huge gap. Would be interesting to see some technical background on this, though.


haxe might be similar to what you're asking for?


Wow, thanks a lot Martin, Haxe looks very very promising :)


From the "JSON serialization" requirements:

> For each request, an object mapping the key message to Hello, World! must be instantiated.

Why mandate how the benchmark is implemented? What if some frameworks could be faster by not instantiating an object? For maximum speed, you'd want to stream-process the input into output while materializing as few intermediate objects as possible.

Benchmarks should be defined in terms of their observable inputs/outputs. Otherwise it's like defining a car's 0-60 measurement with a rule that says "the car must then take gas from its tank and inject it into the engine." And then a Tesla could only compete by adding a gas engine that it doesn't need or want.


I want the test to require the work of creating an object or allocating the memory for the response and then use of a JSON serializer.

These tests are intended to be stand-ins for realistic application behaviors. They are not realistic applications, but they should make use of the same functionality that a real application would. I want the tests to establish realistic high-water marks for the types of operations they exercise.

Running a single query per request? The high-water mark you might expect is what you see as our single-query test. It would be very difficult to make a single-query operation in a real application more trivial than what we have.

If I treated all tests as a black box, none of the database tests would actually query the database; the fortunes test would not add an ephemeral object and re-sort on every request; and so on. Everything would just send static responses back.

Bottom line: nothing beats benchmarking your own application. But the tests in this project are intended to give a first-pass filter of sorts, assuming performance is on your list of requirements.


> If I treated all tests as a black box, none of the database tests would actually query the database; the fortunes test would not add an ephemeral object and re-sort on every request; and so on. Everything would just send static responses back.

A better way to avoid this would be to make the output depend on the input. Require that the reply include some JSON that contains portions of the request.

Likewise for databases, require that the reply contain contents of a previous request. If you want to, make the request contain an ID and make the reply return the data keyed under that ID.

If you want to require that the data is persisted durably, require that the benchmark continue to function even if all processes are killed and restarted in the middle of a run.

What really matters about a system are its inputs and outputs. If an innovative system comes along that works differently inside, it shouldn't be disqualified from a benchmark just for being innovative.


That wouldn't work sufficiently well; unless the benchmark is quite complex it would almost always be better to implement an in-memory custom datastore rather than use a traditional backend. Even if you require persistance, since the intentionally small dataset means you can get away quite well with raw memory dumps or even an mmap variant. You wouldn't use a full-fledged JSON serializer but a limited string concatenation.

In any case, the way I read it, the requirement doesn't mean you have to make anything particularly heavy, just that you need to use a representation that's general enough to be usable in other scenarios without rewriting the entire app.

Of course, a little bit of extra encouragement in the form of a benchmark that's slightly harder to game wouldn't be bad.


Custom datastores (custom at the framework level, not necessarily at the application level) have been very useful for me in the past for getting high performance.

For example, I wrote a little in-memory DB for use in a .NET framework to avoid having to deserialize an object graph per request. Instead, it navigated a byte array and plucked out just the data needed.

Another time, I wrote a compact in-memory representation of a trie, using bit-packing and a few other tricks, to get an order of magnitude reduction in memory usage, making it possible to cache a lot more of a data set.

Why go to the database and instantiate a classic object if you can avoid it?


> Why mandate how the benchmark is implemented?

Otherwise you won't be able to compare the results.

For example, the Benchmarks Game [1] specifies how the algorithms in those benchmarks should work. For some of the benchmarks, there are faster algorithms. However, it's not about comparing algorithms, it's about comparing language implementations.

[1] http://benchmarksgame.alioth.debian.org/


You're going to stream a Hello, World value? Why use JSON, at all, when we can just print the resulting JSON text! Benchmarks should be defined, that's all.


> Why use JSON, at all, when we can just print the resulting JSON text!

Indeed. But if you want to actually force the server to encode some JSON, then make the reply include portions of the request. Then you don't have to have artificial rules about how it is implemented to get interesting/meaningful results: https://news.ycombinator.com/item?id=6650499


No, you still need to, because you can otherwise do a trivial string concat in most cases, even though in reality that would be most unwise for security, robustness, and maintainability reasons.


Since you don't know the inputs ahead of time, the benchmark would be incorrect if it could not handle the corner cases like special characters that need to be escaped in JSON strings.

If you wanted you could also make the request specify the key at which a value must be written.

Be creative. :) But don't think that apps always have to be written the way they are now to be "legitimate."


For JSON serialization, I'd assume it's because they're testing object serialization.


Right. Related to this, I feel like a couple of frameworks are cheating on the plaintext test because they aren't actually testing string... serialization. They encode the "Hello, World!" string into bytes once, retain those bytes in memory, and send those bytes on each request just to avoid encoding the string again.

It's such a small, silly optimization. It doesn't bother me because it's not going to affect the overall results much, but I do feel it's against the spirit of the test. That kind of thing would be prevented if we made the test echo a request parameter, as was suggested elsewhere in the comments.


Clients of a system don't know or care if the server actually constructed an object and serialized it. They care that they get the reply they expect, and do not care how it was constructed.


Right, but then you're just testing how fast a framework can vomit out a small string and some headers.

I'm not saying it's an especially useful benchmark, but it's intended to test something besides raw response speed.



I wholeheartedly agree. But like said:

> I'm not saying it's an especially useful benchmark, but it's intended to test something besides raw response speed.


In an era when VMs are bought by the RAM size, this benchmark still does not show RAM usage, or performance at a specific RAM limit. In such a benchmark JVM is allowed to take advantage of its weakness. :)


We do have a fairly long-standing issue to add collection of resource statistics while tests are running [1]. It has not yet been implemented, however. I do want to get that implemented sometime soon because I too would like to see CPU utilization (some frameworks do not successfully saturate all of the cores!), memory utilization, and IO utilization.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/10...


One more thing: I love the benchmark! I love to see every new issue...


I'm rather unimpressed by the implementations of the Python benchmarks, and surprised there isn't a single mention of gevent - that alone would have speeded up, say, Bottle by nearly an order of magnitude.

We use uWSGI + bottle + gevent for REST APIs and a relatively humble box can reach flood a gig Ethernet link with _useful_ replies. And, of course, any decent implementation will cache computed results, etc. (but toss Varnish in - which we haven't, yet, since there's no need - and you'll outperform anything else for cacheable replies).


With all the nice alternative frameworks available for the JVM these days, including some pretty micro/bare-bones alternatives, I wonder if Java will find new favour with startups or if they'll stick with Rails. The old "the Java ecosystem is too complex/over-engineered" excuse falls down with stuff like Play or Grizzly.

I fully realise the language itself is annoying and clunky and that Scala is not necessarily the answer. I'm just saying that there are some serious upsides to consider beyond language aesthetics.


Which is more expensive, servers or developers? Which is more expensive, server or opportunity cost?

In a lot of cases, performance of the server isn't the determining variable in the overall cost calculation.

In addition, the relationships between those variables likely changes over time.

I'm guessing that for most startups, HR expenses and opportunity costs are pretty important relative to actual server performance.

Obviously, if the point of your startup is some kind of high-performance computing thing, the equation is tipped more towards server performance. But, you probably aren't using off-the-shelf frameworks in that case, either.


Not to beat a dead horse, but ask Twitter. I expect their answer would be something like "Migrating off a non-performant solution to Java was more expensive than just doing it right in the first place."

Also, my point was that the newer jvm frameworks aren't much more complex or time-consuming than Rails so developer cost isn't such an issue. That said, not much compares to Rails for getting stuff up and running quickly.


Cost is relative to the maturity of the business. They may not have had the funds or time to do things the right way the first time, and if they had tried, they might not have ever made it into near-unlimited-budget-land.


Fair enough, back then the newer, sleeker jvm frameworks may not have existed. But now, perhaps there's no such excuse.

Maybe these options will become really compelling once Kotlin is released. It compiles to both the jvm and JS so it will hit the performance, convenience and "fun to write" sweet spots all at once...or so I hope.


The quickstart for grizzly[0] contains a staggering amount of boilerplate code. I don't think anyone's going to be running back to the JVM for that.

https://grizzly.java.net/quickstart.html


Ok, but that's not code for an HTTP app, it's a raw socket echo client and server.


Yes, and a lot of it is comments.


Rails is so 2008... node.js is the new hotness.


I don't like Rails, however....Bitch please -_-.


That's okay, what's old is new again. Haven't you heard that Perl is hip again?


Of course it is =)

  # WebSocket echo service
  websocket '/echo' => sub {
    my $self = shift;
    $self->on(message => sub {
      my ($self, $msg) = @_;
      $self->send("echo: $msg");
    });
  };


I said that in jest not because it's ridiculous, but because I wish it were true. I use Mojolicious and love it.

My only complaint is that I started with Mojolicious::Lite way back in the beginning of the project, and I really needed to port to Mojolicious proper with a better source file layout quite a while back, but Mojolicious makes it too easy to throw in another lite-style route. At least the templates were easy enough to separate that I did it long ago.


You got to be kidding me -_-.


Why Erlang-based frameworks removed in Round 7?


We were unable to correct a freezing issue that we ran into with the Rebar package manager. We removed Cowboy and Elli until this can be resolved.

See https://github.com/TechEmpower/FrameworkBenchmarks/issues/49...

This less forgiving stance with respect to glitches comes from our intent to publish a round of tests every month from this point forward (assuming we have the manpower to do so). In order to pull that off, we need to be less forgiving with problems that we don't have the know-how to fix immediately. We investigated Rebar for a bit but ultimately conceded defeat, skipped the tests, and posted the issue linked above. We'll revisit it again for Round 8.


Looks like they had some issues building with rebar: https://github.com/TechEmpower/FrameworkBenchmarks/issues/49...


Go's numbers are quite impressive considering how young the language is. That combined with the concurrency primitives and ease of deployment makes Go very appealing.


Be very suspect of the results - for instance in python land you have:

wsgi:

    import ujson
    ...
    response = {"message": "Hello, World!"}
    ...    
tornado:

    obj = dict(message="Hello, World!")
    self.write(obj)
    
What you have is the slower way to contruct a dictionary and then passing to the Python native JSON vs. a C optimized JSON for output.


Although this is Round 7, we have no delusions that there is not a lot of room for improvement (particularly in the toolset, although that's a separate matter). If there are tuning tweaks, we'd love to receive a pull request.


Just to expand on that (I had to check it for myself) - in my ipython (ageing core2, two cores at around 6k "bogomips"):

    %timeit d={"message": "Hello, world"}
10000000 loops, best of 3: 103 ns per loop

    %timeit d=dict(message="Hello, world")
1000000 loops, best of 3: 298 ns per loop

    sys.version
    Out[24]: '2.7.3 (default, Jan  2 2013, 13:56:14) \n[GCC 4.7.2]'
So, pretty big difference.


Yes, but!

At 103ns per iteration, you can do 9.7 million iterations per second. At 298ns per iteration, you can do 3.4 million iterations per second.

Let's look at the wsgi numbers for json serialization:

  wsgi-nginx-uWSGI: 109,882 rps
  bottle-nginx-uWSGI: 65,793 rps
  wsgi: 65,755
  ..
Now, let's look at the best case comparison: 109,882 rps vs 3.4 million iterations per second. Is cutting iteration time down to 1/3 significant (298ns vs 103ns)? Yes. Is it significant in the overall context? No.

With a 31x difference between 109,882 and 3.4 million* (and, since this is comparing to an i7, that 31x is probably closer to the ballpark of 100x), this simply isn't likely to be the place where optimization will help much. Put simply, the {"message": "Hello, world"} vs dict(message="Hello, world") cost difference is likely insignificant when compared to the cost of the rest of the request.

With that said, this sounds like a change that should be committed as part of a pull request! Why? A couple of reasons:

- It may still help the performance in a minor way (at the very least, it shouldn't hurt it)

- We prefer idiomatic code and, assuming the literal {} notation - vs dict() - is idiomatic for such a case, this would be a good change.

Your point about this is also interesting:

  What you have is the slower way to contruct a dictionary
  and then passing to the Python native JSON vs. a C 
  optimized JSON for output.
Please do consider a pull request for both. We love community contributions.

* Yes, I know these aren't directly comparable numbers, but they compare well enough in this case.

Edit: Fixed the literal {} vs dict() mixup that e12e pointed out in my comment.


Note, I'm not the one that noticed this, I just tried the difference between literal construction "{}" and using dict().

Good points on the overall (likely) impact on the benchmark(s).

> We prefer idiomatic code and, assuming dict() is idiomatic for such a case, this would be a good change.

I think you mix up two things here: dict() is slower (presumably method look up and maybe class instantiation? Just guessing here) -- and I'd say using the literal notation is in general more idiomatic:

http://docs.python.org/2/tutorial/datastructures.html#dictio...


Oops, yes. I did mix up dict vs the literal version in the comparison. Thank you for the correction.

And yes, I noticed that you weren't the one that noticed the original possible performance issue. What I do appreciate is that you actually went and tested the performance difference.


I'm curious as to why Django was paired with gUnicorn and only gUnicorn for its tests. I would have loved to see it benchmarked with uWSGI, gEvent or something with a little more gusto.


Probably because the testers aren't experts in all frameworks and languages and deployment options, and don't have infinite time and resources to try all combinations of options to test stuff.

Maybe suggest some updates to the people doing the benchmarking. Or, god forbid, do the testing yourself and post some results for people.


Looking through all your comments, I see one vein of similarity between a majority of your words: you're especially, unnecessarily, and consistently impolite. Please, a little internet civility. My question was merely curiosity, and I'm (gasp) well aware that the testers at techempower don't possess infinite time or resources. Your reply was useless and in no way appropriate. Have a bit of respect for yourself.


Ok?

You wanted to know why they didn't test whatever you thought was the optimal setup for your use case, and the answer should've been obvious - there's dozens of frameworks across multiple languages and OS platforms. If you look around on this thread and those for previous iterations of the benchmarks, a consistent response from the benchmarkers is that they didn't test some specific scenario because they lack the expertise to do so.

The tests are on github. Go write one that suits you and send a pull request, or, god forbid, do it yourself and make a post about it on HN and get yourself some karma. But as it stands, your post was lazy and trivial. I've said similarly dumb things in the past, gotten downvoted or called out, and we all moved on.

I'm sorry your fee-fees got hurt because you said something dumb, and I pointed it out in terms that failed to indicate respect for you. I upvoted your response as a gesture of love and understanding.


Are there enough HN readers interested in contributing new benchmark scenarios for their favourite language?

Ideas I had were: - Generating a nested JSON structure weighing in at 30kb (from a database) - Other more-real-word scenarios which I haven't thought up yet ;)


I like your way of thinking. You may want to read and contribute to our "Additional Test Types" issue at GitHub [1]. A test type with a larger JSON payload is on the list.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/13...


For Round 8, it would interesting to see Cognitect's (http://cognitect.com/) new Pedestal framework (http://pedestal.io/) for Clojure. Pedestal is Ring compatible but it doesn't use Ring.

Instead, Pedestal's new interceptors abstraction decouples http requests from threads. This provides better concurrency support because it enables processing a single request across multiple threads (http://pedestal.io/documentation/service-interceptors/).


I agree. I'd like to see Pedestal added. I've created a new GitHub issue to let contributors know we're looking for an implementation [1].

Incidentally, the Pedestal team gave their fans a free idea that I really liked in August [2]. Do it for fun and glory.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/58...

[2] https://twitter.com/pedestal_team/status/364945233814884352


I have honestly never heard of any of those fastest frameworks. I thought I knew a lot, but most of these are pretty obscure.


Is it sad that I all I care about is if Django beats Rails?


Honestly, that's what I began looking at also. Django and Rails are compared frequently - even if those comparisons aren't hostile.

This is the first time I've seen this benchmark, and it was really really interesting to see how Djangos performance degraded as more queries were executed. They mention that the lack of connection pooling is likely a big factor. I never realised how much that could affect an application.

These results are really interesting!


Yes. Don't define yourself by the tools you use. It creates silly divides that don't matter, and only harm.


Now that is so 2008!


84 frameworks and no GWT? The original cross-compiling web framework, being used for Adwords, Adsense, Google Play store, Amazon AWS, Google Groups, and lots of other huge deployments?

JRebel's web framework comparison this summer had GWT as the front runner, pitted against Spring MVC, Play, Grails, etc.:

http://zeroturnaround.com/rebellabs/the-curious-coders-java-...

Glaring omission for a comprehensive comparison, IMHO.


See my reply elsewhere in this thread concerning Meteor [1]. Some frameworks are opinionated, for lack of a better term, to a degree that makes it difficult to shoehorn them to provide implementations of our very fundamental/simple test types.

If you or anyone is willing to give it a try, however, we would gladly accept a pull request.

[1] https://news.ycombinator.com/item?id=6650262


Then send a pull request


Just saw Matthius give a talk at Scala.io in Paris on his meisterwerk, Spray.

Surprised it came in so high (6), no wonder Spray was acquired by Typesafe...and why Spray will take place of Netty in Play stack.


Do they list the ruby versions they use? Also, what's up with the "errors" column?


Updating the versions in the Environment details became a game a whack-a-mole, so I removed that information from the Environment section for now. Maybe some day I'll come up with a nice way to automatically pull it from the GitHub repo or meta-data about each test. For now, you can find some details in the Rack script [1]: ruby-2.0.0-p0.

We'd be happy to receive a PR to update these versions if anyone is interested in crafting one.

The errors column is a sum of non-200 HTTP responses and socket connect, read, and write errors. In practice, it tends to be predominantly 500-series HTTP responses, meaning the web server acknowledged the request but was too busy to assign it to a worker process or thread.

This can cause the latency of some frameworks to appear artificially low because the 500 responses tend to come back very quickly once they start.

Incidentally, if you hover over the error values, you'll get a success rate percentage.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...


I'm impressed by bottle-pypy, and a bit surprised about the gap between it and flask-pypy.


I use bottle in production and it's fantastic.

I'm curious to see what happens to performance when v.12 is released.


ASP.NET/C# definitely improving.

Rails is disappointing. Slow to be expected for interpreted language framework but that is really sh slow.


I have a feeling it is less about ASP.NET/C# improving and more likely their tests improving. Baseline perf has not went up much in years.


That is correct. We have received many contributions from the community (I would call them subject matter experts, but they seem very humble to me, so I'm not sure what they'd call themselves) to improve the C# tests. It looks as if they have helped quite a bit, as I pointed out in the blog entry that accompanies Round 7.


Again, a totally useless benchmark. I really feel all these benchmark tests are useless, if they are not in the realm of any real world app. So why not create a crud app mimicing benchmark, test writes and reads, and calculations instead. In the real world it comes down to the ease of development, deployment and maintenance. This is where the traditional web languages still win, and ill want to bet you can ship software faster than ever these days when using a proved web framework.

Speed is not the only factor, and usually not even a factor that has to be accounted for, lets be reasonable, 99% of any web apps dont reach the state that they even need to scale to 10000s of user actions per second.

Id be interested to see how hhvm/php whould compare though.


Useless? Not at all, but it's just data. It has to be interpreted, and how you do that depends on your needs. If nothing else, it's interesting from a technical standpoint to see the huge difference it performance.

To think that a benchmark like this tells the whole story of the "best" framework is wrong. Of course it is. It's also wrong to think that it tells the whole performance story. Of course it doesn't. But that does not make it useless.

Id be interested to see how hhvm/php whould compare though

Now you're just arguing yourself :)


Aren't you assuming that the benchmark is targeted at all web-apps? They are just creating a benchmark. If an app you design doesn't need to scale, then you can ignore the benchmark.

To the rest who are building apps that need to scale, the benchmarks are definitely useful.


In the previous tests Grails was also tested, I don't see it back in this round anymore?


See my reply elsewhere in this thread [1] concerning a new less-forgiving approach we've adopted to glitches/errors. It's not the most polite way to run the project, and I apologize for any oversights and errors, but it's just a necessity since we don't have expertise in everything.

With Grails specifically, a very late PR arrived that unfortunately broke the test implementation for us, so we removed it from the Round 7 test. We aim to push new rounds on a monthly cycle, so I hope that anything that is missing in one round won't be missing very long.

[1] https://news.ycombinator.com/item?id=6650136


Perhaps you could add a second table on each page of "Did Not Complete" entries.

Because people just aren't going to read the notes. They'll skip to the results.


That is a good idea. I'll add that to the GitHub issues.


I am absolutely shocked that Ruby kicked so much ass-- in the latency department. It's actually a pretty significant margin when you think about how long some of those simpler requests probably took the server to complete and then the latency to return the request.

Also, anyone out there have any good real world experience with other scala persistence frameworks? I was surprised to see play-anorm do so "relatively" poor compared to other scala frameworks. Though I'm quite, admittedly, naive in how "big" each of the other frameworks are (only used play and scalatra).

Please forgive any misspellings or grammar errors. This was typed on my iphone.


PHP and node.js neck to neck?

I'd really like to see the chart showing RAM usage for these tests.


This has puzzled/surprised me over multiple benchmarks. I had assumed that node.js would be faster at pretty much everything than PHP (especially for concurrency).

Time and time again though, these benchmarks show PHP outperforming node.js. I'm sure node.js will have lower RAM usage than PHP but as I'm more interested in reqs/s, I'm going to reconsider using node.


PHP is just a thin layer on top of C. Of course it will perform decently. PHP frameworks are usually designed in a dumb way and thus perform badly. Usually they load all or most of their files into memory with every request, even though only couple files would be enough.


I'd really like to see Yii framework included in the test as they claim to only load what's needed.


So the first framework ever (Java Servlets) is still pretty much the fastest? I wonder what that says about where development is heading.

In my previous project I've actually used straight servlets and found them quite RESTful and elegant - HTTP GET calls a function, you run a SQL query, render results using a template. What else does a framework really need? I much prefer that to the beast that is Spring. In fact, that's pretty much what people use tornado+express for these days - map a URL to a function.


Caching is always faster than any web framework, so I don't think a faster web framework add much value to most companies. Choose a framework that makes it easy to add caching.


I assume you mean reverse-proxy caching and not back-end caching. Not all use cases are suitable for reverse-proxy caching. Blogs, news sites, and the like are very suitable. Applications that are heavily personalized or work with private data are unsuitable for reverse proxying.

Our test cases are explicitly concerned with exercising performance when reverse proxying is not suitable, for whatever reason. If reverse proxying works for your use-case, definitely consider using it.


Can someone tell me what is the difference between php and php-raw?


In the tests that use a database, this distinguishes using an ORM vs. direct SQL queries. From the guide on the filters panel:

We classify object-relational mappers as follows: Full, meaning an ORM that provides wide functionality, possibly including a query language. Micro, meaning a less comprehensive abstraction of the relational model. Raw, meaning no ORM is used at all; the platform's raw database connectivity is used.


One issue I have with these frameworks is the scaling pattern/path

With Rails I might do:

1. Fresh rails app

2. Add page caching / CDN

3. Add database/query caching

(I have no apps large enough to require anything past here)

4. Sharding the DB?

Rails is awesome because each step is concise (via the language, ruby) and many other people have demos / documentation on how to do them.

Sure, your framework of choice may be fast at stage 1, but how easy is it to go to 2, 3 and so on when I need to?


Those are all basic optimizations that most make as their apps grow, I doubt there is a single framework in that benchmark that couldn't handle all of them easily.


Nice work, but a bit useless without Erlang. :-(


I'd like to see what this test looks like once you add event-machine to the ruby frameworks.


It's strange that Vertx has very poor performance when it performs database queries.


I noticed the same, and I suspect this has more to do with the implementation of the Mongo driver it's using than with Vert.x itself. Vert.x is a very high-performance server.


I thought so too, so I looked at the source. It's communicating with a MongoDB persistor over the event bus rather than querying the database directly. So the poor performance is because there's an extra layer of communication happening, which is kind of misleading for a benchmark like this.


Vert.x was in our original batch of tests (round 1). At the time, its documentation showed that using the persistor and event bus was the standard intended mode of communicating with a database. If someone can point us to documentation showing that is no longer how you're supposed to use a database with Vert.x, we'd be happy to accept a pull request that changes the test. Otherwise, I think it's valid as is, and not misleading.


> We use the word framework loosely to refer to any HTTP stack—a full-stack framework, a micro-framework, or even a web platform such as Rack, Servlet, or plain PHP.

Yep, that's the problem of these benchmarks. Apples and oranges.


Wait, why is this a problem? Nobody is suggesting you blindly pick the #1 framework on the list without thought. It IS however very interesting to see how things compare on one benchmark. It also provides some empirical evidence that JVM languages are still king if you care primarily about performance.


You can open the filters panel to slice the data however you see fit, including by removing all of the platforms from the view.

I have answered this sort of question many times. I eventually wrote a blog entry about it: http://tiamat.tsotech.com/unfair-comparisons


According to the charts, PHP performs well when there are multiple queries and in the case of Data updates.

I wonder of they setup PHP with APC cache (opcode cache) which is basic and easy setup to speed up PHP.


Yes: http://www.techempower.com/benchmarks/#section=motivation

"Have you enabled APC for the PHP tests?" Yes, the PHP tests run with APC and PHP-FPM on nginx.


What happened to the Go on EC2 data? It appears to be missing.


With Round 7 we have adopted a less forgiving approach to the test runs. In previous rounds, we would expend a non-trivial amount of effort attempting to ensure all of the tests worked correctly on both i7 and EC2.

But when we're busy with other projects, this becomes a real bottleneck. We wanted to get Round 7 out and be able to do future rounds on a monthly basis. In order to pull that off, we simply need to reduce our own bottleneck and ask the community to do more of that work. Round 7 was the first with a preview round and the community submitted a bunch of PRs prior to the final run.

To be as clear as possible, I absolutely do not blame the missing Go data on EC2 on the community. It's just the nature of attempting to herd all of these cats, er frameworks. :)

I'm fairly sure it's a configuration problem either in the Go test implementation -or- our test-suite. And this part may be obvious, but it bears repeating: except in very rare cases, test absence should not reflect poorly on the framework itself.


Please post a link to where we can go to help contribute to your benchmark work. Thank you.


No JSF or plain JavaEE?


JSF may not be a good fit for these benchmarks, as discussed in this issue on the project: https://github.com/TechEmpower/FrameworkBenchmarks/issues/30...


There are several servlet containers listed, that is as plain as JEE can get, isn't it?


I would like to see Casablanca in the next rounds as a good C++ contender:

http://casablanca.codeplex.com/


I just read the word "microsoft" and instantly kill my tab.


According to this nodejs is barely faster then PHP. Does this have to do with the nodejs http server vs nginx or more to do with the language itself.


That's node vs bare PHP, which has limited utility. Performance drops like a rock if you use any kind of framework - see cake, CI, symfony all at the very bottom.


Plain PHP without any frameworks is just a really thin layer on top of C. So it's not a suprise that it performs well.


Well, if every language would have a sufficiently smart compiler, the world would be fine.


Can you add Meteor for Round 8?


There is a category of web frameworks that imply specific web application architectures which make them somewhat difficult to shoehorn to our test types. Meteor is one of those, as are JSF and GWT. We'd welcome an implementation that attempts to match Meteor to these test types, but acknowledge that might be awkward.


We'd love to have a Meteor test. Any Meteor developers interested in submitting a pull request with a Meteor test?

https://github.com/TechEmpower/FrameworkBenchmarks


tree-frog is also a framework that needs to be included. It is based on c++ and follows rails thinking. So you get both rails way of doing things and c++ performance.


Treefrog is included. It's one of the new frameworks in Round 7.

http://www.techempower.com/benchmarks/#section=data-r7&hw=i7...


PHP benchmarks are extremelly outdated and not even using PHP 5.5 with OPCache (that is now native). Python and Ruby implementation choices are also really weird.

- Laravel Version 3.2.14

- PHP Version 5.4.13 with FPM and APC

Don't waste your time with this crap.


Not to be an ass, I don't know much about their approaches and Php (APC in particular)... But wouldn't enabling caching sort of defeat a portion of the test?

Wouldn't it just be better to compare caches/caching mechanisms?


While other sensible language interpreters have a feature like byte-code caching built-in in them, PHP for quite a while did not, because someone was monetizing a proprietary cache add-on named ZendOptimizerPlus!

They released it as part of the language in 5.5 because alternative APC cache got just good enough and everyone was installing it by default.

So, to answer your question: yes, APC should be used on any PHP benchmark if performance is in question.


they seem pretty open to pull requests. so feel free to fix the "crap", as you call it.


What's wrong with rails data update?


why no template rendering?


The Fortunes test exercises server-side templates. You can see the requirements for each test in the "Source code & requirements" section: http://www.techempower.com/benchmarks/#section=code

I think we used to have a brief description of each test in the results section as well, and they somehow got lost in this round. We should add them back; how is a new reader supposed to know what "Fortunes" means?


The requirement summary is still there for each test, but it has been moved below the results table. I am experimenting with this re-arrangement to see what people think.


such a waste of time and money, I really dont get it...


It is a SEO booster for that company, but don't tell anyone.


I even got punished with bad karma for that :D




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: