Web Framework Benchmarks Round 7

rallison · on Oct 31, 2013

For all those interested in hackernews comments from the previous rounds, enjoy:

Round 6: https://news.ycombinator.com/item?id=5979766

Round 5: https://news.ycombinator.com/item?id=5727012

Round 4: https://news.ycombinator.com/item?id=5644880

Round 3: https://news.ycombinator.com/item?id=5573532

Round 2: https://news.ycombinator.com/item?id=5498869

Round 1: https://news.ycombinator.com/item?id=5454775

Lots of solid discussion.

integraton · on Oct 31, 2013

Summary: if you want performance, use Java, Scala, Go, Clojure, Lua, or C++.

Honestly, now with all the great Scala frameworks, Clojure, and the ability to run Rails, plus Cassandra, Storm, etc, I'm a little creeped out that I'm actually strongly considering building my current new project completely on the JVM.

recuter · on Oct 31, 2013

I'm a fan of Go and an even bigger fan of Lua, and I'm not sure I agree with your summary. Go and OpenResty can push 300,000 "Hello World" plaintext responses a second, and some Java stuff can do x2 that, so JVM frameworks are strictly better? And something called cpoll_cppsp trumps a sane language choice? Don't be cute. :)

It is perfectly fine to not squeeze every drop from your hardware and pick Go or Lua. In the same vain, Flask may be x20 slower at a meaningless benchmark but you'd note that the ones that involve actual work, you know, the kind your complex app will actually perform, the more involved it is the more the difference shrinks. All the way down to x4, in other words not an order of magnitutde.

The real takeaway summary should probably be: a well made framework like flask doesn't have so much overhead that the productivity gains it offers are not worth trading a bit of performance for.

A terribly made framework (I won't name names) will bite you in the ass. Choose carefully.

mangeletti · on Oct 31, 2013

I think the idea of a "meaningless benchmark" has become a dogma. Can you possibly know all the complexities that an app will contain when benchmarking a framework? No, but that doesn't make baseline benchmarks useless. Part of the reason for including the most basic (CRUD-ish) operations as separate benchmarks in their tests is so that you can make comparisons between basic "Hello, World" operations and more complex operations, which gives you a bit more insight into the overhead of each framework.

I'll take that any day, over guessing which framework is fastest.

recuter · on Oct 31, 2013

Agreed, I should have avoided that particular phrasing.

piranha · on Oct 31, 2013

Actually, it's other way around. In 'JSON encoding' tornado and flask-pypy are 4x slower than compojure, for example, but in 'Multiple queries' they are 14x slower.

integraton · on Oct 31, 2013

Just for the record, my comment didn't intend to imply that JVM frameworks are strictly better in all cases, and I don't believe that at all. In fact, I'd argue that for the majority of applications, choosing any modern framework purely for benchmark performance is very foolish since all of them perform more than adequately.

Rails, for example, despite always having been "slow," performs far more than adequately for the vast majority of web applications, scaling patterns are well established, and it's obviously hugely productive for many, many developers and companies.

mbesto · on Nov 1, 2013

> A terribly made framework (I won't name names) will bite you in the ass. Choose carefully.

Facebook and Twitter seem to be doing alright. Care to elaborate?

mgl · on Nov 1, 2013

Well, it really depends on the application type. We are a Java and Ruby on Rails software house (http://codedose.com), we also have experience working for investment banks, and for some cases Java with Spring or Play is the best way to go, especially if you think about apps with complex back-end.

We have recently released an online market for physical gold trading capable of handling 10k+ concurrent users with horizontally scalable architecture and complex trading engine/accounting logic, and it's all Java. It has to be fast and there is not enough static or almost-static content to make caching effective.

TylerE · on Oct 31, 2013

When ever I feel like that, a few hours with either the Java or Scala persistance frameworks rapidly cures me of it.

terhechte · on Nov 1, 2013

This. In my last project I tried to use scalatra with slick, and while I really liked scalatra, slick made me go nuts. I had to jump over so many hoops that it was just a pain.

I'm currently working on a new project (still research phase) where I was pondering going with Spray and a lightweight Postgresql wrapper because I primarily need to read data from a database, do some transformations on it, and write it out as JSON as fast as possible. I had it working in Spray, but I had issues with server crashes and the speed wasn't as I'd have expected. I fiddled with it for one day, and then, out of frustration, decided to give openresty a try. I've never written much Lua in my life, but after only a couple of hours, I had it working, and it was far, far faster than the Spray implementation. I did some research there, and it seems that the database stuff took a whole lot longer in Scala/Spray than in Lua. Now, of course, I loose type safety, so there may be hidden issues in there, but since I'm really just doing simple data transformations, I think I'm fine with lua / openresty.

As a first verdict, I really, really like what I've seen of Openresty so far.

EtienneK · on Nov 1, 2013

Honestly, JPA is not that bad.

_nodg · on Nov 1, 2013

mybatis.

hbbio · on Oct 31, 2013

Too bad memory is not measured/displayed as well. For many cloud-hosted servers, you will end up paying for memory, rather than CPU.

And Java/Scala are disqualified pretty quickly.

rsynnott · on Oct 31, 2013

Java and Scala would have large base memory requirements, but not necessarily large per-concurrent-request memory requirements.

thezilch · on Oct 31, 2013

Or PHP, as you're most likely hitting some kind of DB / cache and not just serving plaintext or cached JSON out of your backend... backs away slowly

jol · on Nov 1, 2013

Actually I was quite surprised that multiple queries benchmark showed more than 1 entry for PHP in top 10. take-away: if you don't do much sophisticated data processing then PHP is ok choice. gets ready to run

Joeri · on Nov 1, 2013

That's because php is a really thin wrapper around the underlying C libraries. The thing that makes it ugly is what makes it fast. Php is still interpreted and not jit'ed, even with the opcode cache used in this benchmark, so it loses out in anything that exercises the php code side of the equation, which is why the big php frameworks (symfony, cake and laravel) are 50 times slower than raw php. I wonder how facebook's hhvm engine would stack up, as it is a jit'ing php engine and supposedly an order of magnitude faster.

bhauer · on Nov 1, 2013

That's my understanding of the situation as well.

As a related aside, we have an intention to eventually capture some additional statistics about the implementations such as source lines of code, total number of commits, and possibly lines of code of libraries (where available).

These are just additional data points to use as the reader sees fit, but they may provide some insights. For example: (a) developer efficiency, assuming you are comfortable using sloc as a proxy for developer efficiency, which is admittedly a hotly debated matter; (b) commits are a proxy for the level of attention the particular test implementation has received in our project, since a test with only 1 commit may be unrefined and in need of tuning and a test with dozens of commits may be considered highly volatile; (c) where applicable, the performance impact of additional code, perhaps even as granular as a calculated average per-line cost in rps. The last point might be particularly illuminating for languages such as PHP.

mgkimsal · on Nov 1, 2013

http://blog.liip.ch/archive/2013/10/29/hhvm-and-symfony2.htm...

Doesn't look to be "an order of magnitude faster" - 200% faster in some cases - certainly nice numbers, but not a massive game changer for many (yet?).

ohashi · on Nov 1, 2013

php-raw performed quite well.

andypants · on Nov 1, 2013

As soon as you throw a php framework on top though, it sinks to the bottom of the benchmarks.

fixxer · on Nov 1, 2013

I'd run.

stesch · on Nov 1, 2013

If you ignore Erlang and run your database on a ramdisk.

Sorry, round 7 isn't helping much. Wait for round 8.

arocks · on Nov 1, 2013

or Haskell :)

on Oct 31, 2013

[deleted]

cgh · on Oct 31, 2013

Here is the environment info:

"As of Round 7, three Intel Sandy Bridge Core i7-2600K workstations with 8 GB memory each (early 2011 vintage) for the i7 tests; database server equipped with Samsung 840 Pro SSD Two Amazon EC2 m1.large instances for the EC2 tests Switched gigabit Ethernet"

So 8GB workstations. Even assuming full memory usage, the JVM, Go and C++ frameworks destroyed all comers.

rallison · on Oct 31, 2013

Just for reference, the deleted comment used to say:

Raw performance is nothing without memory usage information.

TomasEkeli · on Oct 31, 2013

That said - memory is dirt cheap

mgkimsal · on Oct 31, 2013

There are still practical limits, and sometimes you have to work within those. I've got a project on a 16G server - the data center doesn't have anything bigger. Moving everything to a custom server or different data center to get to 24G or 32G is... a lot of work. Some time spent finding memory reductions is still worth it.

And what if we went to 32G or 64G, but still needed more? "RAM is cheap" doesn't scale. The hardware needed to host a 128G system (or multiple 16/32G systems) isn't cheap.

sciurus · on Nov 2, 2013

Either you're wrong about the cost or we have a different definition of "cheap". The sweet spot for RAM pricing right now seems to be 16GB DIMMs, which cost $150 to $250. I can go to dell.com and configure a R320 with a Xeon CPU and 96GB of RAM for $2,500. That's less than a high-end Macbook Pro! A R420 with two Xeon CPUs and 192GB of RAM costs $4,300.

I recently upgraded two Dell Poweredge 2950s from 16 to 32GB of RAM. The cost per server was $300. If I'd needed to, I could have upgraded them to 64GB. That's on a five year old server that sells refurbished for $800.

Thaxll · on Oct 31, 2013

Not really, on a desktop it is, on Amazon it's very expensive.

meepmorp · on Oct 31, 2013

So don't use EC2 with your java apps.

raveli · on Oct 31, 2013

I was surprised not to see Revel included in the round 7 results. You can filter results by it but nothing is shown.

This is significant because, on round 6, Revel was performing better than raw Go in at least some of the test scenarios, which is not something you can say about many frameworks.

Was there ever an explanation why Go without any frameworks was performing worse than Revel? Simply because not much effort had been put into optimizing the Go code?

iends · on Nov 1, 2013

I was also confused that when I switched to EC2 instances I didn't see ANY go.

wheaties · on Nov 1, 2013

Good question, I would like to know this as well.

jakejake · on Oct 31, 2013

I have the dubious honor of the slowest framework on the test on page #1! For some reason the i7 tests perform really badly for Phreeze, whereas the EC2 tests it performs near the top of all the PHP frameworks.

I've been too busy with work to look at the tests but my goal is to make a proper showing in round #8!

bhauer · on Oct 31, 2013

Thanks, jakejake. Your participation is always appreciated, especially your good spirit.

Yes, if you have time, please do help us resolve the issue (and I am guessing it is some configuration issue) for Round 8.

sker · on Oct 31, 2013

Slightly glad to see ASP.NET improved a little (at least the stripped/raw/barebones version).

Slightly sad to see the Mono performance is still abysmal.

Edit: link to the blog. It was submitted earlier, but apparently got killed:

http://www.techempower.com/blog/2013/10/31/framework-benchma...

bigdubs · on Nov 1, 2013

The .net benchmarks, the http listener one in particular, uses threadpool tuning that actually hurts performance.

In my own testing there was another ~60% performance to gain with better tweaks.

MalcolmEvershed · on Nov 1, 2013

I implemented the threadpool tuning originally first for the ASP.NET tests (https://github.com/TechEmpower/FrameworkBenchmarks/issues/39...) because it improved performance 5-10%. If you've seen data showing that it actually hurts performance, please open an issue or send a pull request (especially for the additional ~60% performance gain). Thanks.

cmircea · on Oct 31, 2013

ASP.NET`s performance can be improved further, but that requires application specific tuning. That said, it's pretty damn good, for a full framework.

I'm impressed by the Java options. Definitely something to consider if absolute performance is the requirement.

platz · on Nov 1, 2013

I agree - I wonder if the Entity Framework results could be improved by using Compiled Queries.

radicalbyte · on Nov 1, 2013

Performance should improve massively with the use of async..

MalcolmEvershed · on Nov 2, 2013

There were a variety of experiments with async and I don't think there has yet been any data to show that it improved performance of this benchmark.

[0] https://github.com/TechEmpower/FrameworkBenchmarks/pull/272#... [1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/31... [2] https://github.com/TechEmpower/FrameworkBenchmarks/pull/339#...

redact207 · on Nov 1, 2013

Benchmarks aside, I'm really impressed with the whole One ASP.NET thing. Seeing the whole MVC with web api and SignalR all now coming together now is making me finally feel excited about developing on the MS stack again.

omphalos · on Nov 1, 2013

Maybe there is one an I'm just missing it, but I'd love to see an OWIN-based benchmark in that list.

bhauer · on Nov 1, 2013

I'm not familiar with OWIN. However, a GitHub user named @damianh is working on an OWIN implementation. Perhaps reach out to him and offer to help? :)

[1] https://github.com/damianh/FrameworkBenchmarks/blob/nowin/no...

mwsherman · on Nov 1, 2013

I’d like to see ASP.net MVC natively on Windows, as well as ServiceStack. I think they’d compete. Mono is just not a thing for those of us in the .Net world, frankly.

mhixson · on Nov 1, 2013

We are running several .NET tests on Windows. Some are using SQL Server as their database, too. See: http://www.techempower.com/benchmarks/#section=data-r7&hw=i7...

neya · on Nov 1, 2013

http://www.techempower.com/benchmarks/#section=data-r7&hw=i7...

If you notice the specific filters I've set, PHP actually outperforms Go for many cases. That's actually nice to see (PHP gets better with each release version). Well, if there was something like Coffeescript, but for PHP, then there's no better time than now to make it.

I really love the simplicity of Ruby, the performance of Golang/JVM and the massive popularity of PHP and it's compatibility with budget hardware (Shared hosting, etc). If there was a "Write your code in Ruby and we'll compile it to PHP" kind of generator/converter/service, that would be so awesome :D

(If anyone knows something like that that already exists, then I would be extremely thankful).

iends · on Nov 1, 2013

This is interesting, but why is there such a huge gap between raw php, and most of the php frameworks?

The one PHP framework that seems to do well is yaf. However, Yaf is actually written in C and is a php extension. I would expect it to do better than all the php frameworks, but that doesn't account for the huge gap between raw php and the other frameworks.

plorkyeran · on Nov 1, 2013

Assuming that the PHP frameworks are not just all terribly written (which seems unlikely), what this shows us is that the PHP interpreter is very slow, but the libraries that PHP uses are fast. The raw PHP benchmark doesn't actually have very much PHP code, so the slowness of the interpreter doesn't matter much, while the PHP frameworks (other than Yaf) all have a great deal of PHP code.

neya · on Nov 1, 2013

Yeah, me neither, I don't understand why the huge gap. Would be interesting to see some technical background on this, though.

martinml · on Nov 1, 2013

haxe might be similar to what you're asking for?

neya · on Nov 1, 2013

Wow, thanks a lot Martin, Haxe looks very very promising :)

haberman · on Oct 31, 2013

From the "JSON serialization" requirements:

> For each request, an object mapping the key message to Hello, World! must be instantiated.

Why mandate how the benchmark is implemented? What if some frameworks could be faster by not instantiating an object? For maximum speed, you'd want to stream-process the input into output while materializing as few intermediate objects as possible.

Benchmarks should be defined in terms of their observable inputs/outputs. Otherwise it's like defining a car's 0-60 measurement with a rule that says "the car must then take gas from its tank and inject it into the engine." And then a Tesla could only compete by adding a gas engine that it doesn't need or want.

bhauer · on Oct 31, 2013

I want the test to require the work of creating an object or allocating the memory for the response and then use of a JSON serializer.

These tests are intended to be stand-ins for realistic application behaviors. They are not realistic applications, but they should make use of the same functionality that a real application would. I want the tests to establish realistic high-water marks for the types of operations they exercise.

Running a single query per request? The high-water mark you might expect is what you see as our single-query test. It would be very difficult to make a single-query operation in a real application more trivial than what we have.

If I treated all tests as a black box, none of the database tests would actually query the database; the fortunes test would not add an ephemeral object and re-sort on every request; and so on. Everything would just send static responses back.

Bottom line: nothing beats benchmarking your own application. But the tests in this project are intended to give a first-pass filter of sorts, assuming performance is on your list of requirements.

haberman · on Oct 31, 2013

> If I treated all tests as a black box, none of the database tests would actually query the database; the fortunes test would not add an ephemeral object and re-sort on every request; and so on. Everything would just send static responses back.

A better way to avoid this would be to make the output depend on the input. Require that the reply include some JSON that contains portions of the request.

Likewise for databases, require that the reply contain contents of a previous request. If you want to, make the request contain an ID and make the reply return the data keyed under that ID.

If you want to require that the data is persisted durably, require that the benchmark continue to function even if all processes are killed and restarted in the middle of a run.

What really matters about a system are its inputs and outputs. If an innovative system comes along that works differently inside, it shouldn't be disqualified from a benchmark just for being innovative.

emn13 · on Oct 31, 2013

That wouldn't work sufficiently well; unless the benchmark is quite complex it would almost always be better to implement an in-memory custom datastore rather than use a traditional backend. Even if you require persistance, since the intentionally small dataset means you can get away quite well with raw memory dumps or even an mmap variant. You wouldn't use a full-fledged JSON serializer but a limited string concatenation.

In any case, the way I read it, the requirement doesn't mean you have to make anything particularly heavy, just that you need to use a representation that's general enough to be usable in other scenarios without rewriting the entire app.

Of course, a little bit of extra encouragement in the form of a benchmark that's slightly harder to game wouldn't be bad.

barrkel · on Nov 1, 2013

Custom datastores (custom at the framework level, not necessarily at the application level) have been very useful for me in the past for getting high performance.

For example, I wrote a little in-memory DB for use in a .NET framework to avoid having to deserialize an object graph per request. Instead, it navigated a byte array and plucked out just the data needed.

Another time, I wrote a compact in-memory representation of a trie, using bit-packing and a few other tricks, to get an order of magnitude reduction in memory usage, making it possible to cache a lot more of a data set.

Why go to the database and instantiate a classic object if you can avoid it?

ahoge · on Nov 1, 2013

> Why mandate how the benchmark is implemented?

Otherwise you won't be able to compare the results.

For example, the Benchmarks Game [1] specifies how the algorithms in those benchmarks should work. For some of the benchmarks, there are faster algorithms. However, it's not about comparing algorithms, it's about comparing language implementations.

[1] http://benchmarksgame.alioth.debian.org/

thezilch · on Oct 31, 2013

You're going to stream a Hello, World value? Why use JSON, at all, when we can just print the resulting JSON text! Benchmarks should be defined, that's all.

haberman · on Oct 31, 2013

> Why use JSON, at all, when we can just print the resulting JSON text!

Indeed. But if you want to actually force the server to encode some JSON, then make the reply include portions of the request. Then you don't have to have artificial rules about how it is implemented to get interesting/meaningful results: https://news.ycombinator.com/item?id=6650499

emn13 · on Oct 31, 2013

No, you still need to, because you can otherwise do a trivial string concat in most cases, even though in reality that would be most unwise for security, robustness, and maintainability reasons.

haberman · on Oct 31, 2013

Since you don't know the inputs ahead of time, the benchmark would be incorrect if it could not handle the corner cases like special characters that need to be escaped in JSON strings.

If you wanted you could also make the request specify the key at which a value must be written.

Be creative. :) But don't think that apps always have to be written the way they are now to be "legitimate."

meepmorp · on Oct 31, 2013

For JSON serialization, I'd assume it's because they're testing object serialization.

mhixson · on Nov 1, 2013

Right. Related to this, I feel like a couple of frameworks are cheating on the plaintext test because they aren't actually testing string... serialization. They encode the "Hello, World!" string into bytes once, retain those bytes in memory, and send those bytes on each request just to avoid encoding the string again.

It's such a small, silly optimization. It doesn't bother me because it's not going to affect the overall results much, but I do feel it's against the spirit of the test. That kind of thing would be prevented if we made the test echo a request parameter, as was suggested elsewhere in the comments.

haberman · on Oct 31, 2013

Clients of a system don't know or care if the server actually constructed an object and serialized it. They care that they get the reply they expect, and do not care how it was constructed.

meepmorp · on Oct 31, 2013

Right, but then you're just testing how fast a framework can vomit out a small string and some headers.

I'm not saying it's an especially useful benchmark, but it's intended to test something besides raw response speed.

haberman · on Oct 31, 2013

See: https://news.ycombinator.com/item?id=6650499

meepmorp · on Oct 31, 2013

I wholeheartedly agree. But like said:

> I'm not saying it's an especially useful benchmark, but it's intended to test something besides raw response speed.

cies · on Nov 1, 2013

In an era when VMs are bought by the RAM size, this benchmark still does not show RAM usage, or performance at a specific RAM limit. In such a benchmark JVM is allowed to take advantage of its weakness. :)

bhauer · on Nov 1, 2013

We do have a fairly long-standing issue to add collection of resource statistics while tests are running [1]. It has not yet been implemented, however. I do want to get that implemented sometime soon because I too would like to see CPU utilization (some frameworks do not successfully saturate all of the cores!), memory utilization, and IO utilization.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/10...

cies · on Nov 1, 2013

One more thing: I love the benchmark! I love to see every new issue...

rcarmo · on Nov 1, 2013

I'm rather unimpressed by the implementations of the Python benchmarks, and surprised there isn't a single mention of gevent - that alone would have speeded up, say, Bottle by nearly an order of magnitude.

We use uWSGI + bottle + gevent for REST APIs and a relatively humble box can reach flood a gig Ethernet link with _useful_ replies. And, of course, any decent implementation will cache computed results, etc. (but toss Varnish in - which we haven't, yet, since there's no need - and you'll outperform anything else for cacheable replies).

cgh · on Oct 31, 2013

With all the nice alternative frameworks available for the JVM these days, including some pretty micro/bare-bones alternatives, I wonder if Java will find new favour with startups or if they'll stick with Rails. The old "the Java ecosystem is too complex/over-engineered" excuse falls down with stuff like Play or Grizzly.

I fully realise the language itself is annoying and clunky and that Scala is not necessarily the answer. I'm just saying that there are some serious upsides to consider beyond language aesthetics.

numbsafari · on Oct 31, 2013

Which is more expensive, servers or developers? Which is more expensive, server or opportunity cost?

In a lot of cases, performance of the server isn't the determining variable in the overall cost calculation.

In addition, the relationships between those variables likely changes over time.

I'm guessing that for most startups, HR expenses and opportunity costs are pretty important relative to actual server performance.

Obviously, if the point of your startup is some kind of high-performance computing thing, the equation is tipped more towards server performance. But, you probably aren't using off-the-shelf frameworks in that case, either.

cgh · on Oct 31, 2013

Not to beat a dead horse, but ask Twitter. I expect their answer would be something like "Migrating off a non-performant solution to Java was more expensive than just doing it right in the first place."

Also, my point was that the newer jvm frameworks aren't much more complex or time-consuming than Rails so developer cost isn't such an issue. That said, not much compares to Rails for getting stuff up and running quickly.

hackula1 · on Oct 31, 2013

Cost is relative to the maturity of the business. They may not have had the funds or time to do things the right way the first time, and if they had tried, they might not have ever made it into near-unlimited-budget-land.

cgh · on Oct 31, 2013

Fair enough, back then the newer, sleeker jvm frameworks may not have existed. But now, perhaps there's no such excuse.

Maybe these options will become really compelling once Kotlin is released. It compiles to both the jvm and JS so it will hit the performance, convenience and "fun to write" sweet spots all at once...or so I hope.

ollysb · on Oct 31, 2013

The quickstart for grizzly[0] contains a staggering amount of boilerplate code. I don't think anyone's going to be running back to the JVM for that.

https://grizzly.java.net/quickstart.html

meepmorp · on Oct 31, 2013

Ok, but that's not code for an HTTP app, it's a raw socket echo client and server.

cgh · on Oct 31, 2013

Yes, and a lot of it is comments.

tracker1 · on Oct 31, 2013

Rails is so 2008... node.js is the new hotness.

jbeja · on Nov 1, 2013

I don't like Rails, however....Bitch please -_-.

kbenson · on Nov 1, 2013

That's okay, what's old is new again. Haven't you heard that Perl is hip again?

gizzlon · on Nov 1, 2013

Of course it is =)

  # WebSocket echo service
  websocket '/echo' => sub {
    my $self = shift;
    $self->on(message => sub {
      my ($self, $msg) = @_;
      $self->send("echo: $msg");
    });
  };

kbenson · on Nov 1, 2013

I said that in jest not because it's ridiculous, but because I wish it were true. I use Mojolicious and love it.

My only complaint is that I started with Mojolicious::Lite way back in the beginning of the project, and I really needed to port to Mojolicious proper with a better source file layout quite a while back, but Mojolicious makes it too easy to throw in another lite-style route. At least the templates were easy enough to separate that I did it long ago.

jbeja · on Nov 1, 2013

You got to be kidding me -_-.

nivertech · on Oct 31, 2013

Why Erlang-based frameworks removed in Round 7?

bhauer · on Oct 31, 2013

We were unable to correct a freezing issue that we ran into with the Rebar package manager. We removed Cowboy and Elli until this can be resolved.

See https://github.com/TechEmpower/FrameworkBenchmarks/issues/49...

This less forgiving stance with respect to glitches comes from our intent to publish a round of tests every month from this point forward (assuming we have the manpower to do so). In order to pull that off, we need to be less forgiving with problems that we don't have the know-how to fix immediately. We investigated Rebar for a bit but ultimately conceded defeat, skipped the tests, and posted the issue linked above. We'll revisit it again for Round 8.

strmpnk · on Oct 31, 2013

Looks like they had some issues building with rebar: https://github.com/TechEmpower/FrameworkBenchmarks/issues/49...

mayhew · on Oct 31, 2013

Go's numbers are quite impressive considering how young the language is. That combined with the concurrency primitives and ease of deployment makes Go very appealing.

koblas · on Oct 31, 2013

Be very suspect of the results - for instance in python land you have:

wsgi:

    import ujson
    ...
    response = {"message": "Hello, World!"}
    ...

tornado:

    obj = dict(message="Hello, World!")
    self.write(obj)

What you have is the slower way to contruct a dictionary and then passing to the Python native JSON vs. a C optimized JSON for output.

bhauer · on Oct 31, 2013

Although this is Round 7, we have no delusions that there is not a lot of room for improvement (particularly in the toolset, although that's a separate matter). If there are tuning tweaks, we'd love to receive a pull request.

e12e · on Nov 1, 2013

Just to expand on that (I had to check it for myself) - in my ipython (ageing core2, two cores at around 6k "bogomips"):

    %timeit d={"message": "Hello, world"}

10000000 loops, best of 3: 103 ns per loop

    %timeit d=dict(message="Hello, world")

1000000 loops, best of 3: 298 ns per loop

    sys.version
    Out[24]: '2.7.3 (default, Jan  2 2013, 13:56:14) \n[GCC 4.7.2]'

So, pretty big difference.

rallison · on Nov 1, 2013

Yes, but!

At 103ns per iteration, you can do 9.7 million iterations per second. At 298ns per iteration, you can do 3.4 million iterations per second.

Let's look at the wsgi numbers for json serialization:

  wsgi-nginx-uWSGI: 109,882 rps
  bottle-nginx-uWSGI: 65,793 rps
  wsgi: 65,755
  ..

Now, let's look at the best case comparison: 109,882 rps vs 3.4 million iterations per second. Is cutting iteration time down to 1/3 significant (298ns vs 103ns)? Yes. Is it significant in the overall context? No.

With a 31x difference between 109,882 and 3.4 million* (and, since this is comparing to an i7, that 31x is probably closer to the ballpark of 100x), this simply isn't likely to be the place where optimization will help much. Put simply, the {"message": "Hello, world"} vs dict(message="Hello, world") cost difference is likely insignificant when compared to the cost of the rest of the request.

With that said, this sounds like a change that should be committed as part of a pull request! Why? A couple of reasons:

- It may still help the performance in a minor way (at the very least, it shouldn't hurt it)

- We prefer idiomatic code and, assuming the literal {} notation - vs dict() - is idiomatic for such a case, this would be a good change.

Your point about this is also interesting:

  What you have is the slower way to contruct a dictionary
  and then passing to the Python native JSON vs. a C 
  optimized JSON for output.

Please do consider a pull request for both. We love community contributions.

* Yes, I know these aren't directly comparable numbers, but they compare well enough in this case.

Edit: Fixed the literal {} vs dict() mixup that e12e pointed out in my comment.

e12e · on Nov 1, 2013

Note, I'm not the one that noticed this, I just tried the difference between literal construction "{}" and using dict().

Good points on the overall (likely) impact on the benchmark(s).

> We prefer idiomatic code and, assuming dict() is idiomatic for such a case, this would be a good change.

I think you mix up two things here: dict() is slower (presumably method look up and maybe class instantiation? Just guessing here) -- and I'd say using the literal notation is in general more idiomatic:

http://docs.python.org/2/tutorial/datastructures.html#dictio...

rallison · on Nov 1, 2013

Oops, yes. I did mix up dict vs the literal version in the comparison. Thank you for the correction.

And yes, I noticed that you weren't the one that noticed the original possible performance issue. What I do appreciate is that you actually went and tested the performance difference.

shijie · on Oct 31, 2013

I'm curious as to why Django was paired with gUnicorn and only gUnicorn for its tests. I would have loved to see it benchmarked with uWSGI, gEvent or something with a little more gusto.

meepmorp · on Nov 1, 2013

Probably because the testers aren't experts in all frameworks and languages and deployment options, and don't have infinite time and resources to try all combinations of options to test stuff.

Maybe suggest some updates to the people doing the benchmarking. Or, god forbid, do the testing yourself and post some results for people.

shijie · on Nov 1, 2013

Looking through all your comments, I see one vein of similarity between a majority of your words: you're especially, unnecessarily, and consistently impolite. Please, a little internet civility. My question was merely curiosity, and I'm (gasp) well aware that the testers at techempower don't possess infinite time or resources. Your reply was useless and in no way appropriate. Have a bit of respect for yourself.

meepmorp · on Nov 1, 2013

Ok?

You wanted to know why they didn't test whatever you thought was the optimal setup for your use case, and the answer should've been obvious - there's dozens of frameworks across multiple languages and OS platforms. If you look around on this thread and those for previous iterations of the benchmarks, a consistent response from the benchmarkers is that they didn't test some specific scenario because they lack the expertise to do so.

The tests are on github. Go write one that suits you and send a pull request, or, god forbid, do it yourself and make a post about it on HN and get yourself some karma. But as it stands, your post was lazy and trivial. I've said similarly dumb things in the past, gotten downvoted or called out, and we all moved on.

I'm sorry your fee-fees got hurt because you said something dumb, and I pointed it out in terms that failed to indicate respect for you. I upvoted your response as a gesture of love and understanding.

porker · on Nov 1, 2013

Are there enough HN readers interested in contributing new benchmark scenarios for their favourite language?

Ideas I had were: - Generating a nested JSON structure weighing in at 30kb (from a database) - Other more-real-word scenarios which I haven't thought up yet ;)

bhauer · on Nov 1, 2013

I like your way of thinking. You may want to read and contribute to our "Additional Test Types" issue at GitHub [1]. A test type with a larger JSON payload is on the list.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/13...

espeed · on Nov 1, 2013

For Round 8, it would interesting to see Cognitect's (http://cognitect.com/) new Pedestal framework (http://pedestal.io/) for Clojure. Pedestal is Ring compatible but it doesn't use Ring.

Instead, Pedestal's new interceptors abstraction decouples http requests from threads. This provides better concurrency support because it enables processing a single request across multiple threads (http://pedestal.io/documentation/service-interceptors/).

bhauer · on Nov 1, 2013

I agree. I'd like to see Pedestal added. I've created a new GitHub issue to let contributors know we're looking for an implementation [1].

Incidentally, the Pedestal team gave their fans a free idea that I really liked in August [2]. Do it for fun and glory.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/58...

[2] https://twitter.com/pedestal_team/status/364945233814884352

mcv · on Oct 31, 2013

I have honestly never heard of any of those fastest frameworks. I thought I knew a lot, but most of these are pretty obscure.

hiddentaco · on Oct 31, 2013

Is it sad that I all I care about is if Django beats Rails?

jsmeaton · on Nov 1, 2013

Honestly, that's what I began looking at also. Django and Rails are compared frequently - even if those comparisons aren't hostile.

This is the first time I've seen this benchmark, and it was really really interesting to see how Djangos performance degraded as more queries were executed. They mention that the lack of connection pooling is likely a big factor. I never realised how much that could affect an application.

These results are really interesting!

steveklabnik · on Oct 31, 2013

Yes. Don't define yourself by the tools you use. It creates silly divides that don't matter, and only harm.

jbeja · on Nov 1, 2013

Now that is so 2008!

sfjailbird · on Nov 1, 2013

84 frameworks and no GWT? The original cross-compiling web framework, being used for Adwords, Adsense, Google Play store, Amazon AWS, Google Groups, and lots of other huge deployments?

JRebel's web framework comparison this summer had GWT as the front runner, pitted against Spring MVC, Play, Grails, etc.:

http://zeroturnaround.com/rebellabs/the-curious-coders-java-...

Glaring omission for a comprehensive comparison, IMHO.

bhauer · on Nov 1, 2013

See my reply elsewhere in this thread concerning Meteor [1]. Some frameworks are opinionated, for lack of a better term, to a degree that makes it difficult to shoehorn them to provide implementations of our very fundamental/simple test types.

If you or anyone is willing to give it a try, however, we would gladly accept a pull request.

[1] https://news.ycombinator.com/item?id=6650262

axelf · on Nov 1, 2013

Then send a pull request

virtualwhys · on Nov 1, 2013

Just saw Matthius give a talk at Scala.io in Paris on his meisterwerk, Spray.

Surprised it came in so high (6), no wonder Spray was acquired by Typesafe...and why Spray will take place of Netty in Play stack.

phillmv · on Oct 31, 2013

Do they list the ruby versions they use? Also, what's up with the "errors" column?

bhauer · on Oct 31, 2013

Updating the versions in the Environment details became a game a whack-a-mole, so I removed that information from the Environment section for now. Maybe some day I'll come up with a nice way to automatically pull it from the GitHub repo or meta-data about each test. For now, you can find some details in the Rack script [1]: ruby-2.0.0-p0.

We'd be happy to receive a PR to update these versions if anyone is interested in crafting one.

The errors column is a sum of non-200 HTTP responses and socket connect, read, and write errors. In practice, it tends to be predominantly 500-series HTTP responses, meaning the web server acknowledged the request but was too busy to assign it to a worker process or thread.

This can cause the latency of some frameworks to appear artificially low because the 500 responses tend to come back very quickly once they start.

Incidentally, if you hover over the error values, you'll get a success rate percentage.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

Spiritus · on Oct 31, 2013

I'm impressed by bottle-pypy, and a bit surprised about the gap between it and flask-pypy.

cvburgess · on Oct 31, 2013

I use bottle in production and it's fantastic.

I'm curious to see what happens to performance when v.12 is released.

gesman · on Oct 31, 2013

ASP.NET/C# definitely improving.

Rails is disappointing. Slow to be expected for interpreted language framework but that is really sh slow.

keithwarren · on Nov 1, 2013

I have a feeling it is less about ASP.NET/C# improving and more likely their tests improving. Baseline perf has not went up much in years.

bhauer · on Nov 1, 2013

That is correct. We have received many contributions from the community (I would call them subject matter experts, but they seem very humble to me, so I'm not sure what they'd call themselves) to improve the C# tests. It looks as if they have helped quite a bit, as I pointed out in the blog entry that accompanies Round 7.

stormpat · on Nov 1, 2013

Again, a totally useless benchmark. I really feel all these benchmark tests are useless, if they are not in the realm of any real world app. So why not create a crud app mimicing benchmark, test writes and reads, and calculations instead. In the real world it comes down to the ease of development, deployment and maintenance. This is where the traditional web languages still win, and ill want to bet you can ship software faster than ever these days when using a proved web framework.

Speed is not the only factor, and usually not even a factor that has to be accounted for, lets be reasonable, 99% of any web apps dont reach the state that they even need to scale to 10000s of user actions per second.

Id be interested to see how hhvm/php whould compare though.

gizzlon · on Nov 1, 2013

Useless? Not at all, but it's just data. It has to be interpreted, and how you do that depends on your needs. If nothing else, it's interesting from a technical standpoint to see the huge difference it performance.

To think that a benchmark like this tells the whole story of the "best" framework is wrong. Of course it is. It's also wrong to think that it tells the whole performance story. Of course it doesn't. But that does not make it useless.

Id be interested to see how hhvm/php whould compare though

Now you're just arguing yourself :)

hrjet · on Nov 1, 2013

Aren't you assuming that the benchmark is targeted at all web-apps? They are just creating a benchmark. If an app you design doesn't need to scale, then you can ignore the benchmark.

To the rest who are building apps that need to scale, the benchmarks are definitely useful.

tjeerdnet · on Oct 31, 2013

In the previous tests Grails was also tested, I don't see it back in this round anymore?

bhauer · on Oct 31, 2013

See my reply elsewhere in this thread [1] concerning a new less-forgiving approach we've adopted to glitches/errors. It's not the most polite way to run the project, and I apologize for any oversights and errors, but it's just a necessity since we don't have expertise in everything.

With Grails specifically, a very late PR arrived that unfortunately broke the test implementation for us, so we removed it from the Round 7 test. We aim to push new rounds on a monthly cycle, so I hope that anything that is missing in one round won't be missing very long.

[1] https://news.ycombinator.com/item?id=6650136

jacques_chester · on Oct 31, 2013

Perhaps you could add a second table on each page of "Did Not Complete" entries.

Because people just aren't going to read the notes. They'll skip to the results.

bhauer · on Oct 31, 2013

That is a good idea. I'll add that to the GitHub issues.

rubyn00bie · on Nov 1, 2013

I am absolutely shocked that Ruby kicked so much ass-- in the latency department. It's actually a pretty significant margin when you think about how long some of those simpler requests probably took the server to complete and then the latency to return the request.

Also, anyone out there have any good real world experience with other scala persistence frameworks? I was surprised to see play-anorm do so "relatively" poor compared to other scala frameworks. Though I'm quite, admittedly, naive in how "big" each of the other frameworks are (only used play and scalatra).

Please forgive any misspellings or grammar errors. This was typed on my iphone.

babuskov · on Oct 31, 2013

PHP and node.js neck to neck?

I'd really like to see the chart showing RAM usage for these tests.

nacs · on Nov 1, 2013

This has puzzled/surprised me over multiple benchmarks. I had assumed that node.js would be faster at pretty much everything than PHP (especially for concurrency).

Time and time again though, these benchmarks show PHP outperforming node.js. I'm sure node.js will have lower RAM usage than PHP but as I'm more interested in reqs/s, I'm going to reconsider using node.

hpaavola · on Nov 1, 2013

PHP is just a thin layer on top of C. Of course it will perform decently. PHP frameworks are usually designed in a dumb way and thus perform badly. Usually they load all or most of their files into memory with every request, even though only couple files would be enough.

babuskov · on Nov 1, 2013

I'd really like to see Yii framework included in the test as they claim to only load what's needed.

akrymski · on Nov 1, 2013

So the first framework ever (Java Servlets) is still pretty much the fastest? I wonder what that says about where development is heading.

In my previous project I've actually used straight servlets and found them quite RESTful and elegant - HTTP GET calls a function, you run a SQL query, render results using a template. What else does a framework really need? I much prefer that to the beast that is Spring. In fact, that's pretty much what people use tornado+express for these days - map a URL to a function.

kasperbn · on Nov 1, 2013

Caching is always faster than any web framework, so I don't think a faster web framework add much value to most companies. Choose a framework that makes it easy to add caching.

bhauer · on Nov 1, 2013

I assume you mean reverse-proxy caching and not back-end caching. Not all use cases are suitable for reverse-proxy caching. Blogs, news sites, and the like are very suitable. Applications that are heavily personalized or work with private data are unsuitable for reverse proxying.

Our test cases are explicitly concerned with exercising performance when reverse proxying is not suitable, for whatever reason. If reverse proxying works for your use-case, definitely consider using it.

nkg · on Oct 31, 2013

Can someone tell me what is the difference between php and php-raw?

krg · on Oct 31, 2013

In the tests that use a database, this distinguishes using an ORM vs. direct SQL queries. From the guide on the filters panel:

We classify object-relational mappers as follows: Full, meaning an ORM that provides wide functionality, possibly including a query language. Micro, meaning a less comprehensive abstraction of the relational model. Raw, meaning no ORM is used at all; the platform's raw database connectivity is used.

hayksaakian · on Nov 1, 2013

One issue I have with these frameworks is the scaling pattern/path

With Rails I might do:

1. Fresh rails app

2. Add page caching / CDN

3. Add database/query caching

(I have no apps large enough to require anything past here)

4. Sharding the DB?

Rails is awesome because each step is concise (via the language, ruby) and many other people have demos / documentation on how to do them.

Sure, your framework of choice may be fast at stage 1, but how easy is it to go to 2, 3 and so on when I need to?

nikcub · on Nov 1, 2013

Those are all basic optimizations that most make as their apps grow, I doubt there is a single framework in that benchmark that couldn't handle all of them easily.

stesch · on Nov 1, 2013

Nice work, but a bit useless without Erlang. :-(

rhornberger · on Oct 31, 2013

I'd like to see what this test looks like once you add event-machine to the ruby frameworks.

buremba · on Oct 31, 2013

It's strange that Vertx has very poor performance when it performs database queries.

bhauer · on Oct 31, 2013

I noticed the same, and I suspect this has more to do with the implementation of the Mongo driver it's using than with Vert.x itself. Vert.x is a very high-performance server.

clwk · on Oct 31, 2013

I thought so too, so I looked at the source. It's communicating with a MongoDB persistor over the event bus rather than querying the database directly. So the poor performance is because there's an extra layer of communication happening, which is kind of misleading for a benchmark like this.

mhixson · on Nov 1, 2013

Vert.x was in our original batch of tests (round 1). At the time, its documentation showed that using the persistor and event bus was the standard intended mode of communicating with a database. If someone can point us to documentation showing that is no longer how you're supposed to use a database with Vert.x, we'd be happy to accept a pull request that changes the test. Otherwise, I think it's valid as is, and not misleading.

ExpiredLink · on Nov 1, 2013

> We use the word framework loosely to refer to any HTTP stack—a full-stack framework, a micro-framework, or even a web platform such as Rack, Servlet, or plain PHP.

Yep, that's the problem of these benchmarks. Apples and oranges.

iends · on Nov 1, 2013

Wait, why is this a problem? Nobody is suggesting you blindly pick the #1 framework on the list without thought. It IS however very interesting to see how things compare on one benchmark. It also provides some empirical evidence that JVM languages are still king if you care primarily about performance.

bhauer · on Nov 1, 2013

You can open the filters panel to slice the data however you see fit, including by removing all of the platforms from the view.

I have answered this sort of question many times. I eventually wrote a blog entry about it: http://tiamat.tsotech.com/unfair-comparisons

Continuous · on Oct 31, 2013

According to the charts, PHP performs well when there are multiple queries and in the case of Data updates.

I wonder of they setup PHP with APC cache (opcode cache) which is basic and easy setup to speed up PHP.

krg · on Oct 31, 2013

Yes: http://www.techempower.com/benchmarks/#section=motivation

"Have you enabled APC for the PHP tests?" Yes, the PHP tests run with APC and PHP-FPM on nginx.

TheMagicHorsey · on Nov 1, 2013

What happened to the Go on EC2 data? It appears to be missing.

bhauer · on Nov 1, 2013

With Round 7 we have adopted a less forgiving approach to the test runs. In previous rounds, we would expend a non-trivial amount of effort attempting to ensure all of the tests worked correctly on both i7 and EC2.

But when we're busy with other projects, this becomes a real bottleneck. We wanted to get Round 7 out and be able to do future rounds on a monthly basis. In order to pull that off, we simply need to reduce our own bottleneck and ask the community to do more of that work. Round 7 was the first with a preview round and the community submitted a bunch of PRs prior to the final run.

To be as clear as possible, I absolutely do not blame the missing Go data on EC2 on the community. It's just the nature of attempting to herd all of these cats, er frameworks. :)

I'm fairly sure it's a configuration problem either in the Go test implementation -or- our test-suite. And this part may be obvious, but it bears repeating: except in very rare cases, test absence should not reflect poorly on the framework itself.

TheMagicHorsey · on Nov 5, 2013

Please post a link to where we can go to help contribute to your benchmark work. Thank you.

ddnfcodx · on Oct 31, 2013

No JSF or plain JavaEE?

krg · on Oct 31, 2013

JSF may not be a good fit for these benchmarks, as discussed in this issue on the project: https://github.com/TechEmpower/FrameworkBenchmarks/issues/30...

fedesilva · on Nov 1, 2013

There are several servlet containers listed, that is as plain as JEE can get, isn't it?

oscargrouch · on Nov 1, 2013

I would like to see Casablanca in the next rounds as a good C++ contender:

http://casablanca.codeplex.com/

jbeja · on Nov 1, 2013

I just read the word "microsoft" and instantly kill my tab.

dubcanada · on Oct 31, 2013

According to this nodejs is barely faster then PHP. Does this have to do with the nodejs http server vs nginx or more to do with the language itself.

ricardobeat · on Nov 1, 2013

That's node vs bare PHP, which has limited utility. Performance drops like a rock if you use any kind of framework - see cake, CI, symfony all at the very bottom.

hpaavola · on Nov 1, 2013

Plain PHP without any frameworks is just a really thin layer on top of C. So it's not a suprise that it performs well.

Vektorweg · on Nov 1, 2013

Well, if every language would have a sufficiently smart compiler, the world would be fine.

jsnk · on Oct 31, 2013

Can you add Meteor for Round 8?

bhauer · on Oct 31, 2013

There is a category of web frameworks that imply specific web application architectures which make them somewhat difficult to shoehorn to our test types. Meteor is one of those, as are JSF and GWT. We'd welcome an implementation that attempts to match Meteor to these test types, but acknowledge that might be awkward.

rallison · on Oct 31, 2013

We'd love to have a Meteor test. Any Meteor developers interested in submitting a pull request with a Meteor test?

https://github.com/TechEmpower/FrameworkBenchmarks

kvinnako · on Nov 1, 2013

tree-frog is also a framework that needs to be included. It is based on c++ and follows rails thinking. So you get both rails way of doing things and c++ performance.

bhauer · on Nov 1, 2013

Treefrog is included. It's one of the new frameworks in Round 7.

http://www.techempower.com/benchmarks/#section=data-r7&hw=i7...

HN_Master_Race · on Oct 31, 2013

PHP benchmarks are extremelly outdated and not even using PHP 5.5 with OPCache (that is now native). Python and Ruby implementation choices are also really weird.

- Laravel Version 3.2.14

- PHP Version 5.4.13 with FPM and APC

Don't waste your time with this crap.

rubyn00bie · on Nov 1, 2013

Not to be an ass, I don't know much about their approaches and Php (APC in particular)... But wouldn't enabling caching sort of defeat a portion of the test?

Wouldn't it just be better to compare caches/caching mechanisms?

nercury · on Nov 2, 2013

While other sensible language interpreters have a feature like byte-code caching built-in in them, PHP for quite a while did not, because someone was monetizing a proprietary cache add-on named ZendOptimizerPlus!

They released it as part of the language in 5.5 because alternative APC cache got just good enough and everyone was installing it by default.

So, to answer your question: yes, APC should be used on any PHP benchmark if performance is in question.

jmagoon · on Oct 31, 2013

they seem pretty open to pull requests. so feel free to fix the "crap", as you call it.

waitingkuo · on Nov 1, 2013

What's wrong with rails data update?

arahaya · on Nov 1, 2013

why no template rendering?

mhixson · on Nov 1, 2013

The Fortunes test exercises server-side templates. You can see the requirements for each test in the "Source code & requirements" section: http://www.techempower.com/benchmarks/#section=code

I think we used to have a brief description of each test in the results section as well, and they somehow got lost in this round. We should add them back; how is a new reader supposed to know what "Fortunes" means?

bhauer · on Nov 1, 2013

The requirement summary is still there for each test, but it has been moved below the results table. I am experimenting with this re-arrangement to see what people think.

tbarbugli · on Nov 1, 2013

such a waste of time and money, I really dont get it...

nercury · on Nov 2, 2013

It is a SEO booster for that company, but don't tell anyone.

tbarbugli · on Nov 3, 2013

I even got punished with bad karma for that :D