Debunking the Erlang and Haskell hype for servers

rubyrescue · on June 21, 2010

this guy is incredibly hostile to the commenters.

if you read the whole thing, many from the Erlang community point out that he skews the benchmarks in favor of Python and then will...not...listen when they try to respectfully suggest changes to his tests. it's quite difficult to read all the way to the end but by the end it makes me proud of the Erlang community for the by-and-large mature and respectful way they deal with aggressive and prejudicial attacks.

tel · on June 21, 2010

On the haskell end, dons politely pointed out that the author's haskell code was far from idiomatic in a benchmark-significant way. The response he got was pretty blindly defensive.

codexon · on June 22, 2010

I am the author of this article. Do you not think the commenters are the ones being incredibly hostile?

Here they are on my blog convinced that I am wrong, and when I don't agree, they start calling me epithets like "naive", "stupid", or labeling it as "funny". Let me give you a selection of some of their latest behavior.

- "Then you type some absolutely incoherent stuff about python objects and their state as if anyone still doubted your knowledge level in these matters."

- "And to top it off we get to see a raving “I’m never wrong” lunatic on the Internet."

- "You’re _such_ a dick"

Anyway I don't expect to get a lot of support here since everyone here loves exotic functional languages like Erlang. But I am still not budging from my position.

asolove · on June 21, 2010

"Dear the internet, I know that serious companies have used certain cool languages to do amazing things. However, my benchmark of a trivial HTTP server demonstrates that BASIC is still the language of the future."

far33d · on June 21, 2010

I thought the point of erlang was that it was easier to code for distributed systems, not that it was necessarily faster as a language for simple benchmarks.

mononcqc · on June 21, 2010

The main point is usually about reliability and fault-tolerance. Distribution and concurrency are a bit of a lucky side-effect from that (and the standards that were in the telecom world back then).

I'm exaggerating, but distribution came into the language much much later and wasn't exactly a design goal when the language was started as far as I know :)

cperciva · on June 21, 2010

Macrobenchmarks are made up of microbenchmarks.

More seriously, if your code is serially 10x faster, you can grow 10x further before you need to worry about horizontal scaling.

davidw · on June 21, 2010

Ok, but the problem is that part of the speed that Erlang is giving up in this benchmark is something you get back, with interest, if you make a more complex system, in terms of programmer time and program complexity. I suppose it's like comparing C with Python. C is simply faster, but you're making a tradeoff because it's slower to code with, generally.

cperciva · on June 21, 2010

C is simply faster, but you're making a tradeoff because it's slower to code with, generally.

Exactly. And you need to decide on a case-by-case basis whether having a longer runway (because C gives you more time before you run into scalability problems) compensates for needing longer before you can take off (because C is a harder language).

eru · on June 21, 2010

> (because C gives you more time before you run into scalability problems)

That's true for implementing the same algorithm. But C is so hard to get right, that you will probably be able to use only the simplest algorithms in your C code. (Or the other way round, you can scale by using better algorithms in a higher level language like Python much easier and longer than you can do so in C.)

That makes the comparison more complicated. Also Python (and most other languages) work quite nicely together with C. So you can start with Python and replace the hotspots with C. (And be sure to identify the hotspots with a profiler---lest you guess wrong.)

ct4ul4u · on June 21, 2010

> But C is so hard to get right, that you will probably be able to use only the simplest algorithms in your C code

This is a gross exaggeration. It's not that hard to get C code right (C++ is a different story). I am unaware of any effort undertaken by skilled C programmers that failed because of limits C placed on algorithmic complexity. I am not arguing with your preference for higher level languages, just your statement that C is so difficult that it limits algorithmic expression.

C is substantially less compact and requires you to write code for things you get for free from other languages. Longer code takes more time to write and more time to read. Each feature or function point will, on average, take significantly longer to develop. On the other hand, a developer trying to write an OS in Python would also have some productivity challenges in other dimensions.

I am aware of the paradigmatic challenge C presents for many developers trained in the last 15 years. Trying to write in an OO style in C is neither fun nor advisable. Fortunately, most non-ui development is equally agreeable to other styles (although the developer may not be).

I'm not a C bigot and I like or love a number of high level languages (Python, Lisp, Haskell). I just don't think people should be afraid of C. Its closer-to-the-metal nature is an opportunity as well as a cost.

I'll close with a pointer to a great site written in C: http://www.halfbakery.com.

eru · on June 21, 2010

I agree. And I should have chosen different words. What you say is pretty much what I wanted to express.

The original comment said, that with Python you run into scalability problems earlier than with C.

And I wanted to add, that with C you run into (solvable but hard) `scalability' problems in terms of effort needed to cope with algorithmic complexity, much sooner. And more clever algorithms are often the key to solving scalability problems.

(P.S. I do not like OOP, either. State is ugly.)

ct4ul4u · on June 22, 2010

> And I wanted to add, that with C you run into (solvable but hard) `scalability' problems in terms of effort needed to cope with algorithmic complexity, much sooner.

I have certainly seen this effect. In retrospect, I wonder if this could be somewhat mitigated by real refactoring for C?

eru · on June 23, 2010

Perhaps. What also seems to work nice -- at least for me: Prototype in, say, Python, and then translate to C (either the hotspots or everything, in case you need to have a solution in pure C only).

yxhuvud · on June 21, 2010

Which is true if and only if the problem can be solved at all by a centralized system. This is not true for the major use cases of erlang.

cperciva · on June 21, 2010

Of course. If you need your system to be distributed from the start, these benchmarks are irrelevant.

But most people aren't writing telecommunication software and can handle having a few single points of failure.

stcredzero · on June 21, 2010

I think there's some unintentional benchmark sleight of hand here. I note that the slope of the first segment for Erlang and Haskell is almost the same as ideal, but Python deviates quite a bit. If I were the author, I'd be curious about this and try to analyze it. I suspect this would reveal something about his benchmark. (Probably that it's too small!)

cperciva · on June 21, 2010

The slope is equal to one, i.e., "100% of the incoming connections result in a request being successfully handled".

This is a dumb way to graph performance -- usually people look at either (parallel requests, requests per second) or (requests per second, request latency) -- but he seems to have done it correctly.

stcredzero · on June 21, 2010

In the first segment, the slope for Python is not equal to one.

cperciva · on June 21, 2010

You must have better eyes than me. It looks to me like the slopes are all equal to one until the languages hit bottlenecks (for Haskell, at 6000; for Erlang, at 1000; and for Python, at 12000).

stcredzero · on June 21, 2010

Ah, I misunderstood the graph. The graph for all langs starts at the left, then. The graph makes it hard to tell.

edwtjo · on June 21, 2010

I don't get it.. He's using select/EPOLL in python but not Haskell/Erlang. I call FUD on this.

cperciva · on June 21, 2010

To be fair, he did say that enabling epoll in Erlang had no significant effect on performance.

dons · on June 21, 2010

It has a pretty huge impact on Haskell, http://www.serpentine.com/bos/files/ghc-event-manager.pdf (See page 5 for the graphs).

codexon · on June 22, 2010

Why don't you try it?

Epoll for Haskell was heavily experimental and failed to compile when I wrote the article. Enabling epoll in Erlang changed the results by ~1%.

j_baker · on June 21, 2010

I think this is a strawman argument. Of course Haskell and Erlang have features that make them suitable for handling concurrency efficiently. But I'd take fast code in a slow language over slow code in a fast language any day.

That said, the author's core conclusion is correct: "DO NOT WRITE A SERVER IN ERLANG JUST BECAUSE YOU HEARD ERLANG IS THE FASTEST AND MOST CONCURRENT LANGUAGE".

EDIT: Could someone please explain the downmods? Perhaps something I said didn't come off the way I meant it.

codexon · on June 22, 2010

EDIT: Could someone please explain the downmods?

Hacker News has a very strong functional language fanbase which you could see last year by the number of Erlang articles, which has then promptly moved onto NodeJS.

wingo · on June 21, 2010

Upvoted for the "then they fight you" aspect. Fringe language practitioners need to hang together :)

eru · on June 21, 2010

Do you include Python in the fringe? It seems that Python has gone quite mainstream recently. I'd still count Erlang and Haskell in the fringe.

wingo · on June 21, 2010

I do not include Python in the fringe, no.

moron4hire · on June 21, 2010

I can haz profile plz? Where is the code spending most of its time? What happens when the responce takes a non-trivial amount of processing?

davidw · on June 21, 2010

It'd be interesting to see node.js thrown in there. IMO, it and Scala are likely to be the biggest competitors for some of what Erlang's good at.

evgen · on June 21, 2010

Unlikely. They are providing a small amount of competition for the massive-concurrency sweet spot that Erlang accidentally found itself in, but they do not even begin to provide the basics necessary to play in the reliable/fault-tolerant sphere that Erlang well and truly owns. Node and Scala will definitely pick up mindshare as "like Erlang, but easier if you know [Javascript | Java]" but I have a strong suspicion that they are going to end up feeding people _into_ Erlang in the long run.

davidw · on June 21, 2010

Like Java feeds people into Smalltalk? My experience is that some pioneering language does things in a certain way, and then mainstream languages borrow enough of that to be an improvement on what's gone before. Maybe a tiny portion go look up what came before, but mostly not really.

"reliable/fault-tolerant sphere that Erlang well and truly owns." - that's not the "some" I was referring to, and it's likely that Erlang will continue to be strong there. However, concurrency is what people are most interested in. People mostly don't care if web apps are as reliable as phone switches, but care a lot about easier models of concurrency.

evgen · on June 22, 2010

More like how Java feeds people into that popular Smalltalk variant known as Ruby. Sometimes mainstream languages can borrow enough features to pass themselves off as "close enough", but it is also frequently the case that attempts to make this move never really catch on. Twisted tried to pull off this same trick for Python and IMHO it never really managed to make the grade until the enhanced generators and yield statements in recent versions of Python allowed people to write code that was not a complicated mass of callback hell. Node.js might thread the needle, but it seems equally likely that the role played by node.js will be subsumed by a better runtime and javascript will be used to write functions and handlers that execute on the Java or Erlang VMs -- to the users/coders the system will appear the same but they will gain the benefits of a stronger set of concurrency primitives in the runtime.

defen · on June 21, 2010

The "problem" with node.js is that it's single threaded, so you need to be careful not to block that thread with long-running operations. Erlang has a scheduler built-in so you don't have to deal with that yourself.

davidw · on June 21, 2010

Yes, I know that, as I've pointed it out multiple times in the past. Erlang is "better", but, if people writing code for node.js pay attention to how they write code, making the long-running stuff done via callbacks, as they have been doing, it will be "good enough".

fictorial · on June 22, 2010

"Long-running" does not always imply "I/O".

davidw · on June 22, 2010

Yes, I know. I know how Erlang works and what makes it better. But I think that something like node.js really encourages people to not do stupid things like while(true) in actual code, which is why I maintain that, while it's not as good as Erlang, it may be "good enough".

fictorial · on June 22, 2010

I agree with you. Node.js is a practical in cases where your app is mostly glue between some client and some other service. CPU intensive apps can be offloaded to other processes (not the Erlang kind) making the driver glue again. So, yes, it works.

Maro · on June 21, 2010

I don't think that many people take these hip Erlang projects seriously =) Sure, a lot of bloggers try out CouchDB, but who cares. These projects are usually not serious enough for money to change hands. Eg. I recently talked to a client who was looking for a distributed database, he couldn't even load his test dataset into CouchDB. Or, a simple 20 line script shows that Riak is 1000x slower to SET data then a C wrapper for BerkeleyDB (Keyspace in this case, but is almost doesn't matter).

jamii · on June 21, 2010

The majority of the data for the smarkets.com betting exchange is stored in couchdb. Asynchronous calls and cache purges are handled by rabbitmq. The entire backend is written in erlang. This is far from a pet project. Online betting is a highly regulated industry and reliability is key.