If you can't make a language fast, make it parallel

j_baker · on July 30, 2010

I'm not convinced that parallelism is something that needs support at the language level. The only languages that do this are Go and Erlang (off the top of my head).

Others like Clojure and Scala have more support at the standard library level.

jerf · on July 30, 2010

Erlang gets a lot of good mileage out of it, though. You start to really miss the OTP system when you try to write actually-important code in any other language. And there are some aspects that are hard to get right at a later time when it's not in the language from day one, cross-process messaging in particular, which is a primitive used to build a lot of other features. I do not know where Clojure or Scala have a generic cross-process messaging ability, and I'm not saying it's impossible by any means, but it's much easier if you start from scratch with that in mind.

j_baker · on July 30, 2010

"I do not know where Clojure or Scala have a generic cross-process messaging ability, and I'm not saying it's impossible by any means, but it's much easier if you start from scratch with that in mind."

I don't understand how. The only difference I see is whether the interprocess communication parts are coded into the language itself or whether they're coded into a library. It could theoretically make the problem more difficult to put it in the language itself. Instead of only affecting the programs that use that library, now your concurrency features could potentially affect any program written in your language.

Could you give me an example of a concurrency feature that really, truly benefits from being in the language and not the standard library? I simply can't think of any examples.

jerf · on July 31, 2010

For the real answer to this question, attempt to implement a generic serialization syntax for Haskell that requires no work by the user to use. Not even declaring typeclasses. Not even requiring Typeable to be implemented. Just feed it a datatype, and even if the other end has a different version of the software installed (which can happen in a distributed system, after all), it'll all Just Work to some degree.

Erlang has that, because it has its datatypes and a defined serialization for them and there is nothing in the language that is not those datatypes. It also doesn't permit you to layer any type-level assertions about those types into a user type. In fact, Erlang basically has no concept of user types. (Records are syntax sugar around the built-in tuple.) Since the Erlang data types are so weak, an automated serialization can be built that requires no work to use. But that doesn't come without cost

If you don't start with that, you have to use some sort of introspection to examine data types, and you probably don't have a good story for what to do if two ends have totally different ideas about those datatypes, or how to upgrade running processes where you literally want to change the datatype without shutting the process down. The stronger your type system, the harder that gets. The easier it is, the weaker your type system must be. Erlang's capabilities don't come for free, they exist because they wrote their (non-)type system so that their data structures never have any guarantees in them that don't trivially travel across the network. This has its own issues; Erlang has just about the weakest types you can have without actually having arbitrary casts, and that has consequences too.

There are a handful of characteristics in a language, like its type system, that have radical impacts throughout the language and no amount of library jiggery-pokery can completely paper over. Another example not entirely relevant to cross-process concurrency (but not entirely irrelevant) is immutability; no amount of library work can turn a mutable language into an immutable language.

cjenkins · on July 31, 2010

This comment resonates with me especially well today.

I've been attempting to work with GWT and the serialization issues are entertaining (did you implement IsSerializable? is there a no-arg constructor? Do all of the classes your class depend on have the same? Did you remember to recompile the GWT/JS code to update the serialization white list? etc.).

It makes me appreciate more playing with Erlang a while back and shooting data around being so simple.

Silhouette · on July 31, 2010

> Could you give me an example of a concurrency feature that really, truly benefits from being in the language and not the standard library?

Isn't that a bit of a loaded question? In some academic sense, an ideal programming system might have a kernel language that is probably rather small and certainly clean, flexible and extensible in its feature set, and then almost everything else built on top of that kernel in libraries. In practical programming systems, the perfect kernel has proven rather elusive, though.

It's rather like the question of whether a standard library should be small, clean and extensible or comprehensive with a good-enough version of everything. Many of us might be inclined toward the former approach from an academic/theoretical/intuitive point of view, but in practice, just about every widely used programming language from the past two decades has come with a batteries-included library. While those libraries are often criticised for various technical weaknesses, and many of those weaknesses are never addressed because it's too much work, these systems are still good enough for most users out of the box.

scott_s · on July 30, 2010

See the paper by Hans Boehm from PLDI 2005, "Threads Cannot Be Implemented as a Library": http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf

A bit further discussion: http://news.ycombinator.com/item?id=939364

alec · on July 30, 2010

Fortress makes a convincing argument that it must be supported at the language level. It has a smart work splitting algorithm that does for managing parallelism granularity what garbage collection did for managing memory allocation. It works much better when it's baked into the core and is available from the ground up.

kanak · on July 30, 2010

> a smart work splitting algorithm

Not to detract from anything you said (which I agree with), but when Guy Steele gave a guest lecture at my university on Fortress a few months ago, he said that the work splitting algorithm still needed work, in particular the part that decided the right amount of granularity for the given task (i.e. when to stop splitting the task into smaller subtasks).

carterschonwald · on July 31, 2010

shouldn't it also be set up so that different algorithms can be swapped in? I think its pretty accepted at this point that different application scenarios perform better with differently tuned schedulers..

swannodette · on July 30, 2010

?

Clojure has deep language support for concurrency - from syntactical constructs to it's core performant immutable data structures.

lukev · on July 30, 2010

But the beauty is that because it's a Lisp, it's all implemented in terms of macros - Clojure's complete set of concurrency tools could be written as a library.

The only exception is the deref reader macro "@", since Clojure doesn't allow user-defined reader macros by default.

kanak · on July 30, 2010

> it's all implemented in terms of macros

This is not true at present. The core data structures, and the STM machinery are implemented in java, not as macros.

plinkplonk · on July 30, 2010

He probably meant "distributed". Erlang has a better "distributed" story than Clojure.

jacquesm · on July 30, 2010

Occam.

And a whole pile of others.

Did you use a parallel programming language?

tophercyll · on July 30, 2010

True, the nice things about libraries is that they can compete and make your ecosystem stronger. But the nice thing about building it into the language is that it becomes a common layer of compatibility.

We've gotten comfortable enough with our concurrency model, that we felt it deserved to be in the language.

As an aside, some features that benefit parallelism can be tricky to add retroactively, although possible (like an N:M threading model).

10ren · on July 30, 2010

I believe race conditions can still occur in pure message-passing languages, like Erlang (Joe Armstrong says so in Programming Erlang, p.173). This doesn't get much press, perhaps because it's much easier to get right than shared-memory, and because not that many people are actually doing it yet.

Does Spin address this issue?

tophercyll · on July 30, 2010

You can build message passing code that suffers from race conditions. In practice, though, it's almost always obvious if you're about to do something dangerous.

On p.173 Joe writes:

  When we say things such as this:
    Pid = register(...),
    on_exit(Pid, fun(X) -> ..),
  there is a possibility the process dies in the gap between these two statements.

Regarding process death, we allow multiple inter-process relationships to be specified at spawn time. Hence any number of processes may be configured to observe a newly-spawned process before it has even started. Additionally, all children are linked to (die with) their parent, which both eliminates the risk of orphaned processes and has other benefits.

So in Spin you wouldn't be tempted to write something like above (probably not Erlang either?).

X9 · on July 30, 2010

It might not need language level support, but anything to make it easier to leverage is a plus in my book. If the programmer doesn't have to concern themselves with extra parallelism-specific library calls and is still able to write a binary that can take advantage of multiple cores, that seems like a good idea to me.

malkia · on July 30, 2010

Let's face it - it's not the language that is the problem. It would help you, but it would help you only with some percentage of the problems out there.

For example - there does not exist a practical language (or design) where you can implement the LZ compression (ZLIB, others) in parallel, so that it gives the same results as the sequential "c" version.

It's just that certain algorithms, hence protocols, data structures, standards are not suited for parallel processing that well.

Okay, in the first case, maybe you can split the incoming data by 128kb and process each other individually ... but that's not the same - you can't reuse the LZ window.

Really the problem is the 13 dwarfs that university folks have identified - 13 stereotypical problems that relate to 99% of what's being done with a programming language - some of the dwarfs are just speedy parallel gnomes, some of them are old slow stubborn, like Gimli from LOTR.

http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-18...

pufuwozu · on July 30, 2010

Isn't Pigz a parallel implementation of gzip which is itself based upon LZ77?

http://www.zlib.net/pigz/

Hasn't there been a few parallel implementations of bzip2?

http://compression.ca/pbzip2/

http://bzip2smp.sourceforge.net/

I understand that parallelism isn't suited for a lot of algorithms but from what I can tell, there's been heaps of successful work in compression. Could you give us some more information?

It would really help me out - I'm doing an undergraduate class where I have to manually make an open-source project parallel. I've been looking into compression algorithms because I thought they were well suited. Please help me out if I'm going down the wrong path!

wmf · on July 30, 2010

The key to malkia's strawman is that pigz does not give the same results as the sequential "c" version of gzip. AFAIK gzip is not parallelizable as-is because every symbol depends on the previous symbol; pigz breaks this dependency to get parallelism but gives up a little compression efficiency. In practice this does not matter, which is why LZ is not a good example.

astrange · on July 31, 2010

Compression algorithms are poorly suited to parallelism, because they remove everything that isn't a data dependency in the input, and parallelism is nothing but a lack of data dependencies.

The trick is to start at the largest chunk possible and go down until you find where they have left in some, uh, non-dependencies - like bzip2 which has independent x*100KB blocks, and video which (usually) has independent frames. You should be able to get 2-4 separate tasks out of that, which is good enough for CPUs.

kanak · on July 31, 2010

> Let's face it - it's not the language that is the problem. It would help you, but it would help you only with some percentage of the problems out there.

Let's not underestimate the help that a language can provide; writing interpreters for languages is almost trivial on a lisp because you can reuse pretty much every piece of machinery on a lisp for your ends. Similarly, there is an entire class of problems that is nearly trivial on prolog that is pretty difficult to get right on other languages simply because prolog makes it easy to express rules and specifications that need to be met. Just look at an implementation of a sudoku solver in prolog and compare it with some other language.

I feel that a language designed with concurrency in mind would make it much simpler to write an entire class of problems. These languages are just gaining traction, so we are yet to see bigger and more significant examples. However, the "ants.clj" demo that Rich Hickey has written in Clojure, and some of the erlang demos in Joe Armstrong's book have made me a believer.

wmf · on July 30, 2010

certain algorithms, hence protocols, data structures, standards are not suited for parallel processing that well.

Sure, but for problems that can be parallelized you want a language to be able to express that parallelism. That may sound obvious, but many popular languages cannot do it. Let's not just give up on parallelism because it can't be used everywhere.

nivertech · on July 30, 2010

My favorite features of Erlang/OTP in descending order:

1. built-in distribution (i.e. Erlang distributed RPC)

2. bit-syntax

3. light-weight processes

4. pattern-matching

reeses · on July 31, 2010

2 combined with 4 is a secret weapon when parsing record-oriented files from old mainframes or minicomputers. Replace 500 lines of Java with one--very long--line of Erlang.

adamb · on July 30, 2010

It's worth noting we (the team that's building Skynet) didn't set out to build a parallel language, we set out to build a distributed one. It turns out that a lot of the things that make Spin a good distributed language also make it a good parallel one.

philwelch · on July 30, 2010

Yes, that's the story behind Erlang as well.

nivertech · on July 30, 2010

tophercyll, can you give an invite to skynet? Nice blog post, but you need to give references/links for Spin and Skynet. Otherwise it's just an ad trick...

tophercyll · on July 30, 2010

Hi nivertech,

I'm sharing one of the lessons we learned building our own programming language. I love how many language designers there are on HN!

We're working on an invite system, so everyone can benefit from Skynet! Skynet is better when you use it with others, so the invites will work for groups of friends.