Parallelism and the Limits of (programming) Languages

jerf · on April 9, 2012

"Languages can force developers to think about applying logic over their data (and thus help with parallelism as a mindset), but they will never solve the problem of automated parallelism without the VM to tune it in the background."

Unfortunately, the evidence that anything at all can save us with automated parallelism, with or without VM assistance, isn't very good. Studies on using it on real code written without thought on parallelism always discover that "normal" code simply isn't parallelizable even in theory beyond single-digit factors.

There Is No Escape.

neilc · on April 9, 2012

Studies on using it on real code written without thought on parallelism always discover that "normal" code simply isn't parallelizable even in theory beyond single-digit factors.

If you assume "normal code" is traditional imperative code with destructive state update, that is probably correct (although I'd be curious to see which studies you're referring to). But there are important examples of domain-specific languages that are relatively simple to automatically parallelize. For example, mere mortals can easily write SQL that can be efficiently executed on thousands of machines (given a sufficiently large data set, of course) -- the database system takes care of data partitioning, replication, and fault tolerance automatically.

Another example is the work of Pat Hanrahan and others on DSLs for solving PDEs over meshes: http://liszt.stanford.edu/liszt_sc2011.pdf

Obviously, whether these techniques can be extended to tackle more general-purpose programming tasks is open to question.

jerf · on April 9, 2012

I've seen them on functional programming languages, too.

If you switch programming languages, in the context of this argument you've "already lost". The point of automatic parallelization is basically to give us speedups without having to radically change how we program. Changing languages, using lots of mini-DSLs, etc. all count as radical change as much as having to manually parallelize.

Personally I absolutely agree that the only way this is going to work is with new languages that make it easier, but at the same time it's never going to be fully automatic... and those new languages aren't going to look much like C(#/++/). Haskell is closer and even that probably isn't different enough.

steerb · on April 9, 2012

I tend to agree.

Let's assume a sufficiently smart compiler can parallelize 90% of our hot code. Now considering Amdahl's Law, we cannot achieve a speedup greater than 10 and are therefore stuck. Adding 10 or more cores won't do any good to our running time, because it is dominated by the sequential code.

evincarofautumn · on April 9, 2012

I have a hunch that the trick to writing the sufficiently smart compiler of legend is in deriving nondeterministic parallel cousins to sequential algorithms. In simple terms, if you could quickly compute potential solutions and verify them, then you could get a speedup in the average case. I doubt NC=P (that no problem is inherently sequential), but we could perhaps tackle many real-world problems with this line of thinking.

steerb · on April 9, 2012

This sounds somewhat similar to (software) transactional memory.

Retric · on April 9, 2012

Most people have 64bit GHz CPU's, even if you just use one core that's lot of processing power. So, I don't think most code needs to be parallel until either multiple users, AI, or large data sets show up. And better yet, those are generally the three easiest cases to do multi-threaded code.

That said, there is a lot of slow and terrible code out there but dumping processing power on a terrible algorithm is rarely the best solution.

btilly · on April 9, 2012

The key to writing highly parallel programs in my opinion is to have a set of primitive, highly parallelizable operations (like MapReduce) and languages that make it natural to use those operations, either implicitly or explicitly (see http://static.googleusercontent.com/external_content/untrust... for an example of such a language).

For a very different example of a more general-purpose language designed for large amounts of parallelism, see go. (Many of the same people were involved.) Note that it is much easier in that language to create accidental bottlenecks that are hard to track down than in Sawzall. (That's the trade-off when you let the programmer have more control - they can do more interesting things but also get the rope to hang themselves.)

sausagefeet · on April 9, 2012

ANI tries to do this, although I still haven't figured out if it's a joke or not.

http://code.google.com/p/anic/

pragmatic · on April 9, 2012

> I've always thought that the next truly big leap in VMs and languages will be those that can do smart things with threading without making developers sweat too much.

I'm skeptical we'll see something like this for _nontrivial_ applications, especially highly tuned applications (search, financial, etc).

I think we will have examples of toy applications that can be automatically parallelized.

Parallel Linq (plinq) is a good example. You can easily make course grained operation concurrent. However, for truly fine grained stuff, the compiler isn't "smart enough." (see On Being Sufficiently Smart - http://prog21.dadgum.com/40.html)

I hope I'm wrong and we see some major advances in magic err I mean compiler design.

sausagefeet · on April 9, 2012

I'm wondering if the solution to the Sufficiently Smart Compiler is to have the Sufficiently Stupid Language. Automated parallelism becomes easier if you only give users parallelizable operators. That sounds terrible, but maybe it isn't that bad (I don't know). Perhaps users should always be working on sets of data (a number doesn't a exist, just a set with one number in it), then the compiler really just has to be smart enough to turn that set operator into just an operation on an integer. That is, deparallelize it. Is this why array languages tend to work well for these things?

gauravk92 · on April 9, 2012

Arrays are usually causing problems in the parallelism world, if you're trying to modify them anyway. The standard paradigm is the actor model, you reduce your program to never actually talk to data, only act as if it were. The objects asynchronously handle messages to modify themselves, thus a message can come from anywhere.

That's one solution but the biggest hurdle most parallel codes faces is the fact that most programmers think sequentially. When you start thinking parallel, it doesn't make as much sense, and the thought degrades to a magic box that you throw work at. IMHO usually because where parallelism is an intuitive concept, the API's to take advantage of it usually suck.

sausagefeet · on April 9, 2012

You wouldn't have array mutation in this case, since that isn't a very parallelizable operation. My question is if we could construct a usable language where every operator in it is a parallel operator, in which point the optimizing compiler is trying to find situations where it can just treat values as sequential code, not finding situations where it can treat sequential code as parallelizable.

przemelek · on April 9, 2012

I'm curious if recent changes in Google (taking special characters) will change TIOBE (They base among other signals also on search engines) index? Maybe C gets extra points when those should go to C# or Objective-C? ;-)

tambourine_man · on April 9, 2012

Just an aside, “whilefalse” is a great name for a programing blog.

ionforce · on April 9, 2012

So it can be optimized away?!

rbanffy · on April 9, 2012

There was a time I'd loved to see a Niagara-based desktop workstation. Making Gnome and Firefox run well on a dozen underpowered cores would do a lot towards making parallelism more mainstream. I think, however, this train has left the station.

I wish the GPGPU crowd good luck.

unwind · on April 9, 2012

Hysterically annoying typo in the single word that (it seems) was added to the title as editorial. Not a win.

gtani · on April 9, 2012

Sorry. Somebody, not me, fixed it

th0ma5 · on April 9, 2012

If you consider JVM startup and the JIT compiling to be part of a bootup procedure of a widely parallel cluster, then we're in a pretty good spot these days to do parallelism with just a little code.

kbd · on April 9, 2012

What does that even mean?

th0ma5 · on April 10, 2012

have you messed with hadoop?