What's faster, a sportscar or a truck ? It depends on what you are trying to move.
A load of bricks will be moved faster by truck, even though in an absolute sense the sportscar is faster...
If you are doing vector processing and have a hard-to-parallelize problem then a supercomputer is probably the only way to go.
Otherwise EC2 will possibly be faster (it still depends on lots of subtle factors, such as how compute intensive vs communications intensive your application is).
The only way to know for sure is to do a limited benchmark on the core of your problem in order to figure out which architecture works best.
Reminds me of the joke about the highest bandwidth available being a truckload of drives barreling down the highway. While true, not necessarily useful.
In fact, it could be quite useful. I pay for a seedbox for my torrent downloading, and it would be great to be able to pay to have a hard drive from the seedbox shipped to me when I fill it up.
I think a more appropriate analogy is how could we move a hundred tons of earth more efficiently? With a dump-truck or with N commodity pick-up trucks? How hard is it to separate a load among the N trucks and how many are there?
Does a 25 second process on a 32 node cluster really count as a "supercomputer" task? By definition, you can run that on your desktop in 15 minutes. If that's really your task, and these are your numbers, and latency-to-answer is your only metric (as opposed to other stuff, like "cost" that people in the real world worry about) then sure: EC2 makes sense. But these look like some terribly cooked numbers to me. I mean, how much time are you really going to spend optimizing something you can run while you trot to Starbucks for coffee?
I'd be curious what the answers (including cost) are when you scale the problem to days of runtime on hundreds of nodes. I rather suspect the answers would be strongly tilted in favor of the institutional supercomputer facilities, who after all have made a "business" out of optimizing this process. EC2 has other goals.
Exactly. This is a total misapplication of the power of EC2.
EC2 comes into its own when you have a compute bound task that would normally run for days or weeks on a single computer or small cluster. You want your answer significantly sooner and you want to pay for that.
15 minutes ? Go for a coffee break. If you run 100's of those every day then it starts to make sense again.
Another (very) valid reason to use EC2 is to avoid going through 'purchasing', if it is just a service you'll probably be able to slip it in under the radar.
The NAS parallel benchmark suite are used to represent tasks that appear often in scientific computing. They were designed so that running them on a computing platform would tell us how amenable that platform is for scientific computing.
Yes, but it's a benchmark of the hardware environment. This is more than anything else a test of resource allocation latency. The poster posits that EC2 can get the boxes working faster and get you your answers sooner.
But the result here is that the benchmark will complete with "high probability" within about 6.5 minutes on EC2 for a task that only takes 15 minutes to run on your desktop CPU. That model is wildly overestimating the impact of latency on the computation cost.
<rant>These Seymour Cray quotes have been obsoleted by changes in technology, and trotting them out yet again just displays ignorance of those changes.</rant>
There are no "strong oxen" in today's world; both supercomputers and EC2 are clusters using more or less the same processors.
Some problems demand shared-memory architectures. The fact that those who work on such problems lack the funding to support a large market in shared-memory machines does not negate the problems' existence.
I assume that Cray was talking about a uniprocessor, not SMP. Besides, it's irrelevant to this article which was comparing NCSA's shared-nothing cluster against EC2's shared-nothing cluster.
There is a -limited- market for 'real' supercomputers, and with that I mean computers that have been geared towards a special class of problems. They too exist in SMP versions, but it would be a serious mistake to think of the individual machines in those arrays as comparable to a machine in a run-of-the-mill beowulf arrangement:
The reason these machines still have a right to an existence is because not all problems are solvable in a massively parallel fashion. Some parallelization may be possible, but the speed up from such rearranging of the problem has an upper limit. That's where these machines come in to their own.
Cray didn't make uniprocessors. The reason his supercomputers were faster (and why Cray Computer Corporation is still a darn good supercomputer company) were advances in moving data among lots of processors.
I question the premise of this question and analysis! There is an enormous breadth of work done on clusters vs supercomputers already. It turns out that sufficiently parallelizable ( http://en.wikipedia.org/wiki/Embarrassingly_parallel ) tasks can be accomplished much more efficiently on clusters. They've been en vogue in academia for years, I remember playing with the cluster at ND in 2002. Just wow.
Depends on utilisation. If you own a cluster and are able to keep it busy 24x7 then it pays off pretty quickly. Amazon have to have enough spare idle capacity to handle unexpected customer loads and make a profit - you are paying for this.
Good point. I wonder at what % utilization a cluster beats EC2 in terms of $/computation. Actually I'm sure there are all sorts of variables, and it becomes an optimization problem, but I think it would be cool to see an analysis of this in terms of $, computation power (time/jobsize or something), etc.
Custer scheduling is a huge area. I used to work on MPI clusters and it is an art to balance CPU, Bandwidth, propagation time to pick the optimum number of processors for a particular algorithm.
Especially on commodity ethernet based MPI, it doesn't do broadcast so shipping a Gb common dataset to 64nodes can take a lot longer than actualy doing the calculation.
Strange -- I always just sort of assumed that since they are making big clusters, they could spend the extra $$ for a good multicast switch, and that MPI did ip multicast. (a quick googling shows me to be wrong...).
My understanding was that a lot of research clusers actually have pretty low utilization. I believe the cloud computing whitepaper David Patterson put out in the spring had I data on that.
A load of bricks will be moved faster by truck, even though in an absolute sense the sportscar is faster...
If you are doing vector processing and have a hard-to-parallelize problem then a supercomputer is probably the only way to go.
Otherwise EC2 will possibly be faster (it still depends on lots of subtle factors, such as how compute intensive vs communications intensive your application is).
The only way to know for sure is to do a limited benchmark on the core of your problem in order to figure out which architecture works best.