Does a 25 second process on a 32 node cluster really count as a "supercomputer" task? By definition, you can run that on your desktop in 15 minutes. If that's really your task, and these are your numbers, and latency-to-answer is your only metric (as opposed to other stuff, like "cost" that people in the real world worry about) then sure: EC2 makes sense. But these look like some terribly cooked numbers to me. I mean, how much time are you really going to spend optimizing something you can run while you trot to Starbucks for coffee?
I'd be curious what the answers (including cost) are when you scale the problem to days of runtime on hundreds of nodes. I rather suspect the answers would be strongly tilted in favor of the institutional supercomputer facilities, who after all have made a "business" out of optimizing this process. EC2 has other goals.
Exactly. This is a total misapplication of the power of EC2.
EC2 comes into its own when you have a compute bound task that would normally run for days or weeks on a single computer or small cluster. You want your answer significantly sooner and you want to pay for that.
15 minutes ? Go for a coffee break. If you run 100's of those every day then it starts to make sense again.
Another (very) valid reason to use EC2 is to avoid going through 'purchasing', if it is just a service you'll probably be able to slip it in under the radar.
The NAS parallel benchmark suite are used to represent tasks that appear often in scientific computing. They were designed so that running them on a computing platform would tell us how amenable that platform is for scientific computing.
Yes, but it's a benchmark of the hardware environment. This is more than anything else a test of resource allocation latency. The poster posits that EC2 can get the boxes working faster and get you your answers sooner.
But the result here is that the benchmark will complete with "high probability" within about 6.5 minutes on EC2 for a task that only takes 15 minutes to run on your desktop CPU. That model is wildly overestimating the impact of latency on the computation cost.
Does a 25 second process on a 32 node cluster really count as a "supercomputer" task? By definition, you can run that on your desktop in 15 minutes. If that's really your task, and these are your numbers, and latency-to-answer is your only metric (as opposed to other stuff, like "cost" that people in the real world worry about) then sure: EC2 makes sense. But these look like some terribly cooked numbers to me. I mean, how much time are you really going to spend optimizing something you can run while you trot to Starbucks for coffee?
I'd be curious what the answers (including cost) are when you scale the problem to days of runtime on hundreds of nodes. I rather suspect the answers would be strongly tilted in favor of the institutional supercomputer facilities, who after all have made a "business" out of optimizing this process. EC2 has other goals.