Hacker News new | past | comments | ask | show | jobs | submit login

My counterexample to this would be CUDA. It is so much more successful than OpenCL (for many reasons) and so much carefully tuned library code and dev tools exists for CUDA, that choosing other options is only done for mobile platforms, where a duplicate port is required.

It is conceivable that a company like WD could implement a linux BSP and pay for ports/tuning of high level tools, but it would be a significant task.

The performance analysis of synchronous systems like MPI and map reduce over 4k cores is relatively obvious, but for next generation data intensive tasks and asynchronous compute it isn't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: