Hacker News new | past | comments | ask | show | jobs | submit login

There are probably a lot of factors. I worked on CUDA code for around a year, and used to understand the landscape pretty well, but if I were to start a high-performance computing project today I'd probably take my lumps and go with OpenCL. There would be a lot of lumps.

Firstly, CUDA is just more mature; there is a very large and well-established set of libraries for a lot of common operations, there is a decent sized community, and Nvidia even produces specialized hardware (Tesla cards) designed just for CUDA.

Second, all that generic-ness of OpenCL doesn't come for free. With Nvidia, you're just working with one architecture; CUDA cards. Optimizing your kernels is much easier. OpenCL is just generically parallel, so you could have any sort of crazy heterogeneous high-performance computing environment you have to fiddle with (any number of CPU's with different chipsets and any number of GPU's with different chipsets).

I haven't used OpenCL myself, but almost purely anecdotally I have heard many people say that CUDA is often slightly faster[1] and the code is easier to write.

TL;DR: CUDA sacrifices flexibility for ease of development and performance gains. OpenCL wants to be everything for everyone, and comes with the typical burdens.

[1]: Maybe this is a result of OpenCL being more generic and so harder to optimize.




So why would you go with OpenCL? Is it portability? For a while I thought that was worth it but I am having a hard time remaining convinced of that.


I've been working on a rather large computation library using OpenCL. OpenCL is useful for providing an abstraction over multiple device types. If you are only interested in producing highly-tuned parallel code to execute on NVidia hardware, I suggest sticking to CUDA for the above reasons.

I utilised the OpenCL programming interface to write code that would run the same kernel functions on CPU and/or GPU devices (using heuristics to trade-off latency/throughput) which is something that is not possible afaik using the CUDA toolchain.

TL;DR YMMV and horses for courses.


FYI regarding highly-tuned code -- An ex ATI/AMD GPU core designer told me that the price you pay for writing optimized code in OpenCL versus the device specific assembler is roughly 3x. Something to keep in mind if you're targeting a large enough system to OpenCL and you find spots that can't be pushed any faster.


Unlike previous versions, OpenCL 2.0 been shown to only be about 30%[1] slower than CUDA and can approach comparable performance given enough optimisation.

Since I am working on code generation of Kernels to perform dynamic tasks, I can't afford to write at the lowest level available. (I'm accelerating Python/Ruby routines though so OpenCL gives a significant bonus without much pain at all.)

[1] http://dl.acm.org/citation.cfm?id=2066955 (Sorry about the paywall, I access through University VPN)


OpenCL is a standards compliant compute API that is supported by Nvidia, AMD, Intel, IBM, Sony, Apple, and several other companies.

Nvidia is in the slow process of eventually discontinuing further CUDA support, and it is recommended to write new code in OpenCL only.


> Nvidia is in the slow process of eventually discontinuing further CUDA support, and it is recommended to write new code in OpenCL only.

[Citation needed]

Their OpenCL support is still limited to v1.1 (released in 2010), while just few months ago they've released a new major version of CUDA with tons of features nowhere to be seen in (any vendor's) OpenCL.


Yeah, you're going to have to back that up. CUDA is meant to be supported on all future nVidia cards.


How come, given that only CUDA has direct support for C++ and FORTRAN compilers that target the GPU?


Furthermore Python[1], Matlab[2], F#[3]. Furthermore parallel device debuggers (TotalView, Allinea), profilers (NVIDIA). There's a long way for OpenCL to catch up, if ever (because there might be a better standard coming further down the line).

[1] http://www.techpowerup.com/181585/NVIDIA-CUDA-Gets-Python-Su...

[2] http://www.mathworks.com/discovery/matlab-gpu.html

[3] https://www.quantalea.net/media/pdf/2012-11-29_Zurich_FSharp...


Portability, accessibility, etc. "Because it's hard" is never a good excuse to not do something.


> "Because it's hard" is never a good excuse to not do something.

Well that's certainly not true in the general case.


On the contrary, I'd argue that it's not true in specific cases.

"Because it's hard" is a cop-out.

"It's too hard to accomplish given constraint [X]" where X is a deadline, financial constraints, or other real/tangible resource limitations might be one thing. But if you're working on your own timeline on some sort of open-source project, or there is nothing external preventing you from acquiring the expertise/resources to conquer the hard problem, then "Because it's hard" is an absolutely shitty excuse to not do something.


I suggest you read "It's too hard" when written by other developers as, "It's too hard [given that I spend N hours a week on this and would rather actually accomplish something in the next two months than learn the 'right' API]." Or, "It's too hard [given various constraints that I'm not going to explain to you but are valid to me.]" It'll save you having to give speeches about shitty excuses.

That said, if it makes sense for your project, make it happen! :)


Even if the long term goal is more portable GPU support it still makes some sense to get a CUDA implementation up first if it is easier to get to. It then allows real world testing faster, can always go to openCL later once they know more.


Just out of curiosity: how often did you see that happen (not only related to GPU's, but technologie decisions overall)? In my (little) experience the change at a later moment will not happen. Most of the time because the management has a new idea/project which you have to attend to.


It's often a great excuse to do something else instead. If you can't get what you need done without the more difficult option, sure do it. But there's no sense in going down the harder path needlessly.

I'm not trying to convince you you don't need it or shouldn't do it, I was looking for a datapoint about what you find valuable in OpenCL.


Well, the portability can be a killer feature. I've been writing quite a bit of OpenCL code lately. I have an AMD GPU, so CUDA is a non-starter. I'll eventually replace the AMD card with an NVIDIA one, so it won't be as big of a problem, but my OpenCL code will still be fine then.


CUDA code is GPU only, OpenCL can run on both CPU and GPU. There are limits so it isn't all win, but it means you don't have to implement twice.


You go with OpenCL so that you can use AMD's Fusion processor that will start will soon allow a GPU and CPU to share main memory.


You can do this already, and also with Intel's Ivy Bridge (at least on Windows).


It is the same story as OpenGL vs DirectX, it is all about the support the developers get from the vendors.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: