OpenCL is a very sound design on a technical level. Unfortunately, it's mostly designed to be a good target for code generation/libraries, and is awkward to use directly. The problems are that everything is very explicit and spelled out, and the kernel language itself is based on C, not C++, and segmented from the rest of the code base. In practice, you use OpenCL API calls to submit strings containing code to the OpenCL backend, which then compiles it and hands you back an opaque compiled program that you can execute. This makes it hard to write modular and maintainable code with OpenCL directly.
The issues go away if you use a good OpenCL frontend. PyOpenCL for Python goes a long way towards this, and is not really any more awkward than the corresponding PyCUDA, and higher-level languages that generate OpenCL code, like Lift[0] or Futhark[1] (tooting my own horn here), remove the awkwardness completely.
You're not wrong there. However with the advent of SPIR-V it is possible to write code in whatever language you please (with the caveat that at the moment you need an LLVM backend using https://github.com/thewilsonator/llvm-target-spirv or the Khronos repo I forked that off of.
Then comes the issue of making the code generator friendly interface user friendly, which I have done for D so that you get the ease of use of CUDA.
I think that has been the major failure from Khronos, being stuck with C mentality for their APIs.
OpenGL ES only took off thanks to gaming on the iPhone, and now is deprecated on Apple platforms.
Vulkan still lives in a C world, and the semi-official C++ bindings only exist thanks NVidia.
OpenCL waited too long to support C++, Fortran and providing an infrastructure for compiler writers to add GPU support to their own languages. And two years later the majority of drivers are not there yet.
Khronos' problem is that they want to do everything, or not do it at all. Sycl would be 'real' c++ support, way beyond the 'oo wrapper around c api' that already exists in opencl headers. Nvidia is more 'good enough? Ship it' which has shown itself many times in the past to be a dominant strategy. (As much as I hate to admit that).
It seems like a C API is a much better choice if you want your library to be callable from many different languages. What is a reasonable alternative without giving that up?
having a bytecode, and having a C API are completely orthogonal. And thankfully we now have both with SPIR-V, although the C API needs to be wrapped unless you like writing C-like code in higher languages that abstract most of the tedium.
Yes driver support is a bit lacking, although I hop that I can convince the OpenCL working group of the need to get a backend (such as https://github.com/thewilsonator/llvm-target-spirv) into mainline LLVM so that writing drivers becomes easier for vendors.
D3D, Vulkan, OpenGL (optionally) etc. do use a bytecode format, but that only covers shader modules, which are a small part of the overall API. I'm not that familiar with CUDA, but it looks like a large portion of it is delivered as a C API, judging by the bindings I see online.
The OpenCL kernel language is very similar to the CUDA one, before they started adding more C++ support. It's pretty easy to translate between the two if you don't use hw intrinsics.
And segmenting the codebase is a MAJOR feature. With CUDA you are stuck on an old compiler until NVIDIA issues an update. How anyone can think this is a good idea...
People still write regular shaders in languages that are much more C like.
People writing GPGPU code are few. Most of the DL GPU use is in Python through several layers, and in the end you are running hand-written SASS assembly sitting in an NVIDIA DLL or whatever.
I guess some people must think it's handy to have C++ support in GPU kernels, or they wouldn't have added the feature. But for it to drive technology, hard to believe.
Also the fact that OpenCL lost to CUDA for being stuck in C for so long, shows what most GPU devs actually prefer.
Aw come on, you're sure it has nothing to do with the largest GPGPU vendor pushing CUDA very heavily and intentionally gutting their OpenCL tools? Or putting out a ton of very high performance libraries with no OpenCL equivalent? Putting out a shitton of marketing and tutorial videos for CUDA only?
Yeah, that surely was totally unrelated.
Also pushing a proprietary standard goes faster than a standardized one. No surprise there.
If AMD, Intel and embedded OEMs actually produced quality OpenCL drivers, debugger and IDE support and libraries that could match CUDA productivity, maybe devs would bother to use C with OpenCL.
Even Google decided to create their Renderscript dialect instead of supporting OpenCL on Android.
I am saying that if the other GPU vendors bothered to actually provide a competing technology stack, that was worth the pain of using plain C, maybe GPU devs would have bothered.
It's purely political. If nvidia would have a team of 3-5 ppl working op their opencl drivers/tooling, opencl would be as good as cuda. Nvidia would be stupid to do that, of course. Hence the Khronos play to fold opencl into vulkan. We'll see how that plays out.