Being a C++ dev, I think you're on-point about the reason people still choose C+...

yuushi · on March 9, 2018

> SIMD

Intrinsics are directly callable from C++.

> likely/unlikely branches

Most compilers have extensions that will allow you to do this (__builtin_expect and so on).

> in-lining can't be forced when you know it gives better performance

Again, most compilers have this, not just GCC, e.g. __forceinline.

> the compiler has a lot of trouble knowing when lines of code are independent and can be done in parallel (b/c const =/= immutable)

This is true, as aliasing is a real issue. The hardware itself has some say over this anyway, dependent on its instruction scheduling and OOE capabilities.

What you don't mention, however, is the fact that almost no other languages offer any of these, let alone all of them. Rust may be the exception here, although some of this is still in the words (SIMD, I'm not sure about the status of likely/unlikely intrinsics).

For GPU programming, if you're using CUDA, you're almost certainly using C or C++, or calling something that wraps C/C++ code. Not everything is suited to GPU processing anyway, there's still a lot of code that's not moving off the CPU any time soon that needs to be performant.

geokon · on March 9, 2018

right, so things that are not part of the language, not crossplatform and not crosscompiler. That's called fighting the language in my book :)

I'm not saying you can't get C++ to output the assembly you want - it just sucks trying to coerce it to do things that are honestly not that complicated. And even when you do get what you want you find you can't use the code anywhere else. To me that feels like a language failure...

> is the fact that almost no other languages offer any of these

I guess you missed my point. It seems to me that we're at a point where you no longer need these features as part of your core application language. The idea is that with OpenCL/SPIR-V we'll be able to

1- be more explicit and not fight the language (so even if you're 100% on the CPU it makes sense)

2- target every platform (you can finally write code for your GPU)

3- can be called from any parent language

You're right that not all performance critical problems boil down to tight shared-memory loops that can be thrown onto an OpenCL kernel - but my experience so far tells me that that's the vast majority of performance problems. So C++'s usefulness will shrink. But maybe my experience is biased and I'm off base. I haven't done much OpenCL myself - but I'm definitely planning to use it more in the future

cma · on March 9, 2018

> right, so things that are not part of the language, not crossplatform and not crosscompiler

You just have a header with different #defines for the different platforms you are going to ship on, or use a premade open source one.

If you want to ship on everything, you won't get full optimization stuff everywhere. It would be better if some of these features were in the standard, but in practice it isn't such a big issue for those two in particular.

emtel · on March 9, 2018

These are all good points, but I'd say two things:

1. Whatever C++'s weaknesses in this area are, it's superior to Go, so C++ programmers aren't going to switch to Go because of this.

2. Not everything is about raw throughput. You can't do anything latency sensitive on a GPU. Consider a game: the pixels get drawn on the GPU, and the physics might happen on the GPU, but you still have a ton of highly latency sensitive things that are going to have to be done by the CPU, such as input handling and networking. Also, even with low driver overhead APIs like Vulkan, you still have to have something on the CPU telling the GPU what to do. Finally, GPUs aren't good at branch-heavy workloads in general.