>-Ofast, -ffast-math and -funsafe-math-optimizations will no longer add startup code to alter the floating-point environment when producing a shared object with -shared.
Well, that's a relief. Definitely won't fix anything already built, but it's a nice step in the right direction. The idea of finding out that a shared library is breaking unrelated code is a scary proposition.
Bitwise multiplication is bitwise ‘and’, so exponentiation (repeated multiplication) is the identity, so aⁿ+bⁿ=cⁿ iff a+b=c. I have a truly remarkable proof of this, but my filesystem is full.
Can anyone please share their experience with Openmp offload? In terms of performance improvement observed, how much effort you had to put in, comparison with any other gpgpu paradigm etc.
I'm just getting started with it. My _impression_ is that this isn't quite mature yet. On a toy example I was working on, I had trouble beating host only Openmp performance. Cuda thrust was about 5 times faster, while taking about 5x less effort. Just a data point.
"
-fanalyzer is still only suitable for analyzing C code. In particular, using it on C++ is unlikely to give meaningful output.
"
Not this time, maybe next time ?
Well, that's a relief. Definitely won't fix anything already built, but it's a nice step in the right direction. The idea of finding out that a shared library is breaking unrelated code is a scary proposition.