Hacker News new | past | comments | ask | show | jobs | submit login

SIMD like code, auto-vectorization has come a good way, but it still isn't as fast as doing it by hand.

IoT devices with memory measured in single digits KB.

Plus, someone needs to write the Assembly that those compilers generate.




It depends. If your target is x86/x86_64 and the compiler has a LLVM or Intel backend, I've found that beating the compiler is almost impossible and largely futile. I took a long time to accept this since I know asm and the x86 instructions well.


That is my point of view as well, but then again the kind of code I write also does quite well with bounds checking enabled.

So I never had to deep dive into stuff like writing AVX by hand, just basing my remark on some comments that occasionally read about.


Compiler auto-vectorization is fairly easy to beat as it won’t even try when loop invariants are mildly complicated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: