Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The other issue is instruction-level parallelism, as another poster in TFA pointed out. Even within a single loop iteration the "unoptimized" code is more likely to exploit multiple ALUs if they exist, regardless of vectorization instructions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: