Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not memory speed but extremely fast: https://github.com/powturbo/TurboHist


Yeah, it's nice, about what you can achieve. I'd be happy if it was about 8 times faster. Less than 0.2 cycles per byte would be good.

Unfortunately, that's just not achievable with current x86 instruction set.


There is new AVX512 instruction (_mm512_conflict_epi32) supposed to solve this, but it can't make the histogram construction faster than the scalar functions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: