It's not the most well documented but it's the smallest implementation while still being one of the most performant so you can learn more than just SSE.
It's not the most well documented but it's the smallest implementation while still being one of the most performant so you can learn more than just SSE.