Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most decent BLAS implementations use some variation of the kernels in K. Goto et al. "Anatomy of High-Performance Matrix Multiplication" written in assembly.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: