Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not sure why you are being downvoted, we already know the performance numbers due to memory bandwidth constraints on the M4 Max chips, it would apply here as well.

525GB/s to 1000GB/s will double the TPS at best, which is still quite low for large LLMs.



Deepseek R1 (full, Q1) is 14t/s on an M2 Ultra, so this should be around 20t/s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: