Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I find time to first token more important then tok/s generally as these models wait an ungodly amount of time before streaming results. It looks like the claims are true based on M5: https://www.macstories.net/stories/ipad-pro-m5-neural-benchm... so this might work great.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: