Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Art9681
41 days ago
|
parent
|
context
|
favorite
| on:
MacBook Pro with M5 Pro and M5 Max
It's going to be faster no matter what. My M3 MAX prints tokens faster than I can read for the new MoE models. It's the prompt processing that kills it when the context grows beyond a threshold which is easy to do in the modern agentic loops.
fulafel
41 days ago
[–]
If your computer was faster at it, you could run more capable models at the same token rate.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: