While the M5 is impressive, its capacity to compete against enterprise models li...

While the M5 is impressive, its capacity to compete against enterprise models like Opus, Sonnet etc is off by a couple orders of magnitude. Even the most sophisticated open source models top out at 100-200 billion parameters whereas Claude, OpenAI are north of a trillion parameters.

While SOC is definitely the future or local AI, even if you get an M5 that's jacked to the tits, you still won't be able to store the entire model in unified memory on top of the OS and whatever other applications you have running. 128GB is the upper limit for unified memory on the M5, which on paper could support a model like gpt-oss:120b but still with a nerfed context size and quantised at that. Furthermore, the cost of a maxed out Macbook Pro M5 Max is between $8-10k depending on your storage option, screen size etc., so we can safely assume that the M5 Ultra will be even more. There's also no guarantee that the Ultra will offer double the amount of unified memory, it may only offer more cores but cores aren't the current bottleneck for local AI, memory is.

If you consider what you'd be paying above and beyond your requirements barring local AI, it would be adding $5-6k to the price tag at a minimum. That equates to 5 years worth of a Claude Code subscription! Even if you shouldered that cost with your NFT fortunes you likely wouldn't achieve performance parity with CC.

I'm equally as excited as you are about the future for local AI and I am actively working in this space everyday to improve it but we're still a long way off from being able to match model size, context size, token/sec, TTFB etc. A single H100 is so OP, so hosting thousands in a data centre it's expected that it should remain unrivalled.

The area where I'm having success with local AI is by pairing local models with other supportive technologies like databases and the like to compensate for smaller context size. There are still many in roads to be made in this area and that bodes well for the future of local AI as models become more efficient and sophisticated.