14-billion parameter model with 4-bit quantization seems rather small

derefr · 2026-03-03T20:52:32 1772571152

I think these aren't meant to be representative of arbitrary userland-workload LLM inferences, but rather the kinds of tasks macOS might spin up a background LLM inference for. Like the Apple Intelligence stuff, or Photos auto-tagging, etc. You wouldn't want the OS to ever be spinning up a model that uses 98% of RAM, so Apple probably considers themselves to have at most 50% of RAM as working headroom for any such workloads.

duskwuff · 2026-03-04T06:09:01 1772604541

Also: they're advertising the degree of improvement ("4x faster"), not an absolute level of performance.

giancarlostoro · 2026-03-03T18:54:40 1772564080

On my 24GB RAM M4 Pro MBP some models run very quickly through LM Studio to Zed, I was able to ask it to write some code. Course my fan starts spinning off like the worlds ending, but its still impressive what I can do 100% locally. I can't imagine on a more serious setup like the Mac Studio.

jbellis · 2026-03-04T03:42:48 1772595768

Your limitation after prefill is memory bandwidth. A maxed out Studio has less than a single 3090 (really).

sroussey · 2026-03-04T07:07:03 1772608023

Yeah, the 3090 has faster memory, but not by a lot.

The 5090 is at 1,792GB/sec and potential M5 Ultra would be 1,230GB/sec and 512GB RAM. Maybe 1TB. Not 32.

thejazzman · 2026-03-05T13:04:05 1772715845

You’re suggesting that a difference of the entirety of the M5 Max’s bandwidth is an insignificant gap!

veidr · 2026-03-05T14:55:32 1772722532

No, that difference is the 5090, not the 3090.

efxhoy · 2026-03-03T20:06:47 1772568407

How is the output quality of the smaller models?

elsombrero · 2026-03-04T00:00:23 1772582423

not good enough for coding anything more than simple scripts.

generally, the less parameters, the less knowledge they have.

kraig911 · 2026-03-04T02:09:25 1772590165

what model were you using?

giancarlostoro · 2026-03-04T21:14:33 1772658873

Wrote about it here:

https://news.ycombinator.com/item?id=47191915

simlevesque · 2026-03-03T18:47:57 1772563677

It's not much for a frontier AI but it can be a very useful specialized LLM.

bilbo0s · 2026-03-03T19:06:55 1772564815

It is.

That's how they make loot on their 128GB MacBook Pros. By kneecapping the cheap stuff. Don't think for a second that the specs weren't chosen so that professional developers would have to shell out the 8 grand for the legit machine. They're only gonna let us do the bare minimum on a MacBook Air.

butILoveLife · 2026-03-03T18:59:19 1772564359

For anyone who has been watching Apple since the iPod commercials, Apple really really has grey area in the honesty of their marketing.

And not even diehard Apple fanboys deny this.

I genuinely feel bad for people who fall for their marketing thinking they will run LLMs. Oh well, I got scammed on runescape as a child when someone said they could trim my armor... Everyone needs to learn.

zitterbewegung · 2026-03-03T19:14:45 1772565285

Yesterday I ran qwen3.5:27b with an M1 Max and 64 GB of ram. I have even run Llama 70B when llama.cpp came out. These run sufficiently well but somewhat slow but compared to what the improvements with the M5 Max it will make it a much faster experience.

giwook · 2026-03-03T19:31:09 1772566269

I don't know that there would be a huge overlap between the people who would fall for this type of marketing and the people who want to run LLMs locally.

There definitely are some who fit into this category, but if they're buying the latest and greatest on a whim then they've likely got money to burn and you probably don't need to feel bad for them.

Reminds me of the saying: "A fool and his money are soon parted".

mptest · 2026-03-04T00:38:25 1772584705

In retrospect, was there a better place to learn about the cruelty of the world than runescape? Must've got scammed thrice before I lost the youthful light in my eye

jki275 · 2026-03-04T03:46:38 1772595998

I run local models on my M1 Max. there are a number of them that are quite useful.

jtbaker · 2026-03-04T04:47:58 1772599678

my mac mini m4 is getting to be a good substitute for claude for a lot of use cases. LM Studio + qwen3.5, tailscale, and an opencode CLI harness. It doesn't do well with super long context or complexity but it has gotten production quality code out for me this week (with some fairly detailed instructions/background).

nine_k · 2026-03-03T23:19:49 1772579989

There used to be a polite way to call this out, the "Steve Jobs's reality distortion field".

hamdingers · 2026-03-03T23:37:08 1772581028

Now that every CEO has their own reality distortion field I wonder if it's even worth calling out any more.

jimbokun · 2026-03-04T04:06:21 1772597181

No current CEO has a RDF comparable to Jobs.

Musk is probably closest, but he’s become so involved in partisan politics it makes his field far less effective at distorting reality.

seec · 2026-03-04T05:11:23 1772601083

Musk is leading the build of the biggest objects we have ever sent to space. It does give him some sort of aura that is hard to dismantle, let's be honest.

He can do and say a lot of shit because he will still be viewed as real-life Iron Man, because in some ways he kind of is.

__patchbit__ · 2026-03-04T07:20:43 1772608843

Elon Musk would put Apple's money sloshing about over the years to better uses than failing to build one battery electric vehicle costing $1 billion a year over many years.

He doesn't have a RDF but has Kardashev Scale Intent (KSI).

The lobbyists in the political fray are out to steal his value for money lunch despite his demonstrated effectiveness, over and over again.

Jobs couldn't even engage the politicians to give away or at discount the Apple ][ to education.

Nevermark · 2026-03-04T03:13:42 1772594022

Somehow Tim Cook's many year's position that the lightening port was very important to Apple vs USB-C, fell flat as a parsec wide pancake.

(It didn't help that they couldn't point to a single user facing feature.)

Or that the App Store lock in is for our safety. When anyone who wanted that particular safety, could choose to continue using there store exclusively.

Etc.

He just does not have it. No field. No spiraling eyes. Perhaps he should grow a beard and wave around a tobacco pipe. Works for some.

nine_k · 2026-03-04T01:21:35 1772587295

Most are not nearly as smooth and successful at the distorting.