Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

14-billion parameter model with 4-bit quantization seems rather small
 help



I think these aren't meant to be representative of arbitrary userland-workload LLM inferences, but rather the kinds of tasks macOS might spin up a background LLM inference for. Like the Apple Intelligence stuff, or Photos auto-tagging, etc. You wouldn't want the OS to ever be spinning up a model that uses 98% of RAM, so Apple probably considers themselves to have at most 50% of RAM as working headroom for any such workloads.

Also: they're advertising the degree of improvement ("4x faster"), not an absolute level of performance.

On my 24GB RAM M4 Pro MBP some models run very quickly through LM Studio to Zed, I was able to ask it to write some code. Course my fan starts spinning off like the worlds ending, but its still impressive what I can do 100% locally. I can't imagine on a more serious setup like the Mac Studio.

Your limitation after prefill is memory bandwidth. A maxed out Studio has less than a single 3090 (really).

Yeah, the 3090 has faster memory, but not by a lot.

The 5090 is at 1,792GB/sec and potential M5 Ultra would be 1,230GB/sec and 512GB RAM. Maybe 1TB. Not 32.


You’re suggesting that a difference of the entirety of the M5 Max’s bandwidth is an insignificant gap!

No, that difference is the 5090, not the 3090.

How is the output quality of the smaller models?

not good enough for coding anything more than simple scripts.

generally, the less parameters, the less knowledge they have.


what model were you using?


It's not much for a frontier AI but it can be a very useful specialized LLM.

It is.

That's how they make loot on their 128GB MacBook Pros. By kneecapping the cheap stuff. Don't think for a second that the specs weren't chosen so that professional developers would have to shell out the 8 grand for the legit machine. They're only gonna let us do the bare minimum on a MacBook Air.


For anyone who has been watching Apple since the iPod commercials, Apple really really has grey area in the honesty of their marketing.

And not even diehard Apple fanboys deny this.

I genuinely feel bad for people who fall for their marketing thinking they will run LLMs. Oh well, I got scammed on runescape as a child when someone said they could trim my armor... Everyone needs to learn.


Yesterday I ran qwen3.5:27b with an M1 Max and 64 GB of ram. I have even run Llama 70B when llama.cpp came out. These run sufficiently well but somewhat slow but compared to what the improvements with the M5 Max it will make it a much faster experience.

I don't know that there would be a huge overlap between the people who would fall for this type of marketing and the people who want to run LLMs locally.

There definitely are some who fit into this category, but if they're buying the latest and greatest on a whim then they've likely got money to burn and you probably don't need to feel bad for them.

Reminds me of the saying: "A fool and his money are soon parted".


In retrospect, was there a better place to learn about the cruelty of the world than runescape? Must've got scammed thrice before I lost the youthful light in my eye

I run local models on my M1 Max. there are a number of them that are quite useful.

my mac mini m4 is getting to be a good substitute for claude for a lot of use cases. LM Studio + qwen3.5, tailscale, and an opencode CLI harness. It doesn't do well with super long context or complexity but it has gotten production quality code out for me this week (with some fairly detailed instructions/background).

There used to be a polite way to call this out, the "Steve Jobs's reality distortion field".

Now that every CEO has their own reality distortion field I wonder if it's even worth calling out any more.

No current CEO has a RDF comparable to Jobs.

Musk is probably closest, but he’s become so involved in partisan politics it makes his field far less effective at distorting reality.


Musk is leading the build of the biggest objects we have ever sent to space. It does give him some sort of aura that is hard to dismantle, let's be honest.

He can do and say a lot of shit because he will still be viewed as real-life Iron Man, because in some ways he kind of is.


Elon Musk would put Apple's money sloshing about over the years to better uses than failing to build one battery electric vehicle costing $1 billion a year over many years.

He doesn't have a RDF but has Kardashev Scale Intent (KSI).

The lobbyists in the political fray are out to steal his value for money lunch despite his demonstrated effectiveness, over and over again.

Jobs couldn't even engage the politicians to give away or at discount the Apple ][ to education.


Somehow Tim Cook's many year's position that the lightening port was very important to Apple vs USB-C, fell flat as a parsec wide pancake.

(It didn't help that they couldn't point to a single user facing feature.)

Or that the App Store lock in is for our safety. When anyone who wanted that particular safety, could choose to continue using there store exclusively.

Etc.

He just does not have it. No field. No spiraling eyes. Perhaps he should grow a beard and wave around a tobacco pipe. Works for some.


Most are not nearly as smooth and successful at the distorting.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: