Because it seems like, regardless of the announcement, there will always be someone who has the most niche issue with it and manages to make assertions for an entire group of people while only really referencing their personal experience ("and all of the people they know").
I mean, I am the strongest local LLM advocate you will find. I have my GPU loaded with a model pretty much all day, for recreation and work. My job, my livelihood involves running local LLMs.
But it's intense, even with a very finicky, efficient runtime on a strong desktop. Local LLM hosting is not something you want to impose on users unless they are acutely aware of it, or unless its a full stack hardware/software platform (like the Google Pixel) where the vendor can "hide" the undesirable effects on system performance.
I think that's a reasonable generalization to make.
Fair but google does, _supposedly_ have a Gemini model meant to run on phones so it'd presumably be small enough that it wouldn't necessarily be a massive problem. Or, at least, we could get there eventually. Not arguing at this point, you're right. I just think over time we could get there