Because it seems like, regardless of the announcement, there will always be some...

brucethemoose2 · on Jan 23, 2024

I mean, I am the strongest local LLM advocate you will find. I have my GPU loaded with a model pretty much all day, for recreation and work. My job, my livelihood involves running local LLMs.

But it's intense, even with a very finicky, efficient runtime on a strong desktop. Local LLM hosting is not something you want to impose on users unless they are acutely aware of it, or unless its a full stack hardware/software platform (like the Google Pixel) where the vendor can "hide" the undesirable effects on system performance.

I think that's a reasonable generalization to make.

chankstein38 · on Jan 24, 2024

Fair but google does, _supposedly_ have a Gemini model meant to run on phones so it'd presumably be small enough that it wouldn't necessarily be a massive problem. Or, at least, we could get there eventually. Not arguing at this point, you're right. I just think over time we could get there