Only way to have hardware reach this sort of efficiency is to embed the model in...

tclancy · 2026-03-23T16:35:52 1774283752

I think you're ignoring the inevitable march of progress. Phones will get big enough to hold it soon.

tren_hard · 2026-03-23T18:34:44 1774290884

Instead of slapping on an extra battery pack, it will be an onboard llm model. Could have lifecycles just like phones.

Getting bigger (foldable) phones, without losing battery life, and running useable models in the same form-factor is a pretty big ask.

RALaBarge · 2026-03-23T17:34:45 1774287285

I think the future is the model becoming lighter not the hardware becoming heavier

Tade0 · 2026-03-23T17:51:47 1774288307

The hardware will become heavier regardless I'm afraid.

TeMPOraL · 2026-03-24T10:06:41 1774346801

Good. It's ridiculously tiny and lightweight these days.

Especially with phones; the first thing everyone does after buying their new uber thin iPhone is buying a case for it, which doubles its thickness.

intrasight · 2026-03-23T16:26:44 1774283204

I think for many reasons this will become the dominant paradigm for end user devices.

Moore's law will shrink it to 8mm soon. I think it'll be like a microSD card you plug in.

Or we develop a new silicon process that can mimic synaptic weights in biology. Synapses have plasticity.

bigyabai · 2026-03-23T16:30:54 1774283454

One big bottleneck is SRAM cost. Even an 8b model would probably end up being hundreds of dollars to run locally on that kind of hardware. Especially unpalatable if the model quality keeps advancing year-by-year.

> Or we develop a new silicon process that can mimic synaptic weights in biology. Synapses have plasticity.

It's amazing to me that people consider this to be more realistic than FAANG collaborating on a CUDA-killer. I guess Nvidia really does deserve their valuation.

intrasight · 2026-03-23T16:48:02 1774284482

> bottleneck is SRAM cost

Not for this approach

ottah · 2026-03-23T16:38:36 1774283916

That's actually pretty cool, but I'd hate to freeze a models weights into silicon without having an incredibly specific and broad usecase.

patapong · 2026-03-23T17:39:00 1774287540

Depends on cost IMO - if I could buy a Kimi K2.5 chip for a couple of hundred dollars today I would probably do it.

whatever1 · 2026-03-23T17:42:04 1774287724

I mean if it was small enough to fit in an iPhone why not? Every year you would fabricate the new chip with the best model. They do it already with the camera pipeline chips.

superxpro12 · 2026-03-23T17:46:50 1774288010

Sounds like just the sort of thing FGPA's were made for.

The $$$ would probably make my eyes bleed tho.

chrsw · 2026-03-23T18:04:45 1774289085

Current FPGAs would have terrible performance. We need some new architecture combining ASIC LLM perf and sparse reconfiguration support maybe.

0x457 · 2026-03-23T18:49:46 1774291786

Wouldn't it be the opposite of freezing weights?