I have a Chromebook (Samsung 4) with a Celeron N4020 2/2 1.1GHz ("up to 2.8GHz"), 6W TDP processor. It's not going to blow anyone away with its performance, but it's entirely adequate and I love that it has all-day battery life from a $100 device (mine was $92.44 delivered&taxed on sale; the typical street price is $119).
I don't want the lowest end offerings to become 15W TDP chips in $300 laptops. I think there's a perfectly valid place for 6W chips in $100 devices, which brings computing access to more people and places.
It's running ChromeOS with the Linux dev system installed. I did last year's Advent of Code in Clojure on this device, including some airplane trips. (I didn't solve every puzzle, but the limitation on the ones I didn't get was me, not the Chromebook. It runs Emacs, cider, and the Clojure REPL just fine.)
I'm typing on it right now and it's fine for casual use. (It gets a fair amount of weekend use because I neither want to undock my work laptop nor carry around something that large, expensive, and heavy.)
Would I run it as my only computer if $500 wouldn't pressure my family finances? Probably not. If my choice was between this and nothing, that's an even easier choice.
The bottleneck is usually RAM. It looks like his device has 4GB, which is similar to what a cell phone has.
It doesn't take much manufacturer-supplied bloatware, bad drivers, or background processes to use up that much RAM but a Chromebook is just the kernel and Chrome, so you get a lot of bang for your buck.
There shouldn't need to be such a large tradeoff between efficiency and core count, and in the world of ARM CPUs it's not. It's entirely possible to pack 4+ reasonably performant cores into a 6W TDP, and likely at non-extravagant prices.
How can this be true? I'm not saying it isn't, but this makes it sound like ARM is just objectively better - lower power and higher performance? I assume it's more complicated.
Most of the time you can increase performance either by increasing clock frequency or doing more per clock. Raising clock speed usually increases power exponentially. On a desktop this is usually the strategy because we can put decent cooling rigs on them.
Doing more per clock is difficult on an x86 compared to ARM. x86's instruction set is a hodgepodge collection of instructions of variable lengths and addressing modes. ARM64 on the other hand has far less addressing modes and a fixed 32-bit instruction length. When an x86 is trying to decode ahead of the instruction stream it needs to decode each instruction in order or have special logic to get around that which makes it more difficult to stay ahead of the processor. Normally you see an x86 chip described as having a certain number of complex and a certain number simple decoders because some instructions are just pigs of things to decode. Simple decodes will get stuff that decode to 3 uops or less while complex handles most of the rest. Some real pigs of instructions might even be sent to the microcode sequencer which generates a whole heap of uops which takes a while.
In the case of ARM64 every 4 bytes you have an instruction come hell or high water. On a chip like the M1 it takes 32-bytes of instructions, splits every 4 bytes between its 8 decoders, and each will spit out uops in parallel. From there the chip will issue those decoded instructions to the necessary execution ports. Because of the less complicated decoding, the huge increase in decoding throughput, and the huge reorder buffers an M1 can keep more of its execution ports busy. If twice as many execution ports can be kept full it means you can do the same amount of work in half as many clock cycles. Because you're only running at half the clock speed your power usage is way lower.
The code density of ARM64 is not that much worse than x64 - especially for anything generated by a modern compiler. You may get some small scale gains for hand tuned code with careful instruction and register selection (ie where Rex prefix can be more easily avoided) - but average binary density doesn’t overcome the aforementioned differences in efficiency.
There is nothing intrinsically different about computing from, say, 10 years ago. You might still want to read some text on a webpage, click on a few buttons and have them do something, maybe fill out some text fields, or even upload/download a few files.
Instead, you get "visual experiences" with overcomplicated UIs with similarly overcomplicated underlying technologies (edit: not to say that there aren't benefits to technologies like Vue/Angular/React, however they aren't "necessary" to get things done most of the time, in many cases even server side rendering without JS would be enough), all of which waste all of the resources that you'll give them, especially if you don't have ad-blockers on which means that you'd get bogged down with dozens if not hundreds of malicious scripts.
Of course, this is a bit akin to shouting at the cloud, but nobody should be too proud about the state the modern web is in and use it to justify wasteful hardware and software requirements: https://idlewords.com/talks/website_obesity.htm
With an ad blocker, I've found the next bottleneck for web browsers is the GPU. I'd guess a pentium 4 with the equivalent of a five year old, well-supported integrated Intel GPU would be fine.
I'd rather they put the extra $200 into the keyboard, display, passive cooling, audio and webcam.
Even "garbage" arm CPUs are fine these days. All my real work happens when I remote into a machine that's faster than a $3500 workstation laptop. The remaining heavy workload is the web browser (and, perhaps, corporate email client / other crapware)
I don't want the lowest end offerings to become 15W TDP chips in $300 laptops. I think there's a perfectly valid place for 6W chips in $100 devices, which brings computing access to more people and places.