> But the Ryzen 5000 mobile chips really don't look bad at all in comparison, and a 5nm version of those would level the field.
Not look too bad best to say.
Even if we add a one node advantage, we still have a ~15 watt chip making 20W-30W chips sweat.
If you downclock Zen 3 to match 15W TDP, you may well get M1 beating it by double digit margin.
M1 is simply a way more efficient chip than any X86 chip can be because of 40 years of architectural advantage.
- latest x86 chips have patently gigantic decoders metastasising into backend. They are very tightly coupled to pretty much every other piece of logic.
- x86 memory model, and blockage behaviour costing huge transistor count to work around
- better SMP efficiency because of laxer memory model, and SMP logic being deeper integrated into cores, rather than being an afterthought
- generally better register utilisation, at lower transistor count, and the upstream software ecosystem historically having better understanding how to work with large register count
- X86 cores prefetch from both main memory, and in between caches is more expensive, and less efficient because it has to rely on more complex logic, and do more guesswork than in ARM.
The list can go on for a few more screens.
Remember. Even if M1 is a SoC, it still manages to beat Ryzen which barely have much besides cores, memory controller, PCIE, and USB. If you give a 1 node handycap to Ryzen, and only compare cores, you will still get M1 having almost 3 to 4 times advantage at performance per square mm.
It's really sad to see X86 development digging itself into a ditch with "40 years binary compatibility at any cost."
With all regards to Su Lisa, I believe they very much understand all that, but nevertheless still signed under the idea that X86 market will never go anywhere.
I believe it's trivial for both AMD, and Intel to easily slash X86 transistor counts, while increasing the performance by double digits if they can go in err of the X86 ISA convention on just the few most egregious anachronisms.
Trust me, it does not have 40 years of architectural advantage, just an advantage. It's not going to take Intel "40 years to catch up" give me break. I have an M1 mini anad it's a great little desktop machine for browsing and light development but it's not 40 years ahead of intel.
40 years of architectural advantage is X86 electing to forego results of 40 years of advances, and improvements that every other sane ISA had, and instead trying to add them by increasingly complex "workarounds."
Giant transistors counts go to allow X86 cores to not to break ISA compatibility with a 40 years old chip, while trying make new ISA features to live along with it.
5800U (Zen 3 mobile) and M1 are actually very close multithread perf wise despite one being a 8 core part and the other a 4+4 part. And with more power use for Zen 3 too.
What hurts Zen 3 there is while the M1 maintains the full clock with all cores busy, Zen 3 has to downclock away from its max turbo clocks.
Ok, but that's not what I'm talking about. I'm talking about the 5800U@15W.
On Cinebench, the 5800U gets much more than the M1 in multicore and slightly more in single core. It even edges out the M1 on Geekbench, though Gb is a poor benchmark.
Cinebench is a rendering load that isn’t that much optimised. (Doesn’t even use the newer AVX levels when available, and isn’t properly optimised for Arm either).
Cinema 4D, the program that Cinebench benchmarks, normally does the renders on NVIDIA GPUs, not CPUs. As such, it’s a very poor benchmark.
And those 5800U results are at probably much higher than 15W. (Because that’s the base config, OEMs are free to ship with higher TDPs)
The examples I took were 15W. The M1 is also ran at much more than 15W in some models.
Geekbench in general has very heavily biased for ARM as it does not run the same code on both. Cinebench doesn't have this issue. And while yes nowadays a lot of rendering is done on GPUs for C4D the class of programs is path-tracers and they are often ran on CPUs for many scenes.
Geekbench 5 runs the same tests on both platforms, and always did. What you say as heavily based for Arm comes with no evidence.
> The M1 is also ran at much more than 15W in some models.
Nope, it's the same M1 for both, same voltage/frequency curves and top clocks. It's the whole point of having a chip that's named the same. '
No modern laptop CPU has a headline power use number. They all try to use the headroom that they've been given.
Cinebench is very far from being the end-all of benchmarking that you say it is here. And there are far more optimised renderers around if you want to benchmark that. (RTX makes the matter moot nowadays anyway)
It does not use AVX-512, or older AVX levels much for that matter, for a workload that is SIMD-friendly. For Arm, they leave lots of perf on the table too. (Cinebench uses Intel's Embree renderer, with AVX-512 disabled)
Geekbench 5 is designed to be a composite index of multiple benchmarks to be more realistic to some extent than using just one. You can also access to the scores of the subtests too.
Area isn't what affects power usage, switching activity is. [1] Besides, the cache size doesn't negatively influence or constraint the design of the core logic, but highly complex instruction decoding absolutely does.
[1] Well, mostly. There is such a thing as static power draw, but I suspect that the transistors in the L3 cache are optimized to have lower static leakage than the transistors in the core logic, which are optimized to be fast.
Leakage is actually a significant part of idle power consumption, especially at smaller process sizes, so much that some CPUs have the ability to turn off parts of caches when idle.
I repeat my stance that x86 instruction decoding is a tiny part of a processor, and things like vector units (which are also often powered down when idle) and reordering logic take far more power.
There's a paper about this that compares the efficiency of different ISAs, and basically concludes that ARM and x86 are no different in that respect. Only MIPS is an awful outlier.
That paper doesn't show what you think it shows. There are too many variables between these CPU implementations to support the conclusion that the ISA doesn't matter for power consumption.
Not look too bad best to say.
Even if we add a one node advantage, we still have a ~15 watt chip making 20W-30W chips sweat.
If you downclock Zen 3 to match 15W TDP, you may well get M1 beating it by double digit margin.
M1 is simply a way more efficient chip than any X86 chip can be because of 40 years of architectural advantage.
- latest x86 chips have patently gigantic decoders metastasising into backend. They are very tightly coupled to pretty much every other piece of logic.
- x86 memory model, and blockage behaviour costing huge transistor count to work around
- better SMP efficiency because of laxer memory model, and SMP logic being deeper integrated into cores, rather than being an afterthought
- generally better register utilisation, at lower transistor count, and the upstream software ecosystem historically having better understanding how to work with large register count
- X86 cores prefetch from both main memory, and in between caches is more expensive, and less efficient because it has to rely on more complex logic, and do more guesswork than in ARM.
The list can go on for a few more screens.
Remember. Even if M1 is a SoC, it still manages to beat Ryzen which barely have much besides cores, memory controller, PCIE, and USB. If you give a 1 node handycap to Ryzen, and only compare cores, you will still get M1 having almost 3 to 4 times advantage at performance per square mm.
It's really sad to see X86 development digging itself into a ditch with "40 years binary compatibility at any cost."
With all regards to Su Lisa, I believe they very much understand all that, but nevertheless still signed under the idea that X86 market will never go anywhere.
I believe it's trivial for both AMD, and Intel to easily slash X86 transistor counts, while increasing the performance by double digits if they can go in err of the X86 ISA convention on just the few most egregious anachronisms.