> But the Ryzen 5000 mobile chips really don't look bad at all in comparison, an...

gameswithgo · on April 23, 2021

>If you downclock Zen 3 to match 15W TDP, you may well get M1 beating it by double digit margin.

Nah, people have done that experiment and it is competitive

stjohnswarts · on April 23, 2021

Trust me, it does not have 40 years of architectural advantage, just an advantage. It's not going to take Intel "40 years to catch up" give me break. I have an M1 mini anad it's a great little desktop machine for browsing and light development but it's not 40 years ahead of intel.

baybal2 · on April 23, 2021

This is not the point.

40 years of architectural advantage is X86 electing to forego results of 40 years of advances, and improvements that every other sane ISA had, and instead trying to add them by increasingly complex "workarounds."

Giant transistors counts go to allow X86 cores to not to break ISA compatibility with a 40 years old chip, while trying make new ISA features to live along with it.

ThrowawayR2 · on April 23, 2021

Nope. If that were actually true, Alpha, PA-RISC, UltraSPARC, or even Itanium would've killed x86 earlier.

baybal2 · on April 24, 2021

All of above were beating X86 on transistor count/performance for their time. Beating X86 on that was their very point.

Their commercial demise had nothing to do with their hardware.

leokennis · on April 24, 2021

> great little desktop machine for browsing and light development

Oh come on. You’re making it sound like a toy computer that’s nice for little Debbie who can now watch YouTube videos a little faster.

While my MacBook Air is unplugged, I can edit 4K video in full frame rate for hours and it doesn’t even get warm, let alone hot.

sudosysgen · on April 23, 2021

That's not how downclocking chips work. Reducing the wattage reduces performance by much less.

In multicore workloads Zen 3 slaps the M1 even per watt.

my123 · on April 23, 2021

Nope, about Zen 3.

5800U (Zen 3 mobile) and M1 are actually very close multithread perf wise despite one being a 8 core part and the other a 4+4 part. And with more power use for Zen 3 too.

What hurts Zen 3 there is while the M1 maintains the full clock with all cores busy, Zen 3 has to downclock away from its max turbo clocks.

sudosysgen · on April 23, 2021

Which benchmarks are we talking about?

The 5800u is a 15w part, you might be confusing the normal part with some overclocked implementation.

But the 5800u at 15w is still faster than the M1 in multicore.

Yes, Zen has to downclock, but it has more cores.

my123 · on April 23, 2021

It's far from being as definitive as you say.

Trying to find with one of the best 5800U scores: https://browser.geekbench.com/v5/cpu/compare/7449909?baselin...

It's for 5800H which is at 45W where Ryzen can get a tiny edge.

sudosysgen · on April 23, 2021

Ok, but that's not what I'm talking about. I'm talking about the 5800U@15W.

On Cinebench, the 5800U gets much more than the M1 in multicore and slightly more in single core. It even edges out the M1 on Geekbench, though Gb is a poor benchmark.

my123 · on April 23, 2021

Cinebench is a rendering load that isn’t that much optimised. (Doesn’t even use the newer AVX levels when available, and isn’t properly optimised for Arm either).

Cinema 4D, the program that Cinebench benchmarks, normally does the renders on NVIDIA GPUs, not CPUs. As such, it’s a very poor benchmark.

And those 5800U results are at probably much higher than 15W. (Because that’s the base config, OEMs are free to ship with higher TDPs)

sudosysgen · on April 24, 2021

The examples I took were 15W. The M1 is also ran at much more than 15W in some models.

Geekbench in general has very heavily biased for ARM as it does not run the same code on both. Cinebench doesn't have this issue. And while yes nowadays a lot of rendering is done on GPUs for C4D the class of programs is path-tracers and they are often ran on CPUs for many scenes.

my123 · on April 24, 2021

Geekbench 5 runs the same tests on both platforms, and always did. What you say as heavily based for Arm comes with no evidence.

> The M1 is also ran at much more than 15W in some models.

Nope, it's the same M1 for both, same voltage/frequency curves and top clocks. It's the whole point of having a chip that's named the same. '

No modern laptop CPU has a headline power use number. They all try to use the headroom that they've been given.

Cinebench is very far from being the end-all of benchmarking that you say it is here. And there are far more optimised renderers around if you want to benchmark that. (RTX makes the matter moot nowadays anyway)

It does not use AVX-512, or older AVX levels much for that matter, for a workload that is SIMD-friendly. For Arm, they leave lots of perf on the table too. (Cinebench uses Intel's Embree renderer, with AVX-512 disabled)

Geekbench 5 is designed to be a composite index of multiple benchmarks to be more realistic to some extent than using just one. You can also access to the scores of the subtests too.

baybal2 · on April 23, 2021

CMOS power = Leakage + Average switching energy * frequency^2

yjftsjthsd-h · on April 23, 2021

> It's really sad to see X86 development digging itself into a ditch with "40 years binary compatibility at any cost."

Probably some of that is because they did try to do something different once - and Itanium was an unmitigated disaster.

jeswin · on April 23, 2021

> ...you will still get M1 having almost 3 to 4 times advantage at performance per square mm.

400%? Do you have any sources to back up this somewhat extraordinary claim?

baybal2 · on April 23, 2021

I used die shots from https://www.tomshardware.com/news/apple-m1-vs-apple-m14-floo...

And 119 mm² total die area claim

For Zen 3 I used https://cdn.wccftech.com/wp-content/uploads/2020/11/AMD-Ryze...

0.5mm² per core on M1 vs 2.2mm² on Zen 3 sans caches

If you add caches, it gets much worse

17.5mm² vs 65mm²

And I may have had inadvertently included non-core parts into M1 calculation because of no official die floor map

canadianfella · on April 23, 2021

> Not look too bad best to say.

What does this mean?

watersb · on April 24, 2021

This sounds like a Chinese proverb; there's a whole tradition of pithy groupings of four words.

"Didn't look, too bad" might be a mode of "buyer beware" -- if you are not careful at the market, nyou might come home with trash.

userbinator · on April 23, 2021

latest x86 chips have patently gigantic decoders metastasising into backend.

It's still a tiny percentage of the area. The caches are the biggest.

adwn · on April 23, 2021

Area isn't what affects power usage, switching activity is. [1] Besides, the cache size doesn't negatively influence or constraint the design of the core logic, but highly complex instruction decoding absolutely does.

[1] Well, mostly. There is such a thing as static power draw, but I suspect that the transistors in the L3 cache are optimized to have lower static leakage than the transistors in the core logic, which are optimized to be fast.

userbinator · on April 23, 2021

Leakage is actually a significant part of idle power consumption, especially at smaller process sizes, so much that some CPUs have the ability to turn off parts of caches when idle.

I repeat my stance that x86 instruction decoding is a tiny part of a processor, and things like vector units (which are also often powered down when idle) and reordering logic take far more power.

There's a paper about this that compares the efficiency of different ISAs, and basically concludes that ARM and x86 are no different in that respect. Only MIPS is an awful outlier.

https://www.extremetech.com/extreme/188396-the-final-isa-sho...

adwn · on April 24, 2021

That paper doesn't show what you think it shows. There are too many variables between these CPU implementations to support the conclusion that the ISA doesn't matter for power consumption.