AMD Ryzen 7000 Series performs better with Spectre V2 Mitigations enabled

MBCook · on Oct 8, 2022

“As for why the Ryzen 7000 series performance is actually slower if disabling the Spectre V2 mitigations, that's likely something only AMD can effectively answer but presumably…”

Just ask!

Seriously. You can ask AMD. Maybe they won’t tell you, but they might. It might be really good info. Why not ask someone who is really knowledgeable about this stuff like a kernel developer who works on x86-64 or worked on the mitigations?

This is what I never understand about Phoronix. People link to them all the time but they run a bunch of benchmarks and then end on “there you go”. I’d like investigation into why. You won’t always get an answer but you should try.

p-e-w · on Oct 8, 2022

> Seriously. You can ask AMD. Maybe they won’t tell you, but they might.

Seriously? By far the most likely outcome is that they don't even bother responding.

That's how corporations operate in 2022. If you're a journalist from the Washington Post, you might get a three-line statement from a spokesperson. Everyone else gets a canned reply from a bot, or nothing at all.

I don't blame Michael for not even bothering to try. Phoronix is the only game in town for much of the topics it covers. The fact that it exists at all, and is effectively run by a single person, is nothing short of amazing.

worthless-trash · on Oct 8, 2022

> That's how corporations operate in 2022. If you're a journalist from the Washington Post, you might get a three-line statement from a spokesperson. Everyone else gets a canned reply from a bot, or nothing at all.

This is not true, I write to AMD security team for spectre/meltdown/CPU flaws and get reasonable responses. Intel security team is also quite good. The usual wait time is 3-4 days.

Admittedly I do write from my work address.

inglor · on Oct 8, 2022

I answer engineers from other companies all the time but never answer journalists. Engineers give a vibe of “cares about what you do and wants to collaborate”.

Talking to journalists is like talking to the police - as an engineer you don’t understand the situation well enough and you need to involve an (expensive) professional to navigate it.

The question going through my head when I get journalist emails is “can I ignore this and save the hours of work with legal/pr?”.

marcosdumay · on Oct 8, 2022

On most places only the PR team is even allowed to talk to journalists. Exactly because that "is like talking to the police" issue.

It's a very sensible and correct policy.

worthless-trash · on Oct 12, 2022

Thats a good point, I had never thought of that.

spookie · on Oct 8, 2022

True that

p-e-w · on Oct 8, 2022

That's great, but I suspect the address you write from indeed plays a big part here. Direct responses from engineers of that caliber within a matter of days is most certainly not standard.

throwawaybutwhy · on Oct 9, 2022

Probably somewhere in A.A. County.

stingraycharles · on Oct 8, 2022

What’s your work, out of curiosity? I suppose security researcher?

worthless-trash · on Oct 12, 2022

Red Hat Kernel security team.

TheRealPomax · on Oct 8, 2022

> By far the most likely outcome is that they don't even bother responding

So, "likely" the same outcome as when you don't ask, except unlike when you don't ask, there is a different outcome when you do ask? Sounds like you just made the argument in favour of always asking questions.

Blikkentrekker · on Oct 8, 2022

Under the assumption that time is not valuable. Writing such an email is still an investment of time.

theow7384iri · on Oct 8, 2022

I don't think Phoronix has time and budget for that. It is a single man, a few scripts and heavy automation. Sometimes he founds bugs and reports it, but there is no time to follow such rabbit hole!

getcrunk · on Oct 8, 2022

if anything, industry should reach out to him imo. he has a lot of mind share in the IT community

albertopv · on Oct 8, 2022

Phoronix is a one man show literally doing everything: hw managemen, tests, articles, news reports, forum admin, premium users management...

woliveirajr · on Oct 8, 2022

Remembers me about the 'supercow powers" question at StackOverflow (unixStackExchange): everybody just guessing, I dared to ask Jason Gunthorpe and he didn't answer me... he made post about it:

https://web.archive.org/web/20190322061230/https://plus.goog...

hericium · on Oct 8, 2022

> Just ask! Seriously. You can ask AMD.

But it's Phoronix - short reprints, not well-done benchmarks, 1-3 paragraphs long "articles" and backlink spam on forums, wikis and wherever possible. They may ask AMD but that will be another "article" and another wave of spam.

Ristovski · on Oct 7, 2022

An interesting comment made in the Phoronix forums:

> My theory is that fixing the Spectre V2 vulnerability on a hardware level would lead to fundamental architecture changes that AMD is not willing to make, because it may add so much more complexity to the architecture or it may just be too unconvenient. They probably realized that optimizing the code paths that the Linux kernel utilizes on the default mitigations mode is faster, simpler and it may involve less deeper changes, while still being secure.

> As far as I know, pretty much every CPU architecture that implements speculative execution is vulnerable to some version of Spectre, so note that this is not a fundametal flaw of AMD64.

Sirened · on Oct 8, 2022

I am terrified to think what AMD's predictor structure is if it's easier for them to do _this_ than it is to simply add privilege tags to their predictors. I don't personally buy this explanation anyways; trying to optimize retpolines in hardware would be an absolute pain in the ass and require an insane amount of synchronization with the backend since retpolines always trash the RAS.

sounds · on Oct 8, 2022

I would guess it's probably physics. Specifically the complexity of the signal path routing for the predictor core must be pretty heavily optimized, and probably are where AMD (and Intel) have invested heavily in advanced design software - their secret sauce - to push the chip right to the edge of what semiconductor physics can achieve.

The branch predictor is one of the most highly optimized pieces of the CPU core. Lots of discussion has been had about how the arm architecture's frontend is simpler, so for example Apple's chips have way more execution units. Intel and AMD's latest designs have also expanded the number of execution units, but the frontend instruction decode and dispatch is the "serial" part of the process, reading the incoming instruction stream. And the x86 instruction set is hard to decode, with a lot of variation in the number of bytes per instruction. So for the instruction decoder to even know there's a branch coming up is a "hard problem," and then it predicts which way the branch will go.

ajross · on Oct 8, 2022

That seems like a disaster brewing. The whole spectre family of vulnerabilities is a side effect of CPUs keeping state around to optimize things and leaking that data between privilege levels.

I mean, in the humorous extreme: imagine if some enterprising group at AMD got together and realized they could "optimize" all that retpoline code by making the RET instruction aware of the branch prediction cache!

"Fundamental architecture changes" are, in fact, what is actually required here.

tadfisher · on Oct 8, 2022

Occam's Razor would suggest a Linux bug.

iforgotpassword · on Oct 8, 2022

Or maybe if we want to keep in the realm of AMD doing something on purpose here, maybe they can detect that the kernel is run with mitigations and then just let the CPU do all the unsafe speculation, while without the mitigations, they disable a lot of the speculative stuff which is somehow even slower than the former case.

Sesse__ · on Oct 8, 2022

Occam's Razor would suggest that Phoronix' benchmarks are broken.

galaxyLogic · on Oct 7, 2022

https://www.pcgamer.com/intel-amd-and-arm-cpus-hit-by-new-sp...

dundarious · on Oct 8, 2022

That's the Spectre V2 from the OP title. What's the point this link is meant to convey?

aliqot · on Oct 8, 2022

Probably that they had a feeling they wanted to refute it somehow but figured an article would be more authoritative, except the one they picked was picked by the headline rather than content.

dundarious · on Oct 8, 2022

Yes, it certainly seems that way.

ZiiS · on Oct 8, 2022

This is basically just saying: the super clever AMD designers and Linux kernel developers have optimised for the setting they recomend and most people use. An insecure setting they recomend against isn't yet well optimised on brand new hardware.

staticassertion · on Oct 8, 2022

Yes, the interesting question is how they did this.

stevefan1999 · on Oct 8, 2022

meh, maybe they just flush TLB every time in hardware level if you disabled the mitigations, ane disabled the hardware flush if the software side can handle it

staticassertion · on Oct 8, 2022

This is basically my guess. In the event of no mitigations you have to pessimistically flush TLB. In the case of mitigations the flush only has to happen in specific instances, which PCID significantly helps with but requires cooperation from both hardware and software.

So maybe even something like "if PCID is not set, flush the cache" vs "if PCID changes, flush the cache".

I think that would certainly account for this, and they could have improved the performance with PCID by increasing the PCID cache size.

bmacho · on Oct 8, 2022

So mitigations = off when they do flush, and mitigations = on when they use previous data and weird shenanigans?

Cloudef · on Oct 8, 2022

Yeah it seems like hw migtation is slower than software

amelius · on Oct 8, 2022

Perhaps their Spectre path has benchmark detection :)

staticassertion · on Oct 8, 2022

Interesting. I wonder if the kpti path leveraging PCID has 'tipped' into a performance improvement? Maybe a larger PCID cache on the CPUs and optimized codepaths for specter-usage?

Havoc · on Oct 8, 2022

Interesting. When i first heard this I assumed it’s bad benchmarking but photonics saying same suggest otherwise

sylware · on Oct 8, 2022

this smells really bad... as it does not make a lot of sense, the devil hides in the details.

simlevesque · on Oct 8, 2022

It seems logical that if Zen 4 is immune to Spectre, the mitigations are a waste and therefore slow the CPU.

pdpi · on Oct 8, 2022

Except that that's the opposite of what's going on. The CPU performs better with the "wasted" mitigations turned on.