“As for why the Ryzen 7000 series performance is actually slower if disabling the Spectre V2 mitigations, that's likely something only AMD can effectively answer but presumably…”
Just ask!
Seriously. You can ask AMD. Maybe they won’t tell you, but they might. It might be really good info. Why not ask someone who is really knowledgeable about this stuff like a kernel developer who works on x86-64 or worked on the mitigations?
This is what I never understand about Phoronix. People link to them all the time but they run a bunch of benchmarks and then end on “there you go”. I’d like investigation into why. You won’t always get an answer but you should try.
> Seriously. You can ask AMD. Maybe they won’t tell you, but they might.
Seriously? By far the most likely outcome is that they don't even bother responding.
That's how corporations operate in 2022. If you're a journalist from the Washington Post, you might get a three-line statement from a spokesperson. Everyone else gets a canned reply from a bot, or nothing at all.
I don't blame Michael for not even bothering to try. Phoronix is the only game in town for much of the topics it covers. The fact that it exists at all, and is effectively run by a single person, is nothing short of amazing.
> That's how corporations operate in 2022. If you're a journalist from the Washington Post, you might get a three-line statement from a spokesperson. Everyone else gets a canned reply from a bot, or nothing at all.
This is not true, I write to AMD security team for spectre/meltdown/CPU flaws and get reasonable responses. Intel security team is also quite good. The usual wait time is 3-4 days.
I answer engineers from other companies all the time but never answer journalists. Engineers give a vibe of “cares about what you do and wants to collaborate”.
Talking to journalists is like talking to the police - as an engineer you don’t understand the situation well enough and you need to involve an (expensive) professional to navigate it.
The question going through my head when I get journalist emails is “can I ignore this and save the hours of work with legal/pr?”.
That's great, but I suspect the address you write from indeed plays a big part here. Direct responses from engineers of that caliber within a matter of days is most certainly not standard.
> By far the most likely outcome is that they don't even bother responding
So, "likely" the same outcome as when you don't ask, except unlike when you don't ask, there is a different outcome when you do ask? Sounds like you just made the argument in favour of always asking questions.
I don't think Phoronix has time and budget for that. It is a single man, a few scripts and heavy automation. Sometimes he founds bugs and reports it, but there is no time to follow such rabbit hole!
Remembers me about the 'supercow powers" question at StackOverflow (unixStackExchange): everybody just guessing, I dared to ask Jason Gunthorpe and he didn't answer me... he made post about it:
But it's Phoronix - short reprints, not well-done benchmarks, 1-3 paragraphs long "articles" and backlink spam on forums, wikis and wherever possible. They may ask AMD but that will be another "article" and another wave of spam.
An interesting comment made in the Phoronix forums:
> My theory is that fixing the Spectre V2 vulnerability on a hardware level would lead to fundamental architecture changes that AMD is not willing to make, because it may add so much more complexity to the architecture or it may just be too unconvenient. They probably realized that optimizing the code paths that the Linux kernel utilizes on the default mitigations mode is faster, simpler and it may involve less deeper changes, while still being secure.
> As far as I know, pretty much every CPU architecture that implements speculative execution is vulnerable to some version of Spectre, so note that this is not a fundametal flaw of AMD64.
I am terrified to think what AMD's predictor structure is if it's easier for them to do _this_ than it is to simply add privilege tags to their predictors. I don't personally buy this explanation anyways; trying to optimize retpolines in hardware would be an absolute pain in the ass and require an insane amount of synchronization with the backend since retpolines always trash the RAS.
I would guess it's probably physics. Specifically the complexity of the signal path routing for the predictor core must be pretty heavily optimized, and probably are where AMD (and Intel) have invested heavily in advanced design software - their secret sauce - to push the chip right to the edge of what semiconductor physics can achieve.
The branch predictor is one of the most highly optimized pieces of the CPU core. Lots of discussion has been had about how the arm architecture's frontend is simpler, so for example Apple's chips have way more execution units. Intel and AMD's latest designs have also expanded the number of execution units, but the frontend instruction decode and dispatch is the "serial" part of the process, reading the incoming instruction stream. And the x86 instruction set is hard to decode, with a lot of variation in the number of bytes per instruction. So for the instruction decoder to even know there's a branch coming up is a "hard problem," and then it predicts which way the branch will go.
That seems like a disaster brewing. The whole spectre family of vulnerabilities is a side effect of CPUs keeping state around to optimize things and leaking that data between privilege levels.
I mean, in the humorous extreme: imagine if some enterprising group at AMD got together and realized they could "optimize" all that retpoline code by making the RET instruction aware of the branch prediction cache!
"Fundamental architecture changes" are, in fact, what is actually required here.
Or maybe if we want to keep in the realm of AMD doing something on purpose here, maybe they can detect that the kernel is run with mitigations and then just let the CPU do all the unsafe speculation, while without the mitigations, they disable a lot of the speculative stuff which is somehow even slower than the former case.
Probably that they had a feeling they wanted to refute it somehow but figured an article would be more authoritative, except the one they picked was picked by the headline rather than content.
This is basically just saying: the super clever AMD designers and Linux kernel developers have optimised for the setting they recomend and most people use. An insecure setting they recomend against isn't yet well optimised on brand new hardware.
meh, maybe they just flush TLB every time in hardware level if you disabled the mitigations, ane disabled the hardware flush if the software side can handle it
This is basically my guess. In the event of no mitigations you have to pessimistically flush TLB. In the case of mitigations the flush only has to happen in specific instances, which PCID significantly helps with but requires cooperation from both hardware and software.
So maybe even something like "if PCID is not set, flush the cache" vs "if PCID changes, flush the cache".
I think that would certainly account for this, and they could have improved the performance with PCID by increasing the PCID cache size.
Interesting. I wonder if the kpti path leveraging PCID has 'tipped' into a performance improvement? Maybe a larger PCID cache on the CPUs and optimized codepaths for specter-usage?
Just ask!
Seriously. You can ask AMD. Maybe they won’t tell you, but they might. It might be really good info. Why not ask someone who is really knowledgeable about this stuff like a kernel developer who works on x86-64 or worked on the mitigations?
This is what I never understand about Phoronix. People link to them all the time but they run a bunch of benchmarks and then end on “there you go”. I’d like investigation into why. You won’t always get an answer but you should try.