Unexpected Page Fault In Virtualized Environment Advisory

zelon88 · on Dec 10, 2019

Someone should make a graph quantifying the number of "mitigations" and performance impacting patch work for popular Intel SKU's since release.

It would be interesting to see how many times they've patched the same processor and how much slower they are now than when they were made due to all the mitigations.

kardos · on Dec 10, 2019

It's workload dependent, so you're asking for a series of graphs. Not a small ask...

Phoronix has a bunch of articles where they do this but they are on a per-mitigation basis as I recall

freeopinion · on Dec 10, 2019

Do you spell "someone" A-M-D?

olyjohn · on Dec 10, 2019

I was under the impression that AMD has their own fair share of these vulnerabilities, too. They just aren't as common in the datacenter.

sp332 · on Dec 10, 2019

AMD never had one as boneheaded as Meltdown. Intel keeps having more and more uncovered, and according to researchers who went on the record in the NYT, they are not handling reported vulnerabilities quickly or thoroughly. https://www.nytimes.com/2019/11/12/technology/intel-chip-fix...

dmead · on Dec 10, 2019

IIRC intel stopped doing as much validation like 10ish years ago (so, 2009/2010). it would be nice to see then publish a paper about how those decisions lead into these problems...

ShroudedNight · on Dec 11, 2019

My understanding is that chip QA at Intel took somewhat of a nosedive post Haswell. From my ignorant but interested outsider perspective, everything from Broadwell on seemed to be a mess execution-wise compared to Haswell (modulo TSX), and _especially_ compared to Ivy Bridge.

Some of the recorded comments on https://danluu.com/cpu-bugs/ (First update section) mesh with my observations, but I wouldn't know enough to tell if I was on to something, or just confirming my own biases.

xvector · on Dec 10, 2019

Intel willingly admitting mistakes is just about as likely as hell freezing over

dnautics · on Dec 11, 2019

It really is. I got hit pretty bad by an ubuntu intel-microcode package regression, which has this annoying property that soft reboots fail (and hard reboots are fine). I lost about 3 days of work to this[0], and our mitigation (pinning the package to an earlier version) is still painful, because you have to go through one OS installation cycle and still manually reboot (we do a lot of manual OS installations, and debugging "first installs").

Anyways I was bitching about this to my roommate, and she remarked that hey you know acquaintance X we know works in Intel software security division. I told her to give him crap about it and apparently his response was something like "we should have closed comments on that github issue". I feel like this is not a really appropriate response, even between friends.

AFACIT the package still hasn't been fixed and the official ubuntu solution is to roll back to the nonbroken version.

[0] admittedly slightly poor internal communication is also responsible, since this was observed by our support staff for our customers which didn't make it known to R&D - me

teamsforlinux · on Dec 10, 2019

Which happened earlier today with Microsoft teams for linux

m00x · on Dec 10, 2019

Intel is also the most used processor in the market at the moment, once that balance shifts, more attention will be paid to AMD processor so we'll potentially have more vulns uncovered.

farisjarrah · on Dec 10, 2019

I'm inclined to believe that this is actually the case and not that AMD wrote more secure software. All software has security vulnerabilities, the more eyes on the software the more of them are found.

JackRabbitSlim · on Dec 10, 2019

This isn't "Software" It's hardware (well both but lets not get too pedantic) and everyone was throwing cache and speculative exploits by the shovelful at both AMD and Intel. AMDs are indeed at the very least, "Less insecure."

Way more shit stuck to Intel for one reason. The speed advantage Intel had been lording over AMD (Besides compiler shenanigans) was all the corners they were cutting with there speculative execution, et al.

Amazing timing that; AMD closing those benchmark gaps and the mass meltdown mitigations in Intel products... all in the same decade Intel was court ordered to fix their unfair C compiler. Intel's domination is simply over...

dreamcompiler · on Dec 11, 2019

The inverse question is relevant as well: How many performance enhancements over the last 10 years were only possible because Intel ignored security?

bonzini · on Dec 10, 2019

This seems like the usual processor erratum causing potentially very bad things, but only in very rare conditions that no one really understands. It's not another L1TF or similar.

monocasa · on Dec 10, 2019

Hence why we need open source CPUs pretty badly.

vajrabum · on Dec 10, 2019

I'm pretty sure that high performance open source CPUs will have their own obscure problems. Too much complexity, too many dependencies, too many possible feature interactions.

monocasa · on Dec 10, 2019

They will, but you'll be able to understand the problem, the fix, and how it combines with other fixes.

I can def see a world soon where all of Intel's woes have combined to the point that they've run out of patch space for their microcode updates, and you have to pick and choose what you want mitigations for.

hinkley · on Dec 11, 2019

"Good news, our brand new processor has twice as much space for microcode as the old versions!"

MayeulC · on Dec 10, 2019

That is undoubtedly true, but at least you will have more engineers that are able to dig into them to identify the root cause of these behaviours and fix it.

If you are badly impacted by a bug and no one else is, you are the only one with an incentive to find and fix it. You might pay the CPU manufacturer to share the incentive with them, but you'd need quite deep pockets for this.

I wouldn't be surprised if widespread open source CPUs also had better debugging tools at their disposal.

ddtaylor · on Dec 10, 2019

At least with an open source one more than a handful of engineers at a single company could work on the problem.

thu2111 · on Dec 10, 2019

Is that actually better in this case? Intel found the issue internally. Nobody knows what it is. The advisory isn't sufficient information to figure it out. People can patch at their leisure, fairly sure that nobody is about to pop up with a 1-day exploit for it.

With an open source CPU, by now someone would have looked at the commits that fixed the Verilog/microcode, figured out what the bug is, and there'd be a convenient command line tool to get root on the hypervisor uploaded to GitHub within an hour.

This is one of those times when from a practical perspective proprietary seems to win.

bonzini · on Dec 10, 2019

> This is one of those times when from a practical perspective proprietary seems to win.

I'm not sure about that, but I must say that Intel is being surprisingly candid. Similar errata have been swept under a rug and published a dozen at a time with no workarounds for years.

monocasa · on Dec 10, 2019

Yeah, this feels like active exploitation.

ddtaylor · on Dec 10, 2019

Many open source systems exist that fix vulnerabilities in a timely manner without falling apart. Linux, BSD, etc.

tenebrisalietum · on Dec 10, 2019

You don't know that such a command doesn't exist for the Intel vulnerabilities, but hasn't been released to the wild or public.

thu2111 · on Dec 11, 2019

I can be fairly sure, as otherwise Intel wouldn't be claiming the issue was found via their own internal audits, and likely someone else would be writing about it.

sillysaurusx · on Dec 10, 2019

Is that actually better in this case? Intel found the issue internally. Nobody knows what it is. The advisory isn't sufficient information to figure it out. People can patch at their leisure, fairly sure that nobody is about to pop up with a 1-day exploit for it.

That's pretty bad, actually. It means a determined adversary can simply look at the patch to figure out how to exploit vulnerable systems. (Presumably there exists a way to look at the actual unencrypted bytes being modified; if so, you can work out what it's doing.)

And since people can patch at their leisure, a determined adversary will have lots of targets to choose from after they analyze the patch.

To be fair, I don't know much about CPU microcode. But while it's true that lonewolf hackers are less likely to be a threat here, a threat does exist: governments are increasingly turning to industrial espionage-type practices (apparently even the NSA https://en.wikipedia.org/wiki/ECHELON#Examples_of_industrial...) and this type of exploit seems, at a glance, pretty lucrative: unauthenticated users can achieve privilege escalation.

It's easy to imagine some facility somewhere of industrious Chinese reverse engineers who are pretty darn good at this, and that it's their full-time job to find and weaponize such exploits. In fact, swap out "Chinese" with "American" and you get the NSA.

EDIT: It turns out that I am mistaken: Intel microcode updates are encrypted. https://en.wikipedia.org/wiki/Intel_Microcode#Microcode_upda...

> With the Pentium there are two layers of encryption and the precise details explicitly not documented Intel, instead being only known to less than ten employees.

I guess I'll leave the comment up, since... well, I was formerly a pentester, and it seemed like a logical sequence of arguments. That's where I learned about the technique of looking at binary diffs to work out what security patches were doing.

It's very strange to me that this is possible to encrypt, though. Isn't it "just" a matter of getting your hands on a processor + the update? Why is it impossible to dump the microcode as it's being decrypted? Sure, you won't be able to analyze the patch before it's decrypted, but are we just relying on the idea that it's too much work for someone to figure out how to listen in on the decrypting process?

Following that Wikipedia citation, the quote about it being in the heads of less than 10 employees is from 1997, so it's ancient information. I'm curious what the current state of the art is.

monocasa · on Dec 10, 2019

They're encrypted with a key that's shipped on every processor they ship. A combo of classic espionage and electron microscopes means that we should assume state actors can know the exact mechanism of microcode update changes.

sillysaurusx · on Dec 10, 2019

Thanks! In that case, they seemed to have a good point: if it was an open source CPU, it seems like security might be an issue.

One way to do it: ship security updates using the same technique as intel, and don't release the source code for the fix until much later. I think I remember an open source project doing something similar. But of course, it seems pretty hard to manage that complexity: what if the fix introduces code changes that future commits depend on?

Interesting problem...

klyrs · on Dec 11, 2019

I'm by no means an expert, but open design and verification of secure enclaves seems quite feasible -- keys would differ between different chip makers, I imagine. Folks could write patches, but perhaps not sign them for hardware they own. Though I'd expect most maintainers to work with the community to get bugs fixed.

strstr · on Dec 10, 2019

Anyone know what conditions are required? The advisory is sparse on details.

The errata lists the same vague info: https://www.intel.com/content/dam/www/public/us/en/documents...

erk__ · on Dec 11, 2019

>November 2910

They seem to have gotten information about the bug from the future :P

This could also show that trhey released it in a hurry since they did not fix that typing mistake.

ars · on Dec 11, 2019

It's not a typo, it's in Middle-endian format. :)

hinkley · on Dec 11, 2019

It looks more like a concurrency bug to me.

_Codemonkeyism · on Dec 11, 2019

It feels like the only mails I get from DigitalOcean are about Intel processors.

nopurpose · on Dec 10, 2019

Intel now runs bug bounty program with up to $100k payouts (https://www.intel.com/content/www/us/en/security-center/bug-...), where one of the requirements is not to leak vulnerability details.

cortesoft · on Dec 10, 2019

Isn't that a pretty standard bug bounty requirement? The idea is that you submit the bug to the company and they fix it before it is disclosed.

throwawaymath · on Dec 10, 2019

It is standard in the sense that it's not uncommon. But about as frequently it's not a requirement. Many companies allow complete or partial vulnerability disclosure once resolution is complete. It's often on a case by case basis and requires approval.

cortesoft · on Dec 11, 2019

Oh, I thought that was what you meant (until resolution).. didn't realize they block disclosure forever

ysleepy · on Dec 10, 2019

Skylake and newer.

Is Broadwell and before not affected or are those not mentioned since their support cycle has ended? I'd be surprised with Intel spinning up Haswell production for lower grade CPUs on 22nm, but I can't be sure.

bdibs · on Dec 10, 2019

There seem to be quite a few security updates today from Intel: https://www.us-cert.gov/ncas/current-activity/2019/12/10/int...

_Codemonkeyism · on Dec 11, 2019

Intel feels so much like Boeing now.