As a result of our work, we now believe that speculative vulnerabilities on today’s hardware defeat all language-enforced confidentiality with no known comprehensive software mitigations, as we have discovered that untrusted code can construct a universal read gadget to read all memory in the same address space through side-channels. In the face of this reality, we have shifted the security model of the Chrome web browser and V8 to process isolation.
Processes are pretty heavyweight as a way to perform this sort of isolation. I can't help thinking that something like sthreads and tagged memory from Andrea Bittau's wedge system would be great OS primitives to have right now.
> Processes are pretty heavyweight as a way to perform this sort of isolation.
I'm really thinking that, we don't really need all the features offered by processes. My current understanding is that we just need a different address space. Why can't we, for example, switch from one set of pages to another when we switch from running trusted browser code to running JIT-ed untrusted code? (Leaving of course a small piece of trampoline code mapped, like KPTI.) On a simple level, this could just be calling mprotect() at certain key locations that result in flipping a few bits in the kernel-maintained page tables. With some good design, perhaps the address for untrusted code and data can be so far away from trusted code and data that maybe just one bit flip is needed in a PML4E.
The wedge work did pretty much this - it allowed multiple page tables to be used within one process, and memory tags indicated which regions should be accessible to each "sthread". In this case, an sthread would run the JITed untrusted code. There are a lot of details to get right though, such as callgates between untrusted and trusted code so it's possible to call OS or process support functions. Anyway, although wedge was not intended to guard against spectre, it should do so nicely. Andrea had all this working in Linux.
> Processes are pretty heavyweight as a way to perform this sort of isolation.
How do you figure? Processes on some OS's, such as Linux, are pretty cheap. So what are you considering heavy? And how do you imagine a "process switching in userspace" type thing to not just have the same weight as real processes? What's the expensive thing you're trying to eliminate?
You can do these things as a mitigation but it doesn't fully solve the problem because untrusted programs can try to get timing info from other sources. For example, incrementing a counter inside an infinite while loop can give you a good estimate of time, measured in cpu clock cycles .
Purity is not enough to prevent timing attacks. A famous example is string equality. The running time of a typical implementation is proportional to the length of the largest common prefix between the two strings. To prevent this information leak the implementation must be written to always iterate until the end.
But spectre is even worse than that as it attacks at the hardware level. If the CPU does any sort of speculative optimization, through things like caches or branch prediction buffers, then it likely can be the target of a spectre attack. You can try to add spectre mitigations to your language's compiler but, as the article discusses, this approach is an uphill battle.
Purity is enough, but purity is not useful. If java script was entirely pure it couldn't do any IO. You couldn't do anything with the string compare result because that requires IO to know what it is done, and that IO is not allowed.
Pure algorithms are a useful thing in programming. However pure algorithms are not useful on their own, you always need something impure to get the output out.
I think it is the same. However he is right when he wonders if this is enough for a browser scripting language. A browser scripting language requires some impure things to be useful.
An attacker could probably use a remote server and mesure the timing differences between http packets.
It doesn't matter if your timing source is high noise with lots of jitter, the attacker can repeat the mesurements over and over again and filter everything out.
Even if the attacker can only pull out a few bytes per second, that might be enough to leak something critical like and encryption key or an ASLR offset.
There is a known simple mitigation. Don't JIT random JavaScript, back to interpretation. Of course that means there is no use for V8, which is why it's not in this paper.
V8 has a JIT-free mode now, and it's still pretty damn fast in real world situations. It looks like in synthetic benchmarks they saw up to 80% decrease in performance, but in real-world applications they saw as little as a 6% decrease.
There are going to be use cases that would break, for example with the Web Audio APIs, because it would make it impossible to precisely time or trigger events.