New code for SIGILL faults to help identify misbranches on OpenBSD

actionfromafar · 2024-02-22T19:50:23 1708631423

Can someone explain how this works? What would trigger a misbranch and how can the OS tell the difference? I read the page and the link but I don't understand.

wahern · 2024-02-22T20:01:49 1708632109

As a ROP mitigation measure some recent CPU models have a feature that can require code to branch to a special instruction signifying a valid target; jumps to an address that doesn't contain that instruction result in a CPU fault. On OpenBSD such a fault is translated to SIGILL, apparently, which can also occur when the CPU encounters other issues, like an invalidly encoded instruction. This patch differentiates the new type of fault so it's easier to identify within process signal handlers by adding a new reason code, ILL_BTCFI, to SIGILL info (siginfo_t).

The full patch is at https://github.com/openbsd/src/commit/88580652a3669007813512...

I'm not sure if this patch is noteworthy on its own, but perhaps it was posted because it implies this CPU feature is now supported or soon to be supported by OpenBSD.

brynet · 2024-02-22T22:48:35 1708642115

> I'm not sure if this patch is noteworthy on its own, but perhaps it was posted because it implies this CPU feature is now supported or soon to be supported by OpenBSD.

OpenBSD has supported mandatory BTI/IBT on Intel CPUs (11th Gen+) and arm64 (Apple M2) since 7.4 (Oct 16, 2023).

https://www.openbsd.org/innovations.html

"Mandatory enforcement of indirect branch targets (BTI on arm64, IBT on Intel amd64), unless a linker flag (-Wl,-z,nobtcfi) requests no enforcement."

https://marc.info/?l=openbsd-ports&m=168899645126958&w=2

OpenBSD recently disabled retpolines by default because they are incompatible with IBT, so there has been some work done recently that this patch could help with, e.g: adding missing indirect branch target instructions (endbr64), previously hidden by retpoline. OpenBSD has done a lot of work to push these changes into the entire software ecosystem.

guenthert · 2024-02-23T08:56:14 1708678574

Did I get that right? The company which brought us SPECTRE now brings us a feature requiring us to disable SPECTRE mitigation?

I haven't followed this and surely misrepresent the facts above. This is just a plea for someone in the know to clarify the situation.

sweetjuly · 2024-02-23T02:16:17 1708654577

For more context, CET and friends don't actually act as a ROP mitigation in it of themselves. CET is a coarse forward edge CFI technique. For completeness, you must pair it with both some software function type assert (prevent argument type confusion by swapping function pointers, etc.) as well as a backwards edge CFI mitigation (such as secure a shadow stack for storing link addresses). Both are quite hard problems.

ARM's Pointer Authentication extension is an interesting alternative which does all of this in one feature by cryptographically signing pointers in memory.

dwattttt · 2024-02-23T09:24:53 1708680293

CET is both Indirect Branch Tracking and Shadow Stacks; IBT is the coarse towards CFI, but SSs are a direct ROP mitigation, but enforcing that any address returned to must have been CALL'd from.

admax88qqq · 2024-02-23T02:19:25 1708654765

Jesus. The amount of engineering that has to happen to protect against C is astounding.

wahern · 2024-02-23T03:47:52 1708660072

Protecting JIT-compiled code, e.g. within browsers, is at least as important a concern. Some would argue much more important, especially as usage of services like Cloudflare Workers increases.

Techniques like CHERI could replace MMUs in some environments, potentially improving performance.

admax88qqq · 2024-02-26T17:21:49 1708968109

There's no way to express an out of bounds memory read/write in JavaScript. So you don't actually need an MMU, or CHERI, or pointer signing, or any of the stack smashing mitigations put in place for C.

You do have to generate correct code but that's actually a solvable problem, web assembly did it.

In theory this could unlock tons of performance (and free up some silicon) if we could ever get the rest of the system away from C

MuffinFlavored · 2024-02-23T01:10:10 1708650610

> As a ROP mitigation measure some recent CPU models have a feature that can require code to branch to a special instruction signifying a valid target; jumps to an address that doesn't contain that instruction result in a CPU fault.

How much of a performance flow mitigation is this, 3-5%?

jeffbee · 2024-02-23T01:35:35 1708652135

Compiler-based CFI has low single digits impact. Hardware with ENDBR will be hard to notice, "none" being a reasonably accurate estimate of the impact, unless your code is weird and unusually dense with valid indirect branch targets.

kevingadd · 2024-02-23T01:41:51 1708652511

in fast paths almost all the code you're running is in icache, and even the decoded micro-ops are probably cached, so I bet it's even cheaper than that.

schoen · 2024-02-22T19:52:34 1708631554

This is an instance of control-flow integrity.

https://en.wikipedia.org/wiki/Control-flow_integrity

I guess (although it doesn't specifically say) that this is making use of Intel's CET feature?

mcarmichael · 2024-02-22T20:00:52 1708632052

As described, something more akin to

https://en.wikipedia.org/wiki/Return-oriented_programming#Br...

That page has a reasonable summary of the exploitation techniques and history motivating this work.

schoen · 2024-02-22T20:01:51 1708632111

Interesting! I guess that particular mechanism is ARM-specific?

Findecanor · 2024-02-22T20:59:25 1708635565

The article mentions both 64-bit ARM and x86. Intel's ENDBR64 (part of CET) and ARMv8.5A BTI work largely the same. The instructions are NOP on older processors or if not enabled by the OS.

BTW. A similar extension is also about to be approved for RISC-V: "Zicfilp". It also repurposes an instruction that was previously a NOP.

schoen · 2024-02-22T22:00:40 1708639240

Cool, thanks!

So, ROP itself is a (frequently very effective) workaround for W^X memory protection features, where executable code itself should be unmodifiable at runtime. I wonder if there's a "next" CFI attack that calls existing functions that are actually defined as such, but in a weird order, or with weird arguments.

The ability to do that effectively would depend on lots of stuff, including the calling convention (as the stack or heap are easier to corrupt than registers).

Maybe someone is already writing an advanced ROP gadget-search that looks for "real functions that can be combined to make ROP gadgets by calling them in a bizarre way". (This is already much less likely to occur in a real program ... I think ...! But maybe the return-to-libc phenomenon is still a rich source of vulnerability?)

wahern · 2024-02-22T22:14:02 1708640042

Interestingly, RISC-V's Zicfilp proposal includes an optional, 20-bit label operand so that call sites and targets can be paired more strictly.

I'm not sure how or if it addresses trampolines and similar thunks that interpose callee and caller, but which don't necessarily touch arguments. Such thunks are quite common in dynamically linked code as ELF and Mach-O do lazy loading by default--dynamic function symbols are initially small thunks that load the dependency and then forward the call, restoring registers and the stack without having to know the call signature.

seppel · 2024-02-22T20:28:09 1708633689

X86 has ENDBR32 and ENDBR64 (end branch) for this.

saagarjha · 2024-02-22T22:14:58 1708640098

Neat. Is the expected consumer here meant to be debuggers, or a crash with security implication?

anthk · 2024-02-22T22:35:00 1708641300

Similar on approach on the 'S' malloc.conf setting I guess.