One (non-generalizable) solution is to avoid using writable function pointers, to help maintain control flow integrity.
For example, in libopus we have platform-specific SIMD optimizations that are chosen at runtime.
The common solution is just to have a function pointer that points to the accelerated routine. But there are only a limited number of choices for each routine. So we put pointers to them in a static const array, which gets stored on a read-only page. Then instead of storing a pointer, we just store an index into this array. This index is chosen based on the available instruction sets detected at runtime. This lets us use the same index for every accelerated function.
Then, we pad the array size out to a power of two. To call an accelerated function, we mask off the upper bits of the index. Even if the index gets corrupted, we still call one of the functions in the array. So there's only so much damage it can do.
Obviously this doesn't apply if the set of functions you need to call is open-ended (e.g., a callback, like nginx was using). But it seems like a good pattern to follow when it applies.
Is there a reason why you did that that way other than to avoid writeable function pointers?
Also, that just sounds like how switch/case is sometimes implemented. Have you considered just using a switch/case statement instead of manually managing function pointers and the like?
It also means that you don't need to pass around (a pointer to) a big table of function pointers throughout all of your functions that need to use accelerated routines. Instead you just pass a single integer index. But that is pretty minor.
It can be pretty similar to a switch in practice, but I think there are a few differences. One is function argument handling. Each version of an accelerated routine has to be an independent function, compiled in a separate compilation unit, because you have to limit the instruction sets available. So you will duplicate all of the function call setup overhead in each branch of the switch. This probably gets optimized away by a decent compiler. Even if it does, you still wind up essentially duplicating the function table at every call site. That does not seem likely to get optimized away. You could dispatch to a common thunk which selects the accelerated routine to call, but now you have two function calls instead of one. I am also not sure that switch statements handle the default case as cheaply as doing a single bitwise AND with a small, compile-time constant.
But if you have a use of function pointers that can be replaced by a simple switch statement, that is probably the better approach.
Even in the dynamic case, you can still use an possibly malloced array and verify the index is valid, which is a nice improvement over leaving function pointers lying around in the vicinity of buffers.
I don’t think they actually found an exploit for Nginx: they state that “we assume such a memory bug exists”, rather than saying outright what bug they exploited. It almost sounds like they planted a bug and then exploited it in a binary that was hardened (using this non-standard hardening technique, i.e. not exploiting a binary that was ever used in a real deployment of Nginx).
Exploit != vulnerability. You develop an exploit to take advantage of a specific vulnerability. An exploit for a hypothetical vulnerability is a valid concept, and their conjecture that such a vulnerability may exist is justified in the paper.
I always thought this is example of basic buffer overflow in C - what's different with this example? If somebody asked me to write buffer overflow example in C I'd write something that looks exactly like this, this must have been known for years, what am I missing here?
The "classic" buffer overflow involves overwriting the return address by overflowing an array on stack. Low-fat pointers can protect against that problem [1].
You're suggesting we replace C, which isn't necessarily a bad idea, but there are considerable hurdles, as it's a lingua franca. C has an excellent suite of compilers for every platform you can name, excellent development tools, every serious programmer already knows the language, and it easily hooks in to platform-specific functionality.
If we want to replace C, we have a few languages to choose from: BetterC, Zig, Rust.
If BetterC were a strict subset of C, things would be quite different, but it's not, so it has the just another language problem.
If we want to go the subset route, we already have MISRA C, which blocks various categories of bugs and undefined behaviour. Crucially, it's possible to mechanically verify that a program is MISRA-compliant.
No, it's more than that. While that struct might be a generic overrun, the question is how you check bounds on filling buf. The paper is discussing that schemes to constrain buffer size using a global hardware or software address-based size constraints ("low-fat") as bounds checks are less safe than claimed.
That will never happen with UNIX based OSes around, so what we really need, as much as I would like to nuke C, is to have developers accept some variant of "Safe C".
Solaris has it thanks to SPARC tagged memory.
Problem is when would we ever get it as mainstream, always enabled, in other platforms.
Rust is getting some traction, and much web infrastructure has moved away from C/C++ to C# or Go or Java. After decades of little progress, we're seeing forward motion.
True, myself have moved definitely into Java/.NET worlds back in 2006, although C++ is still an option for what is unavoidable to do without it. Maybe in the future their canonical implementations can even be fully bootstraped or have replaced the C++ layer with something else.
But even if all userspace developers move to safer languages, we still have the UNIX derived kernels, PIC and other microcontrollers, routers and plenty of other embedded stuff to worry about.
Even Microsoft with their Azure Sphere project, which is supposed to be all about secure IoT, still bundles a Linux kernel with it.
There's no way to "fix C" in a way that saves you from the cost of fixing all those broken programs written in C. The best a language-level solution can do for an existing project is make it easier to figure out what specifically needs fixing. Or blow up your code and run time with checks, I guess, but if that was technically feasible, to say nothing of political challenges, those projects wouldn't have been in C to start.
I feel like Elrond at the council of Rivendell: "you have only one choice." Except throwing a ring into a hole is a comparatively well-defined task relative to rewriting an unknown number of crappy pieces of software. Ed: also there's no conveniently situated group of people to whom to assign the project.
> Or blow up your code and run time with checks, I guess, but if that was technically feasible, to say nothing of political challenges, those projects wouldn't have been in C to start.
Most of the infrastructural projects you're referring to predated modern languages. There is no reason, for example, for an MTA to be written in C today. We use MTAs written in C because MTAs are old.
They had pascal and friends, didn't they? Which are safer, safe-ish, right? But I guess today you could eat the cost of automatically inserted runtime checks on software designed to push the limits of 30yo hardware. Still won't fly for Linux or embedded systems...
Schemes for memory safety of C code have already been applied to UNIXy OS's. SVA-OS with SAFEcode used FreeBSD.
Far as embedded, there were some done in Ada and a Java subset. There's even real-time GC's for that kind of stuff. One or two companies sell Java processors that run it natively. CompSci stays doing CPU extensions that could work with MCU's.
The versions of Pascal that are practical aren't any safer than C.
With a properly designed programming environment, you don't need many bounds checks to begin with. Almost every embedded system can easily afford the cost of the few you can't eliminate.
While they aren't Rust level safe, they eliminate like 80% of the common C safety errors.
Thanks to explicit conversions, proper enums, reference parameters, actual string and vector types, memory allocation based on types instead of arbitrary numbers, type safe linkage thanks modules.
So while use after free or double frees would are still an issue, many other classes of typical C errors would be gone.
Conversions and such aren't where the memory-safety-related vulnerabilities in C++ come from nowadays. They come from memory management problems, which Pascal doesn't help with.
He probably meant to type C/C++ or something but typo'd it. The set of vulnerabilities is exactly the same. If Pascal still lets you do heap corruption, it's only a matter of time before RCEs become common. So yeah, I'm convinced on that point.
Pure userspace software is the easiest to fix; there has already been a huge shift to {safer,easier} languages for most new(-ish) software. C/C++ is mostly used for existing software where other options we not {possible,practical}. Porting to another language will probably happen now that we have options like Rust, but it will take a massive amount of work.
However, one place where we need more (and better) alternatives to C is the library ABI. If you want to make a library that can be used by as widely as possible you either use C or something that can generate libraries with a standard C-style ABI. That's the FFI tools of most other languages expect. (C++ ABI is also used, but name-mangling complicates interfacing with other languages) Rust's #[no_mangle] helps, but what we really need is standardized ABI with a safer calling conventions and memory handling semantics that is still very generic, without any language specific baggage or other complications. Until that exists, key parts of the infrastructure will stay with C out of necessity.
> kernels
That's one of the infrastructure pieces I was referring to. Other parts that depend critically on a generic C-like ABI interoperability might include driver-based stuff like libGL.so.
> microcontrollers
While I needed the low-level properties of C (w/ assembly) when I was implementing {802.3,IP,TCP,HTTP,SNMP} on a Z80-clone with 16kB of RAM, modern "microcontrollers" often aren't very "micro" and basically a lightweight linux host. They will move to other languages following after regular userspace software.
However, this is assuming there is even a need for safer languages on the micro, which is true when it's accepting data from a hostile source like a network. A much better idea is to simply not connect your embedded device to a network. A safer language isn't necessary if you aren't controlling anything dangerous, your input is limited to reading buttons states with GPIO, and any bugs can be fixed with a reset from a hardware watchdog timer.
> other embedded stuff to worry about.
There is defiantly a lot of embedded crap that need to be junked and replaced asap, but that's not really a C problem. Buffer overrun safety is great but it doesn't do much when the device has a backdoor password clearly visible in the firmware, ships with the admin password "admin", or grants remote admin access when the password is "" (empty string).
We do need to fix C, but there are, unfortunately, far worse problems with embedded devices.
Regarding ABIs, ChromeOS, Android, Windows (COM/UWP/.NET) and the old guard mainframes are living proof that one can have a library ABI completely unrelated to C.
On the microcontrollers side, there are a couple of vendors that I usually advocate, which do sell Basic, Pascal, Oberon, Java implementations, however if it surprising and refreshing as well, that they still manage to keep in business, given how hard is to move most embedded devs away from C89 + extensions/Assembly duo.
Talks about how to advocate C++ to such devs are a now a recurring theme at C++ conferences.
Even Microsoft which uses a mix of C# and Rust for IoT Edge, they went with Linux and C on Azure Sphere, which is supposed to be their mark on secure IoT, while I was expecting that at very least they would use the opportunity to push their Checked C project, given that Azure Sphere is being pushed by the same MSR group from Singularity and friends.
You're referring to canaries, which are special values used to detect if overflows occur.
Canaries are usually checked on heap deallocation (resp. stack canaries are checked on stack frame deallocation, i.e. function return). Checking them at any other time is expensive and incurs high overhead. They also don't catch all overflows - an overflow with an attacker-controlled index could easily skip the canary and overwrite whatever lies past it.
The problem of memory bounds checking is solvable if you decide to forego memory, processing time, and/or compatibility with existing code. Trying to achieve memory safety while maintaining those three properties is very hard.
Most higher-level languages simply perform bounds checks rather than relying on canaries for safety. Although they can be combined, if your canaries are hit it means your HLL isn’t doing a good enough job of enforcing bounds.
They underflow lowcase_header (or rather, assume the existence of an underflow bug) to overwrite log_handler, a function pointer that is called when an error occurs.
I do find the authors' presentation a little disingenuous, because the purported underflow bug does not actually exist - the authors assume it does, then proceed as if they have full control over the function pointer (including the ASLR leak necessary to obtain proper gadget addresses).
I mean, sure, fat pointers are the "right" solution, but the point of "low-fat" pointers in the original paper [1] is for greater compatibility with existing C code, including at the ABI level. Adding fat pointers to C (which is what your proposal is) is incompatible with those goals. Page 2 of the paper discusses this.
Yes, and that's a laudable goal, but it's not the same problem as the one the "low-fat" pointers proposal is trying to solve. The "low-fat" pointers proposal attempts to improve the safety of existing code.
Is your complaint that because we don't have a perfect automatic solution we shouldn't bother offering any solutions, not even incremental opt-in ones?
I never said fat pointers shouldn't be added to C. The point is that they're irrelevant to this paper, which demonstrates a weakness in a mitigation designed to improve safety of existing code.
I still think that the Deputy's project (from UC Berkeley) approach (https://barnowl.org/research/pubs/07-hotos-linux.pdf) is the most practical for adding bounds checks to existing C code. Basically annotate functions and data structures with the, usually available, bounds information, e.g.:
f(int *a, int n) -> f(int *count(n) a, int n);
or
struct { int len; int *values; }
->
struct { int len; int *count(len) values; }
And other annotations for unions, etc.
This actually works quite well in practice - the cited paper involved us applying these annotations to a complete, bootable linux kernel. You do have to be willing to tolerate limited code changes and a few "trust me" casts
Disclaimer: I worked on the overall project, though not significantly on the Deputy part.
int s[3];
int *p;
f(s, sizeof(s)/sizeof(s[0]));
f(p, n);
f(int *count(n) a, int n);
(which has a mistake, n should be size_t not int), or:
int s[3];
int* p;
int[] a;
f(p[0..n]); // convert pointer to phat pointer
f(a); // or use existing phat pointer
f(s); // static arrays just work
f(int[] a); // the declaration of f()
The point was not to design the "nicest" way of adding safe arrays to C, but a way that captures existing uses and preserves binary compatibility, including preserving memory layouts. For instance:
uint16_t count;
uint16_t *COUNT(count) buffer;
uint16_t *BND(buffer, buffer + count) pos; // pos is within the specified bounds
which captures that buffer is a known-size array, and that pos ranges within buffer's contents. Of course, writing from scratch, one might prefer
uint16_t[] buffer;
uint16_t pos; // An offset within buffer
but that's not much help when incrementally modifying an existing code base, or interacting with some external library one either doesn't have source for or doesn't wish to modify.
> that's not much help when incrementally modifying an existing code base
I am doing exactly that now that I converted the Digital Mars C++ and the DMD D programming language compilers to D - converting the pointer based array code to [] based arrays. Doing it incrementally is very doable and easy.
It's not viral like const is. A slice can be trivially created:
I agree with the framing here, and I agree a small bit of syntax would go a long way.
FWIW, you will see a class called StringPiece copied all over a lot of open source Google projects, including Chrome, Ninja, RE2, probably Android, etc.
That is essentially a pointer-length pair. I agree it would be nice to have a dedicated syntax for it, and not a class that everybody has to copy around.
C++ 17 names this string_view (long overdue!)
I just Googled and apparently there was a problem with array_view ? That's a shame.
The compiler does this with other “library” functions like malloc and printf. There’s no reason a compiler couldn’t convert to a built in for this new type.
How does this fix it? It automatically includes a size_t with all array types but does it also rely on the programmer setting the size_t value correctly? If so there will be the same problems. It seems like this is exactly the same as just passing a size argument along with all array-type arguments, which is already common practice.
> does it also rely on the programmer setting the size_t value correctly?
For static arrays, no, the compiler does it. For turning pointers into phat pointers, yes, the user has to supply the length. But he only has to do that once.
> exactly the same
It isn't, for the reasons:
1. The compiler knows about it, so it can be used for optimization purposes, and to optionally insert bounds checking or not.
2. Consider why C has structs. It's because it's a lot less error prone to group related values together, rather than having a collection of independent variables. It makes the code easier to read, too.
I remember when C added function prototypes. It was an unmitigated and welcome improvement. If I was to add two features with the most impact, simple to implement, and no backwards compatibility issues, it would be:
1. phat pointers for arrays
2. nested functions
(Nested functions have a surprising ability to untangle ugly control flow, including getting rid of most uses of goto's.)
They do not want to break single-pass compilers. Nested functions with no forward declarations will require a multi-pass compiler. I think your D compiler is multi-pass already.
> In this paper, we have illustrated a new type of attack that can bypass bounds checking spatial memory safety techniques that protect allocations.
No they have not. Bounds checking only outer structs doesnt do bounds checks on inner buffers.
They haven't analyzed traditional bounds checks in C: asan, fortify, c11 Annex K, Intel MPX, only a broken application, which does none of these, and uses a wrong bounds check.
tl;dr: the authors overflow from one struct field into another struct field, thus avoiding any allocation protection scheme that protects individual heap allocations from each other. Most of the paper is just spent implementing basic ROP-style gadget hopping.
I’m not actually clear on what’s new or novel here. Allocation protection mechanisms exist to prevent one allocated block from overflowing into another allocated block. They usually don’t protect allocated blocks from themselves.
We've got to get more code out of C and into something with subscript checking. None of the kludges for fixing C work.