You're right. Just compiled the following program with clang from Xcode 14 (Apple clang version 14.0.0 (clang-1400.0.29.102)) without issue.
#include <stdio.h>
struct foo {
uintptr_t p __attribute__((xnu_usage_semantics("pointer")));
};
int main() {
struct foo f = { 100 };
printf("%lu\n", f.p);
return 0;
}
Doesn't seem -Wxnu-typed-allocators (which I forgot to mention above) is present in my version. Didn't test the others.
I also just realized that the latest tagged release of https://github.com/apple-oss-distributions/clang is 800.0.38, my clang is reporting 1400.0.29.102. Is Apple no longer releasing the source for their compilers?
Apple never did release everything that their compilers do, that is why cppreference has a column for Apple's clang, and why watchOS can use bitcode as binary format, even though the official LLVM bitcode isn't stable.
I’m a bit curious how this confers a security advantage. Isn’t the kernel clearing free pages before handing them out? Or does it not bother when it’s a kernel allocation?
If the latter, wouldn’t that be an obvious risk mitigation without even needing to segment by type (ie only hand out zeroed pages for allocations)?
If they’re being zeroed out, then I’m not sure I understand how grouping by type improves UAF security since the attacker couldn’t control the contents.
I’m sure I’m just ignorant here since there’s so much research into this type of hardening. Genuinely curious.
It’s not pages. It’s individual allocations. When free-ing and allocation, it returns it back to the free list to be popped off the next time an allocation of appropriate size comes along. Some implementations have a stochastic element to randomize the freelist entry returned to alloc. A type segregated heap mitigates many classes of type confusion exploitation by preventing confusing objects in use-after-free scenarios. It’s also incredibly expensive to zero out free-ed allocations each time.
The big issue you're trying to address with type based segregation isn't so much the leak of uninitialized data, but rather an attacker managing to get the executing code to have more than one reference to a single piece of memory, but each reference thinks that memory is a different type.
They give an example in the "Type isolation" section where they have two different types, but one has controllable value, and the other has a pointer. In this scenario the attacker has some way to read the time field. What the attacker does is get you to heap allocate a timespec struct, free it but keep the pointer (classic dangling reference), and then get your code to allocate an iovec. Assuming the attacker has done things "right", the iovec that they allocate will be allocated at the address of the timespec that was freed.
Now, it's reasonable to believe that the attacker has someway to get the time described by a timespec, so that is what the attacker now does, except now the dangling timespec pointer is pointing to what is actually an iovec, and so the tv_sec field is no longer a real time value, but is actually a pointer. So when the attacker reads that time out, they can convert it back into an address in memory, which gives them the ability to bypass address space randomization.
Now, if the attacker has complete control of the tv_sec they can also read arbitrary memory by setting the time on the ostensibly dangling pointer, and then getting the kernel to read values from the iovec.
It should be fairly easy to see how separating where different types of object are stored breaks this path - the most the attacker can do with a dangling timespec is point it at a different timespec, ditto for iovec.
Obviously this is not some magical Security Is Fixed panacea - the underlying use after free bug is the issue - but it does make actually exploiting that UaF harder.
The post references a few other implementations of type segregating allocators, which also have blog posts that may provide some different details (although obviously the core type segregation portions also have a lot of overlap).
I haven’t read the whole thing yet but just zeroing allocations (see comment below on allocations versus pages) is not a full fix, because a UAF can come through a dangling pointer. What you need to mitigate against this is preventing allocations from being reused. This is infeasible to do perfectly because it just means you leak everything but in isolated cases you can do things like prevent different types from being given the same allocation (and thus allowing for shenanigans when code does a type confusion) or do other kinds of segregation and randomization to make it difficult to predict when it will be coming back.
I pondered that myself and then realized we’d intersected on some security of iMessage stuff the other day, so it’s reasonable for them to ask “is this a marketing shill?”. Which to be clear I’m not, I think this stuff is interesting, just as I think the Google security posts are.
(I just googled and it does look like it’s not as obvious anymore, apparently DJ Olliej is much more popular :D)
It just would seem like a more interesting question if this post was a link promoting your private thoughts rather than a generic link to the public blog of one of the largest companies on earth. But maybe the question wasn’t really related to the post in particular.
Like I said we were talking about iMessage security in comments yesterday (or maybe this morning?) so presumably if the next time they saw my nick was in an Apple blog submission they became suspicious that I was a shill. Given that specific context I don’t think it’s wholly unreasonable to question things. The phrasin of the question is obviously unpleasant to me as it does come off as accusatory (due to the conspiratorial implication you get from the “no weaseling” text). But again if someone was a shill you probably would want a question like that. But I’d expect a shill to just not acknowledge the question - it is afaict being fairly heavily downvoted which I don’t think is reasonable either (because everyone loves fake internet points :) ) as that would benefit a shill/marketing person.
Anyway to be super clear again: anything I say is my personal thoughts and opinions and in no way reflects what any of my employers, past or present may be thinking or doing.
I guess I could put that in my HN bio? I hadn’t previously because I do try to separate my identity from my job, as when I first started out in tech I did not do that, and it was unhealthy.
I actually expected this to have already been posted and couldn’t find it - so the submission went through.
Anyway I haven’t ever gone out of my way to hide my employers, and I’m a tech worker in the Bay Area so I’ve worked a multiple companies including Google and Apple.
That said if you’d rather I delete this and wait for someone else to post it I can do that?
Obviously anything I say is my personal view and not reflective of my employers, past or present, and I’m only ever going to submit things that /I/ think would be of interest to HN.
[edited to make sentence that conform to silly societal rules like “must follow basic rules of English grammar”, “must not have absurd amounts of ambiguity”]
I get the innuendo, but it’s a reasonable thing to think about. The problem of course is that plenty of small companies, startups, and I guess blogs exist where someone might be proud of their work and want to share it on HN which I don’t think should be outright banned.
But then you also have the periodic content less and clearly marketing content that ends up on HN front page, which I always find deeply suspicious, so..?
It's reasonable and if you think something is wrong you mail hn@ycombinator.com. But there's no hunting woozles on the forum itself otherwise it would be an infinite woozle hunt, as you know.
That’s not what this.
This is largely about mitigating the damage that can be done once someone finds a memory safety error.
Things like pointer authentication, memory tagging, CHERI, aslr, stack cookies, etc, etc are all mitigations to limit what can be done once someone finds a memory error, and all of these things are relatively expensive costs incurred as a result of the lack of safety of the code being written. I am somewhat curious as to whether anyone has done a like vs like benchmark of “real” code with and without these mitigations (and i mean the entire system, libraries, and kernel because otherwise the benchmark picks up those perf hits).
We can then compare that cost to the performance gain you get from pointer nonsense, etc people do for performance vs comparable rust/go/swift code that is memory safe.
Obviously the mitigations can’t be removed (yet?), but I would like to be able to point to some comparison when people start talking about how much slower safe languages are.
> Things like pointer authentication, memory tagging, CHERI, aslr, stack cookies, etc, etc are all mitigations to limit what can be done once someone finds a memory error, and all of these things are relatively expensive costs incurred as a result of the lack of safety of the code being written
Presumably CHERI/MTE/etc. hardware support would help (at the expense of die area and 128-bit pointers); it would be nice to see something like this in Apple Silicon.
The cheri, mte, etc have cost with hardware support.
MTE consumes huge amounts of memory (if you have an N-bit tag for every M bytes, you burn (total ram / M * N) bits. Imagine you have say a 4 bit tag for every 16 bytes - that works out to 3% of memory being used. But there are additional costs: now to read from a pointer you have the load cost of the pointer itself, but also the tag. Now I assume cpu hardware folk know how this kind of stuff could be faster than the obvious thing I would try, but no matter what happens this ends up using more memory and more processing time - a cost that would be avoided if everything was in a safe language.
CHERI has the same set of problems, but I never looked into it too much as the costs seemed like they’d be huge, and conceptually it’s very “obvious” (it took me a bit of time to understand that MTE is intended to be probabilistic, and how you would then make use of it - cheri is much easier to comprehend). But the same thing happens: you need more memory, and it requires more cpu time. Extra fun is that with cheri the hardware burns resources in bounds checks for pointer accesses, even though in safe languages those have already happened.
Pointer authentication has cpu time costs at least - I don’t think there is meaningful memory overhead, but clearly it has enough cost that simply “authenticate every pointer” isn’t something that can be done otherwise someone would have written a post about doing so.
But this also misses the point: I did not say “software cost”, I said “cost” without qualifiers. The reason is that it doesn’t matter whether any particular mitigation is implemented in software or in hardware, the only reason the mitigation is present is to make code written in unsafe languages less trivially exploitable.
If for the sake of argument someone wrote everything from the kernel to all of userspace solely in memory safe languages, then none of these mitigations would be necessary. So you’d get back the ram, the cpu time, the die space, etc that is all burned solely for the benefit of unsafe languages.
Solaris SPARC ADI is doing just fine, it is actually one of the first UNIX systems with proper memory tagging, and this was done under Oracle stewardship.
I am not saying these don't work, and I don't know why people seem hell bent on interpreting everything I'm saying as "X does not work" when that was not anywhere in anything I said.
I said, very clearly, that these mitigations are necessary because of code written in unsafe languages, and that these mitigations are not free.
Some basic googling says that on an M7 there are 4 bit tags with a 64byte granularity. Now before any other costs you are burn 0.8% of your memory on these tags. The other costs are along the lines of increased chip complexity (caches, logic, lookahead/ooo etc).
It may be that we're ok with these costs, but that doesn't change the fact that the costs exist, and that the exist largely to mitigate exploits in unsafe code.
You can start by DoJ security assessment of Multics, where they explicitly refer to PL/I as the reason why Multics isn't susceptible to typical exploits found on UNIX systems.
The Turing award speech of CAR Hoare about their customers clearly not wanting bounds checking disabled.
Unisys still sells Burroughs (naturally somehow modernized) as ClearPath MCP. Sales pitch, being a mainframe OS for business whose top priority is security.
We know how it goes, it just happened that while most computers weren't a target, no one cared about security beyond a couple of virus when using pirated software.
>That’s not what this. This is largely about mitigating the damage that can be done once someone finds a memory safety error.
Hmm, seems like Apple should do better when there are perfectly good languages available which could prevent memory safety issues altogether. Seems like Apple would want to compete with Linux rather than watch it race away into the memory safe future, leaving XNU behind with legacy memory safety problems.
Ok, so rather than writing a comment where you shit on the people who work at Apple and actually do care about security, I suggest you actually read the article. Because the article lists those safe languages, and states why the existence of those safe languages does not solve security.
The engineering world has long since moved on from CMU/OSFMK. Rewrite the internals outward using formally-verified, capabilities-secure (seL4 with MAC tagging added) and zero-copy message passing RPC in thread-safe Rust. Take IOKit and make it entirely user-space. And make almost everything run in user-space. The need for worrying about placing and cleaning individual allocations goes by the wayside when there's fewer "cooks" throwing spinning fragile plates around and trusting that their aim is perfect.
Apple could rewrite it with the same syscall ABI but make it immensely more secure, compartmentalized, and efficient. They won't because of the performance pressures and rewards their employees are under. This is how big companies fail repeatedly and are bested by startups.
The kernel is massive, and rewriting the entire thing takes a serious investment of time (and therefore capital)
Zircon is the closest attempt I’ve seen and it has taken years to be shipped on a single, battery-less device. It might be close to ready, but it was likely expensive for Google, which is now scrambling to rein in R&D spending.
Meanwhile, the incremental security improvements that Apple and Google have made have driven up the cost of zero days sharply on their devices
Fuchsia/Zircon is a fractional ray of hope but it's not formally-verified and it still uses methodologies from the 1990's.
You misunderstand the approach to migrating: not rip-replace rewrite the world all at once, but refactoring internal areas section-by-section. There's no law that says syscalls have to change.
There are few incentives in corporate land to rewrite but to keep ducktapping together the past without addressing the underlying antique, brittle software engineering underneath.
seL4, Amoeba, Plan 9, MINIX 3, and DragonFly show what's possible. Neptune OS is interesting. Capabilities-secure, DAC, microkernels are the future. Ideally, kernel modules shouldn't exist and everything apart from memory mapping, security, and IPC should be a "userland-ish" process. Any complicated operations involving multiple components should be software transactional across multiple "processes".
Monolithic kernels with file servers and every conceivable library built-in, written in C, with enormous wads of firmware blobs and millions of lines of fragile drivers without any internal protections are the past. Patching together piles of mud is still patching mud.
Right. Don’t let the perfect be the enemy of the good. If anything, Apple has demonstrated on several occasions that they are able to commit to long-term changes and the sustained effort they may require. It depends on their analysis, but if they think they should do it, they probably already have a team working on it somewhere and a plan to bring it progressively over ~5 years, with a series of incremental steps.
The strategy that was communicated at WWDC was that every time a kext model has an userspace alternative, there will be one year transition, and then the following OS version won't support the former kext variant.
The long term roadmap is to keep moving the infrastructure to userspace this way, until no one else besides Apple themselves can touch kernel memory space.
True, the point is that they are taking multiple paths to make macOS/iOS more secure and (we already discussed this to death), more micro-kernel like, even if not pure.
A kernel with Apple only code, is more stable and secure than letting everyone to the party.
Yep. It does too much and it's still stuck in he 1980's. If you want security: a formally-verified microkernel that doesn't do much but memory mapping, IPC, and context switching. Everything else needs to be running either in a VM like BEAM or formally-verifiable, statically-compiled, safe "userland"-ish binaries.
Going beyond PNP and ACPI, per-OS hardware drivers could be made unnecessary if vendors provided "ACPI"-like APIs that exposed standard, introspectable functions and data structures to the OS in a unified way independent of bus and architecture, i.e., a GPU, a block device, a serial port, a sensor. A universal "IOKit" would required less integration and development effort for all to support, and every OS would get support for free.
Also as other have pointed out, Apple is migrating all kernel extensions into userspace, their long term roadmap is to transition into a micro-kernel like experience, even if not a pure one.
I can see references to these builtins in the latest xnu source[1], but not in upstream llvm[2], or Apple's forks of llvm[3] and clang[4].
It's been possible to build your own XNU[5] for a long time. Is that still possible now?