Hacker News new | past | comments | ask | show | jobs | submit login
L4 microkernels: The lessons from 20 years of research and deployment (data61.csiro.au)
218 points by snvzz on Sept 20, 2016 | hide | past | favorite | 95 comments



The paper doesn't mention QNX at all. QNX is similar to early L4, but they've taken slightly different paths.

L4 got rid of "long message passing", in favor of shared memory and interrupt-like IPC. This is great for the kernel - no copying delays and no buffering problems. But it means that communicating processes have to share some memory pages and cooperate properly. If the shared memory page is something like a chain of linked buffers, one side may be able to screw up the other side. If that's how you talk to the file system, it may be possible to attack the file system process that way. This won't crash the kernel.

Because L4's API is so limited, it's used mostly as a VM, with the usual bloated Linux kernel on top. QNX has only a few more kernel features than modern L4, but it can implement the POSIX API with just support in the C library. When you call "write()" from C/C++ in QNX, the C library makes a MsgSend request which is directed to a file system service, device service, or network service. The code for this is tiny. When you call "write()" from C/C++ on L4, you're usually just making a Linux system call to an ordinary big Linux in a VM. Eventually that Linux talks to other processes which do the physical disk I/O. The problem is that the giant Linux kernel is still there, just a bit more isolated from the hardware. The total amount of code that can break has not been reduced significantly.


There are efforts to build OSes directly on L4 kernels as well. Most notably Genode. http://genode.org/

They recently added support for running on seL4 too: https://sel4.systems/pipermail/devel/2016-August/000967.html


L4 can be used that way, but in purpose-built systems it can also just be used as a simple stratum on which to build applications directly. If you were using L4 to build a baseband package, for instance, you probably wouldn't run a full OS on top of it.

My intuition, not carefully checked:

I'd rather be using QNX than Linux on L4, but I'd rather be using just L4 than QNX.


How do these compare to Mach IPC in OSX?


Here's a Linux/L4/QNX comparison.[1]

My favorite table:

    Kernel         Source Lines of Code (SLOC)

     OKL4                 13.8k
     QNX Neutrino v6.3.2  23k
     Linux Kernel 2.6.0   5.2M
     Windows XP           Kernel size unknown but totals at approx 40M
     Windows Vista        Kernel size unknown but totals at approx 50M
Modern microkernels are so small that they can be thoroughly debugged, or formally verified.

[1] http://www.gelato.unsw.edu.au/IA64wiki/JamieLennox/QNXvL4?ac...


On a related note, QNX was open source for a while, before it was bought by Blackberry. Do you know if anyone managed to snag a copy of it?


It was available for free on a floppy, but never open source.


It was never open source in the sense of the OSI (Open Source Initiative) definition, but the source code was available to the public for a while:

> https://en.wikipedia.org/w/index.php?title=QNX&oldid=7308275...

"In September 2007, QNX Software Systems announced the availability of some of its source code. [http://www.qnx.com/news/pr_2471_1.html]

On April 9, 2010, Research In Motion announced they would acquire QNX Software Systems from Harman International Industries. On the same day, QNX source code access was restricted from the public and hobbyists."


Exactly this -- So, someone must have grabbed a copy of the source at that time, and I'd love to have a read through it one day.


Yes, and much of the QNX development community abandoned QNX when they did that.


Kind of off-topic, but where can I find the OKL4 source code?

All I have found is a really old version that doesn't seem to have much in common with the current one.


This is from 2015, not sure if it's that old.

https://github.com/tomga/genode/commit/c8a64b5c3465cde1723b2...


I think the latest version is 5 or 6. The one you pointed to seems to be pretty old:

    OKL4 Release 2.1
    ================
    
    Date: 15 April 2008


L4 was created partly due to how much Mach failed in performance and such. The first generation by Liedkte was something like 5 times faster in overhead than Mach solutions hosting Linux. I haven't timed the recent ones but there's numbers of Mach vs L4 in here:

http://ts.data61.csiro.au/publications/papers/Heiser_10:iids...


Note OS X mostly doesn't use Mach IPC for anything. Calling, e.g., write() results in a system call into the BSD portion of the XNU kernel, not all that different from what happens on Linux.


That is completely untrue. While the BSD syscalls are supported, Mach syscalls and Mach IPC are also heavily used, especially internally at Apple.

Just one simple example: run "sample <process>" (where <process> is any Cocoa application) on OS X. Pretty much every thread will have an event loop powered by CFRunLoop, and they'll all be stuck in mach_msg_trap waiting for an event to occur.

Also, pretty much any sort of semi-complicated IPC on OS X is done over Mach IPC (e.g. passing a large bitmap copy-on-write from one process to another). Take a look at the ipc directory in the Chromium source, for example.


Yes, I'd assume it's more heavily used in the higher-level application layer. I was talking about basic UNIX functions.


Even Unix signals are delivered as Mach exceptions first. If the process doesn't have a Mach exception handler set up, the kernel's exception handler converts them to Unix signals and tries delivery that way: https://github.com/opensource-apple/xnu/blob/27ffc00f33925b5... and https://github.com/opensource-apple/xnu/blob/27ffc00f33925b5...


OS X doesn't use Mach IPC to emulate POSIX, but it uses (or did last I checked, which was a while ago) Mach IPC directly for things like launchd.


I would really love to see more commentary from high-level systems people on how suitable SEL4 is as the basis for a general purpose OS. It seems like it's security guarantees would be driving a lot more outside investment than it has received.


I'm having really big expectations on seL4. The modularity of a microkernel with the security of formal methods is something that would give hope of solving some really fundamental problems with the "IoT scene". I.e. having good security out of the box (which is horrible in IoT as of now), while still maintaining good flexibility regarding platforms (which will become more and more of a problem in the future when IoT devices start actually proliferating).

Edit: oh, and you can still run seL4 as hypervisor and run Linux on top of it to get the usual Embedded Linux stack where your CPU has enough juice. And still keep the critical systems safe behind seL4's capability system. Best of both worlds? Dunno, hopefully seL4 will be there someday in production quality.


L4 is great, but it doesn't solve application security problems. L4 by itself doesn't do much, and whatever you build on top of it --- if you're not yourself using formal methods, which very few real product teams do --- isn't going to inherit its security.


True, seL4 verification doesn't solve the application level security problem but it makes it more approachable in several ways.

- No longer need the verification of application make assumptions about the semantics of the operating system. Instead, concrete and verified semantics are available. This makes application specification and verification easier, and safer. Fortunately, verification is cumulative

- The seL4 project has driven the state-of-the-art of verification tool forwards by a considerable degree.

- The seL4 project has lead the thinking about proof engineering [1], the emerging field that that is to verification what software engineering is to programming, addressing the question how to develop, maintain and evolve large-scale proofs.

- The existence of something like seL4 also puts pressure on CPU manufactures to provide usable formal specifications to their customers, so that we can verify against rigorous CPU specs. CPU manufacturers have been loath to do this (for various reasons).

The full verification of seL4 came a lot earlier (by about a decade) than I thought possible.

[1] G. Klein, Proof Engineering Considered Essential.


What's your sense of the number of IoT vulnerabilities that are due to misconstrued OS semantics? Mine --- and I've done a bit of work here, but not that much --- is that there aren't that many.

Really I think it depends on what you're doing. If the work you're doing is fundamentally kernel work --- if it's all about interacting with the security boundaries of the hardware (not just "interacting with hardware", like getting the bits from an RF codec, but manipulating hardware shared by attackers, like the iPhone SEP) then L4 is a pretty huge security win.

Otherwise: the problem you have isn't your OS, but the programming language you're using to build with.


IoT isn't really target for these things. Language or spec-level security is more important. I agree with you there. Isolation mechanisms are still important for many embedded devices, though, if security is a concern. Here's a few I've seen in various products or academic programs:

1. Device driver isolation. Wacky drivers can take down the system. Better to take down and restart the driver. QNX was first I know of that did this with excellent reliability benefits. Many of them do now in RTOS space. MINIX 3 takes it to desktops and servers.

2. Monitors like in Copilot system or for recovery-oriented computing that expects input to crash or subvert main process. So, outside process is necessary for detection of anomalous behavior and recovery. Periodic restarts are another trick used in that field to clear out subtle errors that build up over time or persistent malware. Best to be in different address spaces.

3. Dedicated process, as in Nizza and Turaya, for containing application secrets where external processes can call to have random numbers generated, signatures performed, etc but not actually access the internals.

4. Same thing for logging purposes where interface between main app and logging component is write-only. Prevents accidental or malicious elimination of audit trail. I did that myself many times. Shapiro et al did it for repo security.

5. Finally, separation kernels like INTEGRITY-178B can ensure predictability of certain operations and statically enforce a scheduling policy. That's important in real-time systems where a number of tasks are running where one can screw with the other. So, they use things like ARINC schedulers with fixed-sized partitions operating in fixed intervals. Watchdog timers are also popular here. Also is a strategy for eliminating covert, timing channels at partition level for select few that need that.


The problem is it takes more than just implementing the "kernel" (as in low-level access aka HAL) work under L4. You need to apply that compartmentalization all the way through the stack, and even subdivide applications into smaller chunks of responsibility.

What you typically get is a neat subdivision of all the HAL bits, but everyone stops once they're plugging in applications. This is partly because there's normally a large amount of shared/legacy code which needs importing, and partly because people don't recognize the benefits of partitioning an app.

It's not all bad - at least you can be reasonably confident that one compromised app, or part of the HAL, can't be trivially used to compromise the rest of the system. But, as you say elsewhere here, if there's just one thing you're doing, then all that effort didn't really improve things.


I wonder if anyone will manage to create a tool for Rust that can map out code partitions, or near partitions, based on access patterns of object ownership. At some tipping point a little bit of data sharing could be refactored into message passing and finish the job right.


If done well, formal verification of kernel level services and how these use runtime protection built in hardware can absolutely reduce the attack surface of application level code. The key is to move critical services that attackers would wish to exploit into a formally verified sandbox. That is the real power of systems like seL4, even though seL4 isn't itself really geared toward the sort of chipsets commonly used in consumer grade IoT products. However, a system _like_ seL4 can absolutely improve application level security, as long as the application is constrained by the architecture of the OS.


Can you describe this with an example of a realistic use of L4-enabled sandboxing for a single-function IoT device?

(I've done some L4 work so you don't need to spend a lot of time explaining.)

General purpose OSs like iOS? No question: L4 is a major win. But that's not what the discussion here is really about.


Case in point: cryptography, authentication, and authorization are each very sensitive operations which are prone to both gadget attacks and sidechannel attacks. Sandboxing these operations by creating well-defined boundaries between application code and these algorithms and protocols can ensure that application level bugs cannot creep into these critical services.

Consider, for instance, that it is possible to separate TCP/IP or wireless protocol stacks from authentication code so that, for instance, a packet fragmentation bug can't be exploited to influence authentication level decisions. This is classical defense in depth strategy, but enforced through both runtime and formal methods.


You wouldn't, SEL4 isn't designed for embedded systems. That's what NICTA's eChronos [1] is for : )

1: http://ts.data61.csiro.au/projects/TS/echronos/


seL4 is designed for embeddee systems. That's why it mainly targeted ARM. Gernot Heiser even said so. I believe eChronos is just targeted for embedded devices with more constrained hardware.


> I believe eChronos is just targeted for embedded devices with more constrained hardware.

I guess it all depends on which meaning of IoT and embedded you are using.


No MMU is usually the differentiating constraint when there's two RTOS's. I dont use IoT because it's a BS buzzword.


I'm happy to believe that almost all IoT vulnerabilities are application level problems.

The beginning availability of verified kernels and compilers makes it much more worthwhile to invest in formal approaches for application level vulnerabilities.


See my comment. It has top approaches to that which already delivered implementations.


'Not that many' is not really acceptable for many applications, and I'm not talking only about IoT, but self-driving cars, surgery robots, rockets, etc.


> The full verification of seL4 came a lot earlier (by about a decade) than I thought possible.

Is it a full verification yet? I am by no means an expert, but IIRC, they were still using a simplified model MMU in their proofs.


It's a simplified model, but it's well validated. Without connecting the proofs to a formally verified chip, it's about the best you can do.


But it does give you a proven isolation boundary. We've been arguing for software sandboxes for ages, but their impact has been lessened since our OSs fall pretty readily to skilled adversaries. As a computing industry we've put our trust in VMs as another isolation boundary, but the regular advisories from Xen show that this isn't really as solid boundary as we would like.


First, it depends on how you use it. L4 is pretty close to a "libOS". You can built sandboxes on it, or you can build something more like a monolithic OS on it.

Second, in IoT devices, sandboxing is a lot less interesting, because there aren't that many use cases for sandboxed sensor inputs (you're not RFing or button-pushing whole PDF documents).

I like L4! A lot! But I would be very wary of an IoT device claiming to have inherited security from it.


Are selL4 capabilities transferable?


Yes. See the manual or the tutorials.


Plese correct any/my minunderstanding here: doesn't that imply that flaws in the capability holder can obviate the "isolation boundary" guarantees? (For example, if app x has capability to address a certain memory block and then hands that off to another application.)


That's not what isolation is meant here: if there is a connection between applications that allows cap transfer, they aren't isolated at the cap level. See www.nicta.com.au/pub?doc=7371 for more details, page 35 ("Authority Confinement and Integrity"). If there is any shared memory or potential for shared memory via the vspace caps, there can still be info flows (see page 38, "Confidentiality: Information-flow security").

The theorems are somewhat technical, but your intuition is correct.


Thanks.

Skimming that, I got that (a) PHigh -> PLow via overt channels will be degraded, and that (b) seL4 reduces the attack surface to "covert channels" only which seem to be confined to H/W "platform" concerns.


There are more papers coming out soon about eliminating covert channels at the design/kernel level!


This is more true than many here may realize. If you're not familiar with how microkernels work, remember that everything is moved from kernel to user space if it's at all feasible to do so. In L4 and related systems, that usually means that the kernel doesn't even include the memory manager (the "pager"). So even page faults are handled by application code, which IIRC is not part of the current verification.


Delivery of page faults is, handling of them isn't (although the kernel does abstract the exact mechanism, into message delivery).


But capabilities do solve real application security problems, and this capability system is proven correct. You can get the same thing on Linux [1], but there is no guarantee that it can't be circumvented or broken.

[1]: http://www.cl.cam.ac.uk/research/security/capsicum/


For me, SeL4's verification is important because it can actually provide formal real-time guarantees. Before SeL4, if you wanted to write a hard real-time system, you pretty much had to either forego an OS, or forego formal verification (or, usually, both). Having the option to write a RTOS without also writing a bunch of boilerplate (and probably incorrect) scheduler / resource allocation code, or dedicating a core to a task because you couldn't reason about resource consumption otherwise, is pretty exciting! Sure, it may not help much with securing normal userland applications, but in many embedded systems meeting deadlines is more important than any security functionality could be; the confinement properties are in some sense just consequences of having to reliably hit deadlines and isolate resources.


Is Mirage on L4 something that, if realized, would provide application developers with a more secure basis for building their own applications? I'm but a simple application developer but I do care about security and if there were a platform I could develop against that gave me confidence my code was far less likely to be undermined by kernel or TCP stack vulnerabilities I think I'd be encouraged to do a better job of security myself. This might not be the ideal human attitude toward secure programming but I might not be alone in feeling like my best efforts rest on shaky foundations and that's somewhat demoralizing.


Do you mean "seL4 is great"? I agree it won't do much for application-level security without adding some formally verified code on top (perhaps as simple as setting up isolation between VMs), but it looks great if you do want to use formal methods. For the simplest thing, just starting out with a formal semantics of the OS and reason to trust that semantics would save a lot of work (of course, a lot may remain).


Honestly, I don't much care about the formal verification of L4; it's the L4 design and implementation strategy that I find compelling.


Sure, it's not a panacea. But at least it gives a much better foundation than what we usually are having now.


Comments in this thread also illustrate why it's hard (and frustrating) to do constructive work in security. It's a big, complicated problem, and nobody's happy unless you solve it all at once.

You could do the equivalent of solving world hunger and world peace, but unless you also give everyone in the world a free puppy, you're going to get bad reviews complaining about the lack of puppies. And there's always someone who wanted a kitten instead...


I hope I'm not in this instance coming across that way! I like L4!


How many of the exploits for IoT devices are related to the kernel? I have the impression that it's mostly poor protocols with default passwords and zero consideration for security that are the problem.


I won't pretend to having any kind of statistics on the matter.

But the whole point is that usually in embedded systems, there is no separation between "application" and "kernel", at least on the low-end of CPU power scale. To be able to isolate application level problems from kernel would already be a huge boon.

As a highly publicized anecdote, the Jeep hack of Miller and Valaseck was done by attacking through wireless, and replacing the CAN driver code to suit their needs. Not possible with proper isolation between critical system drivers and application layer.


What does separation between kernel and application really matter for single-function devices, like most IoT things are?


If written in languages like C and C++, you just get the own the complete device when exploiting application errors.

It can also happen when using unsafe code with the Ada, Java, Pascal and Basic variants available for such devices, but the probability is lower.


Right, but if the device exists more or less to run a single program or small set of programs, what do I care if I own the box once I've taken control of that program?


the jeep hack (IIRC) got code exec in userland remotely, but relied upon talking to drivers to be able to flash the CAN chip.

Of course, the whole thing was broken anyhow as everything was running root.

But in almost everything else, I agree. I don't care if you have ring-0 on my Nest camera, because I'm more worried about network-level attacks or an attacker being able to read from the camera which (I'm guessing) is available via user space.


What if you can't take control of the program, like Ironsides? It seems as if your premise is that it's too complex to verify the application layer. However, you don't need to model every behavior in a program and this can be leveraged to reduce proofing overhead. A recent paper [1] produced <100 line security proofs for an SSH server, http server, and the SOP of a web browser.

1: https://pdfs.semanticscholar.org/bc9c/491e215d4abad1be7de944...


True, but by having everything on the same memory space, unless a memory safe language is being used, it means the amount of possible exploits is much higher, thus leading to an higher probability of owning the device.

The isolation helps reducing the attack surface.


You could run other programs on the box.


Without some kind of formally assured way of building your application, you can do that anyways.


The issue you raise is highly relevant — it's like having an IoT powerplug which is, indeed, UL-rated not to burst into flames and explode. That's great, but that doesn't mean that the software running on it won't send every packet on your home LAN to a router in Russia.


Only because that's still the low hanging fruit. Why would most hackers bother with kernel exploits, when hundreds of thousands of routers use their manufacturer's default password?


This is almost tautological. Application vulnerabilities will always be the low-hanging fruit, because that's where the attack surface is. 20 years ago, an OS had juicy attack surface, because the bugs hadn't been shaken out of TCP/IP stacks and OS's shipped with stupid services enabled that nobody used. But it hasn't been 1998 for quite some time now.


Where does one go to learn these formal methods?


Depends on what level of detail you're interested in. You could check out ADA/Spark or Dafny. Or, investigate something like Isabelle/HOL or Coq - perhaps via http://concrete-semantics.org/ or http://adam.chlipala.net/cpdt/.


I'm not very knowledgeable in this area and haven't used these tools myself, but you could start by looking into theorem-proving tools like coq and agda.


At DefCon, someone did a talk investigating the wireless security of some drones. Some of them had a TELNET port open for anyone to log in. The problem here isn't a lack of formal verification, it's a lack of people caring.

There is only one thing that will fix the problem, and that is when corporations get hit in the wallet for having security flaws, see for example: https://medium.com/@xParXnoiAx/irresponsible-disclosure-52d0...


Yeah, the feeds are sent unencrypted, command centers keep getting infected due to running Windows instead of least-privilege architecture, probably written in unsafe languages, not using parser/protocol toolkits that reduce 0-days there, and I'm sure more I'll find out soon. Yeah, the companies' financial incentive is to ignore the stuff since they'll get the contracts anyway. They're just paying for capabilities rather than capabilities with expected quality level.

I'd understand if the autopilot's AI or whatever wasn't perfect due to the complexity of the job or the graphics stack occasionally had artifacts in it. The systems not having basic security measure that budget startups pull off indicates it's not that such a baseline was too difficult: they just don't give a shit.


If you read about the OPM hack you get an idea of the absolute ineptness and decision paralysis at work in these large organisations, and you begin to understand why you can connect to the drone with telnet and why the command centre relies on 13 year old Share Point and Active X.

OPM https://news.ycombinator.com/item?id=12457786


To the point that sometimes, it seems they go out of their way to make things insecure...........


A number of comments here cover how one would use something like this to benefit security of real-world systems. After all, it's a building block you have to combine with other things. This was demonstrated by team behind Nizza in the paper below with examples including digital signatures, VPN's, and so on.

https://os.inf.tu-dresden.de/papers_ps/nizza.pdf

The first to get certified to high assurance under recent models and delivered in products was INTEGRITY-178B. Like Nizza, they implemented desktops that virtualized the machine with Windows/Linux partitions side-by-side with native apps directly on separation kernel. Runtimes for Ada and Java subsets let you write critical components without common errors from C. Special middleware applies security policies to interactions between components.

http://www.ghs.com/products/safety_critical/integrity-do-178...

A recent product that's more accessible is the GenodeOS architecture that builds hierarchical, desktop system on top of components like seL4, NOVA, and Nitpicker GUI. They're dual-licensed with open-source available. Work in progress.

https://genode.org/

It should be noted that seL4 itself is aiming for embedded. Many of the top teams are also focusing on language and spec-level models for verification of holistic properties. The isolation approach isn't enough for the level of correctness they're aiming for. Anyone interested in such work should check out Galois's project or CertiKOS.

http://smaccmpilot.org/

http://flint.cs.yale.edu/certikos/mcertikos.html


If the goal is to provide a verifiably correct kernel, why not build that kernel in something like OCAML so you can leverage a better type system and use the existing verification infrastructure in that language?


They did. There's an Isabelle spec, a Haskell implementation, and a C implementation which I believe is mechanically generated from the Haskell implementation.


The C code is manually written.


I think you would also have to verify resulting binary, compiler, libraries...

It seems more manageable to verify a few KB of assembly or C


It would seem but it's actually the opposite with existing tooling thanks to COGENT. Amateurs did a filesystem with a fraction of the work that pro's did the kernel:

https://ts.data61.csiro.au/projects/TS/cogent.pml

Note: See "Cogent: Verifying high-assurance file system..."

They leverage the same tools used for seL4 verification. Also worth noting that Myreen et al's toolkit basically converts HOL specifications to machine code without need for an external compiler. The "C" that was compiled was an embedding of it in HOL called Simpl which the aforementioned process verifies and converts to verified code. This is called translation validation. That's my non-specialist understanding of what the papers said. COGENT builds on this process to convert functional language and easier specs into that form which gets trans-validated into machine code. With less effort. :)

Note: Myreen et al are doing both verifications of HOL itself and HOL to hardware translation next. These further reduce the TCB of provers and hardware respectively to almost nothing but the specs.


Not really, to verify C code you need to set the compiler and its corresponding version in stone for the verification process, as UB can change even between versions of the same compiler.


Not really true. See https://www.nicta.com.au/publications/research-publications/... for how the compiler and its internal semantics are completely removed from the chain for l4v.


Thanks for the link, I will have a look into it.


Exactly, and then one has to deal with the runtime and GC.


sel4 was specified in Haskell and then converted to C


The "real" specification (if it can be said to exist) is the abstract spec, in Isabelle/HOL.


Where can I get the source for these to look at them? I'd like to know why they are all so long.


Here's the famous verified one: https://github.com/seL4/seL4


This is a republish of an old paper which was pure, uncritical review of the authors' triumphs.

Look for the part where they identify their responsibility for the fact that after 20 years and thousands of engineer hours at their disposal they still don't have a microkerenel based operating system worth a pinch of anything. I'm not trolling, I'd love it if that statement were false.

"For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled." --Richard Feynman


In 2011, supposedly L4 passed a billion installs, including being the basis for the iPad 2, the Motorola Evoke, and some Qualcomm phones: https://stackoverflow.com/questions/8405505/is-there-any-app...

I don't have personal knowledge of these environments, so I wouldn't know if I were wrong about this.


In my opinion, the lesson is that microkernels are too small (e.g. lacking per-user/group security schemes), and monolithic kernels are too big.


Microkernels are designed to let users implement those things on top of them.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: