I'm having really big expectations on seL4. The modularity of a microkernel with...

tptacek · on Sept 20, 2016

L4 is great, but it doesn't solve application security problems. L4 by itself doesn't do much, and whatever you build on top of it --- if you're not yourself using formal methods, which very few real product teams do --- isn't going to inherit its security.

mafribe · on Sept 20, 2016

True, seL4 verification doesn't solve the application level security problem but it makes it more approachable in several ways.

- No longer need the verification of application make assumptions about the semantics of the operating system. Instead, concrete and verified semantics are available. This makes application specification and verification easier, and safer. Fortunately, verification is cumulative

- The seL4 project has driven the state-of-the-art of verification tool forwards by a considerable degree.

- The seL4 project has lead the thinking about proof engineering [1], the emerging field that that is to verification what software engineering is to programming, addressing the question how to develop, maintain and evolve large-scale proofs.

- The existence of something like seL4 also puts pressure on CPU manufactures to provide usable formal specifications to their customers, so that we can verify against rigorous CPU specs. CPU manufacturers have been loath to do this (for various reasons).

The full verification of seL4 came a lot earlier (by about a decade) than I thought possible.

[1] G. Klein, Proof Engineering Considered Essential.

tptacek · on Sept 20, 2016

What's your sense of the number of IoT vulnerabilities that are due to misconstrued OS semantics? Mine --- and I've done a bit of work here, but not that much --- is that there aren't that many.

Really I think it depends on what you're doing. If the work you're doing is fundamentally kernel work --- if it's all about interacting with the security boundaries of the hardware (not just "interacting with hardware", like getting the bits from an RF codec, but manipulating hardware shared by attackers, like the iPhone SEP) then L4 is a pretty huge security win.

Otherwise: the problem you have isn't your OS, but the programming language you're using to build with.

nickpsecurity · on Sept 20, 2016

IoT isn't really target for these things. Language or spec-level security is more important. I agree with you there. Isolation mechanisms are still important for many embedded devices, though, if security is a concern. Here's a few I've seen in various products or academic programs:

1. Device driver isolation. Wacky drivers can take down the system. Better to take down and restart the driver. QNX was first I know of that did this with excellent reliability benefits. Many of them do now in RTOS space. MINIX 3 takes it to desktops and servers.

2. Monitors like in Copilot system or for recovery-oriented computing that expects input to crash or subvert main process. So, outside process is necessary for detection of anomalous behavior and recovery. Periodic restarts are another trick used in that field to clear out subtle errors that build up over time or persistent malware. Best to be in different address spaces.

3. Dedicated process, as in Nizza and Turaya, for containing application secrets where external processes can call to have random numbers generated, signatures performed, etc but not actually access the internals.

4. Same thing for logging purposes where interface between main app and logging component is write-only. Prevents accidental or malicious elimination of audit trail. I did that myself many times. Shapiro et al did it for repo security.

5. Finally, separation kernels like INTEGRITY-178B can ensure predictability of certain operations and statically enforce a scheduling policy. That's important in real-time systems where a number of tasks are running where one can screw with the other. So, they use things like ARINC schedulers with fixed-sized partitions operating in fixed intervals. Watchdog timers are also popular here. Also is a strategy for eliminating covert, timing channels at partition level for select few that need that.

pslam · on Sept 20, 2016

The problem is it takes more than just implementing the "kernel" (as in low-level access aka HAL) work under L4. You need to apply that compartmentalization all the way through the stack, and even subdivide applications into smaller chunks of responsibility.

What you typically get is a neat subdivision of all the HAL bits, but everyone stops once they're plugging in applications. This is partly because there's normally a large amount of shared/legacy code which needs importing, and partly because people don't recognize the benefits of partitioning an app.

It's not all bad - at least you can be reasonably confident that one compromised app, or part of the HAL, can't be trivially used to compromise the rest of the system. But, as you say elsewhere here, if there's just one thing you're doing, then all that effort didn't really improve things.

hinkley · on Sept 20, 2016

I wonder if anyone will manage to create a tool for Rust that can map out code partitions, or near partitions, based on access patterns of object ownership. At some tipping point a little bit of data sharing could be refactored into message passing and finish the job right.

nanolith · on Sept 20, 2016

If done well, formal verification of kernel level services and how these use runtime protection built in hardware can absolutely reduce the attack surface of application level code. The key is to move critical services that attackers would wish to exploit into a formally verified sandbox. That is the real power of systems like seL4, even though seL4 isn't itself really geared toward the sort of chipsets commonly used in consumer grade IoT products. However, a system _like_ seL4 can absolutely improve application level security, as long as the application is constrained by the architecture of the OS.

tptacek · on Sept 20, 2016

Can you describe this with an example of a realistic use of L4-enabled sandboxing for a single-function IoT device?

(I've done some L4 work so you don't need to spend a lot of time explaining.)

General purpose OSs like iOS? No question: L4 is a major win. But that's not what the discussion here is really about.

nanolith · on Sept 20, 2016

Case in point: cryptography, authentication, and authorization are each very sensitive operations which are prone to both gadget attacks and sidechannel attacks. Sandboxing these operations by creating well-defined boundaries between application code and these algorithms and protocols can ensure that application level bugs cannot creep into these critical services.

Consider, for instance, that it is possible to separate TCP/IP or wireless protocol stacks from authentication code so that, for instance, a packet fragmentation bug can't be exploited to influence authentication level decisions. This is classical defense in depth strategy, but enforced through both runtime and formal methods.

indolering · on Sept 20, 2016

You wouldn't, SEL4 isn't designed for embedded systems. That's what NICTA's eChronos [1] is for : )

1: http://ts.data61.csiro.au/projects/TS/echronos/

nickpsecurity · on Sept 21, 2016

seL4 is designed for embeddee systems. That's why it mainly targeted ARM. Gernot Heiser even said so. I believe eChronos is just targeted for embedded devices with more constrained hardware.

indolering · on Sept 21, 2016

> I believe eChronos is just targeted for embedded devices with more constrained hardware.

I guess it all depends on which meaning of IoT and embedded you are using.

nickpsecurity · on Sept 21, 2016

No MMU is usually the differentiating constraint when there's two RTOS's. I dont use IoT because it's a BS buzzword.

mafribe · on Sept 20, 2016

I'm happy to believe that almost all IoT vulnerabilities are application level problems.

The beginning availability of verified kernels and compilers makes it much more worthwhile to invest in formal approaches for application level vulnerabilities.

nickpsecurity · on Sept 20, 2016

See my comment. It has top approaches to that which already delivered implementations.

tloble · on Sept 20, 2016

'Not that many' is not really acceptable for many applications, and I'm not talking only about IoT, but self-driving cars, surgery robots, rockets, etc.

DashRattlesnake · on Sept 20, 2016

> The full verification of seL4 came a lot earlier (by about a decade) than I thought possible.

Is it a full verification yet? I am by no means an expert, but IIRC, they were still using a simplified model MMU in their proofs.

cmrx64 · on Sept 20, 2016

It's a simplified model, but it's well validated. Without connecting the proofs to a formally verified chip, it's about the best you can do.

Eridrus · on Sept 20, 2016

But it does give you a proven isolation boundary. We've been arguing for software sandboxes for ages, but their impact has been lessened since our OSs fall pretty readily to skilled adversaries. As a computing industry we've put our trust in VMs as another isolation boundary, but the regular advisories from Xen show that this isn't really as solid boundary as we would like.

tptacek · on Sept 20, 2016

First, it depends on how you use it. L4 is pretty close to a "libOS". You can built sandboxes on it, or you can build something more like a monolithic OS on it.

Second, in IoT devices, sandboxing is a lot less interesting, because there aren't that many use cases for sandboxed sensor inputs (you're not RFing or button-pushing whole PDF documents).

I like L4! A lot! But I would be very wary of an IoT device claiming to have inherited security from it.

eternalban · on Sept 20, 2016

Are selL4 capabilities transferable?

cmrx64 · on Sept 20, 2016

Yes. See the manual or the tutorials.

eternalban · on Sept 20, 2016

Plese correct any/my minunderstanding here: doesn't that imply that flaws in the capability holder can obviate the "isolation boundary" guarantees? (For example, if app x has capability to address a certain memory block and then hands that off to another application.)

cmrx64 · on Sept 20, 2016

That's not what isolation is meant here: if there is a connection between applications that allows cap transfer, they aren't isolated at the cap level. See www.nicta.com.au/pub?doc=7371 for more details, page 35 ("Authority Confinement and Integrity"). If there is any shared memory or potential for shared memory via the vspace caps, there can still be info flows (see page 38, "Confidentiality: Information-flow security").

The theorems are somewhat technical, but your intuition is correct.

eternalban · on Sept 20, 2016

Thanks.

Skimming that, I got that (a) PHigh -> PLow via overt channels will be degraded, and that (b) seL4 reduces the attack surface to "covert channels" only which seem to be confined to H/W "platform" concerns.

cmrx64 · on Sept 20, 2016

There are more papers coming out soon about eliminating covert channels at the design/kernel level!

cvwright · on Sept 20, 2016

This is more true than many here may realize. If you're not familiar with how microkernels work, remember that everything is moved from kernel to user space if it's at all feasible to do so. In L4 and related systems, that usually means that the kernel doesn't even include the memory manager (the "pager"). So even page faults are handled by application code, which IIRC is not part of the current verification.

cmrx64 · on Sept 20, 2016

Delivery of page faults is, handling of them isn't (although the kernel does abstract the exact mechanism, into message delivery).

oconnore · on Sept 20, 2016

But capabilities do solve real application security problems, and this capability system is proven correct. You can get the same thing on Linux [1], but there is no guarantee that it can't be circumvented or broken.

[1]: http://www.cl.cam.ac.uk/research/security/capsicum/

Jweb_Guru · on Sept 21, 2016

For me, SeL4's verification is important because it can actually provide formal real-time guarantees. Before SeL4, if you wanted to write a hard real-time system, you pretty much had to either forego an OS, or forego formal verification (or, usually, both). Having the option to write a RTOS without also writing a bunch of boilerplate (and probably incorrect) scheduler / resource allocation code, or dedicating a core to a task because you couldn't reason about resource consumption otherwise, is pretty exciting! Sure, it may not help much with securing normal userland applications, but in many embedded systems meeting deadlines is more important than any security functionality could be; the confinement properties are in some sense just consequences of having to reliably hit deadlines and isolate resources.

berkeleynerd · on Sept 20, 2016

Is Mirage on L4 something that, if realized, would provide application developers with a more secure basis for building their own applications? I'm but a simple application developer but I do care about security and if there were a platform I could develop against that gave me confidence my code was far less likely to be undermined by kernel or TCP stack vulnerabilities I think I'd be encouraged to do a better job of security myself. This might not be the ideal human attitude toward secure programming but I might not be alone in feeling like my best efforts rest on shaky foundations and that's somewhat demoralizing.

skew · on Sept 20, 2016

Do you mean "seL4 is great"? I agree it won't do much for application-level security without adding some formally verified code on top (perhaps as simple as setting up isolation between VMs), but it looks great if you do want to use formal methods. For the simplest thing, just starting out with a formal semantics of the OS and reason to trust that semantics would save a lot of work (of course, a lot may remain).

tptacek · on Sept 20, 2016

Honestly, I don't much care about the formal verification of L4; it's the L4 design and implementation strategy that I find compelling.

shandor · on Sept 20, 2016

Sure, it's not a panacea. But at least it gives a much better foundation than what we usually are having now.

cvwright · on Sept 20, 2016

Comments in this thread also illustrate why it's hard (and frustrating) to do constructive work in security. It's a big, complicated problem, and nobody's happy unless you solve it all at once.

You could do the equivalent of solving world hunger and world peace, but unless you also give everyone in the world a free puppy, you're going to get bad reviews complaining about the lack of puppies. And there's always someone who wanted a kitten instead...

tptacek · on Sept 20, 2016

I hope I'm not in this instance coming across that way! I like L4!

adrianN · on Sept 20, 2016

How many of the exploits for IoT devices are related to the kernel? I have the impression that it's mostly poor protocols with default passwords and zero consideration for security that are the problem.

shandor · on Sept 20, 2016

I won't pretend to having any kind of statistics on the matter.

But the whole point is that usually in embedded systems, there is no separation between "application" and "kernel", at least on the low-end of CPU power scale. To be able to isolate application level problems from kernel would already be a huge boon.

As a highly publicized anecdote, the Jeep hack of Miller and Valaseck was done by attacking through wireless, and replacing the CAN driver code to suit their needs. Not possible with proper isolation between critical system drivers and application layer.

tptacek · on Sept 20, 2016

What does separation between kernel and application really matter for single-function devices, like most IoT things are?

pjmlp · on Sept 20, 2016

If written in languages like C and C++, you just get the own the complete device when exploiting application errors.

It can also happen when using unsafe code with the Ada, Java, Pascal and Basic variants available for such devices, but the probability is lower.

tptacek · on Sept 20, 2016

Right, but if the device exists more or less to run a single program or small set of programs, what do I care if I own the box once I've taken control of that program?

wepple · on Sept 20, 2016

the jeep hack (IIRC) got code exec in userland remotely, but relied upon talking to drivers to be able to flash the CAN chip.

Of course, the whole thing was broken anyhow as everything was running root.

But in almost everything else, I agree. I don't care if you have ring-0 on my Nest camera, because I'm more worried about network-level attacks or an attacker being able to read from the camera which (I'm guessing) is available via user space.

indolering · on Sept 21, 2016

What if you can't take control of the program, like Ironsides? It seems as if your premise is that it's too complex to verify the application layer. However, you don't need to model every behavior in a program and this can be leveraged to reduce proofing overhead. A recent paper [1] produced <100 line security proofs for an SSH server, http server, and the SOP of a web browser.

1: https://pdfs.semanticscholar.org/bc9c/491e215d4abad1be7de944...

pjmlp · on Sept 21, 2016

True, but by having everything on the same memory space, unless a memory safe language is being used, it means the amount of possible exploits is much higher, thus leading to an higher probability of owning the device.

The isolation helps reducing the attack surface.

hackuser · on Sept 20, 2016

You could run other programs on the box.

tptacek · on Sept 20, 2016

Without some kind of formally assured way of building your application, you can do that anyways.

zeveb · on Sept 20, 2016

The issue you raise is highly relevant — it's like having an IoT powerplug which is, indeed, UL-rated not to burst into flames and explode. That's great, but that doesn't mean that the software running on it won't send every packet on your home LAN to a router in Russia.

mtgx · on Sept 20, 2016

Only because that's still the low hanging fruit. Why would most hackers bother with kernel exploits, when hundreds of thousands of routers use their manufacturer's default password?

tptacek · on Sept 20, 2016

This is almost tautological. Application vulnerabilities will always be the low-hanging fruit, because that's where the attack surface is. 20 years ago, an OS had juicy attack surface, because the bugs hadn't been shaken out of TCP/IP stacks and OS's shipped with stupid services enabled that nobody used. But it hasn't been 1998 for quite some time now.

hughw · on Sept 20, 2016

Where does one go to learn these formal methods?

cmrx64 · on Sept 20, 2016

Depends on what level of detail you're interested in. You could check out ADA/Spark or Dafny. Or, investigate something like Isabelle/HOL or Coq - perhaps via http://concrete-semantics.org/ or http://adam.chlipala.net/cpdt/.

elihu · on Sept 20, 2016

I'm not very knowledgeable in this area and haven't used these tools myself, but you could start by looking into theorem-proving tools like coq and agda.

XparXnoiAx · on Sept 20, 2016

At DefCon, someone did a talk investigating the wireless security of some drones. Some of them had a TELNET port open for anyone to log in. The problem here isn't a lack of formal verification, it's a lack of people caring.

There is only one thing that will fix the problem, and that is when corporations get hit in the wallet for having security flaws, see for example: https://medium.com/@xParXnoiAx/irresponsible-disclosure-52d0...

nickpsecurity · on Sept 20, 2016

Yeah, the feeds are sent unencrypted, command centers keep getting infected due to running Windows instead of least-privilege architecture, probably written in unsafe languages, not using parser/protocol toolkits that reduce 0-days there, and I'm sure more I'll find out soon. Yeah, the companies' financial incentive is to ignore the stuff since they'll get the contracts anyway. They're just paying for capabilities rather than capabilities with expected quality level.

I'd understand if the autopilot's AI or whatever wasn't perfect due to the complexity of the job or the graphics stack occasionally had artifacts in it. The systems not having basic security measure that budget startups pull off indicates it's not that such a baseline was too difficult: they just don't give a shit.

angry_octet · on Sept 21, 2016

If you read about the OPM hack you get an idea of the absolute ineptness and decision paralysis at work in these large organisations, and you begin to understand why you can connect to the drone with telnet and why the command centre relies on 13 year old Share Point and Active X.

OPM https://news.ycombinator.com/item?id=12457786

XparXnoiAx · on Sept 20, 2016

To the point that sometimes, it seems they go out of their way to make things insecure...........