Hacker News new | past | comments | ask | show | jobs | submit login
Redox OS Crash Challenge (github.com/redox-os)
179 points by dragostis on Jan 21, 2018 | hide | past | favorite | 80 comments



Rust being advertised as a safe language (which is true) got so much into the user's heads that they think if they write it in Rust it's safe and crash free by default. There are plenty of security/safety bugs that aren't about dereferencing a null or accessing invalid ptr.

No cynicism intended, just an observation from what I see around.


> There are plenty of security/safety bugs that aren't about dereferencing a null or accessing invalid ptr.

Sure, but there are also plenty of improvements in Rust that aren't about memory safety. For instance, there's the improved type system over C or C++ (tuples, enums, traits, etc.), related features like pattern-matching on enums and trait-based error handling (the question-mark operator), and a macro system that works at the syntax-tree level and not the source level. All of these mean that you're less likely to write code with logic bugs. This isn't as clearly true as it is that you're less likely to write code with memory safety bugs, and it's harder to either argue or measure empirically, so it's not as commonly claimed as the memory safety. But a lot of stupid bugs in C come from things like forgetting to check error returns, or confusing two things that are both represented as char * , or accessing a resource without proper serialization (which isn't strictly speaking a memory-safety bug), or having side effects in an expression you pass to a macro. Those bugs are harder to write in Rust because there are more readable ways to write the same constructs that don't let you make the same mistakes. You certainly can write safe Rust code that makes the same mistakes, you're just less likely to because it's the more cumbersome way to do it.


> like forgetting to check error returns

One of the most common bugs in C (in my experience) is forgetting to check the error return on a malloc call. Kind of the worst of both worlds.


Yes! In fact I'd argue that null-pointer dereferences are not a memory-safety bug the way buffer overflows / use-after-frees / etc. are, they're a type-safety bug. It's not that NULL is a currently-invalid pointer, it's that NULL is not a pointer at all - it's a value that's been stuffed into the pointer type because C has no better way to represent this. The actual return type of malloc is the set of all possible pointers along with NULL, a separate thing. Passing it to code that only expects something from the set of all possible pointers should require you to check for NULL and do something else instead.

If you want to optimize memory layout by reserving memory address zero, sure (and Rust's Option<&T> does exactly that), but at the language level, you shouldn't be able to use NULL as a pointer any more than you should be able to use 0.0 or '\0' or false as a pointer.


Why do you think NULL isn't a pointer? It most definitely is a pointer, it's the kernel the limits a ptr to 0 from being used, in fact you can use it if mmap_min_addr is set to 0.


NULL is not a pointer—what does it point to?

0x00000000 is a pointer, sure, and it points to the memory at address zero. But that's not the same concept as NULL, even if it happens to have the same representation. If malloc(32) returns NULL, it doesn't mean the memory between 0x00000000 and 0x00000020 is available for me to use.

false also has the same representation, but no one would claim that false is a pointer.

mmap_min_addr exists because C is unable to distinguish NULL and a pointer to address zero, and the Linux kernel is written in C, and too much code wrongly treats NULL as a pointer to 0x00000000 and attempts to read or execute the contents of memory there. If code did not confuse the two, mmap_min_addr would not need to exist.

And it is relatively recent; it was added as a security measure in June 2007 for Linux 2.6.23 in https://github.com/torvalds/linux/commit/ed0321895182ffb6ecf... , about 40 years after the invention of NULL.


The point is that there are two very fundamentally different things that can be stored in a C pointer:

1. A valid address to an object of some type.

2. Null.

One of these can be dereferenced, the other cannot; the valid operations are not the same, because they are not the same "type". Now, C of course will happily store both the address of an object and "NULL" in the same "type", and that's the problem.

There are a great many places in code in C where one would like to take a pointer that must contain an object's address; that is, a non-null pointer. The type system offered by C has no way to indicate this, and so the compiler cannot catch passing NULL to such a function.

(And I honestly would bet that the kernel limits it more due to C, than C uses 0 because the kernel limits it. That is, yes, the kernel limits allocating address 0, but the arrow of causality is the other way around.)


> One of the most common bugs in C (in my experience) is forgetting to check the error return on a malloc call.

Very unlikely.

1. it won't happen in regular use, only if your are asking for a huge amount of memory (possibly by mistake), or the system is already full and you will likely experience problems with the whole system (freezes, instability, processes killed, etc) and not just with your program.

2. it will never happen in Linux, because in the classical setup, malloc() never returns NULL, even when there is no memory available.

So you have to have those conditions + a return value unchecked for the bug to have a chance to appear. There are thousands of other bug sources.


The 'classical' setup is horribly broken and relying it just makes more broken code.


I think much worse is forgetting to check the size returned by a read() or write() call, especially when dealing with sockets.

Unchecked malloc() crashes very easily, data chopped off in the middle usually triggers problems on the /other/ side.


Using exceptions solves that problem and many others besides; it's a shame they're not currently in fashion. But you do end up needing them, so in Rust, we end up with error codes _and_ exceptions (but spelled "panic").


I wouldn't call panic the same thing as exceptions - panics are quite explicitly meant to not be caught, except at fairly hefty boundaries like processes, threads, or FFI. Using std::panic::catch_unwind to catch, say, accessing an out-of-bounds element in a vector would be super un-idiomatic, both because that's what .get() -> Option<T> is for, and also because even if you catch the panic, the message gets printed to stderr: https://play.rust-lang.org/?gist=587f976dc4bcf4010eb0026c9ed...

Rust doesn't have exceptions in the sense that C++/Java/Python/etc. have exceptions, i.e., things that unwind the stack and are part of a function's expected API. And I think the specific reason they're out of style is the inherent contradiction in that statement: either all the exceptions in the API of any of the functions you call are also part of your public API, or you're carefully filtering exceptions in any of the functions you call that raise them, and so you might as well not use unwinding.

Panics unwind the stack, but are not part of the API; they're for erroneous conditions where the usual right thing to do is to kill the process, but maybe you only want to kill e.g. the current HTTP request. Errors as return values / the Result<T, E> type do not automatically unwind - they're just normal data types returned from a function - but they have syntax (the question mark operator) for explicitly unwinding them one step, and there are proposals in progress to introduce syntax that use "throw" or "catch" to refer to returning the error case or handling such returns in a block of code, so it seems like people think Result more closely matches exceptions in other languages.


In other words, Rust _does_ have exceptions! That some people don't use them all that much is a matter of convention in a particular community, not a matter of language feature set. You could implement exactly the same model in C++. The fact is that Rust has exceptions to exactly the same extent C++ has them.


Kind of? I mean, the fact that panics get printed to stderr makes it cumbersome to use it. (I mean, yes, you can do things to suppress the exceptions. You can also write some C macros to implement tagged enums and make a libc whose malloc returns Option.)

I don't really see a huge distinction between a language and its community, for the primary reason that feature evolution in a language - e.g., that Results recently got the question-mark operator, and whether Results will get "throw"/"catch" syntax - is driven by the language community and what sorts of things are or aren't common practice. A Rust community that made heavy use of panics in normally-operating code would probably want to fork Rust just to optimize panicking and catching panics, to fix the fact that catch_unwind is documented to not necessarily catch all panics, etc., and would eventually make deeper language changes to improve the syntax around doing panicking and not merge the corresponding changes to improve the syntax around Result. Which is historically what's happened with languages that have developed multiple communities - there are lots of Lisp dialects, lots of BASIC dialects, etc. Whether BASIC has a feature isn't a well-formed question; whether GW-BASIC or VB.NET or your TI-83 has a feature is well-formed.


Well, error codes are not special constructs in any way. And exceptions can be easily overused. So you have to draw the line somewhere. Rust just choose the balance where facing an exception is truly exceptional :)


"Exceptions are for exceptional conditions" is a meme that I wish would just die. Its origin lies in 1990s C++ compilers, which were extremely inefficient when dispatching exceptions, leading programmers, as a pragmatic measure, to use error codes for "expected" errors and exceptions only for cases thought to occur infrequently.

We've long since past the time that we have to worry about such concerns. "Exceptions for exceptional errors" just means that you have to write code that both cares about exception safety and propagates error codes from subroutines. It's the worst of both worlds.

Just use exceptions for all errors. It's elegant.


It's not elegant - it means that every single exception raised by any function you call, or any function they might call, and so forth, is now part of your API. If you're an HTTPS library, and you're using some OpenSSL bindings for certificate validation, OpenSSLCertificateValidationError is part of your API because callers are now catching that. If you switch to BoringSSL or NSS or whatever, callers won't be expecting BoringSSLCertificateValidationError or NSSMismatchedCertsException.

Java has a particularly inelegant and ugly solution here involving declaring what types of exceptions might be thrown. But that still doesn't change the fact that changing that list is an API change, it just makes it more explicit.

So if you care about API stability (and I firmly believe that no solution that ignores API stability is "elegant" - it is at best "cute"), you're basically required to catch the vast majority of exceptions your own dependencies generate and translate them to your own exception types. You'll need to make a MyHTTPLibCertificateValidationError, unwrap the contents of OpenSSLValidationError, and put them in the new object, or you can never switch away from OpenSSL without an API break. And you want your dependencies to follow the same discipline.

At that point, as I said above, why use unwinding? None of the exceptions in your program can safely pass more than one level of the call stack at a time; each level has to explicitly approve raising it another level or wrap the exception in its own type (or handle it). The only ones that can really unwind are standard library ones like OutOfMemoryError that are expected to go all the way up the call stack to the top of the program or at best the top of the current request, print or otherwise log a backtrace, and abort the entire thing in progress - i.e., exceptional conditions. Exceptions for expected conditions are a different thing entirely, precisely because you don't want unwinding, you want step-by-step propagation.

This has nothing to do with efficiency. This has to do with correctness and robustness.

And you get your syntactic elegance with a library for translating and wrapping error objects, like https://docs.rs/error-chain , combined with syntax for immediately translating and returning errors from dependencies, like https://doc.rust-lang.org/book/second-edition/ch09-02-recove... .


Except that exceptions need RTTI information and so you'll see binary size increase as you start using them.

Some compilers even won't let you use them without turning on full-blown RTTI(or doing it implicitly for you) which is even worse.


Not in C (which does not have exceptions) and not in C++ when doing what the parent suggested (calling malloc, not new).


I don't see anything in the link that seems to imply that they think Redox OS is going to be crash free? In fact this seems to be the opposite, it looks to me like they're looking for bugs to squash.


I agree. Particularly "There are a few ways already that I believe lockups or program crashes can be triggered, but I am looking for the experiences of others."


The OP isn't making a claim about anyone involved in Redox, but rather the general perception of Rust, particularly among non systems programmers.

I tend to agree. I've shook hands with a lot of people that see Rust as a panacea rather than a mitigation strategy.


To my mind Rust is neither a panacea nor a mere mitigation strategy. It (or another language like it) is a necessary condition for anything like generally safe (not perfect) computing.

Please don't jump to the conclusion that someone who thinks that Rust isn't just one more mitigation strategy is an addled fanboi who thinks it's a total panacea and snake oil. That is NOT what they're saying. They are saying it's necessary for robust computing and I think they're right about that.

I'm quite tired of being instantly and vehemently misunderstood on this point so often, personally. So many people seem eager to put not just words but whole chapters in my mouth, even at the cost of using rather rotten logic to accomplish that. Those of us with considerable respect for Rust DO indeed understand that a necessary condition for genuinely safer computing is not a sufficient condition - but I do sometimes wonder if those who have less interest in Rust and sneer at anything resembling enthusiasm for it, have recognized this distinction, or have gotten that far in their thinking.


The idea that something can be "necessary but not sufficient" comes up a lot, and is very often misunderstood.

Rust-like guarantees are necessary but not sufficient for safe systems.


As DJB and other excellent programmers have demonstrated, they're not strictly necessary either.

(and I really like Rust)


The sheer complexity of many of the systems we're building now really does make them necessary; so many programmers not just building but maintaining such complex systems over such long periods of time, now; the only way to be safe is to be safe.

There was a time when good drivers argued that seatbelts weren't strictly necessary if you knew what you were doing in a car, so there shouldn't be laws insisting on them.

So if by "strictly", you mean logically possible and sometimes within human ability; yeahbut we now just have to be more practical.

No doubt I should have narrowed the scope of the first sentence of my first post; but I'll leave it for now.


Not strictly necessary, but as the vast legions of merely adequate programmers have demonstrated they are practically necessary. They're necessary for scaling the number of programmers up.


> Rust being advertised as a safe language (which is true) got so much into the user's heads that they think if they write it in Rust it's safe and crash free by default. There are plenty of security/safety bugs that aren't about dereferencing a null or accessing invalid ptr.

> No cynicism intended, just an observation from what I see around.

The difference is what happens when a bug is encountered. If it is caught and panicked upon, then that is safe predictible behavior that cannot be exploited into something like privilege escallation.


I remember when all the same arguments were made about Java, which led people to ignore the possibility of memory leakage and privilege escalation, even though both those things are common problems with idiomatic Java.

For instance:

Swing recommends you register callbacks all over the place, but those cause otherwise dereferenced windows to stay live from the GC’s perspective (this is a general problem with the callback pattern, not swing).

Java serialization surfaces all sorts of private methods to outside processes via the reflection APIs, which is even easier to exploit than arbitrary memory stomping in C.

Practically everything can throw a null pointer exception, and all generics code can throw ClassCastExceptions. Writing all the error handling logic for this is at least as hard as restricting yourself to a memory-safe subset of C++ templates. If you miss an error handling case, an attacker can use that to escalate up to increasingly high level invariant violations in your code.

Basic things like “final” have ill-defined semantics with multithreaded code (final fields can change value, even without reflection).

I don’t know Rust well enough to know which classes of these bugs it has (and I doubt the Rust community really does either—-it took the Java world a decade to notice some issues like the above).

With the exception of the “final” problems, I think fixing any of the things I listed reduces to solving the halting problem, or giving up on using turing complete languages.

This makes me skeptical of many claims coming from Rust proponents at the moment.


> I don’t know Rust well

That is one problem and i don't want to attack you. This is something Rust needs to work on! Rust is indeed a relatively hard language to learn and understand and it is not really clear from the beginning nor in the intermediate level if its worth the effort. I can only speak from an empirical standpoint. Time showed – to me and maybe i am the only one – that i have fewer bugs in general, especially in the late stage of development. Rust tends to feel a little viscid in the beginning but it catches up later where you don't try to find a race condition in a 100k cloc Program. Rust has a good way to reason about your program especially in a multi threaded environment. And this is where Rust is fundamentally different of what we – as programmers – have experienced in the wild. Yes there are many efforts in academia and research languages but non of them are used outside.

Rust does nothing new or has concepts that are not strongly researched in academia. Rust is trying to bring those findings into the wild.

In Rust practically nothing can throw null pointer exceptions or class cast exceptions. Rust don't let you express ill-defined semantics with multi threaded code – you can't have data races.

And this implies – of course – that the set of programs that you can write in safe Rust is smaller than in C/C++/Java/C# ... you can't possibly write programs with data races or dangling pointers / null pointers. Thus making like 60% of all CVE's (i pulled this number out my ____) impossible to write.

Rust can't prevent you from making logical errors but it can help you preventing various categories of bugs we see in the wild exactly now.

Rust has many problems of its own kind, but none of the ones you listed above. immaturity in its ecosystem due to its short lifetime, steep learning curve and many more. But it can deliver very well in certain disciplines today.


Do you recommend any resources to learn rust?


The official rust book [1] is a great place to start. So are the exercism tutorials [2]

[1] https://doc.rust-lang.org/book/

[2] http://exercism.io/languages/rust/about


Version 1 or 2 of the rust book?


Author here. 2 is much better. And it’s almost done; we’re mostly in editing and bikeshedding the opening paragraph.


Java remains a memory-safe language, though, right? Leaving aside poor deserialization APIs, it's pretty hard to provide malicious input to non-malicious Java source code that takes control over the interpreter.

Most of Java's security reputation, I think, comes from Java applets, which were a very different model that involved letting malicious actors write the Java code. Java continues to see success in both web application software and Android apps, both of which do not use the applet security model, and it seems that Java's security record there is far better than C or C++ would have been in the same roles.


Apache Struts and Equifax might disagree that the handling of exceptions doesn't cause security issues in Java.


If I'm understanding right, the Equifax vulnerability was CVE-2017-5638, where if an invalid Content-Type issue is given, it gets passed to an error-message printing function that searches through the entire provided message for OGNL code (a domain-specific language that apparently can deserialize arbitrary Java objects, or something), and so an attacker can pass OGNL code that spawns an external process in the Content-Type header?

I'd argue that this isn't a language-specific issue at all. It happens to be the case that this is implemented using exception handling in Java, by throwing an object containing the request headers, and having it caught by something that parses the provided message for code. But a language without this type of exception handling, like C or Rust, could have the same issue as easily:

    if (strcmp(request->content_type, "multipart/form-data") != 0) {
        char *error = asprintf("Bad Content-Type header %s", request->content_type);
        log(error);
        free(error);
        return -1;
    }

    void log(char *message) {
        char *ognl = find_valid_ognl(message);
        if (ognl)
            message = evaluate(ognl);
        ...
    }
I doubt that any language can really, fundamentally, save you from wrongly deciding to evaluate untrusted strings. (I do think that the vulnerability here was made more likely by OGNL being able to instantiate arbitrary Java objects and by not distinguishing TrustedString and UntrustedString types, and a good language can gently steer you away from those mistakes, but it won't be able to completely save you.)


I agree that it isn't Java-specific, but Java is affected despite being memory safe and "it's pretty hard to provide malicious input to non-malicious Java source code that takes control over the interpreter" may not be as true a statement as it appears.


That's a fair objection, and I already had to add a caveat about deserialization libraries that can instantiate arbitrary objects. (Which is a problem that affects other memory-safe dynamically-typed languages too, e.g., Python's pickle and yaml modules have the same problems.)

I think there is a meaningful distinction, still, between something like OGNL where the program / library is providing its own interpreter of the EDSL that is excessively capable (which can be done in any language, including Turing-incomplete languages, capability systems, etc.: a library can always dispatch input to whatever functions are available to it) -- i.e., where the programmer intended the functionality that was implemented, they just didn't think through what they were doing -- and things like buffer overflows and ROP where the course of program execution, on the existing interpreter / runtime / platform, is subverted to something the programmer did not intend. I'm not quite sure how to phrase it. (And I'm not sure on which side of the line arbitrary-object deserialization lands.)


I'd phrase it as "Programmers can write vulnerabilities in any language" :-)


Rust is significantly different with regards to this stuff:

* There is no reflection API

* Exceptions don't exist

* There's no subtyping, so ClassCastExceptions can't exist. (There's subtyping with lifetimes, and you can't cast them, so that doesn't count)

* final doens't exist

That said, every language has things that you don't realize; we did have a sort of similar moment to this stuff before 1.0 with the "leakapocalypse." Rust will have its own issues. But it doesn't really inherit the problems you're talking about above.


> Exceptions don't exist

They're called panics.

> There's no subtyping, so ClassCastExceptions can't exist.

Any sort of polymorphism opens you up to the possibility that code that's written to take abstract type A, implemented concretely in B and C, implicitly expects a B and blows up when it gets a C. Rust doesn't solve this problem.

> final doens't exist

Sure it does. It's just implied, and not-final has to be spelled out. This is one thing Rust did that I do agree was a good move. The other stuff? I remain unconvinced, especially with respect to structural subtyping.


> > Exceptions don't exist > > They're called panics.

Nope.

https://doc.rust-lang.org/std/panic/fn.catch_unwind.html


It's non-local control flow with deterministic unwinding and defined stop points. It's an exception system.

Sure, it's not "recommended", but recommendations don't define semantics.


I'd say they do—a Rust user will never encounter a panic from a reasonable library that they're expected to catch.

If hypothetical unreasonable libraries are in scope, they can just open /proc/self/mem for writing (in safe Rust!) and violate memory safety, and that's not an argument that Rust isn't memory safe. So why should hypothetical unreasonable libraries be an argument that Rust has exceptions?


Because the language doesn't define what happens if you write to memory behind the runtime's back. It _does_ define, in detail, what occurs on panic. It's a legal thing to do. That some consider panicing in bad taste or whatever does not change the fact that panic is a supported feature of the language.


It also defines that it may abort, and so culturally, people use them for only what’s intended, as otherwise, it limits your audience.


> Any sort of polymorphism opens you up to the possibility that code that's written to take abstract type A, implemented concretely in B and C, implicitly expects a B and blows up when it gets a C. Rust doesn't solve this problem.

Can you explain this? Do you mean that the type C does not correctly implement A? Do you mean something that expects a specific value from a methid in A that only B will return? Both of those, while possible just seem like a complaint of "X doesn't prevent me from purposely breaking it".


One example is something like enum A { B, C }: a function might have a precondition that it requires an A::B, and crashes/misbehaves when given an A::C. In Rust, this can be seen in Option::unwrap (requires a Some, panicking if given a None). Technically this is the same sort of thing as in Java, but, at least to me, it seems different/more easily controlled.


Unwrap is essentially asking for it to fail under that condition, so that's not exactly broken.

Unwrap should always be suspect. It's easy to turn it into a pattern matcher instead, which explicitly forces you to hand the none case.

I'm not saying you're wrong, but I don't think it's an example of an error created by a polymorphic type. (Instances of the same type can do this as well.)


One can look at Option<T> as an interface/base class that can be satisfied by a T or by (). Indeed, in Java-style languages, one occasionally sees implementations that have an Option abstract base class, with two subclasses, Some and None. Both enums and subclassing are forms of polymorphism, the main conceptual difference is the polymorphism being closed or open (respectively). The fact that Rust adds extra layers of types/syntax on top of the raw polymorphism is syntactic sugar, just a disguise.

One could actually even express the Option concept as a trait, with manual downcasts:

  trait OptionLike {
      type Contained;
      fn is_some(self) -> bool;
      fn as_some(self) -> Some<Self::Contained>;
  }
  struct Some<T> { value: T }
  impl<T> OptionLike for Some<T> { ... }
  struct None<T> { _phantom: ...<T> }
  impl<T> OptionLike for None<T> { ... }
Under this scheme, Option<T> is expressed as a trait object like Box<OptionLike<Contained=T>>, and the whole thing looks a lot more like the OOP/subclassing approach.

In any case, the fact that unwrap is clearly documented to fail on the None case is orthogonal: it's an invariant about the (sub)types of it's arguments that isn't expressed in the signature.


See above about panics and exceptions; just because they’re similar doesn’t mean they’re the same.

I agree Rust doesn’t solve all bugs.


Crash bugs caused by a panic can't necessarily be used for privilege escalation, but they can certainly be used for DOS attacks.


My point is that privilege escalation can happen from logical bugs


Panics can of course result in exploitable system behaviour.


Panics may be safe and unwind nicely, but they can still, for example, be triggered in an untested code path to DOS a system.

That's one reason that people run AFL against safe Rust.


Rust is not just about null pointers. It's also about race conditions in the face of mutation. See this recent [1] for a concrete example in the context of systems programming.

[1] https://news.ycombinator.com/item?id=16189088


Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?

( proggit's got the cynical side going: https://www.reddit.com/r/programming/comments/7ryiih/redox_o... )


No but the machine shouldn't do something stupid, like burn down.


The quote is about data, & now our code is data. The computer indeed should not burn down-- even if the kernel faults


The kernel faulting is effectively burning down. In modern systems there is the concept of "isolation", if you give bad input to one program (e.g. the shell) the kernel shouldn't fault (taking down unrelated programs with it).


There is (most likely) no perfect isolation.

If given no alternatives, a kernel should burn down rather than permitting a privilege escalation. Ideally there is no privilege escalation either.

Given bad input the kernel should not fault but sometimes it must.


Capability-safe languages in the E family, like E and Monte, have perfect isolation, or as it's known in the community, "working encapsulation." In a capability-safe system, only bugs in the implementation, such as runtime or CPU bugs, can violate isolation.

Here's an introduction to E: http://www.erights.org/talks/promises/paper/tgc05-submitted....

The Monte language's manual starts with a tutorial and an explanation of what capability-safe languages do: http://monte.readthedocs.io/


That's all nice but what if you have a CPU bug like Spectre?

Isolation is nice in userspace but in kernelspace the rules are somewhat different, especially considering that it is the task of kernelspace to mediate and abstract hardware access to arbitrary programs running on your system. Any number of hardware bugs may break isolation. Any number of wrong assumption or subtle logic bugs can happen and break the isolation.

The documentation on Monte also mentions that it is rather slow and requires memory management, two properties that simply do not exist in kernel space.

Writing a kernel is like writing a modern web browser except if you get it wrong the malicious program can access everything, all webpages, all the history and send your mom rude jokes on facebook after literally setting your computer on fire.

So you can't simply say "we will never break isolation", you must question "in the absolutely infinitely small occurence that isolation breaks, what do we do?". A Kernel must be prepared to deal with such a possibility. Even if it "cannot happen".


That's why high assurance systems (like L4/seL4) minimize privileged code to a tiny fraction of what you find in a regular system, which absolutely makes sense and provably improves security.

Apart from the obvious advantages, minimal privileged code also means that it becomes far more feasible to formally verify the privileged code, like it has been done for seL4, which in turn is an important cornerstone to ensure that isolation (or more generally, specification invariants) does not break in any reachable system state.


Even a perfect kernel will have to crash.

Something like Rowhammer can turn perfectly written code into malicioous gadgets for an attacker. Until ECC or preventative measures becomes widespread in consumer hardware, this attack will remain viable.

Additionally, cosmic rays may at any time write arbitrary data into memory. In these situations it is perfectly reasonable to crash the machine if essential datastructures were affected.

Or suppose you just received a MACHINE CHECK exception, the type of interrupt which essentially just tells the OS that the machine is no longer capable of operating in it's current state.

This isn't all about an attacker gaining privilege, the mere act of crashing the kernel in a hostile or other event (cosmic rays, voltage fluctuation) which leads to corruption or compromise of the continued operation of the machine is perfectly acceptable and normal.


"That's all nice but what if you have a CPU bug like Spectre?"

Use VAMP [1] or AAMP7G [2] with techniques [3] that prevent leaks if you're worried about that. Far as RAM/Rowhammer, either use techniques that trust it less like Edmison's [4] or actually do QA when you build your own. A RAM engineer told me they were intentionally under-designed for cost-cutting and/or rushing to market. There's also an open, cell library available. Then, you get some analog and RF engineers that don't trust each other to look at the rest of the mixed-signal chip. You should be good on what requirements you've stated so far. If worried about kernel security, modify that same chip to use CHERI [5] which already runs FreeBSD but you'll want a separation kernel. Throw in asynchronous or masked execution for timing channel mitigation.

I still agree you want containment, detection, and recovery mechanisms just in case where you can put them in. Mainly in case of hardware failures but also unknown attack classes.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.217...

[2] http://www.ccs.neu.edu/home/pete/acl206/slides/hardin.pdf

[3] https://pastebin.com/ajqxDJ3J

[4] https://vtechworks.lib.vt.edu/bitstream/handle/10919/29244/e...

[5] http://www.cl.cam.ac.uk/research/security/ctsrd/cheri/


Imperfect isolation is always treated as a bug. As a kernel developer if you try to argue otherwise you'll get laughed out of the room and/or fired. Hopefully both.


I'm not saying imperfect isolation isn't a bug. Rather that bugs are inevitable and the proper responce to some bugs being exploited (maliciously or not) is to crash the kernel if there is no other option.


Hardware can misbehave. If the kernel can detect that it is reasonable to shut down the machine and preventing data corruptions. I cannot think of a kernel developer who would laugh at that.


Depends. Did you tell tell it to burn down?


Related: the Changelog podcast did a really good interview with the developer[1]. It's pretty long but worth the listen IMO.

[1]: https://changelog.com/podcast/280


I listened to this last night; it’s great.


just listened to it too. That podcast has some other interesting interviews like with Miguel de Icaza.


I'd rather trust something with a formal proof, such as seL4.

https://sel4.systems


It's being worked on! part of libstd was formally verified, as well. Lots of more work to be done though. http://plv.mpi-sws.org/rustbelt/ was even just presented at POPL!



The two aren't mutually exclusive.


This was the output I got when I followed the instructions from the book: - what should I have seen? Can you suggest what I can do to make it work?

  (venv3.5) root@ubuntu-s-1vcpu-1gb-nyc1-01:~# qemu-system-x86_64 -serial mon:stdio -d cpu_reset -d guest_errors -smp 4 -m 1024 -s -machine q35 -device ich9-intel-hda -device hda-duplex -net nic,model=e1000 -net user -device nec-usb-xhci,id=xhci -device usb-tablet,bus=xhci.0 -enable-kvm -cpu host -drive file=redox_0.3.4.bin,format=raw  -nographic
  pulseaudio: pa_context_connect() failed
  pulseaudio: Reason: Connection refused
  pulseaudio: Failed to initialize PA contextaudio: Could not init `pa' audio driver
  ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
  ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
  ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
  ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
  ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM default
  alsa: Could not initialize DAC
  alsa: Failed to open `default':
  alsa: Reason: No such file or directory
  ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
  ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
  ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
  ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
  ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM default
  alsa: Could not initialize DAC
  alsa: Failed to open `default':
  alsa: Reason: No such file or directory
  audio: Failed to create voice `dac'
  ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
  ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
  ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
  ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
  ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM default
  alsa: Could not initialize ADC
  alsa: Failed to open `default':
  alsa: Reason: No such file or directory
  ALSA lib confmisc.c:768:(parse_card) cannot find card '0'
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
  ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
  ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name
  ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
  ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
  ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM default
  alsa: Could not initialize ADC
  alsa: Failed to open `default':
  alsa: Reason: No such file or directory
  audio: Failed to create voice `adc'
  qemu-system-x86_64: terminating on signal 15 from pid 8287
  (venv3.5) root@ubuntu-s-1vcpu-1gb-nyc1-01:~#


Remove some of the extra options from the list, like the ich9-intel-hda device probably can't be run on your machine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: