That used to be true, but now we have reasonable model checking tools for C. It'...

kelnos · 2024-10-25T21:39:03 1729892343

So two things:

1. This is perhaps just my preference, and so is more subjective, but I don't want to have to pick and choose among various options for doing linting or static analysis or or address sanitizing or model checking after the fact. I want compilation to fail if any of these invariants don't hold. Rust can do that; C never will be able to do that. (Sure, if I write unsafe Rust, I'm going to want to run MIRI on it, but if can stick to safe Rust, the compiler should be sufficient.)

2. I'm probably not as well-versed in the topic as you are, but my understanding is that model checking tools like this cannot prove that every single program that a compiler will accept will also be free of the issues that the model checker is looking for. Again, the Rust compiler can do that. Yes, that does mean that the Rust compiler might reject some programs that would turn out to be safe and sound, but I'm ok with that trade off.

nanolith · 2024-10-26T02:32:57 1729909977

1. Of course it can. My model checks on function contracts and invariants run for every function. If a function fails this contract or invariant, the overall build fails. Does it matter that this is a separate build step than the compilation? Not in practice. Think of the model check step as being a semantic analysis step that happens after the type check and syntax check from the compiler.

2. You're not interested in proving that every program that the compiler accepts will be free of issues. You're interested in whether YOUR program that you write is free of those issues. The Rust compiler CAN'T do this, because the Rust compiler is only looking at a SUBSET of the possible things that you can build model checks against. This is why Kani -- a model checker for Rust -- exists. You can model check unsafe code in Rust, as well as safe code in Rust against user assertions and function contracts that are otherwise not possible to check in vanilla Rust.

Model checking isn't just for C, but model checking, as a practical form of formal methods, brings the same and even better safety to C. In fact, with Kani, you can get similar safety in Rust as well.

If you like Rust, use it. But, as was the point of my comment, it is possible and practical to get similar safety in C.

Cyph0n · 2024-10-25T20:07:39 1729886859

I am sure it is “possible”, but we are talking about practicality here.

Why doesn’t the Linux kernel embrace model checking instead of experimenting with Rust?

nanolith · 2024-10-25T20:29:01 1729888141

It is quite practical. I'm actually planning a book on the subject.

The reason why some Rust enthusiasts have been experimenting with Rust in the Linux kernel is because they are passionate about Rust, and kernel maintainers are looking to find younger people. It's neither an endorsement of Rust nor an argument against model checking in C.

The reality is that this tooling isn't yet well known about. As it becomes better known, it will be adopted.

Cyph0n · 2024-10-25T21:22:14 1729891334

Unless you have some insider info, I think you’re oversimplifying why Linux is giving Rust a chance.

Also, model checking - at least as you’re portraying it - sounds too good to be true. If it was anywhere close to the realm of practicality for large C codebases (without maintaining a model separate from code), we would be hearing its praises being sung by C devs all over.

nanolith · 2024-10-25T21:31:32 1729891892

> Unless you have some insider info, I think you’re oversimplifying why Linux is giving Rust a chance.

Not really. Linus Torvalds has been quite open about this topic, and it has been covered extensively on LWN.

> Also, model checking - at least as you’re portraying it - sounds too good to be true.

> If it was anywhere close to the realm of practicality... we would be hearing its praises being sung by C devs all over.

Like any technology, model checking takes effort to use and learn. But, it does work quite well. Again, you're conflating the popularity of something with its effectiveness, which is a poor argument.

Cyph0n · 2024-10-25T21:54:08 1729893248

In my opinion, vaguely handwaving that model checking makes C as effective as Rust invites nothing but skepticism to anyone who knows C and Rust.

Typically, if you’re advocating for a relatively unknown technology that you want others to adopt, the onus is on you to describe how it is better and to be upfront about its limitations. Good luck with your book!

nanolith · 2024-10-25T22:15:07 1729894507

I appreciate your opinion. It's not possible to hold a nuanced conversation about model checking in general or in C in particular in the comments of HN. I'd need much more space. But, I can summarize.

CBMC's abstract machine can detect memory safety violations and integer related UB. This includes things like use-after-free, buffer overruns, heap/stack corruption, thread races, fence post errors, integer conversion and promotion errors, and signed integer overflow. When it detects a violation, it provides a counter-example demonstrating this. It also provides the user to build custom assertions, which allows function contracts to be built and enforced. Any function can be defined as the entry point for the model checker, which allows function-by-function analysis. Shadow methods can be substituted, which provides the abstraction necessary to model check entire code bases. A shadow method uses non-determinism to cover all possible inputs, outputs, and side-effects that the real function could perform. This abstraction also allows modeling of third party libraries, user/kernel code, and hardware. So far, I've model checked code bases of around half a million lines of code. It will easily scale to cover a code base the size of the Linux kernel, as long as you understand how to use it. It's an engineering problem at this point. Tracking that abstraction matches implementation is actually not that hard and can be done by writing good function contracts which are verified by the model checker. If the shadow and the original code follow the same function contract, which includes all possible inputs, outputs, and side-effects, the abstraction can be substituted.

The biggest limitation really comes down to large recursive data structures, which is also a pain point for Rust for that matter. There are ways to deal with this, but that's probably the most significant place where any code base customization is required. It's possible to refactor this code to be just as fast in modern C, but in a way that is easier for the model checker to verify quickly.

It's impossible to convert the trillions of lines of code written in C to Rust or any other language without blowing the entire budget of the tech sector for 30 years. Rewrites are prohibitively expensive. Tooling and automation for this tooling is not nearly that expensive.

Cyph0n · 2024-10-26T13:18:59 1729948739

So is it a form of property checking? That is, your model defines certain properties which your program is checked against.

Or is it some form of symbolic execution? This I doubt because I believe the performance is not there yet.

I will read up a bit more on CBMC.

nanolith · 2024-10-26T16:34:52 1729960492

It's a form of abstract interpretation. The code is compiled to a constraint problem in an SMT solver.

There is definitely a performance impact here, which is why it is important to decompose the program and verify it function by function. This decomposition is sound as long as the model checking scenario covers the complete function contract. To improve performance further, I use abstraction in the form of shadow methods. These are sort of like mocks for the model checker. They provide the same function contract -- inputs, outputs, and side effects -- as the original function, but using built-in non-determinism provided by the model checker. This simplifies the overall SMT equation while maintaining an approximation of the overall program. By defining external function contracts, I can use the model checker to verify that both the original function and the shadow function follow the same contract, which keeps the two in sync. The shadow functions are used to replace functions called by the function under model check in order to isolate this function and simplify the overall SMT equation.

The tool provides the mechanism, but it has taken me six years of work and research to develop a practical way to scale it. The book will cover the tool, but it is documenting this "cheat sheet" that is the real purpose for it.

For what it's worth, I'm also considering an edition that covers Rust and Kani.

binary132 · 2024-10-26T02:10:02 1729908602

Now I’m curious

OtomotO · 2024-10-25T18:50:34 1729882234

Great now give me tooling of this millenium (no, I am not going back to vendor everything manually and I am not reinventing every basic data structure I wrote in University in ever project I work on) and we have a deal!

Oh, also get rid of header files, they are archaic. And I want fearless concurrency... And sum types!

nanolith · 2024-10-25T19:04:36 1729883076

If you want those things, you don't want C. Pick a reasonable higher level language you like that makes those decisions for you.

My comment was not to imply that somehow C is superior to X, Y, or Z, but rather to point out that the safety problem with C does have a practical solution.

OtomotO · 2024-10-25T19:16:14 1729883774

Fair point, sorry (non native speaker here, for what it's worth) have a wonderful day!