COGENT: Certified Compilation for a Functional Systems Language

nickpsecurity · on Jan 25, 2016

Really good stuff. Here I am waiting to see what Leroy et al will deliver on mini-ML compiler or more from Habit team when Klein et al deliver a systems-oriented, certified compiler. Heck yeah! Interesting, too, in that it splits the supporting argument between the functional language and stuff in C. Its focus on being essentially an executable spec with verified translations will help in high assurance where spec to implementation correspondence is quite important. Could be a shortcut when this produces acceptable implementations.

eatonphil · on Jan 24, 2016

I'm having trouble finding information on the language they are talking about. It doesn't seem like the language (COGENT) is presented in this paper, just a compiler for language. Is there a page on the language itself? A quick Google search shows only completely unrelated results. There is another language CoGenT[0] at UMass, but it is not the same thing.

[0] https://people.cs.umass.edu/~trekp/cogent-manual.pdf

drostie · on Jan 24, 2016

It is defined in this paper itself; see section 3 titled "Language," beginning "In this section we formally define Cogent, including its linear type system, its two dynamic semantics — update and value — mentioned earlier in §2, and the refinement theorem between them." Also see figure 5, which is its formal syntax after desugaring.

eatonphil · on Jan 24, 2016

Thanks for that, should have read a little further.

_ibu9 · on Jan 24, 2016

It looks like they have an unverified compiler, which assuming correctness of the compiler code, will produce a proof of correctness for each program it compiles. This is significantly less cool and useful than proving that the compiler itself is correct.

rntz · on Jan 24, 2016

No, the proof that the compiled code is "correct" does not depend on the compiler being correct - it can be checked independently.

Of course, if the compiler is incorrect it might generate incorrect code, but then the "proof" that the code is correct would not check.

Mind you, it's very ambiguous to describe what their compiler is doing as producing a proof of "correctness". Any time someone says "correct", you should ask, "correct with respect to what?" In this case, AIUI, their compiler produces: (a) a formal description of what the program does (what they call "a high-level shallow embedding of the program's semantics") (b) a proof that this is in fact what the program does (its "correctness")

The idea being that if you want to show that your program is actually correct with respect to the specification you really care about, you do the proof on the formal description (a), and then (b) automatically guarantees your proof carries over to the actual program.

_ibu9 · on Jan 24, 2016

Yes, that's exactly what I meant. But compare this to CompCert. CompCert itself is verified, which means it doesn't have to output a proof for each program it compiles.

Even if you don't care about the theoretical concerns of for all vs for each, there are two practical concerns with this kind of system. 1) the compiler could just fail to output a valid proof for some program and you're out of luck until the compiler bug is fixed. 2) Proof checking can get pretty time intensive and it'd be better to do it once for the compiler, than every time you run the compiler.

cwzwarich · on Jan 24, 2016

Actually, if you take a look at the semantic preservation theorems that CompCert actually proves, you can see that it allows for compilation to fail, even in optimization and code generation:

http://compcert.inria.fr/doc/html/Compiler.html

As one particular example, register allocation is performed by an untrusted oracle that may not even terminate, and then verified a posteriori:

http://compcert.inria.fr/doc/html/Allocation.html

They proved a traditional register allocator correct in a paper (https://www.cs.princeton.edu/~appel/papers/regalloc.pdf), but I guess there are practical reasons why they are not using it in CompCert by default.

rntz · on Jan 24, 2016

Indeed, but proving a whole compiler correct is really hard. :P

dbpokorny · on Jan 24, 2016

Not if you start with a small compiler

nickpsecurity · on Jan 25, 2016

No, it's still hard. The best approach I've seen to this outside CompCert is a project that independently tried my own recommendation of breaking it into stages then using whatever verification works on a per stage basis. So, one prover and approach works best for Stage 1. Then another maybe for Stage 2. Some compositional method to tie them together.

They were making a MIPS compiler. Got pretty far in short time paper covered with a fraction of CompCert's effort. Nonetheless, the untrusted compiler plus trusted checker method has the most supporting evidence in literature for getting the job done and ease of building. So, even my recommendation should take that approach where possible.

Ericson2314 · on Jan 25, 2016

It's hard, but I've for awhile thought that developing these tools in a more bottom-up would help. And of course orthogonal components, as you say, is a good idea for just about anything.

nickpsecurity · on Jan 25, 2016

I don't agree. Matter of fact, I recommended it here in an answer to the reproducible build meme that pops up. See my conversation with jeffreyrogers here:

https://news.ycombinator.com/item?id=10182282

nickpsecurity · on Jan 25, 2016

Looking back at it, I meant to say "I don't disagree. I had made similar recommendations before."

Totally changes meaning of comment till you click the link haha.