I'm 35. Does that help? Can you try to explain it to me then? :) (I looked at th...

jerf · on March 25, 2015

Coq is a proof assistant, where "proof" is in the mathematical sense of the term. Thanks to Curry-Howard isomorphism [1], it happens that there's a mapping between proofs and programs, meaning that the "proof assistant" can do things like "prove a sorting algorithm correctly sorts all inputs" in a relatively straightforward manner. I mean, still fairly complicated and not something that you'll encounter in an undergrad computer science education in general, but feasible. For instance, a similar tool was recently used to prove that the sorting algorithm in Python and Java had a bug [2], which was a bit surprising given how well-exercised they were.

Something that a computer science program should cover is that a logical implication is always true when the antecedent is false. X -> Y is always true when X is false. Thus, if you can convince your system to prove a false term, you can set that as the X and write anything you want for the Y, and prove anything. (That statement is actually glossing over a bit of discussion on how you can legally introduce statements into your system, and there's variance between the logics on that point anyhow, but that's what it comes out to in most if not all systems.) Therefore, if you can prove a false statement, you can literally prove anything.

This turns out to be an unhelpful characteristic for a proof system to exhibit. In the case of Coq, it means you can now convince it to take any program as satisfying any property. Any program, satisfying any property. You want to prove that "return []" is a proper sorter for any input? This can do that. Heck, you want to prove that "return []" is a valid fully HTTP5 rendering engine? We can do that too!

In this particular case, it exploits a bug in Coq to do that, but the fear mathematicians have developed over the course of the 20th century is that while the bug may be obvious, deeper, more subtle problems in Coq could conceivably be used to prove "False" oven the course of a proof, accidentally and buried in potentially hundreds or thousands of statements, subtly rendering all such proofs flawed. Such are the things that keep mathematicians up at night.

As long as I'm explaining things, let's also explain the joke in Falso. There are, as the page says, a number of different axiom systems. Some of them are simply wildly different; for instance, Euclid's axioms create Euclidean geometry, whereas axioms about vertices and edges create graph theory. Some are more related... the 20th century resulted in several "set theories", which are ultimately characterized by different axiom sets that lead to different places. Axioms are neither right nor wrong, they are the definition of a branch of math, so you can choose them as you like, then see where they take you. Per the previous paragraphs, you prefer not to have an axiom system that allows you to prove false because it reveals your system doesn't actually say anything because it just says "Yes!" no matter what you ask. In some sense this isn't "invalid" or "wrong" either, it's just useless. (And in some sense, there's only one axiom set that just says "return true", there's just more and less complicated ways of spelling it, so there's not much value in trying to "study" one because it's identical to "return true" anyhow.)

The joke in Falso is that it simply takes as an axiom "For all P, the proof of P is the null statement", which, for the purposes of this post, is essentially "Assume false is true." From this, Falso can therefore prove... anything! In one statement! At this point the "comparison table" ought to start making sense. But the page is full of other little mathematician jokes... for instance:

    Let us prove that P = NP by contradiction. Assume that P ≠ NP.
    Then, by the axiom of the Falso system, we have a contradiction.
    Therefore, P = NP.

The joke here being that under the Falso system, "proof by contradiction" isn't actually necessary; you want "P = NP"? Just set it as the "P" in the axiom and read it off in one step, "The empty statement proves P = NP". Proof by contradiction is unnecessary frippery providing the thinnest, thinnest of covers over the triviality of the proof, which is itself the joke. As opposed to a huge proof of dozens of steps, which could be funny in its own way, it is also mathematician-funny to provide the smallest possible obfuscation. (The remainder of the joke is a slightly disguised allusion to the fact that "P ≠ NP" is equally trivially provable under Falso. Estatis Inc. need not wait anxiously by their mailbox for their million dollars from the Clay institute.)

Mathematicians have a strange sense of humor. Strange, but wonderful.

As I said, the last time I encountered this page [3], which HN says was 2 years ago, it didn't have any implementations, it was just an extended joke. The fact that it is now actually "available" is also sort of funny.

Oh, and finally, I can't read or use Coq either. Plus, even if you could, this is ultimately a bug exploit, not a "real proof", which would make it even harder to understand how what is being done relates to the final result; it's like seeing some random-looking assembler that turns out to produce arbitrary code execution due to some quirky combination of bugs, there isn't any apparently relationship between what the assembler looks like it's doing and the result it actually produces. Haskell's close enough for me for now. I'm also willing to wait and see if homotopy type theory follows through on its promises to make proofs simpler... at the moment, as I am ultimately a software engineer and not a mathematician, I can't help but see current proof technology as too much work for too little gain. I wish all those working to change that assessment all the best, though.

[1]: http://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspond...

[2]: http://envisage-project.eu/proving-android-java-and-python-s...

[3]: https://news.ycombinator.com/item?id=5677646

tel · on March 25, 2015

The way the exploit works isn't actually that bad. It just has to do with exploiting Coq's assumption that its bytecode VM evaluator is equivalent to normal CBV reduction. Turns out that's false---though that was certainly always a potential weakness in that assumption.

jerf · on March 25, 2015

Yes, if I did not make it clear, what was actually demonstrated here was not the sort of scarybad mathematical weakness that can only be extracted from a long proof that I was talking about, it's basically just a bug in Coq, and in the long term has little consequence other than providing a bit of fun.

tel · on March 25, 2015

When proving there are choices of "proof systems" to use. Each have different properties. One really vital property is non-contradiction. "Proving False" (note the capital 'F') is terminology for showing contradiction. From False you can (trivially) prove anything regardless of whether it is true.

Falso is a joke proof system which assumes False. This makes proofs in the system very short since they are all of the form "If False, [conclusion] is true; False is true; Therefore [conclusion] is true; QED."

Practically, proving False undermines the value of a proof system. Thus Coq, like all real proof systems, would like to ensure that does not contain contradiction.

The first joke is that this bug calls itself an implementation of Falso in Coq---e.g., a demonstration that Coq has a bug sufficiently dangerous to admit contradiction.

The second joke is that Falso's site---which is tongue-in-cheek hyperbolic on the supposed "benefits" of Falso as a proof system---lists "implementations" of Falso. In other words, that section is a "wall of shame" for proof systems which have flaws sufficient to "prove False".

---

To explain the exploit it has to do with a weakness in the `vm_compute` instruction under large sum types.

Essentially, Coq is based on the idea that "generalized lambda terms" are useful as proofs in a proof system. Part of this usefulness arises from the fact that their "proof system" meaning is invariant under evaluation. Normally evaluation of large lambda terms is slow so `vm_compute` compiles them to bytecode in a VM [0] and executes them there.

Apparently the VM chokes on sums larger than 256 variants. Since Coq assumes that "VM computation" is equivalent to "standard lambda term reduction". The VM bug breaks that equivalence and allows us to state that `true = false` which is contradiction.

---

Impact-wise this is pretty minor. It ought to be fixed since accelerating verification is important for UX purposes, but it doesn't show a weakness in the core calculus---the weakness arises from the assumption that VM evaluation is substitutable for standard, theory-driven lambda term reduction. Clearly this is false.

A much worse weakness, for example, was Girard's Paradox which showed that theories of a certain simplifying form were contradictory [1].

[0] https://coq.inria.fr/distrib/current/refman/Reference-Manual...

[1] https://coq.inria.fr/library/Coq.Logic.Hurkens.html