> The basic approach of this paper is to generalize the problem until it becomes unsolvable
This is simply not true. The talk doesn't say it's unsolvable. It explains why it is hard. It is based on results of researchers in the software verification community. Unfortunately, I don't know how to teach a complex subject without starting with the general and then moving on to the specifics -- which is exactly how the talk is structured: from decidability to complexity, to total programs, to FSMs, and finally to verification methods. How can I explain the idea of reducing the state space by abstract interpretation and type systems, before explaining why the size of the state space is what matters most? If you have a better way of introducing a theory, I'd love to hear it. Honestly, I have no idea how you'd understand verification tools if you don't know that, say, verifying an FSM language is PSPACE-hard. That's at the very core of verification and the design of the tools.
The last two (out of four) chapters of the post explore what is solvable and how. This is how chapter 3 beings:
Now we know why writing correct programs is hard: because it has to be. But while we cannot verify all programs all the time – regardless of how they’re written – there’s nothing stopping us from verifying some programs some of the time for some properties. As we’ll see, none of the software verification methods manage to overcome the limitations we have explored, and it is precisely those limitations that affect the design and tradeoffs of every verification approach.
The idea is that if you understand the problem, namely that the size of the state space is what determines the difficulty of verification, you'd understand how those methods work when they do and why they fail. Very broadly, there are two approaches: reduce the state-space in a sound way (this is called abstraction) and hope that the property you wish to verify is captured by the abstract state space (this is how SLAM works), or not, and work really hard.
If you claim that verification is easy, then you're wrong both in theory and in practice. If you claim that some useful properties are easy and some may be verifiable with a lot of effort, then you're in complete agreement with experience and what I said (which is no more than a summary of known results in verification).
> Others less inclined to claim the problem is impossible have succeeded in verifying some useful systems.
In my job, I formally specify and verify large, complex distributed systems. My success comes from an understanding of the difficulty, which helps guide the way towards what's possible, rather than from wishful thinking. Just as you can't succeed in programming if you don't understand the difficulty (and therefore the importance of readability and structure, for example), if you don't understand the difficulty in verification, you simply will not succeed. You'll either bang your head against the wall or fail to understand how to help the tools.
> L4 microkernel verification
The effort that went into writing the seL4 -- a mere 9KLOC C program -- was, IIRC, over 20 person-years. Someone on that team told me they could have done it in 5 if they had had all the tooling in advance, but we're still talking about significant cost (5x instead of 20x), carried out by a team of (rare!) experts trained in the task. Plus, seL4 was "dumbed down" significantly so that verification would be viable at all, even at that effort. They didn't verify a general micro-kernel, but wrote a much simplified microkernel so that it would be amenable to verification.
For seL4, as I understand it their trials and tribulations also included false starts, like trying CompCert before moving to Magnus Myreen's generate logic from binary code and then meet in the middle approach.
And they've released all this, GPLed, so hopefully others (including maybe myself when I get the time) can continue doing this more efficiently.
Absolutely, but if you ask them, they will tell you that they don't think their process could be done efficiently enough for general purpose software. More importantly: it shouldn't be. End-to-end verification is absolutely unnecessary for all but the most critical pieces of software. No one believes this can be done easily.
When doing software verification we must keep our eyes open, know that it is hard and why:
* There is no general solution
* The difficulty is not a result of code structure, organization or syntactic abstractions.
* The difficulty cannot be avoided with languages -- through totality or abstractions (although some aspects regarding local properties can certainly be aided by language design; see below).
* The difficulty cannot be modularized away.
* The main issue is the size of the state space.
So the main idea is reduction of the state space (and heuristic approaches -- like SAT solvers -- that allow us to tackle larger state spaces). For reduction of the state space to be useful we need to find properties that survive the abstraction and are still important (memory safety is a classic example). The more local the property, the greater the chance that it can survive reductions. Which local properties are important is an empirical question which must be studies.
As for the rest, it's up to careful design, use of simplified algorithms, a lot of grunt work, and a lot of experience, which often involves a manual reduction of the state space.
This is simply not true. The talk doesn't say it's unsolvable. It explains why it is hard. It is based on results of researchers in the software verification community. Unfortunately, I don't know how to teach a complex subject without starting with the general and then moving on to the specifics -- which is exactly how the talk is structured: from decidability to complexity, to total programs, to FSMs, and finally to verification methods. How can I explain the idea of reducing the state space by abstract interpretation and type systems, before explaining why the size of the state space is what matters most? If you have a better way of introducing a theory, I'd love to hear it. Honestly, I have no idea how you'd understand verification tools if you don't know that, say, verifying an FSM language is PSPACE-hard. That's at the very core of verification and the design of the tools.
The last two (out of four) chapters of the post explore what is solvable and how. This is how chapter 3 beings:
Now we know why writing correct programs is hard: because it has to be. But while we cannot verify all programs all the time – regardless of how they’re written – there’s nothing stopping us from verifying some programs some of the time for some properties. As we’ll see, none of the software verification methods manage to overcome the limitations we have explored, and it is precisely those limitations that affect the design and tradeoffs of every verification approach.
The idea is that if you understand the problem, namely that the size of the state space is what determines the difficulty of verification, you'd understand how those methods work when they do and why they fail. Very broadly, there are two approaches: reduce the state-space in a sound way (this is called abstraction) and hope that the property you wish to verify is captured by the abstract state space (this is how SLAM works), or not, and work really hard.
If you claim that verification is easy, then you're wrong both in theory and in practice. If you claim that some useful properties are easy and some may be verifiable with a lot of effort, then you're in complete agreement with experience and what I said (which is no more than a summary of known results in verification).
> Others less inclined to claim the problem is impossible have succeeded in verifying some useful systems.
In my job, I formally specify and verify large, complex distributed systems. My success comes from an understanding of the difficulty, which helps guide the way towards what's possible, rather than from wishful thinking. Just as you can't succeed in programming if you don't understand the difficulty (and therefore the importance of readability and structure, for example), if you don't understand the difficulty in verification, you simply will not succeed. You'll either bang your head against the wall or fail to understand how to help the tools.
> L4 microkernel verification
The effort that went into writing the seL4 -- a mere 9KLOC C program -- was, IIRC, over 20 person-years. Someone on that team told me they could have done it in 5 if they had had all the tooling in advance, but we're still talking about significant cost (5x instead of 20x), carried out by a team of (rare!) experts trained in the task. Plus, seL4 was "dumbed down" significantly so that verification would be viable at all, even at that effort. They didn't verify a general micro-kernel, but wrote a much simplified microkernel so that it would be amenable to verification.