Software Foundations by Benjamin Pierce et al.

cmrx64 · on Dec 13, 2015

This is a lovely book. I gathered a group of students to work through this book last year (and even roped in a professor). If you're interested in formal software verification or interactive proof assistants, this is an excellent and approachable introduction.

Another free book, using the Isabelle proof assistant, is http://concrete-semantics.org/. It's also very good.

nextos · on Dec 13, 2015

It looks very nice. The Little Prover, recently published is also a neat small intro.

I wonder how does Software Foundations compare against other more specialized classics in the field like [1-3].

[1] https://mitpress.mit.edu/books/principles-model-checking

[2] http://www.springer.com/fr/book/9783540654100

[3] https://www.cis.upenn.edu/~bcpierce/tapl/

cmrx64 · on Dec 14, 2015

I've worked through TaPL too. I'm not even sure they are comparable :) It's possible to formalize the contents of TaPL in these systems (and, indeed, Pierce has taught it like that). But it's still a very different focus than Software Foundations.

I've not read or worked through the other books (wow they are expensive), but I'm in general not very keen on model checking as an approach. Tools like Alloy or NuSMV can be useful when designing or investigating algorithms, but they always do a bounded search and can't be applied directly to implementations. I personally don't confer that much trustworthyness to systems that appeal to model checking tools for their correctness.

I've seen The Little Prover, but am somewhat wary of it. On the other hand, I enjoyed The Little Schemer very much. I might take a look at it next month.

pron · on Dec 14, 2015

> and can't be applied directly to implementations

There are model checkers that work directly on implementations, like NASA's JPF.

> I personally don't confer that much trustworthyness to systems that appeal to model checking tools for their correctness.

Given that the going rate for proofs is roughly 1-1.5 days per one line of code, which grows superlinearly with the code size, I think we have no choice...

In any event, pretty much all safety-critical software that human lives depend on (even many, many lives) is verified with model-checking in the best case. Use of proofs in verifying safety-critical systems is very rare, as even those domains can rarely afford it.

Also, remember that no one really needs 100% proof, as no other, non-software component in the system guarantees 100% reliability (not that model-checking can give us probability bounds).

tel · on Dec 14, 2015

100% certainty in proof doesn't mean that you're aiming to get a 100% proven system. It means that you can stop worrying about the proven part forevermore. So it reduces long term running overhead.

pron · on Dec 14, 2015

If your proven component is so well-isolated from others and doesn't undergo changes (and neither do its assumptions about input), it seems like the same would apply to a model-checked component. Besides, the cost of proofs is well over 1Kx if not 10Kx over model checking, so I don't see how this consideration is important.

Proofs are useful -- because of their total coverage -- when verifying components that will be used in many different projects, like compilers, kernels and libraries, and even then only when they are sufficiently small to be doable within, say, 5-10 years by a dedicated team. A one-time gigantic job will pay for itself by being used (and hopefully paid for) by many customers over many years.

tel · on Dec 14, 2015

I've got no problem with model checkers, though I'm not sure I agree with your effort ratings. Different tools for different jobs, eh?

I think I also have a different conception of proof than you're naming. You don't need to prove your entire program. 10 small proofs strung together can go an enormous distance with low weight.

pron · on Dec 14, 2015

I don't know. The spec I'm working on now is ~1300 lines of TLA+, and that's just the essential core of the thing -- nowhere near the whole program. By comparison, the Raft TLA+ spec is under third of that, yet the Verdi proof[1] is ~50K LOC, took a whole team of experts months of work, and didn't prove liveness.

I have a very small sub-algorithm that I might try proving in TLA+ just to see how it goes.

[1]: I couldn't find the original TLA+ proof online, but the author of the Raft protocol said that he struggled even with partial machine-checked proofs.

tel · on Dec 15, 2015

I don't think this is really surprising. I don't know a ton about distributed systems, but the last time I read up on formalizing them and proving properties it seemed like an incredibly daunting task. There's an exponential blowup problem in that you end up having to follow every state in the state spaces of every actor against every other state over time. I saw some interesting proofs which used topological invariants to keep a hold of that whole complex more delicately... but to actually just dive in and try exhausting it cannot be simple.

pron · on Dec 15, 2015

Well, formalizing them isn't too hard, and model-checking them is quite easy (once you have a good formal spec), but proving them is another matter altogether...

People prove specific algorithms (distributed, concurrent or sequential) with TLA+, but once the spec is of a real system, things get much, much harder to prove.

nextos · on Dec 14, 2015

As to comparability of those books, well, I asked because theorem proving is the most general approach (vs abstract interpretation, data/control flow analysis, model checking and type systems). So I was wondering if Software Foundations referred to these, so it could be regarded as a intro book which could then be followed up by [1-3].

I think I disagree a bit wrt model checking, which is very useful in some specialized domains. I have had quite a lot of success with it when verifying complex real time systems, specified as timed automata. We then generated code from this specifications. It'd be challenging to use any other technique to achieve such an exhaustive verification.

Both [1-2] can be found online, and [2] is particularly enlightening! I followed a course by the authors, and implemented a big chunk of the book. It's really awesome to implement static analyzers by interpreting programs with a more abstract set of types.

cmrx64 · on Dec 14, 2015

Ah, I see what you mean. Software Foundation hints ever so slightly at more specific approaches. Looking into the other books a little bit more, I'm not so certain Software Foundations would serve as a good introduction to them. The reason being that it focuses very heavily on the theorem-proving aspects. Concrete Semantics, at least its second half, would be an excellent primer. Software Foundations puts more weight on proving properties about languages than it does verifying particular programs. Whereas Concrete Semantics culminates in Hoare logic and abstract interpretation.

That sounds like an excellent use of model checking tools. Most approaches I've seen involve creating a model of some system separate from the implementation, which I think is risky. Still better than nothing, by far.

I'll definitely have to check those books out now. Thanks for the recommendations!

nextos · on Dec 14, 2015

Thanks for the Concrete Semantics recommendation!

coolsunglasses · on Dec 14, 2015

>The Little Prover, recently published is also a neat small intro.

Really? I know a lot of programmers that were excited for TLP and were very disappointed. I haven't even gotten a lukewarm review from anyone yet.

nextos · on Dec 14, 2015

Well, it's OK. I don't like any of the Little books or HtDP that much except The Little Schemer (neé as The Little Lisper).

mafribe · on Dec 14, 2015

I'm thinking of doing something similar with my PhD students who have no backgroud in maths, but are strong programmers.

How did you organise reading such a book? Any recommendations?

I wonder if I should go Isabelle, Coq or Agda. The former is more suitable to my personal prover needs, as I like classical logic and nominal techniques, but I have a feeling that Coq is easier to get into because there is more literature. Agda is the cleanest system, but lacks proof automation.

cmrx64 · on Dec 14, 2015

We were a group of mostly undergrads. Computer Science students.

What we did was meet once a week to discuss the reading and the exercises, or work through the exercises together. During the week we'd read the next few sections and try the exercises. Especially at first it was really slow and took a lot of effort as we weren't very comfortable with that form of logic (most of us had only previously taken an undergrad intro to discrete math class).

I personally like Isabelle very much, and if I were to start this with a new group of people I'd do it in Isabelle. Its Prover IDE in the form of Isabelle/jEdit is much nicer than Proof General or CoqIDE. I also think its presentation is a bit more immediately understandable for someone familiar with classical logic but not necessarily type theory. Isabelle/HOL is very natural from that perspective, imo.

mafribe · on Dec 15, 2015

Thanks.

I agree that Isabelle's jEdit integration is nice. But what would be a polished and up-to-date introduction? I think "Concrete Semantics" is a bit heavy going for a complete beginner. No detailed explanation of proof rules, tactics etc.

cmrx64 · on Dec 16, 2015

Isabelle's tutorial is quite comprehensive, and, I think, rather approachable to an ITP beginner: http://isabelle.in.tum.de/dist/Isabelle2015/doc/tutorial.pdf

It assumes some knowledge of basic functional programming, but builds up proving-related jargon quite quickly through examples with some interspersed explanations.

It doesn't discuss structured proofs with Isar, and instead focuses on the tactical style, but I think this is an advantage - it's simpler that way, and structured proofs really only benefit human readers. They're somewhat confusing to write if you don't already have a good grasp of the entire system and what is possible.

The length of that tutorial was really intimidating to me at first, but it's quite approachable and isn't all that dense. It does take some time to work through it enough to become comfortable with the system.

mafribe · on Dec 16, 2015

Thanks. When I taught myself Isabelle/HOL a couple of years ago, there was a mismatch between the Isabelle/HOL system and its description in the old version of the tutorial you point to.

What was really hampering my grasping Isabelle/HOL was the lack of a description of the overall structure of the Isabelle/HOL system. In particular, there wasn't really a good account that explains the distinction between the rules of HOL and the meta-rules of Isabelle, and how they relate. This is hand-waved away in the tutorial. Clearly the authors don't want to burden the beginner with this material. Maybe that's the right choice ... they have a lot of experience teaching this material.

A consequence of this choice is that the whole tutorial did not (and still doesn't as far as I can gather from a quick browse) include a complete listing of the logical rules (and meta rules)! As the structure of Isabelle/HOL and its (meta-)rules are not explicitly given, it becomes hard to see how tactics work. Under what conditions does it make sense to use tactic X or tactic Y or Z? Without saying what the rules are and how tactics work on them, how do you know how to proceed when Isabelle/HOL gets stuck?

If you stick only to the examples given in the tutorials and in "Concrete Semantics" this is not so much of a problem, but as soon as you work on your own proofs, you are your own. I found this to be a major stumbling block.

I wonder if Coq is better documented in this regard. Maybe not, ITPs are just complex beasts.

bkase · on Dec 13, 2015

Certified Programming with Dependent Types (available online as well http://adam.chlipala.net/cpdt/) is also a good resource for learning Coq (note: I just started working through the material this weekend).

davidgrenier · on Dec 13, 2015

Whatever your views on that stuff... it is fun to play in Coq.

On the other hand, playing with J-Bob, the toy "proof assistant" used in The Little Prover is tedious at best. I'm torn between just giving up with the last two chapters of the book undone.

Pils · on Dec 13, 2015

Working through this right now as a directed reading with a professor and a couple of other students. Would definitely recommend this as a follow up to an intro PL course.

krat0sprakhar · on Dec 14, 2015

Can someone familiar with this kind of stuff lay out a few benefits of learning this material for a practicing programmer? I'm quite interested (since I've heard a lot about Coq and Pierce) but since there are so many things to learn, I want to prioritize appropriately.

tel · on Dec 14, 2015

It's hard to say exactly. It will teach you to think of programming in an entirely new way.

Will that make it easier to write good code elsewhere? Almost certainly if you're not familiar with mathematical proof (maybe you took a discrete math course before). If you know what it is to prove then you'll learn an interesting deep connection between programming and proving.

You probably don't want to prove things in your programs. Not really. Even after you learn Coq it's too difficult to do day-to-day.

But proofs do not work without being completely consistent in the same way that good programs are. So you can learn a lot of techniques for good programs from techniques for good proofs. Coq rejects bad proofs, so it can be good practice.

Finally, Coq is unashamedly higher-order. The chops you gain working with it will translate more or less directly to better functional programming comprehension.

nextos · on Dec 15, 2015

I moved to Scandinavia and got a MSc which was very strong in formal methods: static analysis, model checking and theorem proving; some big names in the field teaching there.

I was able to access highly technical programming jobs, those where you are using formal or semi-formal methods to build critical and/or real-time systems. These are very well paid and very nice to work on. You end up designing and implementing very elegant systems with nice guarantees. It feels like high-end engineering. So from a pragmatic point of view, you can access better jobs.

From a more idealistic perspective, you will understand computation better.

javajosh · on Dec 13, 2015

Uh, how do you pronounce Coq?

tel · on Dec 14, 2015

As is traditional:

You pronounce it with an understanding of how difficult it must be to speak French and have most of the CS world talk about bit twiddling.

pron · on Dec 14, 2015

cmrx64 · on Dec 13, 2015

https://translate.google.com/#fr/en/coq, click "listen". Or, in French IPA (https://en.wikipedia.org/wiki/Help:IPA_for_French), kɔk (roughly transcribed to English, "couk")

But, everyone I know (including me) who I've heard pronunce it just pronunces it "cock".

auggierose · on Dec 14, 2015

Looks like the book should have been called 'Playing with Coq' instead.

coolsunglasses · on Dec 14, 2015

Is there some better resource for learning programs-are-proofs people should be using?

auggierose · on Dec 14, 2015

In my opinion, many people are curry Howard blind. It's like they found this one correspondence between certain proofs (constructive ones) and programs and are hell-bend on using it. Personally I like not to confuse the two things. Proofs are proofs, programs are programs, programs can generate proofs, proofs can verify programs. And proofs are classical.

mafribe · on Dec 14, 2015

Identifying proofs with programs undeniably gives a beautiful economy of concepts, best experienced among mature tools maybe in Agda.

But there is a price to be paid, and that's the restriction to constructive logic. If you need classical logic, use an LCF-style prover like Isabelle/HOL.

pron · on Dec 15, 2015

> But there is a price to be paid, and that's the restriction to constructive logic.

That's the price on the proof side. The price on the program side is the need to encode complexity in the types because C-H is about abstract programs, not computations.

Otherwise, the existence of a computation is entirely ignored except where it may break the logic's consistency in the case of non-termination, but from the computation's perspective, a long computation and a non-terminating one are the same. The result is that you can "prove" the correctness of programs that are effectively non-computable, i.e. programs that don't encode effective computations (by effective computation I mean one that terminates before the end of the universe, but it could be exchanged with more practical definitions).

tel · on Dec 14, 2015

Yeah, I agree that it should be routinely called the C-H correspondence.

mafribe · on Dec 14, 2015

What's a suitable resource depends on your background. If you like books, "Proofs and Types" by Girard, Lafont, Taylor [1] is a great read, albeit idiosyncratic.

[1] http://www.paultaylor.eu/stable/prot.pdf