Catching integer overflows with template metaprogramming (2015)

DannyBee · on March 27, 2017

"Some day, I would like to design a language that gets this right. But for the moment, I remain focused on Sandstorm.io. Hopefully someone will beat me to it. Hint hint. "

Some language already did beat him to it: Ada

http://www.ada-auth.org/standards/12rat/html/Rat12-2-4.html

Ada's rules for type invariants check them in all the places you would expect, and are straightforward to use.

It's about as correct as you can get without it being impossible to compute :)

kentonv · on March 27, 2017

I thought Ada did only runtime checks? I am asking for (and my template hackery implements) compile-time checks. Runtime checks are slow and don't fully mitigate security bugs, though they probably downgrade the bugs to DoS attacks. But if these are actually checked by the compiler, that's pretty cool and I might actually have to play with Ada at some point!

DannyBee · on March 27, 2017

First, generalized static invariants that are either correct all the time or computable all the time are, impossible, of course. Even of the simple form you are trying. Second, Ada 2012 has static type invariants (enforced at compile time) and dynamic type invariants (enforced at runtime)

You should be able to do, at the very least, what you are doing, in a much simpler way.

Plus, it has procedure level pre and post conditions that are checked

In any case, the tools ADA has allows it to go further than what you want, actually. https://en.wikipedia.org/wiki/SPARK_(programming_language)

There is a formally verifiable language that is just a subset of ada (IE compilable as normal ada) but can formally verify what you want here even if static predicates can't.

(Now all of the above said, i'm not sure i'd go this route, but yeah)

nickpsecurity · on March 27, 2017

Ada does a mix of both compile-time and run-time checks. It was systematically designed to eliminate errors in most parts of the development process as the book below shows. SPARK, an Ada variant, can prove absence of all kinds of errors in code with automated provers. Since Ada supports cross-language programming, the typical route is to do most of the key logic in Ada, anything static/simple in SPARK, and anything platform-related in native language if too hard to do in Ada. They periodically update them with recent versions at 2012 for Ada and 2014 for SPARK.

http://www.adacore.com/uploads_gems/Ada_Safe_and_Secure_Book...

Note: See table of contents especially.

http://cowlark.com/2014-04-27-ada/index.html

Note: This "random walk through Ada" has a nice summary of all kinds of features it has from perspective of an outsider instead of an Ada developer.

https://en.wikipedia.org/wiki/SPARK_(programming_language)

http://www.adacore.com/uploads/technical-papers/SPARKSkein_S...

Note: Link to SPARK and an example by Rod Chapman on doing Skein in it. Just converting it from C to SPARK found an error in original code. Because of course it did haha...

jackmott · on March 27, 2017

Idris or F*? I think they are not practical because compile times end up extremely slow. I think want you want will necessarily have this effect of very slow compile times.

indolering · on March 27, 2017

Came here to suggest Idris, as it compiles to C. F* is still experimental and Idris is just coming up on a RC. They could also drop down into C and manually verify it.

geofft · on March 27, 2017

What's the story for interop between Ada and C on modern systems? That is, if I want to write a library that C programmers on Ubuntu or something can use in a straightforward manner, can I write it in Ada without too much difficulty?

nickpsecurity · on March 27, 2017

Ada was designed for cross-language development. It was ahead of the curve on that. I'd be surprised if it was hard to integrate C today. The others I recall off top of head that went big in industry were OpenVMS (standard calling conventions, etc) and .NET CLR. I can't find my original article on "mixed, language development" in Ada but the marketing piece below gives you an idea of what developers were doing in 2005. Mixed language development, debugging, and traces. It's probably better now.

https://www.rti.com/hubfs/docs/2005Q1_2.pdf

tjalfi · on March 29, 2017

Ada95 added the Interfaces.C package for interop with C. http://www.ada-auth.org/standards/rm12_w_tc1/html/RM-B-3.htm... shows the interface for Ada 2012. If you want your library to run anywhere then you probably still need to use C.

jbapple · on March 27, 2017

Does Ada check these type of invariants at compile-time or run-time?

DannyBee · on March 27, 2017

It can do both.

milesrout · on March 27, 2017

Ada is not the right solution.

ky738 · on March 27, 2017

So what is the right solution? Don't leave us hanging like that, so unpolite.

michaelt · on March 27, 2017

  Why don’t programming languages do this?

I worked on a contribution to clang-analyser a few months ago - one of the things I learned was (1) No-one likes code analysers that raise false alarms and (2) it's really difficult not to raise false alarms. For example, consider checking this program for divide-by-zero bugs:

  for (int i=-99999 ; i<99999 ; i+=2) {
    printf("%d\n", 100/i);
  }
  for (int i=-100000 ; i<100000 ; i+=2) {
    printf("%d\n", 100/i);
  }

A programmer can clearly see that in the first loop there is no divide-by-zero as i will jump directly from -1 to +1 - whereas in the second loop the counter will hit 0 and trigger a divide-by-zero.

But for a static analyser, your options are:

1. Model i as a single value, and simulate all ~400,000 passes through the loops. Slow - you basically have to run the program at compile time.

2. Same but truncate the simulation after, say, 20 loops on the assumption that most bugs would have been found on the first few passes. This misses the bug in the second loop.

3. Model i as "Integer in the range -99,999 to +99,999" and generate a false alarm on the first loop.

4. Support arbitrarily complex symbolic values. Difficult - as the value of a variable might depend on complicated business logic.

I guess the benefit of starting a new programming language is you can choose option 3 and say "Working code is illegal, deal with it" from the start.

kentonv · on March 27, 2017

For my purposes, I would be happy with a way that I can specify constraints specifically in certain critical code sections, where I'd be happy to work around some false positives. Most other code could go on ignoring overflow as it does today.

staticassertion · on March 27, 2017

That's a pretty impressive response - committing to fuzzing with AFL, improving current fuzzing test coverage, adding new coding guidelines, and a paid security audit.

In regards to 4 (valgrind) perhaps utilizing other sanitizers would be a good idea.

Awesome that they tested whether those methods would actually have caught this bug - and multiple did. That's a smart metric.

And a writeup about TMP and compile time safety to boot.

I'm struggling to come up with anything more that I'd want in a disclosure.

kentonv · on March 27, 2017

https://twitter.com/BenLaurie/status/575079375307153409 :)

The sad part of the story is that Cap'n Proto actually hasn't had a release since then, so (so far) I've effectively gotten out of doing any of those release-oriented things I promised. Many changes have gone into git master, but Sandstorm used git master directly, meaning there wasn't sufficient incentive to do the work of a release. (And as an understaffed startup we could never quite justify spending the time.)

The good news is that I joined Cloudflare two weeks ago, who is a big Cap'n Proto user, and it looks like I'll have more time and justification to do releases there. (And yes, I do intend to fulfill all the promises in doing so.)

staticassertion · on March 28, 2017

> The good news is that I joined Cloudflare two weeks ago, who is a big Cap'n Proto user, and it looks like I'll have more time and justification to do releases there

That's great news, thanks for the disclaimer/ info.

jbapple · on March 27, 2017

Two other ways of checking integer bound invariants:

1. Undefined behavior sanitizer, which checks integer overflows at run-time, like INT_MAX + 1. Can optionally check other invariants; see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.h... and https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html.

2. Abstract interpretation - using a lot of constraint solver cycles and enough math to choke a dinosaur, you can check some integer bounds at compile time "relationally", like knowing that x < y. See, for example, "Pentagons: A Weakly Relational Abstract Domain for the Efficient Validation of Array Accesses": https://www.microsoft.com/en-us/research/publication/pentago.... I'm not sure, but that code might be available in https://github.com/Microsoft/CodeContracts.

kentonv · on March 27, 2017

#1 doesn't get you very far since it can only catch cases that are locally detectable -- where the compiler can prove that some lines of code will always result in an overflow.

Consider the code:

    int f(int x) {
      return x + 1;
    }

This code could overflow, but surely the compiler won't complain about it -- it would instead assume that you don't intend to pass a value for x that is too large. In order for the compiler to have enough information to complain, it would need you to specify explicitly what the range of x is -- so that it can detect violations both within the function definition and at call sites. There is no built-in way to specify this.

But, this is exactly what my template hacking adds -- a way to specify the actual range of each variable.

I haven't looked closely at your #2. Presumably it also requires some constraint annotations to be added to your code. But if they've provided tools that can process those annotations and detect violations, that's pretty cool.

nickpsecurity · on March 27, 2017

The best ones you can straight-up buy are SPARK and Astree that I'm aware of. I've already linked to SPARK. Astree is limited in application but pretty amazing in what it does:

https://www.di.ens.fr/~cousot/publications.www/CousotEtAl-ES...

My idea was to subset as many components as possible to static-ish components it could handle then focus more on integration testing. Then one could reap the benefits in a larger project. Similar idea for SPARK, which is probably cheaper.

touisteur · on March 27, 2017

SPARK is indeed very adapted to some kinds of code (protocols, serialization, deserialization, index manipulation, masks...). Proving the absence of runtime errors in such code is easy (not much help needed for the provers, just type your data structures well) and makes you think of every case. Fun. Even better : once AoRTE is proved, remove all runtime checks and get C-like performance. You can also prove some isolated functional properties ('the answer to message X can only be message Y or Z'... just sprinkle some asserts, post-conditions or check the nice Contract_Case syntax). Or you can go to 'platinum'-level and prove your complete protocol/algorithm. The tech is great and it keeps improving, it gets easier every year...

nickpsecurity · on March 27, 2017

Oh yeah, Rod Chapman and others at Praxis did amazing work. It's probably the single, most-practical tools in engineered software since abstract, state machines for specification or Cleanroom for building. Their error rate was so low.

The traditional problem, though, was that some tools were good for specifying full correctness of high-level models while others were good for verifying properties of code itself. They rarely did both well. Still true with SPARK. Before DeepSpec, I was already pushing for mixing them up a lot with different tools on different components with some sort of unified logic. You might find it interesting that one of those was with SPARK and the successful Event-B method. I'll give you a paper show how hard it is to use Event-B by itself down to low-level code followed by how a combo improves things.

http://eb2all.loria.fr/html_files/files/landingsystem.pdf

http://journal.ub.tu-berlin.de/eceasst/article/view/785/782

One last thing before I head out is that the landing system paper is great for illustrating difficulty of correctness. It shows a small number of straight-forward requirements and design specs. Then, verification conditions are derived down to the low level code that must all be maintained true for total correctness. As in, it makes explicit all the stuff a developer must get right in their head to correctly solve this simple problem. I find it's a nice reality check for people even if they don't know the notation on intrinsic complexity of software & how that affects verification. Especially why you want to use simpler, automated methods than hope to test your way out of that level of complexity w/ corner cases. ;)

touisteur · on March 27, 2017

Thanks for the paper.

It's indeed a game-changer to be able to mix Ada2012, Spark, and C in an industry-grade programming environment. You can gradually convert a legacy codebase. I've been replacing some crufty C parts with Ada-with-Spark-restrictions then full-Spark-with-AoRTE-proof and even some functional proof with great success and can now stop worrying about crashes in security-sensitive attack-surface parts. Network/file interfaces, etc. Most of my experiments are very encouraging ! Coupled with CodePeer and/or afl-fuzz for Ada code, almost-bug-free code becomes somewhat more accessible.

Rod Chapman is also a great showman with great success stories, also very funny... and Yannick Moy at AdaCore is an amazing teacher and doing great work leading Spark further, including on small embedded targets.

I think you're doing great work around here reminding everyone once in a while about Ada & Spark, especially in articles that devolve into 'Rust-would-have-caught-it' :-).

nickpsecurity · on March 27, 2017

Appreciate the kind words. I plan to get back to experimenting myself. One of my ideas is coding in both Rust and Ada/SPARK to let borrow checker catch temporal errors while Ada/SPARK handles the rest. Optionally a C version or use an Ada-to-C compiler if Taft's is still available to throw every free analyzer or tester for C at it. Far as apps, check out Ironsides DNS that was done in SPARK. Borrow-checking and testing it will be my first project if it get back into Ada.

Note: There's also work right now on a verified front-end for SPARK to integrate it into CompCert. Then, the ASM is verified as well. Exciting times. :)

EDIT to add: Yeah, that Rust would've caught it was a bit annoying. Worst was them saying it was first, safe, systems language. pjmpl and I demolished that one, though. Now we just get the other one which is at least accurate if it's memory bugs. Progress. (Shrugs)

touisteur · on March 28, 2017

Well then you're going to be happy : there's a new Spark-to-C compiler. http://blog.adacore.com/spark-tetris-on-the-arduboy Maybe that's the one you're thinking about ?

Also, I heard Rod Chapman (at the last AdaCore tech day) say they had great success using 'the' Ada-to-C compiler some years ago. Looks like AdaCore still have the code for it, not doing much with it. You might get the code if you ask nicely (maybe T.Taft directly now that he works for AdaCore) :-D.

If you get any new idea on some combination of Spark, Ada and other tech, I'd send yannick moy an email :-D. Always helpful and insightful.

If you plan on doing your borrow checker as some kind of new static analysis pass for Ada, I'd also look into libadalang, there's some great (open source !) progress there. I also played a lot with gnat2xml but I don't recommend : it works but it was painful...

I also wondered some time ago about a Spark-to-rust compiler. But I'm waiting for my rust skills to improve first.

Anyway, it's very strange to see rust every so often mentionned for embedded software. I get the interest for 'normal' unconstrained software. But after 10 years of large-scale real-time embedded-systems programming in Ada and I maybe faced once a memory safety bug that the borrow checker would have prevented. Mostly the bugs you get are constraint_errors (overflow, underflow, array bounds,...) most already caught by the type system at compile or static-analysis time. The rest are mostly 'real' bugs : logic or temporal. You never doubt of the state of the program when reading the code, such a lower cognitive load... Going back to C from time to time is so painful, especially, when you think of all the help you're missing.

When in need of pointer go for not null access types, when in need for dynamic allocation go for controlled types (scoped deallocation) and storage pools... Everyone (except haskellers...) is missing on so much, it feels like living in the future where the compiler works (a lot !) for you. Sorry for the rant...

By the way I found one of the ways to catch temporal errors (other than uninitialized variables that are caught) in Spark is to use an enumerated state variable and check it all the time (post/pre or in invariants). It's almost always a good idea anyway to have an explicit state instead of some 'nah none will call this code in such sequence...'.

Also interesting : like functional programming or the borrow checker, or strong typing, Spark changes the way you think about your code. You think about how 'provable' your design and code is and it makes code more readable, simple. You can also just start with data flow annotations (also great for safety and security-critical code), makes reading code easier.

I'll be looking out for your next dive back into Ada then !

jbapple · on March 27, 2017

While I agree with your essential point about #1, I think we agree for different reasons.

UBSan is a set of run-time checks and has low compile-time cost. The example function you give, f, would be complained about if you passed it INT_MAX, but that complaint would come at run-time and it would never be triggered if you never passed it INT_MAX.

I also agree that it is not possible to enforce non-INT_MAX limits, like x < 10.

kentonv · on March 27, 2017

Oh, for some reason I thought Clang's sanitizers were strictly compile-time, but looking closer I see that's not correct. Hmm, will have to look closer at that.

norswap · on March 27, 2017

> Some day, I would like to design a language that gets this right. But for the moment, I remain focused on Sandstorm.io. Hopefully someone will beat me to it.

Two language that do it: Whiley (http://whiley.org/) and Dafny (http://rise4fun.com/dafny).

I expect (but not sure) that the proving power is about the same as what can be achieved by template meta-programming.

winstonewert · on March 27, 2017

I'm unclear on how this would work out practically. Cases where my integers belong to a fixed range seems rare. I'd like to have seen more discussion of how it caught the bugs it did.

kentonv · on March 27, 2017

Cap'n Proto by its nature is doing a lot of work with integers that represent sizes, indices, offsets, addresses, etc. These tend to have well-defined ranges -- and unusually bad failure modes. So it works unusually well here.

danschumann · on March 27, 2017

Boo I thought it was a reference to fictional Tom Parris's holodeck program