The types of issues discovered (they mention null pointer access and resource and memory leaks) is much smaller than what a tool like Coverity will find (I use it). And they analyze C and Java, two languages supported by Coverity, a very mature tool...
I am not certain of the proposed value, except it's free to other than Facebook - but not to Facebook, who pays engineers to develop this...
Is this some kind of NIH syndrom by Facebook, or is there something I missed ?
Coverity is great, but for example on the mid-size service (10s but not 100s of kloc) that my team works on the analysis still takes hours. Therefore we only do it for prod releases, not on every commit or CI deployment.
If you want to make static analysis part of the everyday development process, it has to be 1) very quick, ideally seconds; minutes at most 2) preferably something the developer can just run locally before pushing a change. If it's fast and easy enough, it simply becomes another code hygiene tool like a code formatter that you'll run continuously, perhaps even directly integrated into something like IntelliJ.
To me Infer sounds like a nice complement to Coverity to catch issues as close to where they are introduced as possible. It might even be Good Enough for many projects to be the only tool, since Coverity is pretty expensive.
> If you want to make static analysis part of the everyday development process [...]
Just to nitpick, I think your use of the term "static analysis" is a bit too broad. Every (or almost every) production compiler or JIT does static analysis intraprocedurally, that is, confined within a function / method / procedure. On the other hand, whole program / interprocedural static analysis quickly gets very expensive, usually because an alias / pointer analysis is involved, and that's what you need for null pointer checks and stuff.
So I guess my point is, there is plenty of static analysis going on all the time, just not expensive whole program bug-finding analysis. Cheap bug-finding static analysis stuff is common, for example in GCC all those warning options to catch undefined behavior.
(Infer dev here)
One strength of Infer is that it is inter-procedural, yet not whole-program: each procedure gets analyzed independently. So it's cheap enough to run on large codebases while still able to find deep inter-procedural bugs.
If you're analyzing procedures independently, why is it interprocedural? Interprocedural just means that you use some information about another procedure. This is expensive because if the information about one procedure changes during the analysis, you have to go and reanalyze all the dependent procedures. There are cheap but less accurate pointer analyses, is that why it's fast?
Infer does bottom-up analysis: it starts at the bottom of the call graph and analyzes each procedure once independently of its callers. Analyzing the procedure produces a concise summary of its behavior that can be used in each calling procedure. This means that the cost of the analysis is roughly linear in the number of nodes in the call graph, which is not true for a lot of other interprocedural analysis techniques.
It's true that it a procedure changes that you may have to re-analyze all dependent procedures (and calling procedures!) in the worst case. However, in the bottom-up scheme you only need to re-analyze a procedure when the code change produces a change in the computed summary, and in practice summaries are frequently quite stable.
Cool, thanks for the details. So... what do you do about cycles?
By "change" I didn't mean code change, I meant change in the information about the procedure collected during an iteration of the fixed point computation. But from the sounds of things you aren't computing a fixed point.
For example: A calls B and B calls A. You have information A0 and B0 about A and B. Analyze B, you have information B1 about B. Then you go and analyze A using B1. This gives you A1. Now you have to redo B, and compute B2. Use this to compute A2. This carries on until the information is not changing, i.e. An = An + 1 and Bn = Bn + 1.
Blatant sales pitch: Klocwork (which I rep) is what you want. Our static analysis tool is fast enough so that your developers can run it just before a commit, but for many environments it will run when they save code, or even as they type. In a CI process you can run it on every commit. It's like a spellchecker, except that it does highly sophisticated static source code analysis.
We cover C/C++/Java/C#. The tool runs on the order of 1-2x of the speed of your build, and is integrated into Eclipse, IntelliJ, and Visual Studio. It's very, very fast, and it covers much of same ground as Coverity, and more. Mail me at larry.edelstein@roguewave.com to arrange a demo.
You might want to check the desktop analysis in Coverity. It runs a complete analysis at the first time. But afterwards, it can run analysis on the changes - but against a version present on your build server. I had set this up for a code base in C (>100Kloc) and desktop analysis was fast enough.
Have you tried Coverity Desktop analysis? It'll let you run Coverity on your code incrementally(for only code changes), assuming you already have a baseline.
Coverity was evaluated a number of times at Google, and IIRC we decided it wasn't going to scale to the codebase size we needed it to. A separate and unrelated effort ended up with us building Tricorder [1].
Often perceived NIH at large companies for this sort of thing is simply a byproduct of scale that is unreasonable for external companies to have to worry about supporting.
Your second point is very good; I get the feeling that NIH is simply a buzzword at the moment. Especially since a cursory glance Infer's source code will show that it was, in fact, invented somewhere else and purchased by Facebook.
These sorts of tools can get very annoying with their excessive warnings, but when they work, they save tons of time. We faced one bug at work where when multiple users were making a search at the same time, you would get bogus results back. With one user it would work consistently. We spent weeks tracking this downs and the problem:
== instead of .equals() in Java
The thing is, we were used to ignoring find bugs, but lo and behold it had pointed this problem out.
Some of these tools (like Coverity) can be very comprehensive yet slow and expensive. We do our checking in layers like lint, memory check tools on every checkin and slower and expensive tools on a hourly/nightly basis.
* Facebook has recently hired a number of expert language theorists and practitioners. Doing it in-house
1) Gives them something to do
2) Serves to cement Facebook's language-expertise-brand recognition and dominance.
The very fact that hiring is focusing on this group signals to me that this is an area which Facebook takes seriously and wants to be taken seriously in.
I'd be interested in this as well; all I could find on their site is that it is uses "patened techniques." Considering the first seperation logic paper was published in 2001 [1] and Coverity founded in 2002 [2], they probably did not (at least originally) use the same techniques.
I'm guessing their site is designed to sell to people and not have the details.
Facebook employs thousands of developers and has a product that is pretty much done (or, IDK what they spend all their development budget on atm); they have the room to create new tools. TBF though, if this was a hobby project that wasn't linked to Facebook, it wouldn't get the attention it is getting right now.
It's interesting to see how building tools with languages like OCaml can reduce bugs for teams, without them having to change the language itself. I do wonder what things would be like if such languages we're used directly more widely.
Legend has it there is a small room at FBHQ, containing a quorum of OCaml committers, all of them French for some reason, hacking away at level of abstraction beyond the ken of mortal man.
That's not a legend, it's true since they're doing a partnership with INRIA where OCaml was born. Those French computer scientists know something that American companies don't and that's that theory is important as an underlying foundation for extraordinary results.
I'd actually expand that to European - in my grad studies, the European CS world had a much more mathematical bent. Notice the the Glasgow Haskell Compiler, Coq, etc.
You can always go from math -> pragmatism, but the reverse is nearly impossible. People get set in their ways, and it takes time to develop mathematical rigor, even if you want to.
So when you need to get mathematical expertise, you wind up needing to hire it.
A pretty sizable chunk of the OCaml world is French. It was developed at INRIA, yeah, and maintains a lot of presence there. It's actually a tiny bit difficult from time to time in OCaml to find English resources.
We use Pfff internally to generate code-graph of large polyglot code-bases (think AOSP). This graph powers the querying engine that finds all tests to be run given a diff. It is incredibly useful. My only gripe is that Pfff isn't being maintained. Pfff can barely support Java 7, let alone Java 8. C/CPP support isn't as extensive as is for PHP. How I wish Pfff was being actively maintained...
I've found Sourcegraph's srclib.org (Go) and Google's kythe.io (Cpp, Go) make some interesting strides in the static analysis field as well.
IMO, treating code as query-able data can open up a lot of possibilities, and OCaml suites the field like a glove.
This appears† to be the result of Facebook having purchased a UK company called Monoidics†† in 2013. It's nice to see these types of acquisitions resulting in code getting opensourced.
"Based on the success of these research advances, we formed a startup company, Monoidics, in 2009. Monoidics joined Facebook in 2013, and since then we have adopted the continuous development and deployment style practiced at Facebook and improved our analyzer based on an iterative feedback loop between our analyzer's development team and Facebook's mobile software engineers. We have demonstrated that verification technology can indeed thrive when applied to codebases at Facebook's scale and pace of change."
Can someone explain-it-like-I'm-a-90s-programmer (ELi90s?) why so much symbolic evaluation stuff gets done in OCaml? What does OCaml do that makes it so well suited for this problem domain? (I know a very little bit about symbolic evaluation and have done a very very little bit of it).
In addition to what the other commenter said, pattern matching is really nice for this kind of thing. Of all of the functional programming languages, OCaml has probably the most sophisticated pattern matching engine around (and we basically copied it into Rust, incidentally), supporting or-patterns, multiple bindings, guards, and so forth. Pattern matching lets you essentially match on the shape of subtrees of arbitrary data structures with complex predicates.
If you're familiar with old-school compiler construction, this is like having a souped-up BURG built into the language. For example, pattern matching lets you say things like "if I have Load(Var, Add(Var, Constant)) where constant is a small power of two, fold it into the x86 indexed addressing mode" in one line. Unsurprisingly this is useful not only for compiler construction but for any kind of term rewriting/symbolic manipulation.
Ok, tangent question: if I wanted an interesting project to learn Rust with, would a symbolic evaluation checker for (say) C code be a really good fit? In the same sense as emulators turned out to be a fantastic fit for Golang?
If that's true, what are the features of Rust that make this so, and roughly how would they apply to that problem domain? (I could answer that question for Golang and emulators pretty quickly).
(Hopefully this question comes across the way I intend it to, which is: I have no plans on using OCaml any time soon, believe the comments that say you want a language with pattern matching to do this in, and would love to tinker more with both Rust and symbolic evaluation.)
Sure, I think it'd be a fun project to try! We use Rust for compiler construction, obviously, and it works great for us. Bear in mind, though, that Rust is manually memory managed, and there is significant cognitive overhead of having a compiler that checks that you're doing the manual memory management properly as opposed to just using a GC.
I think if you want a really fast symbolic evaluation checker—say, the kind that you're going to run on $BIG_COMPANY's codebase on every checkin—Rust would be a really good fit, because you get pattern matching and excellent performance. Other than OCaml and (maybe?) F#, I don't know of a language that has as sophisticated a pattern matcher as Rust does, which helps a lot with this stuff. But, if you're building a one-off tool, a more dynamic language with a GC might be more convenient.
Would it be likely that I could get around that problem just by using arena allocation? Do you think the allocation patterns of a symbolic checker fit that sort of "just allocate everything and forget about it, then free it all at once" pattern? Does Rust make it easy to punt that way?
I'm mildly surprised, why isn't this a thing in Haskell? Can't it be easily added, considering an or pattern is trivially expanded to two (or more) patterns?
I've been surprised/annoyed by this as well. My conclusion was that there are no practical problems that would make this hard, and it is most likely held back simply by the theoretical trivialness of it (what academic wants to work on something that doesn't make a good paper?).
As far as I know, that's never really been a reason as to why we haven't implemented it in GHC. It's much more likely just that nobody has ever gotten around to implementing it. Shouldn't be especially hard, I imagine.
As a lot of other commenters say, pattern matching, ADTs and higher-order functions are ideal for symbolic manipulation. Nowadays many languages have that (Scala, Rust, F#, Haskell), but Ocaml is old and was probably the first language that offered these features and had a really fast, rock solid compiler producing really fast, high-quality code (since the 1990s). Programming language researchers know for a long time that ML-style languages are pretty sweet and started using them as soon as they became pragmatically viable with Ocaml in the 1990s, and a lot of them just stuck with Ocaml.
On top of that, Cristiano Calcagno, one of the co-founders of Monoidics, the startup that Facebook bought to get the "Infer" technology is one of the people behind MetaOcaml, and has coauthored a paper with Xavier Leroy, Ocaml's creator.
Another reason is the close relationship with theorem provers, in particular Coq. You can build certified and relatively efficient functional programs by 'pressing a button' in Coq that extracting programs from either Coq functions or Coq proofs of specifications. This now also works for Haskell IIRC, but I think originally this was just for Ocaml.
Symbolic reasoning involves a lot of term-matching and term-rewriting. These tasks are best accomplished in programming languages that have first-class support for recursive terms (aka. algebraic data types). Popular candidates these days are OCaml, Haskell, Scala.
While I am not certain the premise holds (OCaml dominance), one possible explanation is that many programmers were introduced to ML/OCaml through the compiler construction course and Andrew Appel's book.
I think one part of this is the OCaml type system. It just seems to be very suitable for symbolic manipulations. You can do a lot with algebraic data types.
The other feature that I personally think is great in OCaml is pattern matching. This can help you expressing computation on symbols, where usually you have few options for the input. See following:
I can sort of make sense of this, but I don't know enough OCaml or FbInfer to really grok it. Why would this be a particularly good example of how OCaml simplifies symbolic evaluation?
Thank you, by the way! Actual code examples are kind of exactly what I'd like to see.
If I oversimplify I would say that due to pattern matching and functional nature of OCaml you can avoid
much of the boilerplate code needed for building tree nodes, or finding the node you want, allowing you to focus on the high-level task you want to actually achieve.
In fact one thing that you might notice in the beginning when reading OCaml is that the code seems more "dense".
In C I was used to reading code by skipping over large chunks of it (again I'm oversimplifying here) like the one below, and one of the things I had to get used to when learning OCaml was to slow down and avoid skipping large chunks of code:
if (some_complex_condition == failed) {
/* ... large block of code for error handling to ignore on first read .. */
}
result = malloc(...);
result->... = ...;
result->... = ....;
Here are some examples of what is possible with symbolic manipulation in OCaml, although I would recommend learning a bit of OCaml syntax and concepts from a book first
(such as Real World OCaml):
Not from pfff/FbInfer, but Benjamin Pierce's excellent book on types (Types and Programming Languages) uses OCaml as well, with heavy use of pattern matching on AST.
| TmIf(fi,t1,t2,t3) ->
if (=) (typeof t1) TyBool then
let tyT2 = typeof t2 in
if (=) tyT2 (typeof t3) then tyT2
else error fi "arms of conditional have different types"
else error fi "guard of conditional not a boolean"
In addition to pattern matching, SML and Ocaml are popular languages for this type of work as some of the graph algorithms used in static analysis are easier expressed with eager evaluation and mutability. I am guessing there are commonly accepted idioms and libraries around the use of functors, monads, applicatives, etc.. for doing these things in Haskell; there is a Haskell version of TAPL examples (as well as a Scala one). Here's a talk from Intel about their use of SML in this field, which (in a collegial manner) mocked Haskell by saying in a slide "yeah, I am sure Simon Peyton Jones has a paper on it somewhere..." (note: I attended this talk at CUFP in 2010... IIRC Simon Peyton Jones was in the room when this remark was made :-))
A great deal of this also seems to be convention/pragmatism: e.g., pfff and Hack are in OCaml -- and predate the use of Haskell at FB -- hence FbInfer is as well. Coq, which is often used for proofs of correctness of type systems, is in OCaml so someone who already uses/hacks on Coq might as well use OCaml for implementation work.
Yes you right, I am not exactly clear. My understanding of symbolic manipulation is the following (from wikipedia):
"In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions and other mathematical objects."
Now obviously if you have types in your language that make this easier than it is a win. I thought algebraic data types makes this easier. I might be wrong. On the pattern matching side, you are not concerned about the actual value of the variables in your expressions rather the patters those are matching to. That was my intention to show with the second URL in the previous comment, but again I might be wrong on that too.
Static FP languages just feel so natural for compilers - the AST can often be perfectly naturally expressed with algebraic data types. Then you want to traverse the tree and do certain things when subtrees or nodes are of a certain shape... which turns out that pattern matching can express very nicely.
I doubt that there are nicer general purpose languages than languages like Ocaml for compiler stuff, and symbolic execution somewhat by extension. Maybe languages that have more powerful matching and maybe even term rewriting, but I don't know if they would be considered general purpose (at least fairly fringe).
(obviously it supports Java as well, but I assume Android Studio comes with some sort of static analyzer as well, so same question?)
It specifically calls out null pointer exceptions but those... aren't a thing... in Objective-C, messages passed to nil return all 0 bits, and that's okay (unless they mean null dereferences...).
On iOS there is the Clang Static analyzer. Infer does some things different,
in particular reasoning that spans across multiple files.
But CSA checks for more kinds of issues and is also more mature than
Infer when it comes to iOS: we send big respect to CSA! Infer has only got started there recently.
Really, these tools complement one another and it would even make
sense to use both. Indeed, that's what we do inside FB!
About null dereferences, they are still a problem in ObjC: if you dereference a nil block it will crash, if you access an instance variable directly or try to pass nil to arrays or dictionaries it will crash. We try to find that kind of bugs.
Sorry to hijack your comment but it sounds like you are one of the devs of Infer. I am working on static analysis as part of my PhD and I am going to be an intern at Facebook MPK this summer. Are you located at MPK as well? Any chance we could meet up for some coffee at some point?
Yes Android Studio does come with a static checker called 'Lint' and I'm wondering the same. What are infer's advantages over 'Lint' or even 'findbugs'?
Infer finds deeper infer-procedural bugs sometimes spanning multiple files. Linters, in contrast, typically implement simple syntactic checks that are local within one procedure. But they are valuable and Infer doesn't try to duplicate what they are good at. At Facebook we run both Infer and a collection of Android linters.
Findbugs can be useful too; it is more akin to linters.
I am always interested in what powers these tools under the hood. I had to learn the hard way, you do not write a program analysis tool from scratch, if you can help it. I know I have tried. It is too much for one person to do.
So what is powering this thing?
1. http://sawja.inria.fr/ This is a OCaml library for parsing .class files into OCaml datastructures. There is some built-in analysis it uses
2. Clang and LLVM which is the popular thing to build you C family analysis framework on.
I use https://github.com/Sable/soot for Java analysis myself. It is extremely powerful out of the box and can analyze: java source code, jvm bytecode and dalvik bytecode. I recommend taking a look at that if you are interested in that sort of thing.
The innovation in the released tool seems to be the incremental checking. Haven't had a lot of time to dig into that but that seems to be the important part. In general it is great that they created something useful and practical, that is always a challenge.
>I use https://github.com/Sable/soot for Java analysis myself. It is extremely powerful out of the box and can analyze: java source code, jvm bytecode and dalvik bytecode. I recommend taking a look at that if you are interested in that sort of thing.
Soot is also really slow if you're using SSA on large codebases, and the code is a mess.
I am extremely happy to see Facebook using OCaml, it is good the get some more traction in that community. I hope it gains velocity over time and becomes a viable option especially for startups where there is no technical debt. It has amazing features and as you can see even very complex problems can be solved in a concise, terse way. Kudos to Facebook on this one.
As it is pointed out, OCaml has support for algebraic data types and symbolic evaluation, doesn't this make it an excellent language for natural language processing? Are there any more examples anyone is aware of?
I'm going to completely ignore the tool itself and focus on the fact that it is written in OCaml. IMHO it's a great language that's much underused and as such there's a need for greater library ecosystem (what's out there is generally very good, but there isn't much). Hopefully adoption of OCaml at Facebook will grow and we'll see some interesting general purpose open-source libraries!
For Java, it's Resource leaks and Null dereferences
For C and Objective C, the list is
Resource leak
Memory leak
Null dereference
Parameter not null checked
Ivar not null checked
Premature nil termination argument
> At present, the analyzer reports problems caused by null pointer access and resource and memory leaks, which cause a large percentage of app crashes.
Did you mean you were looking for something more specific?
Gave it a quick trial for iOS... doesn't seem great. It doesn't run at all when giving it a whole project (no response, no CPU usage, not even when you feed it BS arguments), it gives a "Starting Analysis" and nothing else for other (simple) files, it doesn't understand the newish 'nullable' keyword, and it will quit with a fatal error if it can't resolve an import (like UIKit), so pretty unusable on single files. I'm not convinced by the bug types it checks for either - doesn't xcode / the Clang analyser do those same checks itself and then some?
Just opened the example in xcode, the analyzer itself already highglights a lot of issues, like the nil dereferencing: http://i.imgur.com/CdiYRYX.png. If it's written in more modern objective-C and supports the nullable/nonnull type, the compiler will also warn / fail to build when trying to assign nil to a nonnull type.
To paint these tools with an overfly broad brush, they linter-like in that they perform shallow intra-procedural analysis to identify common bug patterns (e.g.,
if (x != null) {
y = x.f
}
z = x.f // possible NPE; x was previously checked for null
or
foo(String s) {
if ("x" == s) // oops, should use .equals() for Java String comparison.
}).
By contrast, Infer performs deeper inter-procedural reasoning that can track the flow of values across long chains of procedure calls to identify subtle bugs that are hard to see with the naked eye. Infer doesn't support as many bug patterns as these existing tools do yet, but it can find some deep bugs that these tools will miss.
Layman Query : Can anyone kindly explain what are it's pros/cons when compared to already existing tools which have a quite a bit of features already compared to this ?
I clicked through to the description of separation logic - http://fbinfer.com/docs/separation-logic-and-bi-abduction.ht... - and I'm having a hell of a time understanding the first couple paragraphs. Is there a typo in there? How is z↦y∗y↦x "x points to y and separately y points to x"
It's interesting they're using OCaml for some of their open source projects when they have some of the top Haskell devs working there. I wonder if these OCaml projects they've recently open-sourced are all coming from the same team.
Are there examples of the types of bugs this finds? The embedded video has a single null dereferencing example, but I assume it does more sophisticated analysis than just that.
In this case, Infer is correct: the method test() by itself is harmless, because it could be the case that "s" has been initialised before it is called.
Infer does a bottom-up analysis (callees before callers), and will infer that test() expects "s" to be allocated to run correctly.
To get an actual error, you need to call test() without initialising s, or with s being set to null by some other means prior to the call.
Here is a modified version of your example that will get Infer to complain.
class Hello { private String s;
int test() { return s.length(); }
int foo() { Hello a = new Hello(); return a.test(); }
int bar() { Hello a = new Hello(); a.s = null; return a.test(); }
}
Running "infer -- javac Hello.java" will show an error in bar(). Infer also finds the error in foo() but doesn't report it as it considers it lower-probability. This is a trade-off made in Infer to try and report only high-probability bugs. In this case it could be improved. To see that Infer finds the error in foo(), run "infer --no-filtering -- javac Hello.java".
Yes, it can be used with most Java code you can build on the command line. Just run "infer -- <your_build_command>". Currently, this works with javac, Ant, Maven, and Gradle. See http://fbinfer.com/docs/hello-world.html#hello-world-android for a Gradle example.
I just used it to evaluate a rather large (50k LOC) java8 application deployed as a war to tomcat8. ( infer -- gradle build ) and it found a bunch of null refs that IntelliJ and humans had all missed.
I am not certain of the proposed value, except it's free to other than Facebook - but not to Facebook, who pays engineers to develop this... Is this some kind of NIH syndrom by Facebook, or is there something I missed ?