KLEE LLVM Execution Engine

hdespiritu · on April 17, 2018

One of the contributors wrote a memoir of his time spent working on this project as a PhD student and gives an interesting perspective on the challenges encountered:

http://pgbovine.net/PhD-memoir.htm

k4st · on April 17, 2018

You can also run KLEE on x86(-64) or AArch64 binaries using McSema to lift the binary to LLVM bitcode. An example is here: https://github.com/trailofbits/mcsema/tree/master/examples/M...

mslate · on April 16, 2018

Anyone using this in practice?

TheNewAndy · on April 16, 2018

I use it - I am writing a subset-of-yaml parser, and I've littered the code with a bunch of abort()/assert() statements for things I don't think should be possible. Running klee over my code spits out ~100K test cases (I let it use any 10 byte YAML input), and so while I don't inspect the output of any of the cases - I do make sure that none of them crash.

Similarly, if I make a change which I don't expect to change the output of the parser, then I can use this test suite to be quite sure that my change hasn't changed behaviour.

One thing I am unclear about is what it means for klee to be "done" on my binary. When it has generated all the test cases, if none of them crash my binary, then does that mean it is impossible to crash my binary in 10 bytes of input?

withzombies · on April 17, 2018

Have you checked out DeepState (https://github.com/trailofbits/deepstate), it's a symbolic unit testing framework that has an interface similar to google test.

TheNewAndy · on April 17, 2018

I haven't - thanks.

Writing test harnesses specific to the different tools (so far, just AFL and KLEE) has not been particularly difficult - all of them fit easily on one screen. So I'm not sure that deepstate really brings much to the table for what I'm doing.

xnum · on April 16, 2018

if your program calls system call or random function which KLEE couldn't handle, some parts of code may not be tested. KLEE uses its special libc (uclibc) to deal with that cases in glibc. To solve this problem perfectly, we usually uses S2E instead of KLEE in large and complex softwares (http://s2e.systems/)

TheNewAndy · on April 17, 2018

Right - but being a parser, I don't have those problems. My problem domain is quite well specified and there isn't any place for random numbers of system calls.

wyldfire · on April 17, 2018

Can you share more details about this? Is the work already public?

TheNewAndy · on April 17, 2018

The work won't be public. What details are you interested in?

nickpsecurity · on April 17, 2018

You can find both interesting uses (esp extensions) and users on the page below:

http://klee.github.io/publications/

There's a large amount of work building on KLEE. Similarly, it builds on LLVM that has its own huge ecosystem. The potential of all that together is what interests me about it.

rurban · on April 17, 2018

More people are using cbmc, the non-llvm variant, but otherwise very similar tool. I find cbmc much easier to use.

xvilka · on April 17, 2018

Problem of KLEE is lagging behind the latest LLVM version , which changed and improved a lot meanwhile.

kren1 · on April 17, 2018

There is support up to atleast llvm 3.8 now. See: http://klee.github.io/build-llvm38/

xvilka · on April 19, 2018

3.8 is super old already.

cztomsik · on April 17, 2018

I wonder if we could somehow use the work done in prepack to generate similar tests for javascript.

BTW: I don't want to oversimplify it, but it basically just looks at each branch and generates data which would pass the branch, right?

senatorobama · on April 16, 2018

What is CS 101 explanation for this project?

modeless · on April 17, 2018

I found a blog post with a succinct explanation.

> KLEE [runs] a program considering its input(or some other variables) to be symbols instead of concrete values like 100 or “cacho”. In very few words, a symbolic execution runs through the code propagating symbols and conditions; forking execution at symbol dependant branches and asking the companion SMT solver for path feasibility or counter-examples.

https://feliam.wordpress.com/2010/10/07/the-symbolic-maze/

cryptonector · on April 17, 2018

Finally, a succint, pithy explanation.

This is great. Makes me want to use it.

doomjunky · on April 18, 2018

KLEE is a symbolic execution engine. You can use it to analyse your program. A symbolic execution considers all possible values for all variables. E.g.:

   int x = readIntFromIO(); // -2147483648 to 2147483647

On branches constraints are added.

    if(x < 3) {
        // -2147483648 to 2
    } else {
        // 3 to 2147483647
    }

In this way all possible paths of the program are considered. When the execution terminates, e.g. on reaching a abort() or fail() statemant, a constraint solver calculates the input values to reach this statement. This values could be considers as test cases.

You can add assert() statements to your code for expressions (e.g. assert(x != 3) ) that you assume to be an inconsistent states and solver will give you test cases for this.