Hacker News new | past | comments | ask | show | jobs | submit login

With fuzzing it's possible to find distinct bugs (or at least bugs that trigger in distinct code locations) without ever further investigating the bug in person.

Your bug report can simply consist of "this input file causes a compiler crash".




Crashes are the easier bugs to report, but often the most important compiler bugs are silent miscompilations that go unnoticed. With C compilers in particular, it's not easy to non-conclusively demonstrate a bug/convince the developers that the compiler is actually buggy (a lot of C standard lawyering is at play when it comes to subtle undefined behaviors).

[disclosure: I have been involved with the initial phase of "EMI" compiler validation work that's linked] One of the great strengths of EMI has been its ability to identify a lot of miscompliations in particular. In fact, while we did not fix the bugs ourselves, I would say the majority of the time spent was probably in reducing the bugs in a short program that manifested the issues and engaging with the developers to prove the bug is not an undefined behavior. I remember from my days that the majority of bugs we reported were miscompilations, not crashes, which took a lot of time to report. Even with simple crashes, you probably need to reduce the bug as the developers don't usually appreciate attaching a 7MB source file to your bug report!


Indeed! On the Mill project we leave boxes crunching hours and hours of csmith/creduce, and we don't watch them do it :)

SPE looks to be very nice too.


I haven't seen a Mill press release hit the front page for awhile.

I'm not good at subtly hinting.


What's the Mill project? Is that the "new computer" thing or am I missing something.


Yeap that's the project :)

We have our own llvm backend which is a front end to our own "specialiser".

We use csmith to fuzz for compiler bugs and creduce to reduce them. This starts with C and we even validate the output of the sim against clang x86 output.


How should someone go about contributing to your LLVM backend?



So it is meant to be a different architecture? What market is it targeting?

Is mill low power, high performance, embeded, radiation hardened, etc? Will it come in small packages with RAM and ROM on the die? Will it support a different architectural view (maybe all the memory will be non-volatile)?

A lot of questions and very little answered in the page.


Give a man a bug report, he will fix his program one day. Give a man a fuzzer, he will be fixing bugs for the rest of his life.


Alternatively, you could do what I do and just not write bugs.


This works if you don't have coworkers.

I'm kidding! It doesn't work even then.


In those kinds of environments, typically I just will the program into existence, bug-free.


I agree with you cynicism when dealing with a software vendor like Oracle. Where bug reports are fixed for only the reported instance alone (if at all).

For a FOSS project like the LLVM I think getting >100 parser bugs might cause a slight re-evaluation of the current code structure. Maybe I'm a bit too optimistic, but the LLVM team has done a fine job of fixing outstanding issues.

Lastly Fuzzers like the AFL (American Fuzzing Loop) attempt to model input, sort of doing a lazy search for bugs (taking parser run-time into account). So when it finds a cluster of bugs they're normally clustered in the source code as well.


i don't think the comment was meant to be cynical, i think it was meant to show how much time and effort fuzzing can save when trying to generate reproducible test cases.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: