I'm new to fuzzers and fuzz testing in general so I apologise for my ignorance about the purpose of fuzzing. My understanding is that fuzzing tests the user facing side (which is what is important for most programs). Does there exist similar tooling for testing the system-facing side (i.e. the stack below your application) to check your applications error handling, for example and uncover corner cases. What I'm getting at it something like syzkaller but for userspace, so library functions beneath your application would return wrong values and you get to see how your application responds to them.
It sounds like you already have a good understanding of fuzzing.
You might want to take a look at libfuzzer (http://llvm.org/docs/LibFuzzer.html), which you can use to test library functions. What you do is implement a function called LLVMFuzzerTestOneInput that should call whatever library function you want to test. One of the parameters is a uint8_t array that can be transformed into whatever kind of parameter types the library function expects. Then libfuzzer provides the main function that will generate the data and call the LLVMFuzzerTestOneInput function, repeating the process until there is a crash.
The one downside is that you need to recompile the code (the library) you are trying to fuzz.
I've started writing a fuzzer for the JVM once. It looks at the bytecode and adds instrumentation to check whether a label has been hit = a code-path has been traversed after calling methods.
It uses genetic algorithms to spawn agents that call an API with all kinds of arguments and reproduce when it finds unique code-paths.
Before anyone asks: the code is far from complete, it's really more a fiddle than anything.
I've been thinking about this for fuzzing my own ssh server, there are various paths that depend on external stimuli other than the main network stream - TCP data incoming or writable, TCP sockets closed, processes exiting etc.
It seems that wrapping select()/read()/write() to have random readability/writability/closing seeded by a fuzz input might exercise those code paths. I'm yet to implement it though - be interested to hear of prior work. Getting the fuzz input corpus to correlate to program actions might be problematic.
Yes! Static and dynamic code analysis (for JVM a popular analysis tool is Findbugs, which is incidentally built into InrelliJ; another one that's okay that I've used before is Coverity Code Advisor) are what you're looking for I think, instead of API endpoint analysis, which is what fuzzing usually boils down to (e.g. What can I pass to this program to make it barf).
This notes that they disabled reading config files in order to own life as the default setup. I assume that with more time it would be wise to try and fuzz as many configured options as possible as well?
Yes, that is what I was trying to get at in the second paragraph under the "What can I do to help with fuzzing Irssi?…" heading. That there may be certain bugs that require certain non-default configuration options as part of the criteria for triggering the bug.
I actually did do some fuzzing of the config file (just loading Irssi to see if the config file caused a crash) and found a couple of bugs there (for example: https://github.com/irssi/irssi/issues/563). The choice to instead fuzz network traffic as done in the blog post was made, because it is generally more interesting because it is easier for a malicious person to exploit network based bugs than those requiring the user to load a bad config.
But you are right, that the configuration can be part of the fuzzed input. It should be possible to take part of the data fed into Irssi by AFL and use that as the config file and then use the rest as the network traffic.
This fuzzing is interesting stuff. Does anyone know of an in-process or otherwise lib for the JVM? Findbugs is mentioned in here but I'm not sure if that does fuzzing (maybe a plugin?).
Seems in my mind to be a nice complement to achieving code-coverage with testing i.e. whereas unit/integration testing might test the various code paths with a few good/bad values, this then throws every possible input value at them to see what breaks.
There is PIT[1], it calls it mutation testing, but I guess it is roughly the same idea
To put it another way - PIT runs your unit tests against automatically modified versions of your application code. When the application code changes, it should produce different results and cause the unit tests to fail. If a unit test does not fail in this situation, it may indicate an issue with the test suite.
Sorry for the bad English.