I typed in the examples from the paper and tried them out with the Digital Mars C compiler. They all passed, as well as http://blog.regehr.org/archives/482. I hope the authors will try to break dmc with their test suite generator, and look forward to fixing any issues they find.
regehr@home:~/volatile/bugs/tmp007$ cparser --version
cparser (0.9.11) using libFirm (1.18)
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
regehr@home:~/volatile/bugs/tmp007$ cparser -c small.c
cparser: ast2firm.c:3197: conditional_to_firm: Assertion `get_irn_mode(false_val) == mode' failed.
Aborted
regehr@home:~/volatile/bugs/tmp007$ cat small.c
unsigned char foo (int ui1, unsigned char ui2)
{
return ui2 ? : ui1 + ui2;
}
The remarks about coverage are interesting, and lends support to something I was thinking about a while ago (maybe this is common knowledge):
If I'm writing test cases with the goal of increasing (line) coverage, I first look for large sections of code that aren't yet covered, then test the most obvious path to reach those lines (usually its some sort of special case or error handler). Sometimes this catches really simple mistakes, but generally those lines were written specifically to handle the the path I'm testing. The problem is that once I've tested that path I'm inclined to move on to un-covered lines, and don't test a lot of paths that exercise the same code, but that aren't necessarily obvious to me. Such paths, of course, tend to be more likely to contain bugs because they weren't as obvious to, or considered as critical by whoever originally wrote the code.
Look into Microsoft's CHESS proejct which analyzes the code to quickly converge on error cases as well as a system by which thread interleaving is predictable and thus testable. It actually tests every possible combination of thread interleaving through a code path.
As someone who works in generative algorithms for expressive/art/content purposes, the phrase "generating random programs that are expressive" is interestingly evocative. I wonder if there are any lessons from the past ~5 years of "fuzzing" algorithms for generative-art / generative-AI to mine.