The problem is if you combine all the various flags that affect the compiler, across all the architectures, across all the platforms, in all its variants (cross compiler, native, the many handful of libc and barebones variants) you're looking at too many tests to run no matter how huge an infrastructure you have to run it.
Another problem is that optimization depends a lot of context, given the amount(basically infinity) of C code that could surround any other piece of C code and affect the result - it's quite a hard task.
> The problem is if you combine all the various flags that affect the compiler, across all the architectures, across all the platforms, in all its variants (cross compiler, native, the many handful of libc and barebones variants) you're looking at too many tests to run no matter how huge an infrastructure you have to run it.
As someone who at one point in time maintained such a compiler test system, I'll say that it isn't possible to get all combinations, but you can hit a reasonable percent of them.
A good compiler test run end up running through millions of tests. It isn't for the feint of heart, but it is perfectly doable.
The problem is if you combine all the various flags that affect the compiler, across all the architectures, across all the platforms, in all its variants (cross compiler, native, the many handful of libc and barebones variants) you're looking at too many tests to run no matter how huge an infrastructure you have to run it.
Another problem is that optimization depends a lot of context, given the amount(basically infinity) of C code that could surround any other piece of C code and affect the result - it's quite a hard task.
One interesting approach is csmith ,http://embed.cs.utah.edu/csmith/, that generates random C programs and look for bugs.