Crater's an embarrassingly parallel problem though, it's only a matter of how much hardware you throw at it. Microsoft already donates the hardware used by Crater, it would have no problem allocating 10x as much for its own purposes.
There are certainly more things written in C than in Rust--the advantage of being fifty years old--but the standardization of the build system in Rust means that it would be difficult for any C compiler (or OS, or libc, or etc.) to produce a comparable corpus of C code to automatically test against (crates.io currently has 90,000 crates). But that's fine, because for the purpose of this thread that just means that Microsoft's theoretical Crater-like run for Windows compatibility just takes even less time and resources to run.
If you want to compile a large fraction of C/C++ code, just take a distro and rebuild it from scratch--Debian actually does this reasonably frequently. All of the distros have to somehow solve the problem of figuring out how to compile and install everything they package, although some are better at letting you change the build environment for testing than others. (From what I understand, Debian and Nix are the best bets here.)
But what that doesn't solve is making sure that the resulting builds actually works. Cargo, for Rust, makes running some form of tests relatively easy, and Rust is new enough that virtually every published package is going to contain some amount of unit tests. But for random open-source packages? Not really. Pick a random numerics library--for something like an linear programming solver, this is the most comprehensive automated test suite I've seen: https://github.com/coin-or/Clp/tree/master/test
> But that's fine, because for the purpose of this thread that just means that Microsoft's theoretical Crater-like run for Windows compatibility just takes even less time and resources
Huh? I don't follow. There are more libs to test and they aren't standardized. How does that mean theoretical Crater will take less resources?
Did you mean excluding non-testable code? That doesn't prevent future glibc-EAC incompatibility.
The manual labor would be greater, yes, and that's a problem. But the original point of this thread was about dismissing the idea of Crater at scale, which is unnecessary 1) because it's an embarrassingly parallel problem, and 2) because you're probably not going to have a testable corpus larger than crates.io anyway, so the hardware resources required are not exorbitant for a company of Microsoft's means. Even if they could only cobble together 10,000 C apps to test, that's a big improvement over having zero.