Crater's an embarrassingly parallel problem though, it's only a matter of how mu...

Ygg2 · on Aug 15, 2022

How much of crates are ran on crater? All of them?

Also I think there are magnitudes more C libs/apps, than Rust crates.

kibwen · on Aug 15, 2022

There are certainly more things written in C than in Rust--the advantage of being fifty years old--but the standardization of the build system in Rust means that it would be difficult for any C compiler (or OS, or libc, or etc.) to produce a comparable corpus of C code to automatically test against (crates.io currently has 90,000 crates). But that's fine, because for the purpose of this thread that just means that Microsoft's theoretical Crater-like run for Windows compatibility just takes even less time and resources to run.

jcranmer · on Aug 16, 2022

If you want to compile a large fraction of C/C++ code, just take a distro and rebuild it from scratch--Debian actually does this reasonably frequently. All of the distros have to somehow solve the problem of figuring out how to compile and install everything they package, although some are better at letting you change the build environment for testing than others. (From what I understand, Debian and Nix are the best bets here.)

But what that doesn't solve is making sure that the resulting builds actually works. Cargo, for Rust, makes running some form of tests relatively easy, and Rust is new enough that virtually every published package is going to contain some amount of unit tests. But for random open-source packages? Not really. Pick a random numerics library--for something like an linear programming solver, this is the most comprehensive automated test suite I've seen: https://github.com/coin-or/Clp/tree/master/test

Ygg2 · on Aug 16, 2022

> But that's fine, because for the purpose of this thread that just means that Microsoft's theoretical Crater-like run for Windows compatibility just takes even less time and resources

Huh? I don't follow. There are more libs to test and they aren't standardized. How does that mean theoretical Crater will take less resources?

Did you mean excluding non-testable code? That doesn't prevent future glibc-EAC incompatibility.

kibwen · on Aug 16, 2022

The manual labor would be greater, yes, and that's a problem. But the original point of this thread was about dismissing the idea of Crater at scale, which is unnecessary 1) because it's an embarrassingly parallel problem, and 2) because you're probably not going to have a testable corpus larger than crates.io anyway, so the hardware resources required are not exorbitant for a company of Microsoft's means. Even if they could only cobble together 10,000 C apps to test, that's a big improvement over having zero.

Ygg2 · on Aug 16, 2022

Thanks for that clarification.

I agree it's embarrassingly parallel, but the expenses scale almost linearly.

Overall I agree, for MSFT it's doable, but I doubt any Linux distro has enough money to continually provide this level of support.