The issue is more subtle than that. The GNU and glibc people believe that they provide a very high level of backwards compatibility. They don't have an aversion towards stability and in fact, go far beyond most libraries by e.g. providing old versions of symbols.
The issue here is actually that app compatibility is something that's hard to do purely via theory. The GNU guys do compatibility on a per function level by looking at a change, and saying "this is a technical ABI break so we will version a symbol". This is not what it takes to keep apps working. What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something. And then if they broke important apps they roll the change back or find a workaround regardless of whether it's an incompatible change in theory or not, because it is in practice.
Linux is really hurt here by the total lack of any unit testing or UI scripting standards. It'd be very hard to mass test software on the scale needed to find regressions. And, the Linux/GNU world never had a commercial "customer is always right" culture on this topic. As can be seen from the threads, the typical response to being told an app broke is to blame the app developers, rather than fix the problem. Actual users don't count for much. It's probably inevitable in any system that isn't driven by a profit motive.
I think part of the problem is that by default you build against the newest version of symbols available on your system. So it's real easy when you're working with code to commit yourself to some symbols you may not even need; there's nothing like Microsoft's "target a specific version of the runtime".
I really, really miss such a feature with glibc. There are so many times when I just want to copy a simple binary from one system to another and it won't work simply because of symbol versioning and because the target has a slightly older glibc. Just using Ubuntu LTS on a server and the interim releases on a development machine is a huge PITA.
You actually can do that using inline assembly, it's just a very obscure trick. Many years ago I wrote a tool that generated a header file locking any code compiled with it to older symbol versions. It was called apbuild but it stopped being maintained some years after I stopped doing Linux stuff. I see from the comments below that someone else has made something similar, although apbuild was a comprehensive solution that wrapped gcc. It wasn't just a symbol versioning header, it did all sorts of tricks. Whatever was necessary to make things Just Work when compiling on a newer distro for an older distro.
> What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something.
There are already sophisticated binary analysis tools for detecting ABI breakages, not to mention extensive guidelines.
> And, the Linux/GNU world never had a commercial "customer is always right" culture on this topic.
Vendors like Red Hat are extremely attentive towards their customers. But if you're not paying, then you only deserve whatever attention they choose to give you.
> As can be seen from the threads, the typical response to being told an app broke is to blame the app developers, rather than fix the problem.
This is false. Actual problems get fixed, and very quickly at that.
Normally the issues are from proprietary applications that were buggy to begin with, and never bothered to read the documentation. I'd say to a paying customer that if a behaviour is documented, it's their problem.
> Normally the issues are from proprietary applications that were buggy to begin with, and never bothered to read the documentation. I'd say to a paying customer that if a behaviour is documented, it's their problem.
… But that's exactly why Win32 was great; Microsoft actually spent effort making their OS was compatible with broken applications. Or at least, Microsoft of long past did; supposedly they worked around a use-after-free bug in SimCity for Windows 3.x when they shipped Windows 95. Windows still has infrastructure to apply application-specific hacks (Application Compatibility Database).
I have no reason to believe their newer stacks have anything like this.
The issue I see most often is someone compiled the application on a slightly newer version of Linux and when they try to run it on a slightly older machine it barfs saying that it needs GLIBC_2.31 and the system libc only has up to GLIBC_2.28 or something like that. Even if you aren't using anything that changed in the newer versions it will refuse to run.
That's an ergonomic deficiency. In practice, you probably need more than glibc, so then you have to make sure that other bits are available in this chroot. And if it so happens that one of the build tools that you rely on needs a newer version of glibc than the one you're building against, it still breaks down.
On Windows, you specify the target version of the platform when building, and you get a binary that is guaranteed to load on that platform.
Do you build software on your desktop machine and ship it? Do you not build in a chroot (or container, how the cool kids call them nowadays) to make sure the build is actually using what you think it should be using?
You have to build in a chroot or similar in any case. Just use the CORRECT one.
> On Windows, you specify the target version of the platform when building, and you get a binary that is guaranteed to load on that platform.
Except if you need a c++ redistributable… then you must ship the redistributable .exe setup inside your setup because windows. Let's not pretend shipping software on windows is easier.
Anyway all of this only applies to proprietary software. For free sotware the distribution figures it all out, does automatic rebuilds if needed and so on.
Really, just stick to free software, it's much easier.
You don't need chroot to make sure that your build uses correct versions of all dependencies; you just need a sane build system that doesn't grab system-wide headers and libs. Setting up chroot is way overkill for this purpose; it's not something that should even require root.
In case of Windows SDK, each version ships headers that are compatible going all the way back to Win9x, and you use #defines to select the subset that corresponds to the OS you're targeting.
With respect to the C runtime, Windows has been shipping its equivalent of glibc as a builtin OS component for 7 years now. And prior to that, you could always statically link.
Flatpak throws the baby out with the bathwater though. For example, on Ubuntu 22 if you install Firefox and don't have snaps enabled it installs the Flatpak, but if you do it can't access ffmpeg in the system so you can't play a lot of video files. It also fails to read your profile from Ubuntu 20 so you lose all of your settings, passwords, plugins, etc... It also wants to save files in some wierdass directory buried way deep in the /run filesystem. System integration also breaks, so if a Gnome app tries to open a link to a website the process silently fails.
> it can't access ffmpeg in the system so you can't play a lot of video files.
That's a typical packaging bug. Libraries should be bundled in the build or specified as a runtime dependency.
> It also fails to read your profile from Ubuntu 20 so you lose all of your settings, passwords, plugins, etc... It also wants to save files in some wierdass directory buried way deep in the /run filesystem. System integration also breaks, so if a Gnome app tries to open a link to a website the process silently fails.
All of this sucks and the user experience here needs a ton of work. I get the technical issues with co-mingling sandboxed and unsandboxed state, but there needs to be at least user-friendly options to migrate or share state with the sandbox.
A replacement for the xdg-open/xdg-mime/gio unholy trinity that offers runtime extension points for sandboxes might be nice. Maybe I could write a prototype service.
I think the issue with ffmpeg is that it needs to access GL driver libraries in order to support hardware acceleration. The GL libraries are dependent on your hardware and have to match what you have installed on the system. So you'd end up having to support a hundred different variations of Firefox flatpacks and users would have to make sure they match when installing, and remember to uninstall and reinstall when they update their graphics drivers.
Well you talk about Windows, that was true in the pre-Windows 8 era. Have you used Windows recently?
I bought a new laptop, and decided to give Windows a second chance. With Windows 11 installed, there were a ton of things that didn't worked. To me it was not acceptable for a 3000$ laptop. Problems with drivers, blue screens of death, applications what just didn't run properly (and commonly used applications, not something obscure). I never had these problems with Linux.
I mean we talk about Windows that is stable mostly because we use Windows versions after they are out since 5 years and most of the problems were fixed. Good, now companies are finishing the transition to Windows 10, not Windows 11, after staying with Windows 7 for years. After 10 years they will probably move to Windows 11, when most of its bug are fixed.
If you use a rolling-release Linux distro, such as ArchLinux, some problems on new software are expected. It's the equivalent of using an insider build of Windows, with the difference that ArchLinux is mostly usable as a daily OS (it requires some knowledge to solve the problems that inevitably arrive, but I used it for years). If you use let's say Ubuntu LTS, you don't have these kind of problems, and it mostly run without any issue (less issues than Windows for sure).
By the way, maintaining compatibility has a cost: have you ever wandered because a full installation of Ubuntu that is a complete system with all the program that you use, an office suite, driver for all the hardware, multimedia players, etc is less than 5Gb while a fresh install of Windows is minimum 30Gb but I think nowadays even more?
> And then if they broke important apps they roll the change back or find a workaround regardless of whether it's an incompatible change in theory or not, because it is in practice.
Never saw Microsoft do that: whey will simply say that it's not compatible and the software vendor has to update. That is not a problem by the way... OS developer should move along and can't maintain backward compatibility forever.
> The GNU and glibc people believe that they provide a very high level of backwards compatibility.
That is true. It's mostly backward compatible, having a 100% backward compatibility is not possible. Problems are fixed as they are detected.
> What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something.
There is one issue: GNU can't test non-free software for obvious licensing and policy issues (i.e. an association that endorses free software can't buy licenses of proprietary software to test it). So a third party should test it and report problems in case of broken backward compatibility.
Keep in mind that binary compatibility is something that is not fundamental on Linux, since it's assumed that you have the source code of everything and in case you recompile the software. GNU/Linux born as a FOSS operating system, and was never designed to run proprietary software on it. There are edge cases where you need to run a binary for other reasons (you lost the source code, compiling it is complicated or takes a lot of time) but surely are edge cases and not a lot of time should be spent to address them.
Beside that glibc it's only one of the possible libc that you can use on Linux: if you are developing proprietary software in my opinion you should use MUSL libc, it has a MIT license (so you can statically link it into your proprietary binary) and it's 100% POSIX compliant. Surely glibc has more feature, but probably your software doesn't use them.
Another viable option is to distribute your software with one of the new packaging formats that are in reality containers: snap, flatpack, appimage. That allows you to distribute the software along with all the dependencies and don't worry about ABI incompatibility.
I literally run on Windows insider for two my laptops - primary one is on beta channel and auxiliary laptop is on alpha channel. Both running Windows 11 and had 10 running before. Auxiliary one lives on insider for I think 5 years of not 6 and definitely had issues, like Intel wifi stopped working and some other minor ones, but main one, had, I guess 3-4 BSODs over 2 years and around 10 times not waking up from sleep. That's pretty much all of the issues.
For me it's impressive and I cannot complain on stability.
I believe that appimage still contains the glibc compatibility issues. I've read through appimage creation guides which suggest compiling on the oldest distro possible as glibc is forward compatible but not backwards.
> Linux is really hurt here by the total lack of any unit testing or UI scripting standards.
> standards
I've been very impressed reading how the Rust developers handle this. They have a tool called crater[1], which runs regression tests for the compiler against all Rust code ever released on crates.io or GitHub. Every front-facing change that is even slightly risky must pass a crater run.
Surely Microsoft has internal tools for Windows that do the same thing: run a battery of tests across popular apps and make sure changes in the OS don't break any user apps.
Where's the similar test harness for Linux you can run that tests hundreds of popular apps across Wayland/X11 and Gnome/KDE/XFCE and makes sure everything still works?
> Surely Microsoft has internal tools for Windows that do the same thing: run a battery of tests across popular apps and make sure changes in the OS don't break any user apps.
And hardware, they actually deploy to hardware they buy locally from retailers to verify things still work too last I checked. Because there is always that "one popular laptop" that has stupid quirks. I know they try to focus on a spectrum of commonly used models based on the telemetry too.
And crater costs a bunch, runs for a week, and it's not a guarantee things won't break. I'm not sure it runs every crate or just top 1 million. It used to, but I could see that changing, if
And in case of closed source software, that isn't publicly available, crates wouldn't work.
Crater's an embarrassingly parallel problem though, it's only a matter of how much hardware you throw at it. Microsoft already donates the hardware used by Crater, it would have no problem allocating 10x as much for its own purposes.
There are certainly more things written in C than in Rust--the advantage of being fifty years old--but the standardization of the build system in Rust means that it would be difficult for any C compiler (or OS, or libc, or etc.) to produce a comparable corpus of C code to automatically test against (crates.io currently has 90,000 crates). But that's fine, because for the purpose of this thread that just means that Microsoft's theoretical Crater-like run for Windows compatibility just takes even less time and resources to run.
If you want to compile a large fraction of C/C++ code, just take a distro and rebuild it from scratch--Debian actually does this reasonably frequently. All of the distros have to somehow solve the problem of figuring out how to compile and install everything they package, although some are better at letting you change the build environment for testing than others. (From what I understand, Debian and Nix are the best bets here.)
But what that doesn't solve is making sure that the resulting builds actually works. Cargo, for Rust, makes running some form of tests relatively easy, and Rust is new enough that virtually every published package is going to contain some amount of unit tests. But for random open-source packages? Not really. Pick a random numerics library--for something like an linear programming solver, this is the most comprehensive automated test suite I've seen: https://github.com/coin-or/Clp/tree/master/test
> But that's fine, because for the purpose of this thread that just means that Microsoft's theoretical Crater-like run for Windows compatibility just takes even less time and resources
Huh? I don't follow. There are more libs to test and they aren't standardized. How does that mean theoretical Crater will take less resources?
Did you mean excluding non-testable code? That doesn't prevent future glibc-EAC incompatibility.
The manual labor would be greater, yes, and that's a problem. But the original point of this thread was about dismissing the idea of Crater at scale, which is unnecessary 1) because it's an embarrassingly parallel problem, and 2) because you're probably not going to have a testable corpus larger than crates.io anyway, so the hardware resources required are not exorbitant for a company of Microsoft's means. Even if they could only cobble together 10,000 C apps to test, that's a big improvement over having zero.
This still has nothing to do with Linux. Unit testing isn't standardized in most languages. Even in Rust people have custom frameworks!
The Linux Kernel does have such a project doing batteries of tests. Userspace may not, but that's not a "unit test" problem. In fact it's the opposite, it's integration tests.
Right, but Linux (the OS) doesn't have unit tests to ensure that changes to the underlying system doesn't break the software on top. Imagine if MS released a new version of Windows and tons of applications stopped functioning. Everyone would blame MS. The Linux community does it all the time and just says that it's the price of progress.
I think the problem is that there isn't really a thing like "Linux the OS"; there's Debian, Ubuntu, Gentoo, Red Hat, and more than I can remember, and they all do things different: sometimes subtly so, sometimes not so subtly. This is quite different from the Windows position where you have one Windows (multiple editions, but still one Windows) and that's it.
This is why a lot of games now just say "tested on Ubuntu XX LTS" and call it a day. I believe Steam just ships with half an Ubuntu system for their Linux games and uses that, even if you're running on Arch Linux or whatnot.
This has long been both a strong and weak point of the Linux ecosystem. On one hand, you can say "I don't want no stinkin' systemd, GNU libc, and Xorg!" and go with runit, musl, and Wayland if you want and most things still work (well, mostly anyway), but on the other hand you run in to all sort of cases where it works and then doesn't, or works on one Linux distro and not the other, etc.
I don't think there's clean solution to any of these issues. Compatibility is the one of the hard problems in computers because there is no solution that will satisfy everyone and there are multiple reasonable positions, all with their own trade-offs.
So, I very much agree with mike_hearn, their description of how glibc is backwards compatible in theory due to symbol versioning matches my understanding of how glibc works, and their lack of care to test if glibc stays backwards compatible in practice seems evident. They certainly don't seem to do automated UI tests against a suite of representative precompiled binaries to ensure compatibility.
However, I don't understand where unit testing comes in. Testing that whole applications keep working with new glibc versions sounds a lot like integration testing. What's the "unit" that's being tested when ensuring that the software on top of glibc doesn't break?
I've used Linux since the Slackware days. I also spent years working on Wine, including professionally at CodeWeavers. My name can still be found all over the source code:
Some of the things I worked on were the times when the kernel made ABI changes that broke Wine, like here, where I work with Linus to resolve a breakage introduced by an ABI incompatible change to the ptrace syscall:
I also did lots of work on cross-distribution binary compatibility for Linux apps, for example by developing the apbuild tool which made it easy to "cross compile" Linux binaries in ways that significantly increased their binary portability by controlling glibc symbol versions and linker flags:
So I think I know more than my fair share about the guts of how Win32 and Linux work, especially around compatibility. Now, if you had finished reading to the end of the sentence you'd see that I said:
"Linux is really hurt here by the total lack of any unit testing or UI scripting standards"
... unit testing or UI scripting standards. Of course Linux apps often have unit tests. But to drive real world apps through a standard set of user interactions, you really need UI level tests and tools that make UI scripting easy. Windows has tons of these like AutoHotKey, but there is (or was, it's been some years since I looked) a lack of this sort of thing for Linux due to the proliferation of toolkits. Some support accessibility APIs but others are custom and don't.
It's not the biggest problem. The cultural issues are more important. My point is that the reason Win32 is so stable is that for the longest time Microsoft took the perspective that it wouldn't blame app developers for changes in the OS, even when theoretically it could. They also built huge libraries of apps they'd purchased and used armies of manual testers (+automated tests) to ensure those apps still seemed to work on new OS versions. The Wine developers took a similar perspective: they wouldn't refuse to run an app that does buggy or unreasonable things, because the goal is to run all Windows software and not try to teach developers lessons or make beautiful code.
> But to drive real world apps through a standard set of user interactions, you really need UI level tests and tools that make UI scripting easy. Windows has tons of these like AutoHotKey, but there is (or was, it's been some years since I looked) a lack of this sort of thing for Linux due to the proliferation of toolkits.
This made me remember a tool that was quite popular in the Red Hat/GNOME community in 2006-2007 or so:
The issue here is actually that app compatibility is something that's hard to do purely via theory. The GNU guys do compatibility on a per function level by looking at a change, and saying "this is a technical ABI break so we will version a symbol". This is not what it takes to keep apps working. What it actually takes is what the commercial OS vendors do (or used to do): have large libraries of important apps that they drive through a mix of automated and manual testing to discover quickly when they broke something. And then if they broke important apps they roll the change back or find a workaround regardless of whether it's an incompatible change in theory or not, because it is in practice.
Linux is really hurt here by the total lack of any unit testing or UI scripting standards. It'd be very hard to mass test software on the scale needed to find regressions. And, the Linux/GNU world never had a commercial "customer is always right" culture on this topic. As can be seen from the threads, the typical response to being told an app broke is to blame the app developers, rather than fix the problem. Actual users don't count for much. It's probably inevitable in any system that isn't driven by a profit motive.