As Linus points out, the utility of shared libraries depends on the situation. Widely used libraries like libxcb, libGL, and zlib that tend to have reasonably stable ABIs are good candidates for sharing, while the "couple of libraries that nobody else used. Literally nobody" should obviously be statically linked. Software engineering is mostly about learning how to choose which method is appropriate for the current project.
However, there important advantage of using .so files instead of building into a big statically linked binary is NOT that a library can be shared between different programs! The important benefit is dynamic linking. The ability to change which library is being used without needing to rebuild the main program is really important. Maybe rebuilding the program isn't practical. Maybe rebuilding the program isn't possible because it's closed/proprietary or the source no longer exists. If the program is statically linked, then nothing can happen - either it works or it doesn't. With dynamic linking, changes are possible by changing which library is loaded. I needed to use this ability last week to work around a problem in a closed, proprietary program that is no longer supported. While the workaround was an ugly LD_LIBRARY_PATH hack, at least it got the program to run.
I don't think that being able to replace a library at runtime is a useful enough feature to justify the high maintenance cost of shared libraries. Like I complain about in a comment below, the cost of shared libraries is that an upgrade is all-or-nothing. If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked. You can't have both programs on your system, and Linux distributors will simply stop updating one of them until the author magically fixes it, which happens approximately never. And, the one they stop updating is the one that you want to update to get a new feature, 100% of the time.
It's just not worth it. A binary is a closure over the entire source text of the program and its libraries. That's what CI tested, that's what the authors developed against, and that's what you should run. Randomly changing stuff because you think it's cool is just going to introduce bugs that nobody else on Earth can reproduce. Nobody does it because it absolutely never works. You hear about LD_PRELOAD, write some different implementation of printf that tacks on "OH COOL WOW" to every statement, and then never touch it again. Finally, dynamic loading isn't even the right surface for messing with the behavior of existing binaries. It has the notable limitation of not being able to change the behavior of the program itself; you can only change the results of library calls.
I never want to see a dynamic library again. They have made people's lives miserable for decades.
>Nobody does it because it absolutely never works.
As an intern at a data science company, I noticed that a certain closed-source shared library we were using was calling the same trigonometric functions with the same arguments over and again. So, I tried to use LD_PRELOAD to "monkey-patch" libm with caching and that reduced the running time of the data pipeline by 90%.
>I never want to see a dynamic library again.
And replace it with what? Completely statically linked libraries? How much disk space would linux userland require if every binary was to be statically linked?
I like the anecdote, but I don't see what argument it offers.
Usually, the core maintainers are the one with the strongest context on the project, and thus would have the best knowledge of optimization locations.
Telling maintainers to dynamically link just in case someone wants to change the libraries is like having replaceable CPUs on motherboards. Useful for 0.1%, but requires a lot of working around.
I was just giving a real-world example where dynamic patching via LD_PRELOAD was quite useful, not offering it as an argument for maintainers to use dynamically linked libraries.
Of course, there are many independent arguments for using dynamically linked libraries, which is why most linux distributions use them.
> As an intern at a data science company, I noticed that a certain closed-source shared library we were using was calling the same trigonometric functions with the same arguments over and again. So, I tried to use LD_PRELOAD to "monkey-patch" libm with caching and that reduced the running time of the data pipeline by 90%.
This is a very edgy edge case, though.
> How much disk space would linux userland require if every binary was to be statically linked?
Megabytes more?
I have never been on a project or a machine since the 70s where executables took a majority of the disk space. On this disk, about 15% of it is executables, and I have a lot of crap in my executable directory and just deleted a lot of junk data.
Also, no one's seriously proposing that core libraries that everyone uses be statically linked.
Megabytes more in total, or megabytes more per binary?
> On this disk, about 15% of it is executables, and I have a lot of crap in my executable directory and just deleted a lot of junk data.
Is there any study on how this might change if we moved from dlls to statically linking everything?
> no one's seriously proposing that core libraries that everyone uses be statically linked
If we agree that shared libraries are fine for the common case, and there are n number of existing solutions for distributing applications with tricky dependencies, what are we even talking about in this thread? :\
no, into common block maps. it's filesystem compression and deduplication, and it works great, on a lower level from libraries. I use it with ZFS and btrfs, works wonders.
That’s a dependency problem, which is orthogonal. A model like Nix can make this work splendidly. The programs that share an exact version of a shared lib do share it, but you are free to use a custom dynamic lib for only one package. In a way, Nix makes your notion of “clojure” the way of life everywhere (not just binaries, but scripts as well referring to interpreter, other programs by explicit hash of derivation)
> I don't think that being able to replace a library at runtime is a useful enough feature to justify the high maintenance cost of shared libraries.
We’re moving to a world where every program is containerized or sandboxed. There is no more real “sharing” of libraries, everything gets “statically linked” in a Docker image anyway.
If I do an `apt-get install` of the same packages in different containers, with anything different at all before it in the Dockerfiles, there's no way to share a common layer that includes the package contents between those images.
You could volume mount /usr and chunks of /var to help, but I'm not sure that would be sane.
The flip side is that it is easy to track security vulnerabilities at a level of a shared library - once the distribution pushes an update all the dependent applications are fine.
Imagine you have 10 programs statically linked against the same library. How many will be promptly updated? How do you, as a system admin, track which ones were not?
You solve two unrelated problems (tracking library dependencies and security patching) with one complex interdependent solution and then you pretend no other solution can possibly exist. This is a trap.
In production you very often end up maintaining stuff not from distribution, you have to track and rebuild that anyway.
At least when the distro updates its shared libraries it fixes it for all users, imagine how many rebuilds + time + effort it would take for every docker container that had the same shitty small bug.
> You solve two unrelated problems (tracking library dependencies and security patching) with one complex interdependent solution (...)
Linking to a library is not what I would describe as "complex", or have any meaningful challenge regarding "interdependence".
This issue makes no sense in Windows, which you are free to drop any DLL in the project folder, and for different reasons also does not make sense in UNIX-like distribution which already provide official maintained packages.
I'm sure that there are plenty of problems with distributing packages reliably, but dynamic libs ain't one of it.
It seems to me like the better way to solve this would be for distributions to publish which versions of dependencies each package uses, and provide an audit tool that can analyze this list and notify of any vulnerabilities.
>If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked.
Why? You can have both e.g. /lib/libfoobar-1.0.4.so /lib/libfoobar-1.0.3.so
Usually you would have /lib/libfoobar-1.so -> /lib/libfoobar-1.0.4.so, but it doesn't prevent you from linking the problematic program directly with libfoobar-1.0.3
> but it doesn't prevent you from linking the problematic program directly with libfoobar-1.0.3
I feel that patchelf[0] is an ingenious little tool for exactly that purpose that isn't getting enough love. Not as useful for FOSS stuff, but it's been really useful for the times I had to relink proprietary things on our cluster in order to patch a security vulnerability or two.
That's incidental and not by principle, isn't it? You should be able to have a multiple versions of a library installed, and the dynamic linker would pick the appropriate version.
Heck you could even have multiple versions in the same process (if you depend on A and B, and B uses an older version of A internally). Unfortunately I think the current dynamic linker on Linux has a shared namespace for symbols.
I also disagree that it is not useful to be able to update libraries separately. The example that I like to give is a PyGtk app written against an earlier Gtk2. Without modifications, it ran against the very last Gtk2, but now supporting: Newest themes, draggable menubars, better IME, window size grippers, and all kinds of other improvements to Gtk.
In the days of ActiveX (and I realize this is a bad example because in reality you often had breakage), you could update a control and suddenly your app would have new functionality (like being able to read newer file format versions or more options to sort your grids and so on).
>If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked. You can't have both programs on your system, and Linux distributors will simply stop updating one of them until the author magically fixes it, which happens approximately never.
I've not experienced that. If the package is maintained at all, then it will get updated. People are still somehow maintaining Perl 5 libraries and putting out bugfix releases, which are functionally equivalent to dynamic linking.
If it's not maintained, then of course it will get removed from the distribution, for that and for a number of other reasons.
> People are still somehow maintaining Perl 5 libraries and putting out bugfix releases, which are functionally equivalent to dynamic linking.
Not to be pedantic, but having multiple versions of any Perl library, or multiple versions of Perl on a single server without things getting stepped on is trivially easy. There's utils to switch Perl versions as well. I mean, it's Perl.
Perl has a long, long, long history and culture of testing, backwards compatibility, Kwalitee Assurance, and porting to every system imaginable behind it as well, so if some Perl script from the 90's still runs with little to no modification, that should not exactly be seen as a rarity.
I don't mean modules installed from cpan, this is specifically for things like Perl and Python modules packaged in Debian. On purpose, there is usually no way to install more than one version of those with the system package manager. Maintainers seem to be doing a fine job of keeping them up-to-date. One of the major reasons to do this is so packages that depend on those libraries actually DO get tested with updates and don't get neglected and stuck on some old buggy version.
There are a lot of reasons to prefer static linking, but to me the argument of "it means you don't need to keep your package up to date with bug and security fixes" never held up, from a distro point of view anyway.
If libfoobar-1.0.4 is not a drop-in upgrade for libfoobar-1.0.3 then it should have a different SONAME, which would allow distributions to package both of them.
But a breaking change probably doesn't impact all programs that depend on it, so that approach still ends up with many installed versions that could probably have been shared.
If you have gifted your code under a FOSS license, then people are going to run it in ways that you don't like.
The distributions will take all front-line bug reports and the maintainer will forward them to upstream only if applicable. Or at least, that's how Debian operates.
> Like I complain about in a comment below, the cost of shared libraries is that an upgrade is all-or-nothing.
It really isn't. At all. For example, let's take a notoriously egregious example of C++, which doesn't really have a stable ABI, and a major framework like Qt, which provides pretty much the whole platform. Since the introduction of Qt5, you can pretty much upgrade even between LTS releases without risking any problem: replace libs built with the same compiler tool chain, and you're ddone
This is not an isolated case. Shared libraries and semantic versioning is not a major technical hurdle, and one which in practice only has upsides.
> you can pretty much upgrade even between LTS releases without risking any problem: replace libs built with the same compiler tool chain, and you're ddone
as someone who develops Qt apps for a living, this is not true at all. Yes, Qt promises no ABI / API break, sure. But it doesn't matter at all when a new release introduces a bug that breaks my app. It's super common to have to pin software to Qt versions if one wants to minimize bugs (and more importantly have a precise behaviour for all the users of the software - for instance, the way hidpi has been implemented has changed something like 3 times across the Qt5 lifetime, and there are still things to fix ; this means that on a hidpi screen the software won't look the same if linked against qt 5.6 / 5.10 / 5.15 which is pretty bad as for instance assets may have been made specifically to fit with the "Qt 5.6" look and may appear broken / incorrectly positioned / etc... on later versions (and conversely).
> But it doesn't matter at all when a new release introduces a bug that breaks my app.
I fail to see the relevance of your case. Semantic versioning and shared libraries is a solution to the problem of shipping bugfixes by updating a single component that's deployed system-wide. Semantic versioning and shared libraries do not address the problem of a developer shipping buggy code. Shared libraries ensure that you are able to fix the bug by updating a single library, even without having to wait for the package or distro maintainers to notice the bug exists.
> It's super common to have to pin software to Qt versions if one wants to minimize bugs (and more importantly have a precise behaviour for all the users of the software - for instance, the way hidpi has been implemented has changed something like 3 times across the Qt5 lifetime, and there are still things to fix ;
I've worked on a long-lived project based on a Qt application that supported on all platforms, and we've upgraded seamlessly from Qt 5.4 up to 5.12 without touching a single line of code.
In fact, the only discussions we had to have regarding rebuilding the app was due to a forced update to Visual Studio, which affected only Windows.
YMMV of course, specially if no attention is paid to the contract specified by the library/framework, but I disagree that your personal anecdote is representative of the reality of using shared libraries, even in languages which are notoriously challenging to work with such as C++.
> Shared libraries ensure that you are able to fix the bug by updating a single library,
but that assumes that the only thing that changes between library_v1.0.0.so and library_v1.0.1.so is the bug. There are always tons of regressions and unrelated changes between minor Qt versions - hell, I even had patch versions break, for instance I was hit by this which was introduced in 5.11.1: https://bugreports.qt.io/browse/QTBUG-70672 or this which worked before 5.9 and failed afterwards (for good reasons) : https://bugreports.qt.io/browse/QTBUG-60804 or this in 5.4: https://bugreports.qt.io/browse/QTBUG-44316
> and we've upgraded seamlessly from Qt 5.4 up to 5.12 without touching a single line of code.
and that worked on mac, windows, linux with X11 & wayland, in low-dpi and high-dpi alike ? I don't see how if you care about your app's looks (e.g. your UI designers send you precise positioning & scale of every widget, text, image, etc).
hell, if you have a QtQuick app today, resizing its window has been broken since Qt 5.2 (since then, layouting happens asynchronously which gives the impression the app wobbles - and that is the case for every Qt Quick app under Linux, even the simplest hello world examples ; this worked much better in 5.0 / 5.1, see e.g. https://bugreports.qt.io/browse/QTBUG-46074 ).
>If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked. You can't have both programs on your system, and Linux distributors will simply stop updating one of them until the author magically fixes it, which happens approximately never.
It's not true at all. Libfoobar-1.0.4 is backward compatible with libfoobar-1.0.3, so both programs will work. If one of the programs or the lib has a bug, then it's easier to just fix the bug than to add burden on maintainers.
Moreover, both libfoobar.so.1.0.3 and libfoobar.so.1.0.4 can be installed at the same time, e.g. libfoobar.so.1.0.4 can be packaged as libfoobar-1.0.4, while libfoobar.so.1.0.3 can be packaged as compat-libfoobar-1.0.3.
If you have more questions, try to find answers in packaging guidelines for your distribution, or ask maintainers for help with packaging of your app.
It seems to me that several circumstances have changed since the idea of dynamic linking first came about:
- A whole lot of software is now open and freely available
- The code-changes-to-binary-updates pipeline has been greased by a couple orders of magnitude, from internet availability to CI to user expectations around automatic updates (for both commercial and non-commercial software)
- Disk space, and arguably even RAM, have become very cheap by comparison
Given these, it seems worthwhile to re-evaluate the tradeoffs that come with dynamic linking
> Given these, it seems worthwhile to re-evaluate the tradeoffs that come with dynamic linking
Perhaps this is just a Linux distro thing, but as someone who closely monitors the Void packages repo, dynamic linking burdens distro maintainers with hundreds of additional packages which need to be individually tracked, tested and updated.
(Packages which could otherwise be vendored by upstream and statically linked during the build.)
Dynamic linking also adds complexity to distro upgrades, because dependant packages often need to be rebuilt when the libraries they dynamically link to are changed or upgraded. For example, Void’s recent switch from LibreSSL to OpenSSL borked my install script which wasn’t aware of the change. It resulted in a built system whose XBPS package manager couldn’t verify SSL certs. On Arch, AUR packages were notoriously prone to dynamic linking errors (back when I still used it).
Personally, I don’t find the bandwidth efficiency and CVE arguments for dynamic linking to be all that convincing [1]:
Will security vulnerabilities in libraries that have been statically
linked cause large or unmanagable updates?
Findings: not really
Not including libc, the only libraries which had "critical" or "high"
severity vulnerabilities in 2019 which affected over 100 binaries on
my system were dbus, gnutls, cairo, libssh2, and curl. 265 binaries
were affected by the rest.
The total download cost to upgrade all binaries on my system which
were affected by CVEs in 2019 is 3.8 GiB. This is reduced to 1.0
GiB if you eliminate glibc.
It is also unknown if any of these vulnerabilities would have
been introduced after the last build date for a given statically
linked binary; if so that binary would not need to be updated. Many
vulnerabilities are also limited to a specific code path or use-case,
and binaries which do not invoke that code path in their dependencies
will not be affected. A process to ascertain this information in
the wake of a vulnerability could be automated.
> Perhaps this is just a Linux distro thing, but as someone who closely monitors the Void packages repo, dynamic linking burdens distro maintainers with hundreds of additional packages which need to be individually tracked, tested and updated.
> (Packages which could otherwise be vendored by upstream and statically linked during the build.)
The effort is approximately the same for distro maintainers. Shared libraries might mean more individual packages but the same number of projects need to be actively maintained because if a vulnerability is patched in libssl (for example) you'd need to rebuild all projects that are statically linked rather than rebuilding libssl.
Upstream vendoring doesn't really help here. If anything, it add more issues than it solves with regards to managing a distro where you want to ensure every package has the latest security patches (vendoring is more useful for teams managing specific applications to protect them against breaking changes happening upstream than it is at a distro level where you can bundle different versions of the same .so if you absolutely need to and link against those)
Additionally I would say at least `dbus`, `gnutls`, `libssh2` and maybe?? `curl` are this kind of libraries which count as "widely used system libraries".
I.e. the few for which Linus thinks dynamic linking still can make sense ;=)
I think none of those (at least two, nearly surely) might qualify for that exemption. On the vast majority of systems I bet libcurl, for example, is probably linked against by 1 other piece of software - curl(1). “Really useful” isn’t the qualifier here. I bet ~same for libssh. dbus and gnutls I’d have to see more examples to understand.
I think that the kind of "knowledge" you are correcting is informing much of the discussion here. And it seems like the motivation is "I want to make MY thing as easy to ship as possible!".
There are hundreds of binaries in a Linux system and no one wants to rebuild all of them when an important library is updated.
That's with shared libraries and debug symbols stripped. If I upgrade libc, why would I want to also waste a massive amount of time rebuilding all 5616 binaries?
Of course, even if we ignore the insane buld time, I don't have the source for some of those binaries, and in a few cases the source no longer exists. In those cases, should I be forced to only use those programs with the (buggy) versions of the libraries they were originally built with?
1.3 GB? It's 2021, I can read that from SSD into memory in less than the blink of an eye. Hard disk is fast and cheap enough that it's worth keeping builds around to incremental rebuild against new libraries. Also, typically tests require a build tree, and you are going to rerun tests on the relinked binaries, aren't you?
Libcurl is the only sane way I know to access the web from a C or C++ environment. I expect that most C programs needing to do web requests will link in libcurl.
Thank you! I had noticed the same problem, but was not aware of this excellent solution. I just applied it, and it works perfectly... and I removed the dozens of Haskell libraries that were there just for Pandoc.
For anyone else interested, the same solution applies to shellcheck - just install shellcheck-bin from AUR.
I discovered that quite quickly after they switched.
More importantly (for any haskell developers using Arch), there are also static builds for stack, cabal and ghcup too although you then need to maintain your own set of GHC and libraries.
I can't imagine what it must be like to use the official packages since the haskell library ecosystem is quite fast moving.
The Haskell maintainers in Arch have an almost fantatic fascination with dynamic linking. Dynamic linking is otherwise not the norm in Haskell.
Haskell has poor support for dynamic linking. The Haskell (GHC) ABI is known to break on every compiler release. This means that every dynamically linked executable in the system needs to be recompiled. Any programs you compiled locally against dynamic system libs will break, of course.
This stance makes it much harder than necessarily for Arch users to get started with Haskell.
This is due to the maintainers on Arch and I've complained about it before too. They also update them almost daily so they just add noise to system updates. I just end up running the official Pandoc Docker container.
I'm assuming they already are using the package manager. The issue is that there are so many dependencies for a single package, and almost every system update consist of updating a bunch of Haskell packages which just clutters maintenance.
Distribution does it job: it distributes upstream changes. It's like blaming a zip file, because it contains many small files instead of a few big, so it clutters maintenance. If you see this as a problem, then fix upstream. For me, it's important to receive all upstream changes in one update.
No, I didn't. But, before uninstalling Pandoc, I was wasting time with frequent useless updates of minor Haskell dependencies that should be statically linked in the first place as no other package, at least in general, make use of them.
As someone who runs a Linux distribution, please don't vendor your dependencies for the benefit of distribution maintainers, it makes it much more difficult to build a coherent system. Running different versions of common libraries causes real annoyances.
It's a double-edged sword, though. I find that with Debian I often can't have up-to-date versions of programs I care about, because some program that nobody cares about won't build against an upgraded library version that they mutually depend on.
The requirement that every binary on the system depend on the same shared library basically led to things like app stores and Snap, which I think we can all agree that nobody wants.
Yeah, and I get my free ride by just downloading the binary from the project's website. Linux distributions add a bunch of arbitrary rules that I don't care about and that don't help me.
Yes this can't be emphasized enough. static linking is fine, I don't really care either way. But please, please don't vendor.
The point of software is to be composable, and vendoring kills that. Vendoring will cause free software to drown in its own complexity and is completely irresponsible.
It's one a package contains it's dependencies, and often their dependencies too (transitive deps).'
packages are supposed to be nodes in a dependency graph (or better, partial order), and there are nice composition rules. But one the nodes themselves represent graph/order closures (or almost-closure) we use that nice property.
People that are very application oriented tend to like vendoring --- "my application is its own island, who gives a shit". But the distro's point isn't just gather the software --- that's an anachronism --- but plan a coherent whole. And applications who only care about themselves are terrible for that.
The larger point is free and open source software really should be library-first. Siloed Applicationns are a terrible UI and holdover from the economics of proprietary shrink-wrapped software which itself is a skeuomorphism from the physical world of nonreplicable goods.
> Siloed Applicationns are a terrible UI and holdover from the economics of proprietary shrink-wrapped software which itself is a skeuomorphism from the physical world of nonreplicable goods
Siloed applications are UI neutral and represent decoupling, which enables progress and resilience. Excessive interdependence is how you lose Google Reader because other services have changed requirements.
What? Getting rid of Google Reader was a business decision.
> Siloed applications are UI neutral and represent decoupling, which enables progress and resilience.
The medieval manorial economy is resilient. Is that progress?
What happened to unified look and feel? and rich interactions between widgets? The smalltalk and Oberon GUI visions? All that requires more integration.
I get that we need to make the here and no work, but that is done via the en-mass translation of language-specific package manager packages to distro packages (static or shared, don't care), not vendoring.
> What? Getting rid of Google Reader was a business decision
It has been widely reported that a significant factor in that business decisions was Google’s massively coupled codebase and the fact that Reader had dependencies on parts of it that were going to have breaking changes to support other services.
> The medieval manorial economy is resilient.
It’s not, compared to capitalist and modern mixed economies, and unnecessary and counterproductive coupling between unrelated functions is a problem there, too.
> Is that progress?
Compared to its immediate precedents, generally, yes.
Fewer and fewer people are interested in the benefits provided by "coherent whole" distributions. And more and more people are interested in the benefits provided by "it's own island" applications.
The future is statically linked and isolated applications. Distros need to adapt.
1. Have the users been asking for everything to be a laggy electron app? I don't think so.
2. Within these apps are languages based package managers that don't encorage vendoring, it's just one people go to package for distros that they vendor away. Distros do need to make it easier to automatically convert language-specific package manager apps.
The future is making development builds and production builds not wildly different, and both super incremental, so one can easily edit any dependency not matter how deep and then put their system back to together.
Again, I am fine with static linking. My distro, NixOS is actually great at that and I've helped work on it. But vendoring ruins everything. I hate waiting for everyone to build the same shit over and over again with no instrumentality, and I don't have time to reverse-engineer whatever special snowflake incremental dev build setup each package uses.
The number of people who consider the system/distribution the atomic unit rather than the application, is probably about equal to the number of people who "edit dependencies and put their system back together" -- they are in total statistically zero. The overriding concern is the user need for a working application. Everything else is a distant secondary concern.
I'm not trying to convince you of anything, here, I'm just describing the facts on the ground. If you're not into them, that's fine!
The number of people who have a "theory of distribution" one way or the other is pretty low.
But
- many people seem to like unified look and feel
- many people complain about per-app auto-update
- many people love to complain software is getting worse
Are these people connecting these complaints to the debate we're having? Probably not. Can these views be connected to that? Absolutely.
---
I work on distros and due edit the root dependencies, I also contribute to many libraries I use at work during work, finally, I use the same distro at work on on my own and everything is packaged the same way. So yes, it's a "unified workflow for yak shaves" and I quite like it.
I hope there can be more people like me if this stuff weren't so obnoxiously difficult.
Yeah, sorry, I saw that later. (May have seen it earlier too, but not connected it to your name as I was replying.)
But that still doesn't make my comment a strawman (because I was talking about this one specific comment), or AFAICS yours less of one: Why would you jump to "bloated Electron apps"? Sure, they may suck, but the comment you replied to was about statically linked apps; no mention of Electron at all. Unless you're saying there was, originally, and had been edited out before I saw it? If not, your reply was... OK, more charitably, at least a non sequitur.
The "coherent whole" is more in demand than ever before. Just look at the Android and iOS ecosystems with super hard rules how things have to behave and look in order to be admitted. They just put the burden on the app dev instead of a crew of distro maintainers.
If you define "demand" as hard orders from the warden of your walled garden, yes. But that's not how the concept is normally used.
Personally, for instance, I'd have been perfectly happy if at least a few apps had stayed with the Android UI of a few versions back[1], before they went all flat and gesture-based. There was no demand from me, as a consumer, to imitate Apple's UI.
___
[1]: And no, that's not outmoded "skeumorphism". That concept means "imitation of physical objects", like that shell on top of Windows that was a picture of a room, with a picture of a Rolodex on top of a dedktop etc etc. In the decades since ~1985 a separate visual grammar had developed, where a gray rounded rectangle with "highlights" on the upper and left edges, "shadows" on the lower and right edges, and text in the middle meant "button" in the sense of "click here to perform an action", not any more in the original skeumorphic sense of "this is a picture of a bit of a 1980s stereo receiver".
> packages are supposed to be nodes in a dependency graph (or better, partial order), and there are nice composition rules. But one the nodes themselves represent graph/order closures (or almost-closure) we use that nice property.
A) "Lose", not "use", right?
B) Sounds like a lot of gobbledy-gook... Who cares about this?
C) Why should I?
> People that are very application oriented tend to like vendoring --- "my application is its own island, who gives a shit".
No, people that are very application oriented tend to like applications that work without a lot of faffing around with "dependencies" and shit they may not even know what it means.
> But the distro's point isn't just gather the software --- that's an anachronism --- but plan a coherent whole.
A) Sez who?
B) Seems to me it would be both easier and more robust to build "a coherent whole" from bits that each work on their own than from ones that depend on each other and further other ones in some (huge bunch of) complex relationship(s).
> And applications who only care about themselves are terrible for that.
As per point B) above, seems to me it's exactly the other way around.
> The larger point is free and open source software really should be library-first.
Again, sez who?
> Siloed Applicationns are a terrible UI
WTF does the UI have to do with anything?
> and holdover from the economics of proprietary shrink-wrapped software
[Citation needed]
> which itself is a skeuomorphism from the physical world of nonreplicable goods.
Shrink-wrapped diskettes or CDs are physical goods, so they can't be skeumorphic.
That's easy to say, but sometimes I need to fix bugs in downstream packages, and I am not willing to wait for 6 months (or forever in some cases) for a fix to be released.
Then Linux distro a year out my patched censored version and link to the buggy upstream, and I have to keep telling bug reporters to not use the distro version.
A useful thing to do when forking software is to give it a new name, to make it clear that it's a separate thing. It sounds like you copied some specific version of software you depend on, then forked it, but left the name the same -- which caused confusion by package builders at the distribution since it's a bit of work to determine if you forked the dependency or are just including it for non-distribution user's convenience.
> (...) dynamic linking burdens distro maintainers with hundreds of additional packages which need to be individually tracked, tested and updated.
To me this assertion makes no sense at all, because the whole point of a distro is to bundle together more than hundreds of packages individually tracked, tested, and updated. By default. That's the whole point of a distro, and the reason developers target specific distros. A release is just a timestamp that specifies a set of packages bundled together, and when you target a distro you're targeting those hundred of packages that are shipped together.
Also compilers nowadays are smarter and can perform link time optimizations. Meaning that if of a library you only use a single function, in the final executable you would only get that single function. In reality code that use static linking could be more efficient than code that uses dynamic linking.
And you have to consider some performance penalty when using shared libraries. First you have a time penalty when loading the executable, since first you have to run the interpreter (ld-linux) and then your actual code. But also for each function call you have to make an indirect jump into it.
One tradeoff is security. If you're patching vulnerabilities, then just a single .so needs to be patched. With static linking every binary needs to be investigated and patched.
You can also argue that it is impossible to update dynamic libraries because they are used by multiple applications and you can't afford that any application breaks. So instead of being able to patch that one application where the security is needed, you now have to patch all of them.
> You can also argue that it is impossible to update dynamic libraries because they are used by multiple applications and you can't afford that any application breaks.
That's where maintenance branches comes in. You fix only the security issue, and push out a new version.
Isn’t this especially true in the world of containerization? We literally ship entire images or configurations of OS’s to run a single application or system function.
Although, I have mixed feeling about containers, because I fully appreciate that Unix is the application and the software that runs on it are just the calculations. In that world, sure, a shared library makes sense the same way a standard library makes sense for a programming language. Thus, a container is “just” a packaged application.
Regardless, this concept is so out of the realm of “performance” that it’s worth noting that the idea of trying to reduce memory use or hard disk space is not a valid argument for shared libraries.
> Disk space, and arguably even RAM, have become very cheap by comparison
CPU frequency and CPU cache are remaining small, so smaller binaries, which fit the cache, are running faster, and use less energy, and wastes less resources overall.
> Given these, it seems worthwhile to re-evaluate the tradeoffs that come with dynamic linking
Create your own distribution and show us benefits of static linking.
To underscore the gp's point, I have had to work with a 3rd party-provided .so before. No sources. A binary. And strict (if unenforceable) rules about what we could do with the binary.
I dealt with a situation like this for a ROS driver for a camera, where the proprietary SO was redistributable, but the headers and the rest of the vendor's "SDK" was not. The vendor clarified for me repeatedly that it would not be acceptable for us to re-host the SDK, that it was only accessible from behind a login portal on their own site.
The solution we eventually came to was downloading the SDK on each build, using credentials baked into the downloader script:
The vendor was fine with this approach, and it didn't violate their TOU, though it sure did cause a bunch of pain for the maintainers of the public ROS CI infrastructure, as there were various random failures which would pop up as a consequence of this approach.
I think eventually the vendor must have relented, as the current version of the package actually does still download at build time, but at least it downloads from a local location— though if that is permissable, I don't know why the headers themselves aren't just included in the git repo.
To be honest, I haven't really tracked it— the product I work on dropped stereo vision in favour of RGBD, so I don't really know where it's landed. I suppose it's not a great sign that the current generation SDK still requires a login to access:
And at least one spinnaker-based driver seems to have inherited the "download the SDK from elsewhere" approach, though who knows if that's due to genuine need or just cargo-culting forward what was implemented years ago in the flycapture driver:
The "proper" approach here would of course be for Open Robotics (the ROS maintainers) to pull the debs and host them on the official ROS repos, as they do for a number of other dependencies [1], but that clearly hasn't happened [2].
I think a lot of hardware vendors who cut their teeth in the totally locked down world of industrial controls/perception still think they're protecting some fantastic trade secret or whatever by behaving like this.
I think this was precisely the point. Shared libraries create seams you can exploit if you need to modify behavior of a program without rebuilding it - which is very useful if you don't want to or can't rebuild the program, for example because it's proprietary third-party software.
> The ability to change which library is being used without needing to rebuild the main program is really important.
Having spent many hours avoiding bugs caused by this anti-feature I have to disagree. The library linked almost always must be the one that the software was built against. Changing it out is not viable for the vast majority of programs and libraries.
Just as an example, there is no good technical reason why I shouldn't be able to distribute the same ELF binary to every distro all the time. Except the fact that distros routinely name their shared objects differently and I can't predict ahead of time what they will be, and I can't feasibly write a package for every package manager in existence. So I statically link the binary and provide a download, thereby eliminating this class of bug.
Despite the rants of free software advocates, this solution is the one preferred by my users.
Not defending it for general use, but dynamic linking can be very useful for test and instrumentation. Also sometimes for solvers and simulators, but that's even more niche.
Can you please elaborate? Why would the ability to change the library version at runtime be useful for testing? and what aspect of this is useful for simulators and solvers?
You compile a version of the dependency which intentionally behaves different (e.g. introduces random network errors, random file parsing errors, etc.) and "inject" it into a otherwise functional
setup to check if all other parts including error
reporting and similar work.
There are always other ways to archive this but using (abusing?) the dynamic linking is often the simpler way to set it up.
> and what aspect of this is useful for simulators and solvers?
Duno, but some programs allow plugin in different implementations of the same performance critical functionality so that you can distribute a binary version and then only re-compile that part with platform specific optimizations. If that is what the author is referring to I would argue today there are better ways to do it. But it's still working out either way and can be much easier to setup/maintain in some cases. (And probably falls into the "shared libraries as extension system" category Linus excluded from his commentary.)
I should clarify - it's at the start of runtime when the library is initially loaded. Not afterwards. You'd have to restart the program to swap a library.
For testing and simulation, it's a way to mock dependencies. You get to test the otherwise exact binary you'll deploy. And you can inject tracing and logging.
Solver installations tend to be long-lived, but require customizations, upgrades, and fixes over time. Dynamic linking lets you do that without going through the pain of reinstalling it every time. Also with solvers you're likely to see unconventional stuff for performance or even just due to age.
You're forgetting about one common case where libraries are replaced.
This is security vulnerabilities. If your application depends on a common library that had a vulnerability, I can fix it without you having to recompile your app.
With GLibc or X libraries a vulnerability there would result essentially requiring reinstallation of the entire OS.
You could but you would be doing yourself two disservices by trusting vendors that aren't providing security updates for dependencies in a timely manner and running applications on top of dependencies they haven't been tested with.
Vendors could ship applications with dependencies and package managers could tell which of those applications and dependencies have vulnerabilities. This would clarify the responsibility of vendors and pressure them to provide security updates in a timely manner.
One big obstacle is that it's fairly common for vendors to take a well known dependency and repackage it. It's difficult to keep track of repackaged dependencies in vulnerability databases.
You seem to be assuming rebuilding is possible. What about the (still very useful) proprietary binaries from a company that hasn't existed in a decade? What about the binaries where the original source no longer exists?
I say that presents market opportunities. Every piece of technology faces a time of obsolescence.
If the original source no longer exists and rebuilding is no longer possible then replacing dependencies is no longer feasible without manual intervention to mitigate problems. ABIs and APIs are not as stable as you'd think.
> Every piece of technology faces a time of obsolescence.
That is true for most technologies that experience entropy and the problems of real world. Real physical devices of any kind will eventually fail.
However, anything build upon Claude Shannon digital circuits do not degrade. The digital CPU and the software that runs it are deterministic. Some people see a lack of updates in a project as "not maintained", but for some projects that lack of updates means the project is finished.
> obsolescence
What you label as obsolete I consider "the last working version". The modern attitude that old features need to be removed results in software that is missing basic features.
> replacing dependencies is no longer feasible without manual intervention to mitigate problems
This is simply not true. Have you even tried? I replaced a library to get proprietary software working last week. In the past, I've written my own replacement library to add a feature to a proprietary program. Of course this required manual intervention; writing that library took more than a week of dev time. However, at least it was possible to replace the library and get the program working. "Market opportunities" suggests you think I should have bought replacement software? I'm not sure that even exists?
> ABIs and APIs are not as stable as you'd think.
I'm very familiar with the stability of ABIs and APIs. I've been debugging and working around this type of problem for ~25 years. Experience suggests that interface stability correlates with the quality of the dev team. A lot of packages have been very stable; breaking changes are rare and usually well documented.
> there is no good technical reason why I shouldn't be able to distribute the same ELF binary to every distro
Oh, your app also works on every single kernel, every different version of external applications, and supports every file tree of every distro? Sounds like you added a crap-ton of portability to your app in order to support all those distros.
> I can't feasibly write a package for every package manager in existence.
But you could create 3 packages for the 3 most commonly used package managers of your users. Or 3 tarballs of your app compiled for 3 platforms using Docker for the build environment. Which would take about a day or two, and simultaneously provide testing of your app in your users' environments.
Yes this not that hard if you are statically linking as much as possible. The magic of stable syscalls. Variance between glibc is the biggest headache but musl libc solves a lot of the problems.
Have you ever actually tried that last step that you're suggesting? It's actually really time consuming and expensive to maintain that infrastructure due to oddities between distros, like glibc or unsupported compiler versions. Statically linking is easier than redoing all the work of setting up and tearing down developer environments just because one platform has a different #define in libc. It's also cheaper when your images are not small and you're paying for your bandwidth/storage.
Most libraries do not have stable ABI's, even for C there are may ways you can mess that up. Even "seemingly clear cut cases" like some libc implementations ran into accidental ABI breakage in the past.
And just because the ABI didn't change doesn't mean the code is compatible.
It's not seldom that open source libraries get bugs because dynamic linking is used to force them to be used with versions of dependencies which happen to be ABI compatible (enough) but don't actually work with it/have sub-tile bugs. It sometimes gets to a point where it's a major anoyence/problem for some open source projects.
Then there is the thing that LD_LIBRARY_PATH and similar are MAJOR security holes, and most systems would be best of to use hardening techniques to disable it (not to be confused with `/etc/ld.so.conf`).
Through yes without question for not properly maintained closed source programs it is helpful.
But then for them things like container images being able to run older versions of linux (besides the kernel) in a sandbox can be an option, too. Through a less ergonomic one. And not applicable in all cases.
> Then there is the thing that LD_LIBRARY_PATH and similar are MAJOR security holes, and most systems would be best of to use hardening techniques to disable it (not to be confused with `/etc/ld.so.conf`).
I do not consider LD_LIBRARY_PATH or LD_PRELOAD more a security hole than PATH itself.
There is two scenarios:
- you control exactly how your program launcher (environment variables, absolute path) andit's a non issue
- you do not control the environment properly and the everything is a security hole.
That's said: DT_RUNPATH and RPATH are however beautiful security holes.
They allow to hardcode loading path in the binary itself even with a controlled environment.
And many build tools let garbage inside these path unfortunately (e.g /tmp/my_build_dir )
From a desktop point of view Linux needs some major improvement about how it handles applications.
It also has all tools to do so, but it would brake a lot of existing applications.
In the past I though Flatpack and Snap would steps in the right direction. But now I'm not so sure about that anymore (snap made some steps in the right direction but also many in wrong directions, flatpack seems to not care about anything but making things run easier, in both cases moving from a kinda curated repo to a not-really curated one turned out horrible).
For a server point of view things matter much less, especially wrt. modern setups (container, vm in cloud, cloud provider running customized and hardened Linux container/vm hosts, etc).
And in the end most companies paying people to work on Linux are primary interested in server-ish setups, and only secondarily in desktop setups (for the people developing the server software). Some exception would be Valve, I guess, for which Linux is an escape hatch in case bade lock-in patterns from phone app-stores take hold on windows.
For comparison, AmigaOS was built on the assumption of binary compatibility and people still replace and update libraries today, 35 years later.
It's a cultural issue, not a technical one - in the Amiga world breaking ABI compatibility is seen as a bug.
If you need to, you add a new function. If you really can't maintain backwards compatibility, you write a new library rather than break an old one.
As a result 35 year old binaries still occasionally get updates because the libraries they use are updated.
And using libraries as an extension point is well established (e.g datatypes to allow you to load new file formats, or xpk which let's any application that supports it handle any compression algorithm there's a library for).
Oh man that brings memories. It's so sad that things like datatypes or xpk didn't made it to modern OSes (well, there's just fraction of it, I guess video codecs are closest thing to it, but that just targets one area).
I also wanted to point out that this standardization made it possible to "pimp" your AmigaOS to make individual desktops somewhat unique. There were basically libraries that substituted system libraries and changed how the UI looked or even how it worked. I kind of miss that. Now the only personalization I see is how the terminal prompt looks like :)
It's a side effect of abstraction. Even a language like C makes it extremely hard to figure out the binary interfaces of the compiled program. There's no way to know for sure the effects any given change will create in the output.
The best binary interface I know is the Linux kernel system call interface. It is stable and clearly documented. It's so good I think compilers should add support for the calling convention. I wish every single library was like this.
We have an entire language-on-top-of-a-language in C++ pre-processor, but we could not figure out something to specify to the compiler what we want in an ABI?
I think an abstraction is when a tool takes care of something for you, this situation is just neglect.
I have maintained some mini projects which try to have strong API stability.
And even through keeping API stability is much easier then ABI stability I already ran into gotchas.
And that was simple stuff compared to what some of the more complex libraries do.
So I don't think ABI FFI stability ever had a good chance (outside of some very well maintained big system libraries where a lot of work is put into making it work).
I think the failure was to not realize it earlier and instead move to a more "message passing" + "error kernel" based approach for libraries where this is possible (which are surprisingly many) and use API stability only for the rest (except system libraries).
EDIT: Like use pipes to interconnect libraries and use well defined (but potential binary) message passing between them. Being able to reload libraries resetting all global state, or run multiple versions of them at the same time etc. But without question this isn't nice for small utility libraries or language frameworks and similar.
I think the failure was to not realize it earlier and instead move to a more "message passing" + "error kernel" based approach for libraries where this is possible (which are surprisingly many) and use API stability only for the rest (except system libraries).
Sounds pretty sweet as far as composability is concerned, but there is the overhead caused by serialization and the loss of passing parameters in registers.
Maybe this is in-line with what Linus said (very standard libraries), but I think when there's a vulnerability in libssl.so being able to push a fix instead of having to fix many things is a huge win.
The dependency hell thing seems to come up when libraries change their APIs and then you have to scramble. Last one I recall struggling with was libpng.so but there are plenty of others.
If you need to change a statically built executable, you can always patch it manually. This was how no-CD cracks worked back in the day.
Yes, dynamic linking makes this process easier, but it makes it so easy that both users and distro maintainers regularly break software without even realising it (hence why Red Hat use 5-year old versions of everything).
> If the program is statically linked, then nothing can happen - either it works or it doesn't. With dynamic linking, changes are possible by changing which library is loaded.
These problems are almost purely caused by dynamic linking though. The devs release something that was tested on some old version of ubuntu and now on your new fedora, things work different and the program is broken. While with static linking, it all just works.
You seem to be assuming that a statically linked library won't have a subtle, edge-case bug, when in fact, it doesn't just work.
Even that could be env-related, so maybe it surfaced once you moved to a new Fedora release.
Basically, the problems are pretty much the same with either approach, one has a more complex runtime environment, other has a more complex upgrade story.
While I'd agree with Linus' take on shared libraries, I'd still say that "statically linked libraries are not a good thing in general either" (the stress is on general).
Latest features and security patches rarely go hand-in-hand: it's usually latest features and new security bugs instead.
Developers want to maintain a single branch because that's much cheaper, not because it's the ultimate solution to all the problems of maintaining software.
> The important benefit is dynamic linking. The ability to change which library is being used without needing to rebuild the main program is really important.
Or just having the option of loading the library at all. If you don’t need the functionality offered by libxyz, you’re not required to use it. One then has no end of language extensions that can be loaded into a generic interpreter to fit a script to whatever job they have at hand.
Edit: Linus touched on this as a last point:
> Or, for those very rare programs that end up dynamically loading rare modules at run-time - not at startup - because that's their
extension model.
Most applications don't have runtime extensions. The few that do, really need them in order to be useful. I'm convinced that there's a Pareto Distribution where 20% of libraries really benefit from being dynamically loaded, where the remaining 80% is better served by static linking. Given that, it seems to me that dynamic linking should be both opt-in and properly supported, but the one that decides that should be the library maintainer.
> If the program is statically linked, then nothing can happen
Static libs are embedded into most object-file formats (e.g. ELF) as isolated compilation units. Could you not just replace the static library within the object-file at the linkage level, in about the same way that old "resource editors" could replace individual resources within object-files, or the same way that programs like mkvmerge can replace individual streams within their target media-container format?
Static vs. dynamic linking is just about whether the top-level symbol table of the executable can be statically precomputed for that linkage. It doesn't impede you from re-computing the symbol table. That's what the linker already does, every time it links two compilation units together!
More and more often they're not just isolated units. Link time optimisation can mess up that assumption and for example inline things across library borders.
Perhaps then the OS should figure out which parts of an executable are common, so we can still have shared libraries except they are smarter and not visible to the user.
Except "being visible to the user" is a useful thing to have, as GP explained :).
Or put in a different way: the problem is the "shared" part, not the "dynamic linking" part. Instead of static linking, you can avoid the issue of library versions by just shipping your shared libraries along with your program[0] - and this way, users still retain the ability to swap out components if the need arises.
--
[0] - Well, you could, if you could rely on LD_LIBRARY_PATH to be set to "." by default. Windows is doing it right, IMO, by prioritizing program location over system folders when searching for DLLs to load.
Thanks! I forgot all about it, in particular about '$ORIGIN' thing! Yes, I'm happy to see the person building the executable has at least some control over this.
It's actually relevant to a project I'm working on (proprietary, Windows/Linux, uses shared libraries for both mandatory components and optional plugins) - I'm gonna check if and how we're setting RPATH for the Linux builds, it might need some tweaking.
It's definitely technically doable: you'd have to checksum every read-only page of a program and see if you already have one in cache.
But is it worth it though? Even "big" statically linked Rust programs are a few dozen MBs of executable (and not all of that is read-only, and even less will be shared with any other executable at any given time). With things like LTO identical source files can result in different machine code as well.
In the end it would be a lot of trouble for potentially very little gains.
At first I was annoyed that Rust defaulted to dynamic linking, it felt inefficient and overkill. But the more I think about it the less I can really justify doing it any other way, there are very few benefits to dynamic linking these days. The only one I can think of is overriding an application's behaviour through LD_PRELOAD, but few people know how to do that these days.
DLLs and shared libraries are different things though, no? Isn't it common in windows to use DLLs and simply pack in everything you need? Is that not the best of both worlds minus a bit of ram and disk?
> DLLs and shared libraries are different things though, no?
No; they're just different names for the same concept.
> Isn't it common in windows to use DLLs and simply pack in everything you need?
Yes.
> Is that not the best of both worlds minus a bit of ram and disk?
No; it's more like the worst of both worlds. When it is feasible, static linking is strictly superior to this approach. (You've pointed out that it may not be feasible in some cases in sibling comments, and I agree.)
DLLs are the same kind of file as ELF shared objects. Thing is that in Linux/BSD world they are indeed also commonly shared system-wide, on a package management level.
Bundling shared libraries is kinda the worst of both worlds: not getting any sharing between application (/system wide security patches) but not getting the performance benefits of static linking either. The only good thing is that you could manually replace e.g. a vulnerable version of a library inside this bundle.
And yeah… Flatpak/AppImage/snap distribution is kinda like that Windows-style one.
I do the same with good ol' cmake, I have a "developer" preset which will build my app with clang, lld, PCH, split into small shared libraries. And a release preset where everything is statically linked
The now predominant style. Windows DLLs didn't always used to be named Whatever_5_0_1.DLL; used to be it was just Whatever.DLL.
Which lead to "DLL Hell" when the version of Whatever.DLL you had installed was not the same as the developer of your app had so the app wouldn't work, and if you switched to the one he had, some other app that used Whatever.DLL stopped working in stead. Which lead to the "exclusive version" / version-named DLL situation we have on Windows today... And, increasingly, on Linux.
I don't know exactly how shared libraries are loaded -- there are some tantalising mentions in sibling comments -- but it all seems to point towards two opposite solution paths:
1) Version-named libraries like on Windows; and/or, possibly, some file-link magic where /std_shared_libs/somelibrary/somelib_x points to /actual_libs/somelibrary/somelib_x_y_z, etc, etc... Then you could also have .../somelibrary/somelib_latest point to, well, the actual latest version installed, and somelibrary/somelib_default point to the one you want as default, etc, etc. (Hmm... Shouldn't stuff like this have been hammered out decades ago by, Idunno, some kind of Linux standardisation initiative... if this doesn't exist already, then WTF is the LSB for?!?)
2) Just fucking static-link everything already.
Friar William of Ockham seems to be pointing more towards one than the other of the above.
Anyway, what certainly doesn't look like a great solution to me is
3) Containerization. That just feels like "In stead of fixing DLL Hell on Linux, let's just ignore it, replicate ~half the OS -- but with our preferred version of every library -- inside a (pseudo-)VM, and pretend that running boxes within boxes within boxes is a solution". No, that's not a solution, that's a kludge.
Several times last week when I needed to use an old proprietary program that used several outdated libraries.
Every single time I want to run a program (usually a game) that has a runtime dependency on pulseaudio[1]. (apulse[2] usually works to translate the libpulse ABI back to ALSA).
One time I had to write my own version of a library that specifically emulated the ABI used by an important program from an outside vendor. Obviously this didn't "just work"; it required a couple weeks of work to write the new version. The point isn't that future versions of a library will magically "just work". With a dynamically linked dependency replacement is at least possible. If the program in question was statically linked, nothing could be changed.
The question isn't about which method is less work or easier to maintain. The question that matters in the long run is if you want the basic ability to modify program's dependencies in the future to fix an important problem?
[1] As of a few years ago, the stupid design decisions in pulseaudio make it make it highly incompatible with my needs. Just having it installed makes runtime linking issues even worse. It also add insane amounts of latency (>5ms is bad. >50ms is insane) by design.
I'm not in the pro-shlib camp (it's what brought us the whole Docker->k8s fiasco in the first place), but it might be worth noting that the LGPL requires shared libs ie requires you as an end user be able to swap LGPL libs by newer versions, alternate implementations, or your own.
Also, there are legitimate use cases for shlibs-as-plugins, such as ODBC.
I agree with the core of your comment, but I would note that containerization has other advantages over simple static binaries. In particular, sandboxing even at the network layer. Also, containers often bundle mutiple programs together to ensure compatibility, not just libraries, which is a step that is almost never discussed. Applications often depend on system utilities whose API is even more poorly defined/maintained than system libraries, so bundling your own ls or shell may be a safer option.
Even further, k8s really has nothing to do with this, it is a tool for submitting workloads to a group of a computers through a single API, and it critically depends on containerization for network and storage isolation. K8s would have looked more or less identical to how it does today even if shared libraries had never existed (though probably the exact format of OCI container images could have been much simpler).
> If you dynamically link against an LGPLed library already present on the user's computer, you need not convey the library's source. On the other hand, if you yourself convey the executable LGPLed library along with your application, whether linked with statically or dynamically, you must also convey the library's sources, in one of the ways for which the LGPL provides.
Heads up to gnu.org: site won't load via https in FF due to HSTS policy (broken certificate chain?).
You'd not need to do that if all (or all besides libc maybe) were statically linked though.
So that's actually an argument for statically linking, as doing such every day is just a PITA and not really an option for those who need to actually get work done during spending paid time on such a system.
I successfully upgraded the OpenSSL-DLLs used by a no longer maintained Windows program when some servers started refusing connections (presumably due to outdated SSL/TLS versions and/or encryption schemes used by the old DLLs). After upgrading to a recent version of OpenSSL, everything worked fine again.
You are right on track here. The complexity is real, but so are the benefits of a modular executable vs. a huge monolith.
I fully understand why languages like rust demand static linking (their compilation model doesn't allow dynamic linking). But once you encounter a C-API boundary, you can as well make that a dynamic C-ABI boundary.
It's not the default, but Rust absolutely allows dynamic linking [1]. For example, Bevy encourages building the engine as a dynamic library during development to improve iteration turnaround times for the app code [2].
If I am not completely mistaken, if you produce a dynamic library with rust, you are limited to the C-ABI. For instance, you cannot import a polymorphic function from a dynamic library.
The rust abi isn't stable between compiler versions, but it does exist. Bevy can get away with using it because it's just to speed things up in development.
Polymorphism is a separate issue. One way to do polymorphism in rust is monomorphism, where your polymorphic function is compiled to a specific version with no generics per caller. If you don't know the callers ahead of time this can't work. Another way is dynamic dispatch, where you have one polymorphic function that chooses what code to run per type at runtime. This can work with dynamic linking
I haven't used it, but I don't believe that's the case. I think what you're describing is `#[crate_type = "cdylib"]` ("used when compiling a dynamic library to be loaded from another language"), whereas `#[crate_type = "dylib"]`
produces a "dynamic Rust library".
> However, there important advantage of using .so files instead of building into a big statically linked binary is NOT that a library can be shared between different programs! The important benefit is dynamic linking.
One could reasonably view the kernel use of loadable modules as an example of this utility.
> The ability to change which library is being used without needing to rebuild the main program is really important.
In 40 years in the field, I've never needed that. Every single time, we just emitted an entirely new build - because it's _much easier._
> Maybe rebuilding the program isn't practical. Maybe rebuilding the program isn't possible because it's closed/proprietary or the source no longer exists
These are edge cases.
Nearly all the time when we develop, we are making changes in an existing codebase.
Asides from rare bugs (which get fixed), glibc does not break ABI. However, glibc does extend the ABI with new versions so compatibility only goes one way - your runtime glibc (generally) needs to be at least as new as the one used for linking.
Another important consideration is security. If there is an exploit in the library code and the original program authors do not release updates regularly, then it is really important to be able to patch the security issue.
> (...) while the "couple of libraries that nobody else used. Literally nobody" should obviously be statically linked.
I disagree. The main value proposition of shared libraries is being able to upgrade them without requiring to rebuild the application. Sure, libraries need to be competently maintained to ensure that patch releases are indeed compatible, but that still opens the door to fixing vulnerabilities on the fly by simply upgrading the lib that sneaks in the vulnerability, which shared libs allow even to end-users by simply dropping in a file.
Note to 1-bit people: saying that "X is not a good thing in general", is not the same as saying that "X is a bad thing in general". All that's being said here is "tradeoffs exist" and highlighting some issues some people apparently like to ignore.
One of my favorite libraries is SDL. Of course, when SDL2 came out, they moved away from the LGPL license to a more liberal zlib license, which allows static linking without infecting the license of your project. Great, except now as years go by and games are abandoned, they stop working in new environments, or at least don't work as well, and you can't just force in a new dynamic version of SDL to save things like you used to be able to.
The solution they came up with: even if the program is static linked, the user can define an environmental variable to point to a dynamic lib and that will be used instead. Best of both worlds, at least going forward for programs that use a version of SDL2 that incorporates the feature. (Some more details: https://www.reddit.com/r/linux_gaming/comments/1upn39/sdl2_a...)
I think this is still the hallmark of API volatility.
I personally think with the stalling of serial speed in CPUs, we will eventually have to optimize and stabilize interfaces.
I think some of that will require stabilization of many APIs and libraries such that they don't change over the span of a decade.
This era is basically unthinkable in the current age, but I think that is because we have been living in the "free lunch" era of faster better CPUs every year for too long.
Much like all of the 20th century was an era of resource availability. The 21st will be an era of succeeding within resource constraints: with efficiency.
Those libraries will be excellent candidates for shared libraries.
As the reply to Linus's post points out, the increase in storage/network (~9MB vs 40+MB) can be multiples of the shared library cost, and memory cost can be multiples too if you're not calling the same binary over and over again like Clang.
A middle ground could be distros can keep control over the version that's being statically linked at build time. It would do away with the dependency conflicts that happen now and result in a better user experience. The app/libraries can be updated as needed together. This is what's happening in a venv or container too, once you've built the venv or container it's effectively static.
You do need to keep track of the library versions ultimately at build time. This effectively means handing things off to the build system, and distros like Nix have effectively wrapped build systems to some success I think.
Nix doesn't really "wrap" anything besides compilers, or when it's absolutely necessary for an application. And the wrapping is only done so that there's a way to shoehorn certain CFLAGS, some nix-specific behavior, and other configuration options which an upstream may not have a great way to configure.
But for most mature build systems we largely just use the toolchain as-is. CMake, pkg-config, autotools, and other toolchains or technologies generally allow enough flexibility to enable non-FHS builds to be done in a consistent and easy manner. Actually, creating nix packages for upstreams which have good release management discipline is trivial once you're familiar with nix.
What makes nix special is that you when you go to build the package, you reference the dependencies through their canonical name, but nix translates these dependencies to a unique paths which are hashed to capture all inputs (including configuration flags) used to create the dependency. So you can use multiple conflicting versions of a dependency in a package's dependency tree without issue. Dependencies can be shared if a dependency matches exactly, but can differ if needed. And that's something that FHS distros cannot do.
Actually, since nix never makes any assumptions about what's installed on the system, you're free to use it on any distro. And it's hermetic, so the host system wouldn't be changed outside of the `/nix` directory. You can even use is it even on macOS, although it's not as well supported, but should work fine for common packages.
> A middle ground could be distros can keep control over the version that's being statically linked at build time.
Nixpkgs already does this, even if the dependency is dynamically linked. So there's no additional overhead with switching it out to the static variants; other than upstreams may not support that scenario as well as "traditional" dynamic linking.
I was trying to get at cargo2nix, gradle2nix, etc..
Static linking along with non-FHS does enable multiple versions of apps to co-exist easier, but then this is what AppImage/Flatpak enable too.
I've played around with Nix and source-based atomic upgrades, and modules are more impressive to me than non-FHS, though I see how it enables the first. I think shared system libs with AppImage for apps would be fine for any distro, if only everyone could agree.
The main supposed benefit is not disk or memory economy, but central patching.
The theory is as follows: If a security flaw (or other serious bug) is found, then you can fix it centrally if you use shared libraries, as opposed to finding every application that uses the library and updating each separately.
In practice this doesn't work because each application has to be tested against the new version of the library.
> then you can fix it centrally if you use shared libraries, as opposed to finding every application that uses the library and updating each separately.
This comes up a lot, but how often do you end up in a scenario where there's a critical security hole and you _can't_ patch it because one program somewhere is incompatible with the new version? Maybe even a program that isn't security critical.
Plus what you mentioned about testing. If you update each application individually, you can do it piecemeal. You can test your critical security software and roll out a new version immediately, and then you can test the less critical software second. In some systems you can also do partial testing, and then do full testing later because you're confident that you have the ability to roll back just one or two pieces of software if they break.
It's the same amount of work, but you don't have to wait for literally the entire system to be stable before you start rolling out patches.
I don't think it's 100% a bad idea to use shared libraries, I do think there are some instances where it does make sense, and centralized interfaces/dependencies have some advantages, particularly for very core, obviously shared, extremely stable interfaces like Linus is talking about here. But it's not like these are strict advantages and in many instances I suspect that central fixes can actually be downsides. You don't want to be waiting on your entire system to get tested before you roll out a security fix for a dependency in your web server.
> This comes up a lot, but how often do you end up in a scenario where there's a critical security hole and you _can't_ patch it because one program somewhere is incompatible with the new version? Maybe even a program that isn't security critical?
Talking about user usecases, every time I play a game on Steam. At the very least there are GNU TLS versioning problems. That's why steam packages it's own library system containing multiple versions of the same library -- thus rendering all of the benefits of shared libraries completely null.
One day, game developers will package static binaries, and the compiler will be able to rip out all of the functions that they don't use, and I won't have to have 5 - 20 copies of the same library on my system just sitting around -- worse if you have multiple steam installs in the same /home partition because you're a distro hopper.
There's a big difference between production installs and personal systems or hobbyist systems. Think, there are lots of businesses running Cobol on mainframes, there are machine shops running on Windows XP, there are big companies still running java 8 and python2. When you have a system that is basically frozen, you end up with catastrophic failure where to upgrade X you need to upgrade Y which requires upgrading Z, etc. You'd be surprised what even big named companies are running in their datacenters, stuff that has to work, is expensive to upgrade, and by virtue of being expensive to upgrade it ends up not being upgraded to the point where any upgrade becomes a breaking change. And at the rate technology changes, even a five year old working system quickly becomes hopelessly out of date. Now imagine a 30 year old system in some telco.
These are such different use cases that I think completely different standards and processes as well as build systems are going to become the norm for big critical infrastructure versus what is running on your favorite laptop.
Well, not really. The compiler is able to optimize the contents of the library and integrate it with the program. i.e. some functions will just be inlined, and that means that those functions won't exist in the same form after other optimizations are applied (Like, maybe the square root function has specific object code, but after inlining the compiler is able to use the context to minify and transform it further).
Yes, but LTO doesn't apply across shared object libraries. Suppose I write a video game that uses DirectX for graphics, but I don't use the DirectX Raytracing feature at all. Because of DLL hell, I'm going to be shipping my own version of the DirectX libraries, ones that I know my video game is compatible with. Those are going to be complete DirectX libraries, including the Raytracing component, even though I don't use it at all in my game. No amount of LTO can remove it, because theoretically that same library could be used by other programs.
On the other hand, if I am static linking, then there are no other programs that could use the static library. (Or, rather, if they do, they have their own embedded copy.) The LTO is free to remove any functions that I don't need, reducing the total amount that I need to ship.
Good point (and shows that I am not a video game developer). I had tried to pick DirectX as something that would follow fao_'s example of game developers. The point still holds in the general case, though as you pointed out, not in the case of DirectX in particular.
Even without LTO, linker will discard object files that aren't used (on Linux a static library is just an AR archive of object files). It's just a different level of granularity.
I doubt that you did since OpenGL implementations are hardware-specific. Perhaps you mean utility libraries building on top of OpenGL such as GLEW or GLUT.
Some libraries (OpenGL, Vulkan, ALSA, ..) the shared library provides the lowest stable cross-hardware interface there is so linking the library makes no sense.
> This comes up a lot, but how often do you end up in a scenario where there's a critical security hole and you _can't_ patch it because one program somewhere is incompatible with the new version? Maybe even a program that isn't security critical.
That’s not the point. The point is having to find and patch multiple copies of a library in case of vulnerability instead of just one.
Giving up the policy to enforce shared libraries would just make the work of security teams much harder.
In practice this works really well as long as the change is to eliminate an unwanted side effect rather than to change a previously documented behavior.
But it doesn't really matter. What matters is that whatever system is in use needs to have a control system that can quickly and reliably tell you everything that uses a vulnerable library version, and can then apply the fixes where available and remind you of the deficiencies.
That metadata and dependency checking can be handled in any number of ways, but if it's not done, you are guaranteed not to know what's going on.
If a library is used inside a unikernel, inside a container, inside a virtual machine, inside venvs or stows or pex's or bundles, the person responsible for runtime operations needs to be able to ask what is using this library, and what version, and how can it be patched. Getting an incomplete answer is bad.
I strongly agree that the reporting and visibility you're talking about are important.
But there's one other advantage of the shared library thing, which is that when you need to react fast (to a critical security vulnerability, for example), it is possible to do it without coordinating with N number of project/package maintainers and getting them all to rebuild.
You still do want to coordinate (at least for testing purposes), but maybe in emergencies it's more important to get a fix in place ASAP.
>In practice this works really well as long as the change is to eliminate an unwanted side effect rather than to change a previously documented behavior
...and then you deploy and discover that somebody was depending on that "unwanted" side effect.
> In practice this works really well as long as the change is to eliminate an unwanted side effect rather than to change a previously documented behavior.
I fully agree with that, didn't understand the rest.
Let's say you are building a web-based file upload/download service. You're going to write some code yourself, but most components will come from open-source projects. You pick a base operating system, a web server, a user management system, a remote storage system, a database, a monitoring system and a logging system. Everything works!
Now it's a month later. What do you need in ongoing operations, assuming you want to keep providing reasonable security?
You need to know when any of your dependencies makes a security-related change, and then you need to evaluate whether it affects you.
You need to know which systems in your service are running which versions of that dependency.
You need to be able to test the new version.
You need to be able to deploy the new version.
It doesn't matter what your underlying paradigm is. Microservices, unikernels, "serverless", monoliths, packages, virtual machines, containers, Kubernetes, OpenStack, blah blah blah. Whatever you've got, it needs to fulfill those functions in a way which is effective and efficient.
The problem is that relatively few such systems do, and of those that do, some of them don't cooperate well with each other.
It's plausible that you have operating system packages with a great upstream security team, so you get new releases promptly... and at the same time, you use a bunch of Python Packages that are not packaged by your OS, so you need to subscribe individually to each of their changefeeds and pay attention.
The whole "shared libraries are better for security" idea is basically stuck in a world where we don't really have proper dependency management and everyone just depends on "version whatever".
This is interestingly also holding back a lot of ABI breaking changes in C++ because people are afraid it'll break the world... which, tbf, it will in a shared library world. If dependencies are managed properly... not so much.
I wish distros could just go back to managing applications rather than worrying so much about libraries.
EDIT: There are advantages with deduplicating the memory for identical libraries, etc., but I don't see that as a major concern in today's world unless you're working with seriously memory-constrained systems. Even just checksumming memory pages and deduplicating that way might be good enough.
So if there is 'proper dependency management' (what do you propose? are we too fixed in versioning, too loose?) how will you fix the next Heartbleed? Pushing updates to every single program that uses OpenSSL is a lot more cumbersome (and likely to go wrong because there is some program somewhere that did not get updated) than simply replacing the so/dll file and fixing the issue for every program on the system.
And in case your definition of proper dependency management is 'stricter', then you simply state that you depend on a vulnerable version, and fixing the issue will be far more cumbersome as it requires manual intervention as well, instead of an automated update and rebuild.
If it is looser, then it will also be far more cumbersome, as you have to watch out for breakage when trying to rebuild, and you need to update your program for the new API of the library before you can even fix the issue at all.
No, it is not cumbersome to reinstall every program that relies on OpenSSL. My /usr/bin directory is only 633 MB. I can download that in less than a minute. The build is handled by my distro's build farm and it would have no problem building and distributing statically linked binaries if they ever became the norm.
That is going back to the same issues with containers, where everything works just fine... as long as you build it from your own statically-configured repo and you rebuild the whole system every update. It's useless once you try to install any binaries from an external package source. And IMO, a world where nobody ever sends anyone else a binary is not a practical or useful one.
Yes? Rebuilding and (retesting!) the system on every major update is not a bad idea at all. I rarely install binaries from out-of-repo sources so that is not a great problem for me. And those I do install tend to be statically linked anyway.
> In practice this doesn't work because each application has to be tested against the new version of the library.
In practice it works: see autopkgtest and https://ci.debian.net, which reruns the tests for each reverse dependency of a package, every time a library gets updated.
I know for a fact that other commercial, corporate-backed distributions are far away from that sophistication, though.
> I know for a fact that other commercial, corporate-backed distributions are far away from that sophistication, though.
No, they're not. Both Fedora/RHEL and openSUSE/SLE do similar checks. Every update submission goes through these kinds of checks.
Heck, the guy who created autopkgtest works at Red Hat and helped design the testing system used in Fedora and RHEL now. openSUSE has been doing dependency checks and tests for almost a decade now, with the combination of the openSUSE Build Service and OpenQA.
Having worked with both systems, Debian CI does not focus on system integration. Fedora CI and Debian CI are more similar than different, but Fedora also has an OpenQA instance for doing the system integration testing as openSUSE does. openSUSE's main weakness is that they don't do deeper inspections of RPM artifacts, the dependency graph, etc. They don't feel they need it because OBS auto-rebuilds everything on every submission anyway. The Fedora CI tooling absolutely does this since auto-rebuilds on build aren't a thing there, and it's done on PRs as well as update submissions.
If you can retest everything, you can rebuild everything. But that doesn't help the network bandwidth issue. Debian doesn't have incremental binary diff packages anyway (delta debs) in the default repos anyway, so there's room for improvement there.
> If you can retest everything, you can rebuild everything
No, rebuilding is way heavier.
Even worse, some languages insist on building against fixed version of their dependencies.
It forces distributions to keep multiple version of the same library and this leads to a combinatorics explosion of packages to fix, backport, compile and test.
It's simply unsustainable and it's hurting distributions already.
So if the CVE says that you need to update library L, and program A uses the thing that's broken and it's ok to update, but program B doesn't use the broken thing, but the L upgrade breaks it, CI would let you know that you're screwed before you upgrade... but you're still screwed.
It's worse if the program that needs the update also is broken by the update, of course.
Now it's your choice... you either lose B but protect the rest of the infrastructure from hackers... or you think the CVE doesn't apply to your usecase (internal thing on a testing server), and don't upgrade L to keep B working.
You can also install both versions of L. You can also patch the broken part of L out at the old version, if it's not mission critical. There's a lot of things you can do.
Having one giant binary file with everything statically compiled in is worse in every way, except for distribution-as-a-single-file (but you can already do this now, by putting the binary and the libraries in a single zip, dump everytinh in /opt/foo, and let user find the vulnerable library manually... which again, sucks.
If it were static libraries, you'd upgrade the package for A (which would need to be recompiled with updated L) and leave B alone. As a low priority followup, fix either B or L so they can work together again (or wait for someone else to fix and release).
Installing both versions of L is usually hard. It's one thing if it's OpenSSL[1] 1.1 vs 1.0, but if 1.0.0e is needed for security and 1.0.0d is needed for other applications, how do you make that work in Debian (or any other system that's at least somewhat mainstream)?
[1] Not to pick on OpenSSL, but it's kind of the poster child for important to pick up updates that also break things; but at least they provide security updates across branches that are trying to stay compatible.
For a rough measure of how many packages will break if you perform a minor update to a dependency, try searching “Dependabot compatibility score” on GitHub (or just author:dependabot I suppose), each PR has a badge with how many public repo’s CI flows were successful after attempting to apply the patch. By my eye it seems to be generally anywhere from 90-100%. So the question is would you be happy if every time a security vulnerability came out approximately 1 in 20 of your packages broke but you were “instantly” protected, or would you rather wait for the maintainers to get around to testing out and shipping the patch themselves. Given the vast majority of these vulnerabilities only apply in specific circumstances and the vast majority of my packages are only used in narrow circumstances with no external input, I’d take stability over security.
Security patches are typically much smaller scoped than other library updates. Also, Dependabot does not support C or C++ packages so its stats are not that useful for shared libraries, which are most commonly written in C.
Rebuilding and running tests on that amount of packages every time there's a security update in a dependency is completely unsustainable for Linux distribution as well as the internal distributions in large companies.
I worked these systems in various large organization and distros. We did the math.
On top of that, delivering frequent very large updates to deployed systems is very difficult in a lot of environments.
Works the exact same way without shared libraries, just at the source level instead of the binary level. "Finding every application" is simple. The problem is the compile time and added network load. Both are solvable issues, via fast compiling and binary patching, but the motivation isn't there as shared libraries do an OK job of mitigating them.
> In practice this doesn't work because each application has to be tested against the new version of the library.
Debian security issues thousands patched shared libraries every year without testing them against every program, and without causing failures. They do that by back porting the security fix to the version of the library Debian uses.
I gather you are a developer (as am I), and I'm guessing that scenario didn't occur to you as no developer would do it. But without it Debian possibly wouldn't be sustainable. There are 21,000 packages linked against libc in Debian. Recompiling all of them when a security problem happens in libc may well be beyond the resources Debian has available.
In fact while it's true backward compatibility can't be trusted for some libraries, it can for many. That's easily verified - Debian ships about 70,000 packages, yet typically Debian ships just one version of each library. Again the stand out example is libc - which is why Debian can fearlessly link 21,000 packages again the same version.
I'm guess most of the people here criticising shared libraries are developers, and it's true shared libraries don't offer application developers much. But they weren't created by application developers or for application developers. They were created by the likes of SUN and Microsoft, would wanted to skip one WIN32.DLL so they could update just that when they found a defect in it. In Microsoft's case recompiling every program that depended on it was literally impossible.
But is this still as important if the executable in question is part of the Linux distribution? In theory, Fedora knows know that Clang depends on LLVM and could automatically rebuild it if there was a change in LLVM.
To me that is an argument that doesn't make any sense, at least on Linux. It could make sense if we talk about Windows or macOS, where you typically install software by downloading it from a website and you have to update it manually.
On Linux it the only thing it should change is that if a vulnerability is discovered let's say in OpenSSL all the software that depends on OpenSSL must be updated, and that could be potentially half the binaries of your system. It's a matter of download size, that in reality doesn't matter that much (and in theory can be optimized with package managers that applies binary patches to packages).
It's the maintainers of the distribution that notes the vulnerability in OpenSSL and decides to rebuild all packages that are statically linked to the vulnerable library. But for the final user the only thing he has to do is still an `apt upgrade` or `pacman -Syu` or whatever, and he would still get all the software patched.
That's on the assumption that all software on Linux comes through the the official repos of the distribution. I would bet that there are almost no systems where this holds entirely true, as I've seen numerous software packages whose installation instructions for Linux are either `curl | sudo bash` or `add this repo, update, install`.
Actually now that I think about it, building by yourself might put you at a disadvantage here, as you'd have to remember to rebuild everything. I'm kinda lazy when it comes to updates so not sure if I like the idea anymore with having to rebuild all the software I built myself lol, but it probably could be solved by also shipping .so versions of libraries.
Automatic updates are themselves a security risk, which is something that I rarely hear talked about. For example, the FBI's 2015/2016 dispute with Apple about unlocking phones. The FBI's position relied on the fact that Apple was technically capable of making a modified binary, then pushing it to the phones through automatic updates. If Apple were not capable of doing so (e.g. if updates needed to be approved by a logged-in user), then that vector of the FBI's attack wouldn't be possible.
I don't have the best solution for it, but the current trend I see on Hacker News of supporting automatic updates everywhere, sometimes without even giving users an opt-out let alone an opt-in, is rather alarming.
I don't argue for automatic updates. It's pretty much whatever we already have, but instead of updating a single library, you'd have to update every package that depends on that library.
I'm just throwing ideas around so you should definitely take what I'm saying with a grain of salt. It just would be interesting to see a distro like that and see what the downsides of this solution are. Chances are that there probably already is something like this and I'm just not aware of it and I'm reinventing the wheel.
Ah, got it. Sorry, I misinterpreted "update everything regularly" to imply developers forcing automatic updates on every user.
I'm in the same boat, as somebody who isn't in the security field. I try to keep up with it, but will occasionally comment on things that I don't understand.
sure, some packages in some of the popular distros are indeed like that. If the package is important enough (like firefox) and the required dependencies are a bit out of step with what's currently used by the rest of the distribution you will sometimes see that for at least some of the dependencies.
Most distros dislike doing so, and scale it back as soon as it becomes technically possible.
But they dislike just packages requiring different versions of libraries, right? My point is to do literally everything as it is right now, but simply link it statically. You can still have the same version of a library across all packages, you just build a static library instead of a dynamic one.
This is odd to me, because surely they have to maintain that list anyway so they know which dependent packages need to be tested before releasing bugfixes?
Or is that step just not happening?
I just feel like, one of the big points of a package manager is that I can look up "what is program X's dependencies, and what packages rely on it?"
Another issue with security patches is that specific CVEs are not of equal severity for all applications. This likewise changes the cost/benefit ratio.
You also get bugs introduced into a program by changes to shared libraries. I've even seen a vulnerability introduced when glibc was upgraded (glibc changed the direction of memory, and the program was using memcpy on overlapping memory).
The memcpy API says that is Undefined Behavior, that program was never valid. Not much different from bitbanging specific virtual addresses and expecting they never change.
C language makes a difference between Undefined Behavior and Implementation Defined Behavior. In this case it's the former (n1570 section 7.24.2.1).
Any code that invokes UB is not a valid C program, regardless of implemention.
More practically, ignoring the bug that did happen, libc also has multiple implementations of these functions and picks one based on HW it is running on. So even a statically linked glibc could behave differently on different HW. Always read the docs, this is well defined in the standard.
The source code may have had UB, but the compiled program could nevertheless have been bug-free.
> Any code that invokes UB is not a valid C program, regardless of implemention.
I disagree. UB is defined (3.4.3) merely as behavior the standard imposes no requirements upon. The definition does not preclude an implementation from having reasonable behavior for situations the standard considers undefined.
This nuance is very important for the topic at had because many programs are written for specific implementations, not specs, and doing this is completely reasonable.
> You also get bugs introduced into a program by changes to shared libraries. I've even seen a vulnerability introduced when glibc was upgraded (glibc changed the direction of memory, and the program was using memcpy on overlapping memory).
Did glibc change the behavior of memcpy (the ABI symbol) or memcpy (the C API symbol) which currently maps to the memcpy@GLIBC_2.14 ABI symbol?
Central patching had greater utility before operating systems had centralized build/packaging systems connected via the internet. When you had to manually get a patch and install it via a tape archive or floppy drive, or manually copy files on a lan, only having to update one package to patch a libc issue was a big help. Today it would require a single command either way, and so doesn't really have any significant benefit.
It strikes me that the Linux position of "we won't provide a stable binary driver interface to force you to mainline your device drivers" is a kind of spiritual opposite to dynamically linking shared libraries. One of the most common arguments in favor of shared libraries is that you can patch libraries without rebuilding everything under the sun, but the mechanism used for that is in fact a stable binary interface that the Linux kernel itself refuses to provide. So, if nothing else, Linus taking this position on shared libraries is extremely consistent.
I do think usermode and kernelmode concerns are different, because the kernel drivers are basically the libraries or extensions, not the program itself. If you statically link a driver with the kernel, you get a kernel. If you statically link Qt with your GUI application, you get your GUI application.
The kernel moves fast, because it has to. The usermode does not. The libc interface and relevant syscalls are backwards compatible to a greater degree than would ever be reasonable with kernel drivers. Windows does the best here arguably and drivers still had to break a few times in the past couple decades — not All drivers, but certainly some.
I guess the fuller picture to me would need to acknowledge the difference between modular plugins, library dependencies, and programs, as well as the difference between kernel and userland. The latter is definitely important: it’s very rare in userland that your CPU vendor matters over just the ISA itself, but in privileged/supervisor mode, often there are more differences directly exposed, as one example of potential pitfall. I think this sortof exemplifies the increased responsibility of the kernel to keep drivers working through refactors and changing hardware, especially on Linux where the hardware support is extremely diverse.
I do still agree with Linus regarding dynamic libraries, but I also think the two cases make sense for different reasons.
Somewhere around that mark (it was 3 when i responded, i believe that is more close to the truth,) maybe more for graphics drivers. Of course, that’s with Microsoft’s legendary commitment to backwards compatibility, more limited scope in which devices Windows targets, and a driver model specifically designed to provide a stable interface.
Linux doesn’t attempt to provide a stable driver model, but based on the fact that many users are stuck on old kernels due to unmaintained drivers, I’m not sure if it would really help that much if it did. Desktop PCs are the place where drivers are most likely to be maintained long term and thats a small part of the Linux install base. The dynamics differ more in the wide range of supported devices.
Nah, it’s just obstinance. Providing a stable kernel driver ABI would not actually be difficult, but when you’ve dug in your heels about it for 25 years admitting just how obviously and blatantly wrong you were can be difficult.
If I don't use any features that require libbz2 or whatever a piece of software shouldn't complain if libbz2 isn't there on the system. If I add it to the system I should be able to use those features without recompiling and relinking everything.
Half of the features of autoconf such as "is this library with this method signature available?" should be moved into the dynamic linker.
Software would need to be rewritten to be more lazy and modular, but it'd make dynamic shared libraries actually dynamic. Having to make all those decisions at compile time instead of run time isn't very dynamic.
And the reason we have "dependency hell" in packaging is because software developers never take advantage of this. If applications would use dlopen() with versioning (a real thing that has existed for a while now) to load the specific version of the library they want, we could install 1,000 different versions of a library in our systems today, no problem at all.
For different apps to use different versions of dependencies but still interact with each other, they would need strict rules about how to use interfaces to different versions. You obviously don't want two different programs that use two different versions of a library to talk to one another, because what if version A has a different schema than version B? Unless there was a very specific way to pass along interface information between two programs, so that they could independently handle changes in their interfaces.
This is the reality on Darwin-based platforms: If you use an API that was introduced before a certain OS version, you can weak-link the framework or dylib that it’s in, and just not call it, and your code will load and execute just fine without it present.
> Software would need to be rewritten to be more lazy and modular, but it'd make dynamic shared libraries actually dynamic.
This does exist--see the Vulkan API, for example.
And mostly everybody ignores it. Most programmers effectively write a wrapper to load everything and then treat it like the dynamic linker pulled everything in.
The alternative is when the functionality isn't core. We call those plugins and people use them quite a bit.
The problem is now you get a lot of bug reports of "Feature <X> doesn't work." "Well you don't have plugin <Y>. Closed."
Static linkage is goodness.
And, I'm tired of the dynamic library people beating the "security upgrade" dead horse. The biggest problem with the "security upgrade" argument is that a significant number of people refuse to upgrade because an upgrade always breaks something because everything is dynamically linked.
If everything was statically linked, their pet program wouldn't break and they'd be more likely to upgrade.
The problem is that not having the library usually means you also don’t have the headers, so you can’t compile the C code that would use it. What you’re talking about isn’t impossible, but it probably requires either a binary distribution built from a system aware of these libraries (this is the Darwin+availability attributes approach) or some massive header repository that you can pull from to build things.
Linus is a brilliant guy. Underneath his quirky, cold persona, and the short temper, is a guy I admire not just for his technical brilliance, but his ability to put it into words so concisely and correctly. That’s the one superpower people don’t realize makes great engineers (now remember that English isn’t Linus’ first language).
I’ve felt this with Linus, Dave Cutler and other “10x” engineers. Their code really isn’t that much better than anyone else with similar level of experience. Lot of engineers come up with neat solutions. But listening to these men or reading something they’ve written on their topic of interest, it feels as if you’d figured out the thing they’d just explained by yourself.
> (now remember that English isn’t Linus’ first language).
Nor even his second; I'm fairly sure it's his third (at least chronologically).
Why? As a Finland-Swede -- one of the Swedish-speaking minority in Finland -- he would have learnt Swedish as his mother tongue at home (and probably kindergarten / pre-school, if he went to any), and Finnish as his second, at the latest in elementary school. Perhaps in kindergarten / pre-school -- bilingual ones were rarer in his day, but some people put their kids in that of the "opposite" language explicitly to give them a head start on that. Then in around second to fourth grade he would have taken his first foreign language, which for an overwhelming majority is English (though it could also be German or French or something, and nowadays even more alternatives, like Spanish and Russian and whatnot, are available at least at central schools in larger cities). After that he may well have taken another foreign language in parallel, I have no idea (in primary school, up to and including grade 9, that is). I think in his school days, the 1970s / early 80s, that would statistically most likely have been German.
English being his third in stead of second language is not necessarily a disadvantage; on the contrary, I suspect it's actually an advantage for him to be at least tri- in stead of just bilingual: Knowing more languages in and of itself may make you more eloquent in all of them.
Not knowing him personally[1], so wild-ass guess: I'd tend to think he regards English as his second-but-close-to-first language now, having lived in the USA for so long. Probably still thinks -- to the extent one does that in any language at all -- and dreams in Swedish, and speaks it with his wife, but perhaps at least as much English with his kids who are born there.
At least that's how it works for me. Disclosure: second-hand knowledge of the Finnish school system by having a kid in it, first-hand experience of immigration twice over.
___
[1]: Met him once, about twenty years ago, when he signed a copy of Just for Fun for me.
this really feels like one of those roundabouts that CS/IT does.
"every program the system runs depends on library foo, so why is this loaded into memory x times? Let's load it once and share it"
"there are 10 versions of library foo, none of them are backwardly compatible. Every application loads its own version of foo. There are now at least 5 versions of foo in memory at once. Why is the OS trying to manage this? We should statically link every application and stop worrying about it"
This has already gone back and forth at least once since I started working in the industry.
I think Unison's core concept[0] is the right way to address this: content-addressable packages/libraries. You can then get the best of both worlds - apps that need the same version of a dependency automatically share it, instead of duplicating it, but there's no chance of mistakenly substituting incompatible versions (barring a hash collision).
What about when a security vulnerability in a shared library is found? Each dependent app would need to be explicitly changed to accept the patched version, once compatibility has been established. The problem with trying to force-update all apps by patching the sole version of a shared dependency is that it may break some apps. This runs afoul of the maxim "security at the expense of usability comes at the expense of security." At best, you end up with some hacky dependency-management system, at worst the patch remains all-or-nothing so that the slowest apps veto the update.
You can also imagine that the list of versions of a shared library could be maintained in an implicitly declarative fashion, so that old versions with no dependent apps could be garbage-collected using reference-counting.
I think the big reason Linux has mostly dodged "DLL hell" is that a large chunk of Linux applications are either built from source on the target system, or installed via packages that were built specifically for the target system.
However, if you try to build an application binary, then ship it on more than a handful of Linux distros (and even versions within distros), you'll quickly start running into issues with GLIBC symbols etc.
Containers makes this incredibly easy as you're basically shipping exactly what you built your code with.
I think this is also a place where languages with package managers & the corresponding ecosystems really shine (e.g. Golang, Rust etc). These do well because they are building from source against runtimes that don't involve a pile of shared libraries as you would be doing in C/C++.
> I think the big reason Linux has mostly dodged "DLL hell" is that a large chunk of Linux applications are either built from source on the target system, or installed via packages that were built specifically for the target system.
Correspondingly, that's a non-technical thing to appreciate about shared libraries: they ended up creating the environment where building programs from source or installing them from maintainer provided packages was the only way to do things. If we end up moving to languages where static linking is the norm (or the only option), ecosystem shift away from the maintainer + source based approach might be unavoidable.
To my mind, that's pretty clearly a bad thing. It's how you end up with the node ecosystem's security problems, not something as reliable as most Linux desktops have been for decades.
From what I observe, most people use dynamic libraries simply because they cannot get stuff setup correctly for linking statically.
So they're usually not shared, but merely use the shared library file format. For a practical example, most software on Windows will bring its own folder with DLLs.
One step further, this is probably why we're now seeing docker everywhere: it's a Linux equivalent to Windows and Mac app store apps.
> Pretty much the only case shared libraries really make sense is for truly standardized system libraries that are everywhere, and are part of the base distro.
I dare say this also adds credit to the approach of FreeBSD, where the base system itself includes some basic libraries such as openssl, etc. Updating the system patches the libraries uniformly, so any ports using them are automatically patched. And there is no problem with security patches breaking compatibility, because the version of libraries in base remains mostly stable for a particular version of FreeBSD and security patches are backported as necessary. Which is of course what happens in popular long-term support Linux distributions too, albeit at the package level.
It is also possible to have ports that use a different version of a library too, though that often requires building custom packages locally or via a centralized Poudriere builder.
Linus works for LF where most of its key members are server and PC and high-end cpu vendors, where resource is plenty so you can run, say 100 static programs.
There is a different market called embedded devices, 64MB-256MB RAM are still very common and there are probably 100x more devices in this segment than LF members' fat ones combined. Without shared library, you will quickly run out of RAM/Storage space on them.
Yes and in fact, the majority of embedded systems are actually running not with 64-256MB of RAM, but 64K-256K of RAM.
For example, the STM32 line of Cortex M microcontrollers (MCUs) usually have between 32-256K of RAM, and 128K to 2MB of on-chip flash. Different world from a multi-core 1.5 GHz infotainment system (which is arguably embedded), which in turn is different from a laptop with 32GB of DDR4 or a server with .... well, you know....
Embedded systems are usually purpose-built instead of the general desktop usage where users install whatever they want, leading to a much smaller amount of processes running simultaneously and effectively removing the benefits of dynamic linking.
Just like Linus for PC, I think it sensible to consider libc and other to be dynamically linked also on embedded systems, but maybe even less for other software.
Shared libraries are a tradeoff from a time (80's and 90's) where big projects like GUI systems needed to be run on small computers.
Today with the cheap memory resources and fast interfacing you are often better off with programs that are statically linked as they have no run dependencies other than the OS.
Well, in Windows and MacOS you need libraries like kernel32.dll etc to sit in-between userland and kernel because the kernel ABI is not stable, like Linux.
Also, Linus talks about this sort of exception in his post: "unless it's some very core library used by a lot of things (ie particularly things like GUI libraries like gnome or Qt or similar)".
Even desktop libraries are moving towards IPC architectures so not even they require dynamic linking. E.g you don't dynlink a library for the trashcan subsystem, you communicate with some service on the dbus thingy. This makes hot updates possible which you don't get with dynamic linking.
Demo scene windows programs essentially revolve around the fact there is an enormous amount built in to windows that can be relied on. You can make a tiny windows GUI binary that is a few kilobytes and has no dependencies other than what comes with windows.
GUI is something like an API, let's take React Native - you just have an api to rendering (via bridge), same for X server, same for almost any GUI framework.
This is a very very tiny part of an app. UI libraries with prebuilt controls are already almost always bundled with apps.
Its not all the windows libraries, but often the OS creates just one interface library to wrap the syscalls. Those do not tend to be big libraries where you should worry about being statically linked to your program.
The problem with Linux also have to do with license, giving GPL can get some program into trouble by statically linking GPL code.
There are 3 cases for me that can force one to use shared libraries: Licensing, Dynamic loading a DSO at runtime (like in Apache modules) and when you are developing a big project, but once you need to ship the production code, get it back to statically linking everything.
Other from that i don't see why you would want this.
> Today with the cheap memory resources and fast interfacing you are often better off with programs that are statically linked as they have no run dependencies other than the OS.
This might be the case for developers, and for some people in the first world, but much of the world is using older hardware with relatively smaller amounts of memory and storage.
I've noticed developers tend to buy new computers often, but a lot of people just have one computer they bought several years ago and have no reason to upgrade it because it works well enough for Facebook, YouTube and email.
>much of the world is using older hardware with relatively smaller amounts of memory and storage.
If you read Linus' email, you'll see that for most libraries it doesn't save any memory or disk space.
What it WOULD do to link statically is save the extra IO needed to load shared libraries (somewhat, the same code would be loaded at program execution instead of dynamically).
If that library is statically linked, that one function is compiled into your program. The rest of the library is ignored.
If that library is dynamically linked, the entire library is loaded into memory, and the function address is shared with your program. More memory used.
The advantage of dynamic linking is if that specific version of the library is already in memory (because another application is using it) then it takes up no additional memory. Less memory used.
> If that library is dynamically linked, the entire library is loaded into memory, and the function address is shared with your program. More memory used.
False. Shared library pages, like pages in executables, are loaded only on demand. You pay a memory cost for what you use, not what's available.
I'm pretty sure Linus was talking about the fixups to the function addresses that happen at load time. Because those pages are modified, they aren't shared with other instances of the same process (unless those processes forked from same parent after the shared library was loaded?), causing more memory usage and less cache locality than static linking:
> And the memory savings are often actually negative
If you consider inlining and dead code elimination from inter-procedural optimizations, this could sometime offset the savings from a library shared across 2~5 binaries.
Fixups are done by usermode code, which means the pages containing them can't be shared between different processes. The kernel has no visibility into what is being fixed up.
(Also, the fixups themselves could actually be different between different processes due to address space layout randomization.)
Fixups are only needed for position-dependent code - Fedora (and maybe other distros too) set -fPIC -pie in default CFLAGS for all shipped binary *.so's.
I get this now. The fixups are runtime patches for call addresses in the application code to point to where the shared library function was mmap()'d. Those patches, although they will all be the same, are in fact COW writes which will in turn prevent page sharing. The solution that Linus doesn't mention is prelink.
I quite like shared libraries. What I don’t like is actually sharing them.
I firmly believe that all applications should ship their dependencies. Launching an application should never be more complication than “unzip and double-click executable”.
Docker only exists because run-time environments are too complicated, convoluted, and mutually incompatible. It’s insane.
On desktops this is 100% correct. On mobile it sucks to have to ship stuff statically linked (or even dynamically linked) but included in the .apk or .ipa. But mobile is a different world, it is more like the past where every resource is constrained. You have limited storage (and storage is relatively expensive), limited memory, limited bandwidth (despite what the carriers want the consumers to believe), and the app stores impose limits too.
I don't disagree with the Linus' sentiment. What I find interesting though, is that some people here are like "we don't need no shared libs, away with them!"
Let's suppose this happens. I'm wondering how all those Python, Ruby, Lua and whatnot programs would use database drivers, xml parsers, protocol interfaces, GUI frameworks, etc - all that stuff they are currently loading as shared libs? What are the alternatives?
Yeah, not to mention interpreted/jitted C++ is a thing (with cling/ROOT). It's very useful to be able to load random shared libraries and do things with them.
Shared libraries aren't the same thing as dynamic linking. They're a specific subset of dynamically linked libraries. You can have a single-use .so, and just not install it in the system libs directory.
Windows eventually got this right. There are a set of OS-provided DLLs, that are true shared libraries. Then applications wanting to use other libraries bundle their own DLLs. Applications don't generally need to screw with the system libraries, and shouldn't be allowed to (exceptions for some drivers and similar).
So, basically the mechanism stays the same (after all, .so stands for Shared Object), but the method of using it is different, right? I like your point about system DLLs in Windows.
Dynamic linking is a Windows term (thus .DLL). In Linux (and *BSD) the same concept is know as shared libraries.
If you force everything to be static in a system then yes - You need to statically link everything you need into your interpreter too. Things are going to grow in size a lot (and spinning up your interpreter will take longer time too).
I personally think that shared libraries are the way to go in most cases. And there are distributions which handle this quite well. Of course simplification and general gravity towards minimalism will help too.
Using FFI is extremely handy (very fast to experiment against some dynamic library when the need arises). I personally think your point is perfectly valid.
Keep in mind Torvalds is replying to a distro person, and distro people like it that way. They're never going to see eye to eye. This piece of software was developed on Fedora 30. Therefore to run the binary you need to install Fedora 30 in a Docker container beforehand. Who doesn't want to ride on coattails? DSOs are like open-lock-in and no native software can escape its grip. These days I'm not even sure if most software can be considered a thing in and of itself that's separable from the specific version of the specific distro on which it's written, due to these libertine development practices using unportable magic .so files as implicit dependencies, since no one wants to write a cmake config vendoring the transitive closure of some C++ library that takes 30 minutes to build.
. . .so you build it and the cp it into the Docker image, and cease caring over whether a target platform has a copy because it's all in some container layer, no?
That's a big ask for someone who doesn't use Docker. If someone tells me I need it, then I set up a virtual machine. I write binaries that run fine on seven operating systems. Using a single file. If someone comes to me and says their program can only run on a single version of a single distro of Linux then I think that they might benefit from learning my way of doing things.
I can fully agree with Linus there (in general bad, exception are very widely used libraries, and maybe .so's as dynamic extension system).
Even if we look at security it in my experience doesn't help much in practice (besides the same exceptions of widely used system libraries). Instead it can even reduce security by introducing subtitle bugs due to unexpected interactions between the dependency and the dependent because the versions of both never had been meant to be used with each other.
And at least wrt. open source it's always possible to just recompile the software with patches.
Which brings me to another kind of software where shared libraries make sense: Proprietary likely to not get proper maintained software with non-proprietary dependencies.
If an app is actively maintained then the publisher should be able to releash a patch to fix a security hole, regardless of whether that is in its main source code or in the libraries it uses.
If it is abandonware, then hoping that shared libraries will save you is at least naive.
Currently shared libraries only cause pain to the linux desktop users. I always postpone updating my system because I don't want to deal with the broken dependencies. It is insane how bad the situation is.
I've used Debian as a daily driver on my laptop for several months now and I haven't observed issues with broken dependencies. I've observed that issue with other distros in the past (as well as the *BSDs) due to package managers that don't really have a good approach for dealing with complex changes to dependencies (e.g. the recent change in python default version in FreeBSD could not be handled by the package manager and needed to be done manually following instructions). I believe apt basically solves this problem surprisely well even in the face of complex changes. I think shared libraries are worth the small pain since it means that in the case of updating a library, only the specific library need to be updated with instead of everything that happen to statically link a library needing updating (if the upstream maintainer updates it in a timely manner or even at all!). A world where everything is statically linked would be a world where we were stuck with whatever the upstream maintainer decides is best for them and not what is best for the users.
As Serge Guelton said in a followup email, one of the reasons that Linux distros use dynamic linking is to reduce the disk usage and bandwidth requirement for updates.
Technically speaking, would it be possible to design a Linux distro so that the installed executables are statically linked but an update for a library doesn't require redownloading large executables for every package that depends on the library?
It would still not be pleasant for disk space. Ideally I want my OS as well as the VMs to be small and require as little ram as possible. I don't want to update my entire OS just because glibc had a critical bug.
> As Serge Guelton said in a followup email, one of the reasons that Linux distros use dynamic linking is to reduce the disk usage and bandwidth requirement for updates.
There are a number of ways to solve the latter problem, though: dnf, for example, includes the ability to download only the diff between your installed RPM and the version you're upgrading to.
Absolutely possible, with source distribution. It keeps downloads small, and upgrades (notionally) transparent. The downsides are that it's computationally expensive on the user end, in addition to storage bloat. Does the latest firefox depend on the latest llvm? Depending on how awful your hardware is, you might wait a day for those two packages to build.
There is a whole lot of industries based around shipping plugins. While some of these might work as external processes, they possibly still require some bridge to be loaded as .dll, .so, .dylib.
Also security updates. Other than that, static libs for the win.
Other examples, speeding up "link" times - Chrome uses component builds - e.g. bundle a bunch of C++ code as .dll and compile/load only that, rather than relink the whole .exe - slower to start and use, but apparently allows you faster development.
For what we do, we are using .dlls to enforce that certain C++ code (libs) are not suddenly using symbol from another lib if this was not intended (granted this could've been done if we compiled single .exe for the said lib, but gets the point). We also have mode where we use these .dlls instead of them being statically linked (having both).
If linking has such a huge impact on Chromium dev cycles, maybe they should make their codebase less of a bloated hog :)
Linking all of Firefox into one libxul.so with LLD takes basically no time (on a good machine) for a release build. Slightly noticeable with all the debug info but still pretty quick.
Ah he gets it. Shared libraries are a good idea if you only consider performance and idealized scenarios. In the real world, where users have to use programs that they want to not be broken, vendoring absolutely everything (the way Windows apps do it) is essential.
IMO the best reason to use shared libraries has always been cases where updates are likely critical and there's a well-defined interface that is unlikely to have breaking changes. Like libssl. If there's a security fix, I'd love to have `apt update` apply that fix for everything, and not worry about verifying that everything that might have statically linked it gets updated too. Although ideally the latter would be easy anyway.
It's also nice for licensing - being able to have dependencies that you don't have to redistribute yourself.
There are some shared dll's too, perhaps most notably the OS APIs. But 99% of the dll's on my system sit in Programs\specificprogram\x.dll
"Dynamically linked", doesn't imply "Shared between multiple executables". The trend on windows just like on Linux and Mac is for bundled-everything even for the thing you'd think was last to get there on windows: things like C++ runtimes and .NET frameworks are now bundled with applications.
The tradeoff between patching in one place, and being forced to maintain backwards compatibility and effectively make the runtime an OS component was won hands down by the bundle everything strategy.
To expand on the other answer you got, there's nowhere to put DLLs by default. You install what you need in your directory in Program Files and at that point you may as well statically link them.
Or you may as well not. DLLs sitting next to your executable are pretty much like static linking, except with one crucial difference - they can still be swapped out by end user if the need arises. For instance, to apply a fix, or to swap out or MITM the DLL for any number of reasons. It's a feature that's very useful to have on those rare occasions when it's needed.
- An installer you used could've been compromised. For example, the attacker swaps out a DLL for a bad one, uploads the modified installer to a file sharing site, and gets you to download it from there.
- The application has its DLL swapped/modified on the fly before or during installation by pre-existing malware in your system.
- DLL is replaced at some point post installation.
All of these attack vectors can be pulled against a statically-linked program too, and the privileges they require also allow for more effective attacks - like modifying a system component, or shipping in a separate malware process. Crypto miner will be more effective if it's not tied to execution of Super Editor 2013, even if it's delivered by its installer :).
Problems with malware have little to do with dynamic linking. They stem from the difficulty in managing execution of third-party code in general.
> All of these attack vectors can be pulled against a statically-linked program too
Yeah, but then the attacker would have to pull them against a bazillion apps, in stead of just infecting a bunch of more or less generic DLLs and then just replace all copies of those wherever he finds them.
Which is why I said, "the privileges they require also allow for more effective attacks". If you can scan my system and replace popular DLLs in every application that bundles them, you may as well drop a new Windows service running your malware. Or two, and make them restart each other. Or make your malware a COM component and get the system to run it for you - my task manager will then just show another "svchost.exe" process, and I'll likely never notice it.
> If there's a security fix, I'd love to have `apt update` apply that fix for everything, and not worry about verifying that everything that might have statically linked it gets updated too. Although ideally the latter would be easy anyway.
While it is one failure mode, and a well-understood one, to end up finding that you've been owned by an application with a hidden dependency that had a security hole that you didn't know about, the "apt update" means that your ability to fix a security hole is now waiting on the time it takes you to update thousands or tends of thousands of programs, 99% of which you don't and never will use, against the One True Copy of the library.
Do security fixes break general usage of stable APIs often? That is not my experience. I mean that's the whole point of semantic versioning: you should be able follow a maintenance release line (or even a minor release line) without any incompatibilities. I don't know that I can remember an unknown security issue or other critical fix, certainly not many, where I've had to wait for a major release or break the contract.
Shared libraries are still very important for memory efficiency on mobile systems and in other constrained environments. Android could not work as it does today without the memory savings that sharing libraries gives us. Megabytes matter when you're talking about a device with only a few gigabytes of memory and a hundred processes that all need to run some bit of code or other.
It's important to avoid equating "Linux" with "{Desktop,Server} {Ubuntu,Debian,Fedora,SUSE}". There's an entire universe of Linux-using systems out there that have operating constraints very different from your typical anonymous cloud server machine.
A modern smartphone has more RAM than most computer that people use to do actual work. I mean I see around a lot of people that uses computers with 4Gb of RAM, and the standard nowadays is 8Gb for Android phones. Even with disk space, nowadays a smartphone typically has 128Gb of rather fast SSD, a lot of people still uses computer with spinning disks.
Also are shared libraries all that used in Android? The part of the system that is written in native code is very small, it probably doesn't make all that sense to use shared libraries for that.
Smartphone memory is used for lots of things, e.g. modem carve-outs, graphics buffers, and page cache. The fraction left over for actual code is pretty small, especially on the smaller devices common in the developing world. Shared libraries are essential for using this memory efficiently.
Shared libraries libraries amplify future shock. Combined with the modern rapid pace of new features being added to libraries and languages themselves even distros that aren't even released yet are already out of date and unable to compile and run $latest application. The symptom of this amplified future shock is the rise of containers on the desktop.
The problems of updating applications individually because they're statically compiled are insignificant compared to the problems that come from having to do everything within containers.
> The problems of updating applications individually because they're statically compiled are insignificant compared to the problems that come from having to do everything within containers.
Bundling dependencies aren't the only reason containers are used though. We still want to sandbox everything, and those sandboxes need to provide dependencies.
At least with dynamically linked you can decouple and share things like "runtimes" in the flatpak vernacular, which does have some advantages like dependency security updates.
Maybe it’s because I don’t do any C/C++ programming but when you’re building your application can’t you just link your libraries via an arbitrary folder? Why are you dependent on system versions at all? To me that seems horrid. Unless I’m coding directly on the system (i.e. working on the OS itself) can’t you just skirt issues with different versions by just downloading your libraries to an arbitrary folder and pointing your tool chain there?
Exactly. I don't see the benefits of shared libraries in terms of software distribution. It makes sense for them to be used as standard system libraries, like in macOS with System.dylib or libc.so in many distros.
For general apps using third-party libraries, nowadays static libraries is the way to go, as I have said months ago [0].
I agree that shared libraries are mostly unneeded complexity, as far as free & open source programs are concerned.
I kinda like how they open up opportunities for fixing or tweaking the behavior of broken proprietary binary-only garbage without requiring you to RE and patch a massive binary blob. Of course, in a better world, that garbage wouldn't exist and we'd fix the source instead.
Are there downsides when library writers build and publish both .so and .a files, and the library users can pick if they want to link statically or dynamically?
If so, which downsides are those?
I can't tell if they exist, and it seems to me that this is a flexible solution that gives the power of choice to the consumer of the library.
I certainly don't want each installed desktop app to have a copy of base gnome/kde runtime and everything down to libc. And the implication is even the graphics would be duplicated, for example the Adwaita icon pack is huge. So if I have a full set of gnome applications (say 50) would I have 50 copies of Adwaita icon set? Suddenly disk space isn't cheap. Shared libs are good and we could do better than flatpaks and containers and static linking.
And just because shared libs are PITA it's not just because of their nature, it's the lack of tooling from supposedly modern languages, lack of guidance, lack of care for versioning and API stability, and distro agnostic convention. Each of these problems can be solved by not sweeping them under the rug.
I don't know whatyour point is. He literally says:
>Yes, it can save on disk use, but unless it's some very core library used by a lot of things (ie particularly things like GUI libraries like gnome or Qt or similar), the disk savings are often not all that big - and disk is cheap
He's literally making the point you're arguing. He says, core libraries should be shared.
Tell me how much of the system libraries are written in C or C++, and how much of them are being written in newer languages.
C is the default choice because of its ABI, and the tooling around it is made for using shared libraries.
What can you say about modern languages? Each of the languages are designed to work in a silo and not much cooperation for let's say, system plumbing. Their own package manager makes it easy to pull in code, and only make code for that language only. You can't just create a package that works as a shared library for other programs without mushing around C FFI. They make it hard by default, which creates resistance in developers to make a piece of code usable by others other than their own language. This trend is pretty alarming, especially when hidden and manipulative language fanboyism is showing its ugly head everywhere.
You should explain what's wrong with the argument instead of being a passive agressive asshole. It was a continuation of why it happens nowadays that people swing to static linking and then posts like this get shamelessly upvoted.
OK, if being an active aggressive asshole is better (which judging from your comment certainly seems to be your opinion): You made a stupid comment. It was pointed out to you that your "big counter-argument to what Linus wrote" was actually exactly what he had written. In stead of graciously acknowledging, or admitting by even so much as a hint, that you were wrong (which was so obvious that one would have to be a total blithering idiot not to get it), you went off gibbering about some other tangent. That makes you the primary passive-aggressive asshole here. Now you've graduated to active-aggressive assholery, which makes you just simply an asshole.
Nix(OS) solved this problem by hashing all packages based on their inputs (including other package hashes) all the way down in a merkle tree. You would have one copy of the icon pack, for example. But if any common libraries are built with different inputs for a particular program it will be duplicated instead of shared. Nix can then go through your store and hard-link any duplicate files between similar packages to save some more space.
> So if I have a full set of gnome applications (say 50) would I have 50 copies of Adwaita icon set?
No. Icons is easier to load with regular open/read, then with dlopen. But in any case they go into separate files, they are not in a binary.
Dynamic loading might be used to load data into the process, but it will be a very strange way to do it.
> And just because shared libs are PITA it's not just because of their nature, it's the lack of tooling from supposedly modern languages, ...
It is more complex than this. Dynamic linkage is limited in its ways. It couldn't do a type parametrization, for example. All it could do is to fill gaps in a code with function addresses. But it is not enough. Far not enough. For instance, you wouldn't like to dynamically link a c++ vector, because it is meant to be inlined and highly optimized then. Dynamic linker cannot inline or optimize.
So you are forced to use a lot of inlining at the stage of the static linking of the application binary, but then you get a problem of binary incompatibility between an app and a lib coming from lib being rebuilt with different optimizations.
So I'd say, that modern linux distributions should adapt to modern languages (like c++, lol), not vice versa.
> Each of these problems can be solved by not sweeping them under the rug.
For what end? What we possibly might gain, from solving these problems? Dynamic linkage is a runtime cost. Why should we prefer runtime costs to compile-time ones?
We’ll surely individual applications only use a few icons, not the entire set, so they can statically link in only the resources they actually depend on, right?
A good reason to use dynamic libraries is for plug-in systems. The application can then dynamically find out what plugins are on the machine and run them. If a plugin isn't installed nothing breaks. It makes it possible to have a open API to a closed application. Say if you have a paint application and you have plugins for file loading, then users can write/install additional file formats as plugins, without the app changing. It can also be helpful in development since you can move dependencies to separate modules. If a file loader requires libpng, then you need that dependency to rebuild the plugin but not the core application.
In general I agree with Linus, an application should not be separated in to multiple parts unless it serves the user.
I read it as, modules that are loaded and unloaded dynamically, facilitating things like the ability to update a running application. On second reading you are probably right that plug-in interfaces even if plugins are only loaded at startup, would fall in to the category he describes.
Shared libraries are a necessity on Linux, because Linux users pride themselves on the efficiency of their systems, and if two programs used separate copies of the same library, that would be wasteful.
The exception that everyone could get onboard with would be a common API that all programs use. So, for instance, if everyone could agree to use GTK instead of Qt, then GTK could just be shipped as standard on all Linux distros and there would be no problem of bloat - all programs need GTK.
But that goes against the Linux philosophy of modularity & configurability, so that will never happen either.
So you need to strike a happy medium. Make an attempt to use a shared library, but if your users don't have it, download it live.
Forgetting the past, and doing the same mistakes...
Not Linus, but a lot of young dev and newcomers are discovering static libs building with GO and co, and think that it is a revolution and that it solves all their problems.
But, they are probably forgetting or not knowing the world of static libs before dynamic one became a thing.
A big part of the revolution of linux success in servers and all come from the magic that brought a full system built on shared dynamic libraries.
Do anyone remember having to install a 1Gb software to setup a hp printer on windows?
The full afternoon or day needed to wait for the installation of visual studio?
The ms office installation needing multiple gigabytes when openoffice could be downloaded and installed for a hundreds megabytes at most?
IMHO it's really the way Linux and other Unix-likes do dynamic linking that's "not a good thing", as well as the "hypermodularisation" that Linus alludes to (the left-pad fiasco would be an extreme example of that.)
Windows prioritizes DLLs sitting next to the application over ones installed system-wide, so you can ship your DLLs with your app, gaining all the benefits of static linking while not losing the benefits of dynamic linking.
True, but I have seen software where the DLL files are in the directory where the EXE lies. So copy+paste from another computer works just fine (in most cases).
Left-pad wasn't really a fiasco. It briefly broke a couple of people's build. If anything, it showed how resilient the system is. People are actually starting to use NPM as a system package manager.
Meanwhile, dependency hell on Linux distributions keeps people stuck on outdated versions of software, because some piece somewhere is either too old or too new.
Shared libraries would be good if there were better semantics in how they're used. Saying "shared libraries bad" instead of working on fixing those semantics is unfortunate.
I think one area where shared libraries is are a huge toll is in
C++. ABI concerns have crippled a lot of improvements to the library. In addition, because C++ development by default uses the system C++ standard library, many times the latest C++ languages and libraries can’t be used.
In addition, with regards to saved disk space and central patching, Docker negates a lot of those benefits. Basically instead of static linking, you end up shipping a whole container with a bunch of shared libraries
> We hit this in the subsurface project too. We had a couple of libraries that nobody else used. Literally nobody. But the Fedora policy meant that a Fedora package had to go the extra mile to make those other libraries be shared libraries,
I'm a Fedora packager and I'm not sure what he's talking about here. If a single package uses some code then there's no requirement to make that code into a shared library.
But that's if the libraries are going to be used by another package (and even then it's not a prohibition - packagers should use common sense). Linus said this applied to a single package so it wouldn't need to ship *.a files at all, as everything would presumably be linked to binaries. I'm going to check out this subsurface diving program to find out what really happened here.
To be fair, LLVM is not just clang. It's also clang++, lld, lldb, clang-format, clang-tidy, clangd and dozens of other utilities that rely on the LLVM library.
For anyone considering contributing to tools and distros that support static linking, check out my StackOverflow answer "Static linking is back on the rise" [1].
Shared libraries and dynamic linking make writing FFI wrappers in other languages possible. If every library were statically linked, I wouldn't be able to write things like this: https://github.com/mike-bourgeous/mb-sound-jackffi
"Yes, it can save on disk use, but unless it's some very core library
used by a lot of things (ie particularly things like GUI libraries
like gnome or Qt or similar), the disk savings are often not all that
big - and disk is cheap."
How much savings? Has this ever been measured? Just curious.
The era where disk space is a rare commodity and the savings gained by dynamic libraries are worth the tradeoffs ended a decade ago.
Heck, the heavy usage of Docker or similar in industry really signals that developers are more than happy to tradeoff disk usage, often a lot of it comparatively speaking, for reliability.
What about memory mapping, shared memory? Are those not a thing on Linux? I'd say err on the side of dynamic linking whenever there's a chance at least 2 programs that load the library can run at the same time.
Does anyone not seem to care about code hotloading at all? Shared libraries enable that. It's a HUGE iteration bonus that seems gets glossed over in favor of onerously processes and dogma.
This again. Yes, static linking is better in every way except semantics, because the bloody linker is stcuk with 1982 semantics while ELF has gotten way more interesting.
you watch. everything will go static and then some new runtime hot patch mechanism will be introduced.
what does ram usage look like if you build gnome, qt, x, wayland apps statically? seems like you might end up with weird stuff like huge rss for things like calculators and desktop tools. moreover, doesn't it just push compatibility problems to the network protocol/ipc level anyhow? i suppose maybe that's more stable... for now?
I think he has a good point overall, but at the same time I think there can be an equal benefit of shared libraries when it is likely that the library will be shared among multiple applications. It doesn't have to be a tremendous amount.
On Arch, users pick their display manager and windows system, and many Qt applications or GTK applications may share multiple libraries and I can see the benefit of smaller packages and being able to update those libraries separately.
He hinted that for base installations it makes sense. GTK or Qt frameworks and libraries aren't part of the base and the are highly dependent on what app you're installing. Usually if you are just a casual Linux user you wouldn't understand this. In these instances they aren't commonly shared across all applications, just maybe 2 or 3.
The Rust ABI is unstable, the C ABI is stable. You can always dynamically link against a C object file, even one written in Rust. I would love to see a restricted stable Rust ABI for types without generics and trait objects and nothing else. That would enable library authors to easily express/provide a dynamic linking surface, while the default remains static linking. I tend to agree that things that "are part of the OS" like QT and GTK should be linked dynamically, while most other things shouldn't.
Go is by default statically linked. You can however create shared libraries from Go which can be called like any C library and call any shared library.
Go already has several other deal-breakers for systems programmers (garbage collected everything, runtime-provided greenthreading, etc). Rust is a lot closer to what Linus would approve for in-kernel development, but it still has a few problems, such as failed allocations panicking the system instead of returning an error that can be handled normally.
Interestingly, there are folks working on turning Linux into a unikernel, which means you would statically link each application and its libraries with the Linux kernel and then boot that application as a VM.
While I'm not sure I agree 100% with Fedora's policy that everything(?) must by dynamically linked... I think it has a lot of benefits besides the actual reasons for dynamic linking. I'm not an expert but I have observed a few things over the years using and building packages.
First off, dynamic linking does not seem to make updating software impossible. There are several rolling release Linux distros (Archlinux, Debian-unstable, Void, Gentoo, etc.) and basically everything works while releasing upstream updates within 2-4 weeks typically. Most are maintained by volunteers. The number of core packages that break due to shared libraries is single digits in my experience of Archlinux over the last 5-8 years.
Sure some things are shipped as pre-compiled packages which just come with their own library directories so they can load the versions they need.
I will say that I have encountered software that was library hell. This software tended to be academic in nature and use really poor build processes like having dozens of make files while being developed on ancient versions of Ubuntu. The software was not vaporware or abandoned it was just poorly developed. Even if static linking would help - the developers choose not to use it out of ignorance.
Other software that has issues with libraries tend to be very large applications where the overhead of shipping an alternative LD_LIBRARY_PATH or 'Docker' container is minimal and has other benefits as well.
I cannot imagine how many projects will actually keep their libraries up to date. Electron apps are already often using old Electron versions causing issues where the Electron version won't support Wayland as well as the browser version. I have stopped using the desktop versions of apps like Microsoft Teams and Slack because of this issue. If large companies can't be bothered to update their flagship desktop applications libraries - how can we expect small, open source projects to do better? They no doubt use community feedback from distro maintainers & users to maintain compatibility with upstream libraries.
Plus if static linking is used to an extreme amount then their will be N-~1 more versions to 'maintain'. Any popular library/dependency will become a defacto supported version; just look how long it took apps to move to Python 3. Heck, the 'chromium' package seems to still require Python 2 as a build dependency. Even 'Cellebrite', as evil as the program is, was hacked in part because it shipped with out of date libraries and it is probably has a multi-million dollar price tag.
TLDR: Developers of all levels suck at keeping dependencies updated. The distro system & build process provide s a much needed barrier between application development and sanity. It's basically the final code review for the Linux & open source ecosystem. That said static linking certainly has its place but also has its own set of issues.
Doesn't OpenSSL fall into the "truly standardized system libraries that are everywhere, and are part of the base distro" category? You just proved his idea right yourself ;-)
> If OpenSSL has a hole in it, the fastest way to patch every system in the world is to patch the OpenSSL library. If every OpenSSL app in the world were statically compiled, then every single OpenSSL-using app in the world would need to be patched, tested, released, and upgraded, independently.
Even if we upgrade a shared library and the reverse dependencies do not require a rebuild, we should still re-test everything (I think Debian does this already).
Not sure if people are getting the reference here... The "suckless" tools authors are known to prefer static linking, they even tried to make a linux distro "stali" with the philosophy of static linking everything.
In Windows, "DLL hell" used to be a thing. It's mostly solved by (among other things) Microsoft forbidding installers from overwriting system DLLs. Also today most apps bundle the DLLs along with the apps and don't really share them.
In MacOS, brew the popular package manager recently took upon using shared libs for everything. While it mostly works it can cause some version incompatibilities, you tend to accumulate garbage in your disk after uninstalling stuff and now you have to routinely update 3x more packages...
"Don't abstract your code into functions! Just copy and paste it all over the place. That way, when you need to change the behavior of one of them, the rest don't break!"
/s
Abstraction is good. Surely anyone can see that having one copy of code is superior - technically, and from a standpoint of elegance - than duplicating it once per program. A huge amount of effort has gone into making this infrastructure work. This isn't just about disk space, or memory, or the ease of patching - it's about composability. I like that I can override behavior with LD_PRELOAD. If we think of our systems as a collection of unrelated, opaque blobs, then static linking looks seductively simple; but if we think of them as a network of interacting code that transcends application boundaries - vivisected and accessible, like a Lisp machine - then static linking is a repulsive act of violence against deeply held aesthetic principles.
If dynamic linking doesn't work well - if we've messed it up - that's a sign we should try harder, not give up.
You're forgetting the dream of MULTICS. Modern MMUs grant each program dominion over its virtual address space and white picket fence like the American Dream. It doesn't inhibit cooperation between program processes which can totally happen over IPC or network mechanisms. Sort of like how you don't need a farm in your backyard to eat food. You just go to the food store and pay them money. What you appear to want is a collective farm where code is reorganized into a communal state where no boundary is sacred and that MMU you paid good money for isn't worth the silicon on which its printed.
Which reinvented 5ESS MERT which was the intended purpose of UNIX, which reinvented MULTICS. It's about more than real time reliable systems though. It's about software freedom too. Ask yourself: "Am I authorized to pick an arbitrary pointer address, and store what I want there?" If the answer is "no, you can't, because some DLL or SO or DYLIB file might have been randomly loaded there" then you are not using a Free as in Freedom operating system.
In its own way quite a cool and valid vision... For a Lisp machine, or some other future micro-kernel (and pretty much micro-everything?) operating system.
Not for Linux, though, because that's not the kind of operating system Linux is.
> Torvalds: [hot take that almost no one else agrees with]
Considering the popularity of vendoring and/or static linking dependencies, I'm not sure where you're getting the "almost no one else agrees with" from.
glibc's hostility towards static linking is arguably a substantial part of why musl has attracted so much interest. Static linking support is explicitly stated as one of musl's core tenets [0].
One can also argue the popularity of containers has in part been due to glibc's dynamic linking requirements, and containers have become the bloated "modern" version of static linked binaries, bundling all the dependencies without eliding all the dynamic linking overhead.
Torvalds is definitely not alone in this particular opinion.
> Considering the popularity of vendoring and/or static linking dependencies, I'm not sure where you're getting the "almost no one else agrees with" from.
Vendoring is hugely popular for projects in year zero, because it makes it easier to ship. Until they learn that packaging specific versions of all your dependencies with your project implies taking on responsibility for backporting all security updates to that version of every dependency when upstream is only actively maintaining the latest version. Or you have a project riddled with known published security vulnerabilities, which is what most of those turn into.
You use a library, you've already taken on responsibility for it. You're application has bugs due to bugs in a library, you're users aren't going to accept "we reported it to the library and they said it wasn't important" as the resolution.
At which point you either submit a patch to the library or switch to a different library. That much of the work isn't avoidable, except maybe by choosing a better library at the outset.
But having to fix it yourself when the library maintainers fail once is a far cry from having to backport every security update in every library yourself even if the library maintainers are properly doing it for a newer version than you're using.
The problem is that doing the backporting is typically more work than updating your software to use the newer version of the library, but not maintaining the library version you use at all is invisible in the way that security vulnerabilities are until somebody exploits them.
Vendoring libraries isn't really different from keeping a version lock file around. You need to review and update the dependencies regularly and monitor them for security issues. Only difference is that instead of a file with the versions you have the full libraries.
I personally don't like to vendor as it bloats the revision control system, but as far as dealing with bugs in the dependencies it doesn't really make a difference.
> Vendoring libraries isn't really different from keeping a version lock file around. You need to review and update the dependencies regularly and monitor them for security issues. Only difference is that instead of a file with the versions you have the full libraries.
In the first case, using a version lock with dynamic linking, the maintainer of the library package is the one paying attention to and shipping patches for the library. And if the version of the library you're using stops being supported, your software breaks and you have to fix it immediately.
That's the thing vendoring "makes easier" -- you're not dependent on the third party to maintain that library version. Except that now it's on you to do it, and the breakage that would have told you that the version you're still using may no longer be secure is now absent. So the developer has to pay more attention. Which they regularly do not.
You have two choices here. Keep updating as you go along and amortize the price for so much open source deps inability to avoid breaking change; or don’t do that and then eight years later lead a heroic update the deps project that takes ten people and millions of dollars and some availability hits. You keep the principal of change only one thing at a time the first way and get promotions the second way.
Static vs dynamic is just do you want the code owners to be responsible or some centralized ops group to be responsible. Usually ops doesn’t have the insight to do a good job with deps, so it’s better to have dev do it, e.g. statically linked go binaries, statically linked C/c++, frozen package versions in Python, I think mvn has a similar capability, also version pinned base container images etc etc. hence the fact all the packaging systems add new versions and only rarely remove things as that breaks it all worse than a security hole.
As a dev person with a lot of ops experience I like static because I don’t want some janky dependency change to come into the execution path when I am not aware of making a change. On the other hand, people usually just freeze everything and then don’t worry about it, which is wrong. But your security team needs to review source and source deps not just deployed versions. So if code is still owned, the bad deps can be found and fixed either way.
Ooor... you just upgrade your library versions. Backporting fixes is something you do as a distro maintainer to avoid having to update each application separately. It almost never makes sense as a thing to do as an application author. If you find yourself in a position where you don't want to upgrade to the version being maintained by upstream then you're in trouble anyway.
If you are using upstream libxyz 1.3 but they don’t provide security updates for it then you should do something even if you are using dynamic linking. Imagine that redhat provides security updates but Debian uses 1.4, are you going to tell your users that they must use redhat?
I don’t see that dynamic linking makes a lot of difference here.
Typically with dynamic linking, the library is maintained as a separate package by a separate party who takes distribution-wide responsibility for backporting anything.
So you have two alternatives. One is that you make your software compatible with both version 1.3 and version 1.4, so it can be used on both Redhat and Debian with dynamic linking. This is a bit of work up front which is why everybody grumbles and looks for an alternative.
The other is that you vendor in version 1.3 so that you're using version 1.3 even on Debian. But now that library is your responsibility to patch instead of the distribution's. So instead of doing that work up front, you either maintain the library yourself on an ongoing basis (which may be more work than supporting two only slightly different library versions up front), or you neglect to do it and have a security vulnerability.
As a software author, the first alternative doesn't sound very attractive to me. Because as an author, I will want the upstream version of libxyz, and if the distribution provides security updates past the upstream date, this means that I need to use that exact distribution to benefit.
And vendoring in is pretty much equivalent to what happens with static linking.
My parent had suggested that dynamic linking is some great solution, and I don't see it's that great.
> Because as an author, I will want the upstream version of libxyz
Why would that be true in general? It would only be the case if the upstream version contains some feature you need that isn't present in the version packaged by the distribution, and then you would only need to do it until the next version of the distribution packages the newer library version.
> and if the distribution provides security updates past the upstream date, this means that I need to use that exact distribution to benefit.
Or support the multiple versions of the library, which for non-ridiculous libraries is often no more than refraining from using the features not present in the older versions.
>> Because as an author, I will want the upstream version of libxyz
> Why would that be true in general?
Because I want to write software that works on various Unices, not just one Linux distribution. So it seems safer to rely on software that my users (the people who install the software I wrote) will be able to install from source, if necessary.
If I were to require whatever version Debian had at the moment, then I'd leave Fedora, Arch, OpenBSD, FreeBSD, Illumos aficionados in the lurch.
You can also automate the upgrade process with good tooling.
Detect a new version? Update your local copy, run your CI, if it passes, you can automatically merge. Or you could ask for someone of your team to review the changes, knowing your tests passed.
And if it doesn't, you can warn your team, who can then either fix the build manually (hopefully a small incremental change) or open a bug report to the library.
The point is to do that continuously, so that the deltas are smaller and easier to handle. I do that with some pretty critical libraries at work and it's mostly fine, takes a few minutes usually to approve changes every week. Doing an upgrade after 6 months+? That'd require a long rollout.
I'm not sure how dynamic linking even makes this process easier. Either you test every update or you don't. With static linking you could easily just automate having all the dependencies updated. You just risk randomly breaking things which is just the normal case for dynamic linking.
> glibc's hostility towards static linking is arguably a substantial part of why musl has attracted so much interest.
Anothre reason is that glibc is equally bad for dynamic linking. Often a binary compiled on recent glibc doesn't work with older glibc. It is certainly possible but quite challenging to compile portable executables on Linux.
Maybe I hyperbolized that a bit. I think there are pros and cons to each approach. The uses cases for static vs dynamic are also different. There are compile time disadvantages to static libraries, and there are security disadvantages to shared libraries. But calling one method better than the other is inane.
Are you aware of the runtime differences? Both time and space.
Shared libraries (at least with ELF) requires fixups, which involve COW and a considerable number of per-process pages. Compiling with -fPIC reserves one register, which isn't a problem unless you're on a register-starved CPU, which is all of them.
It's not particularly short of registers, but IIRC even aarch64 is short enough that they implemented register renaming. I agree that taking a register away is a much less pressing issue there than on the x86.
The purpose of register renaming in the AArch64 microarchitectures is to support out-of-order. By comparison, you don't find register renaming in the in-order A53.
Suppose you had enough registers, ie. so many that there's no real reason to compilers to save. In that case, you could "support" out-of-order execution by adding a couple sentences to the compiler guidance documentation for your CPU, telling them that they'll get better performance on o-o-o cores if they wait as long as practical with reusing registers.
They've tried variations of this. It takes instruction bits to describe these registers: 32 regs is 5 bits, target + src1 + src2 is 15 bits already. They've tried scratch pads. The Mill CPU's Belt is yet another approach.
https://en.wikipedia.org/wiki/Scratchpad_memory
https://millcomputing.com/
How to Use 1000 Registers
https://caltechconf.library.caltech.edu/200/
1000 registers would be 10 bits. So they tried register windows.
These are all good ideas but they get to compete. So far, I'm a huge AArch64 fan but I now see the purpose of RISC-V. I watched Chris Lattner's ASPLOS talk and it finally clicked.
I know... and it's quite interesting really, but for the purpose of this thread, I think it's safe to say that the CPU-wallahs spend considerable effort on ways to lighten or avoid register pressure, and from that I infer that all of the currently widely used CPUs do have significant pressure.
Much more on the x86 than on sane architectures, of course.
Until the day comes when an application isn't updated with a patched library and people get hacked. This is the reason why I'm not so keen on statically linked applications. I'd rather my applications focus on their concerns and link to shared libraries for stuff like SSL and so on. This means the SSL people can focus on shipping secure SSL libraries and application people can focus on shipping applications built on secure shared libraries.
This is an unlikely scenario. Most vulnerabilities are not in shared libraries. If you don't update your software, either it doesn't matter, or you eventually run into security issues.
Optimizing for the unlikely scenario is not a worthy tradeoff. Focusing on shared libraries can indirectly lead to less security overall, because people run outdated software, because of dependency hell leading people to defer upgrades.
While i want static linking, i would still use package manager for convenience and because it provides some form of curation. (Packages in repos are usually not malicious)
Sure, but again, if these packages were simple binaries that "just work", you wouldn't need your package manager to be strongly coupled to your distribution.
I don't think this is the case here. It's not hard to agree that shared libraries are meant to be _shared_ among different applications and when they're not (or not shared enough, which is a subjective measure) it's best to statically link.
However, there important advantage of using .so files instead of building into a big statically linked binary is NOT that a library can be shared between different programs! The important benefit is dynamic linking. The ability to change which library is being used without needing to rebuild the main program is really important. Maybe rebuilding the program isn't practical. Maybe rebuilding the program isn't possible because it's closed/proprietary or the source no longer exists. If the program is statically linked, then nothing can happen - either it works or it doesn't. With dynamic linking, changes are possible by changing which library is loaded. I needed to use this ability last week to work around a problem in a closed, proprietary program that is no longer supported. While the workaround was an ugly LD_LIBRARY_PATH hack, at least it got the program to run.