This is far too narrow a view. It's not a question of whether to dynamically link or not, but WHERE and HOW to dynamically link.
Think about it: If you were to force absolutely everything to be statically linked, your KDE app would have to include the ENTIRE KDE library suite, as well as the QT libraries it's based on, as well as the X window libraries those are based on, etc etc. You'd quickly end up with a calculator app that's hundreds of megabytes.
But let's not stop there, because the linkage to the kernel is also dynamic, which is a no-no. So every app would now need to be linked to a specific kernel, and include all the kernel code.
Now imagine you upgraded the kernel. You'd have to rebuild EVERY SINGLE THING on the system to build with that new kernel, or a new version of KDE or QT or X or anything in between.
The kernel membrane is a form of dynamic linkage for a reason. Same goes for IPC. Dynamic linkage is useful and necessary; just not to such a microscopic level as it once was due to size constraints.
The key is not to eliminate dynamic linkage, but rather to apply it with discretion, at carefully defined boundaries.
> Think about it: If you were to force absolutely everything to be statically linked, your KDE app would have to include the ENTIRE KDE library suite, as well as the QT libraries it's based on, as well as the X window libraries those are based on, etc etc. You'd quickly end up with a calculator app that's hundreds of megabytes.
That is false. Static linking only links in the the parts that are actually used, which would be a small fraction of the total.
Almost all parts end up being used though - for the vast majority it's not possible to prove that it's never used.
KCalc would for instance pull in HTML rendering for the Help dialog, which would also add all available picture format plugins and also PDF writing support etc.
And as you begin to flesh out some kind of calling convention, you pretty quickly end up with what could very easily be called dynamic linkage of sorts, even if it's not strictly the linking of object files, etc.
Yeah, I figured that'd be a response, but if you're gonna call mime type handlers dynamic linking we may as well give up on using the term at all.
I just don't think "I need to let the user open an HTML document, though not as any core part of my program's functionality" is a strong case for "well I guess I'll have to embed an entire HTML rendering engine if I statically link my deps".
It feels like your missing his point. OP is pointing out that static linking everything is necessarily unworkable. You're arguing that static linking isn't unworkable because you can just dynamic link. It doesn't respond meaningfully to the OP.
I may be totally misunderstanding asark, but the model I thought they were suggesting was something like how COM works in Windows. You can, for example, talk to other COM components and pass data back and forth using an efficient binary message passing protocol while still letting them live in their own process space.
It allowed some interesting things. For example, you could write applications that could read, display and modify the contents of Excel files, except they didn't do it directly; they delegated all the actual work of opening, reading, and modifying the *.xls file to Excel itself.
I wouldn't personally consider consuming a COM interface like that to be a form of linking.
That said, getting back to the broader context, I've no idea how you'd make something like that work for a GUI toolkit. But WinRT is supposedly based on COM, so maybe they got something figured out?
No, I'm arguing that including a manpage doesn't mean your program is now dynamically linked to the man command, or that receiving a .doc attachment in my email client doesn't mean my email client is now dynamically linked to a word processor. If you're going to call that dynamic linking then we've officially reductio'd this out of the realm of usefulness.
Maybe. But for an application opening html documentation, the convention already exists.
xdg-open [URL]
Opens the URL in the user's default browser, which is a wholly reasonable and expected way for a program to behave. Moreso I'd say than using shared libraries to pull up some sort of kludgy KHTMLPart thing.
No, by default, everything is included, at least in GCC. Go ahead and statically link against some unused .a of your choice and run objdump if you don't believe me. There are some special compiler flags that can be used to prune unused functions, to some extent, but they aren't on by default.
I mean static linking is not on by default either so you have to flip a flag as well. Adding -ffunction-sections -fdata-sections to the compiler and --gc-sections to the linker is not hugely difficult, although of course you have to make sure that the static libraries were built that way as well.
That being said I agree that even with that you'd still end up with a massive binary because of all the code that's actually used in one way or an other and all of the code that the compiler and linker can't statically mark as dead. In particular anything being called indirectly through a function pointer is almost certainly going to end up in the final binary regardless of whether it's used or not.
And even if a lot of the code gets removed it's a bit silly not to reuse the shared code in RAM instead of allocating pages containing the same thing over and over again. It's wasteful and cache-unfriendly.
> Think about it: If you were to force absolutely everything to be statically linked, your KDE app would have to include the ENTIRE KDE library suite, as well as the QT libraries it's based on, as well as the X window libraries those are based on, etc etc.
no, because we had working LTO for a long time - every KDE app would contain only the exact code that it needs, down to the individual member function. Sure, all KDE apps would have a copy of QObject / QString / QWhatever - but most C++ code nowadays is in headers anyways .
> You'd quickly end up with a calculator app that's hundreds of megabytes.
The DAW I'm working on (340kloc), which can be built statically linked to Qt, LLVM, libclang, ffmpeg and a few others, is a grand total of 130 megabytes in that case.
It's actually heavier when distributed as set of dynamically linked things (unless shipped by linux distros of course, and most of that size is useless LLVM things that I haven't found how to disable yet) :
only accounting for the dependencies I have to ship, it would be 176 megabytes - plus 17 megabytes of the actual software. I'd argue that in that case dynamic linking actually takes less disk space overall because most people don't have LLVM installed on their machines and don't need to.
The code needed is more than you might think. The transitive usage graph is surprisingly dense. FYI I'm one of the original Qt developers, and I remember how it was to try to slim down Qt for embedded use.
Java is like that too. The smallest possible program (just a single line to exit) requires things like java.time.chrono.JapaneseChronology. Because argv is a String array, and making the String array requires calling a function or two that can throw exceptions, which requires the Throwable class, which has a static initialiser that requires... the chain is long, but at the end something has a SomethingChronology member, and so AbstractChronology.initCache() is called and mentions JapaneseChronology.
A friend of mine tells a story about how how he accidentally FUBARed the tests, and then discovered that running one unit test gave 50% line coverage in Rails.
We have big libraries nowadays, and we use them...
This is zone of the things I like more about HN. You read a random answer to a comment and end up discovering he's one of the original authors of QT. Thanks for the nice work!
Regular LTO for large sized projects (think Chromium sized), in my experience is by far the biggest bottleneck in the build process. It's partially the reason so much is being invested into development and improving parallel linking as well as techniques like ThinLTO that all aim at reducing the link times since often, LTO linking takes around 60-70% of all build time combined despite the heavy use of C++ code (although with no exceptions or RTTI).
Unless you have build servers capable of rebuilding all Qt, WebKit etc. and performing an LTO link (which pulls in all build artifacts in form of bitcode archives/objects) in a reasonable amount of time (big reason buildlabs exist - it takes a long time), LTO is not likely to be suitable, it's an extremely expensive optimization that essentially defers all real compilation until the link step at which the linker calls back into libLLVM/libLTO and have them do all the heavy lifting.
At the very least you need a workstation grade machine to be able to do that kind of stuff on regular basis, you really can't expect everyone to have that. And there's a reason libLLVM.so is usually dynamically linked, it cuts a massive amount of time spent on builds, which is especially useful while developing and it's a middle ground between building all LLVM and Clang libraries as shared objects and having to wait for static linking of various LLVM modules into every LLVM toolchain binary (which tends to result in the toolchain being much much bigger). The build cache with shared libLLVM.so for Clang/LLVM/LLD builds is around 6-7GB (Asserts/Test builds). Statically linking LLVM modules blows that up to 20GB. God forbid you actually do a full debug build with full debug information with that.
That's a terrible argument against dynamic linking. That's not to say static linking is bad, in fact, recently it's been making a comeback for exactly that reason - LTO and next-generation optimizers. But saying LTO makes static linking viable for everyone including consumers is somewhat far fetched.
> At the very least you need a workstation grade machine to be able to do that kind of stuff on regular basis, you really can't expect everyone to have that. And there's a reason libLLVM.so is usually dynamically linked, it cuts a massive amount of time spent on builds, which is especially useful while developing
I of course do not argue doing LTO while developing, it seemed clear for me that the context of the whole thing is about what's released to users.
> your KDE app would have to include the ENTIRE KDE library suite, as well as the QT libraries it's based on, as well as the X window libraries those are based on, etc etc. You'd quickly end up with a calculator app that's hundreds of megabytes.
If you look at the transitive dependencies and then only link in the code that is actually reachable, I doubt that static linking has a significant impact. (I don't know to which degree this is possible with today's toolchains, but I suspect that many libraries have some inefficiencies there due to missing dependency information). As an uneducated estimate (from someone who has written compilers and GUIs entirely from scratch) I'd say 100K of additional statically linked object should be enough for almost any desktop application.
> the linkage to the kernel is also dynamic
Not an expert here, but you don't link to the kernel at all. The interface is via syscalls. At least on Linux, to help decrease inefficiencies at the syscall boundary, there is some magic involving the so-called VDSO, though - which is indeed dynamically linked, but shouldn't be very large either (On my system it seems the [vdso] and [vsyscall] mappings (both executable) both have size 8K. There is also a non-writeable and non-executable [vvar] of size 8K and I guess it links to live variables written by the kernel).
> As an uneducated estimate (from someone who has written compilers and GUIs entirely from scratch) I'd say 100K of additional statically linked object should be enough for almost any desktop application.
This is completely misguided. For a modern UI library it is very very hard to figure out what code is actually used.
Say we load the ui description from an xml file. What controls get used in the app then? Maybe the xml has a webview control in it? Then we can't drop the webview dependency.
Then your app needs to load an icon, which typically goes via some generic image loading API. What types of loaders are there? Can we get rid of the gif loader (and dependencies)? If any single image file reference is not hardcoded in the binary, again we can't.
Want to script your application in lua/python/js, and let it call into the UI library? Then we have to assume that basically any code can be reached.
Yes I agree. I does not work with deep nested dependency chains where some choices are only decided at runtime (after loading additional text files).
Maybe it's possible, though, if we remove a little flexibility and flatten the dependencies, such that the application hardcodes a little more of what actually happens (e.g. specificies the file formats for image loading, the format of the ui description, etc).
This is the way I've done my own GUI application, where I simply coded what should happen, and created widgets as I needed them, and all the important decisions were known statically. But maybe this amount of specification is not what people expect from a standalone GUI library.
It is possible to do it. And it probably happens naturally for a handrolled toolkit. However when the (one) developer of the GUI is different than the (many) app developers using it things typically end up different.
For instance you end up with things like ui designer tools that produce manifest files, where you just expect that some non-programmer can edit it and load it into your app without having to rebuild the toolkit.
Statically linking the OS is quite common nowadays [1], better known as containerization.
For quite a while there has been a movement that every program or "executable" is statically linked.
In your example, the "OS" is a program on it's own, it's controlled by a single vendor and released/updated at the same time. So dynamic linking it fine there
For a downloadable program statically linking Qt would seem prudent however, otherwise an incompatible version on the OS would render the program unusable.
You see this design decision nearly everywhere where legacy is not too big a concern: Go and Rust statically link its runtime and dedendencies in executables, JVM consumer apps are usually bundled with a specific JVM, containerization etc.
[1] edit: masklinn correctly pointed out that usually the kernel is still shared between containers and the host OS
> Statically linking the kernel and OS is quite common nowadays, better known as containerization. For quite a while there has been a movement that every program or "executable" is statically linked.
Containers specifically do not statically link the kernel, the entire point is to have multiple isolated userlands use the same underlying kernel. That's why they're so cheap.
Ok, yes, but that's just the kernel, and it's cheap only relative to a full VM. A container may or may not be cheap depending on the language runtime and whatever helper apps you're bundling. Bundling the python interpreter or a Java VM into every app isn't cheap.
This reminds me of early web servers and CGI's. A CGI can be relatively cheap to run as long as you write it in C and don't link in too much; not so much for a scripting language.
Redundancy can happen at multiple layers, so deduplicating at one layer might not be enough, depending on the situation.
> Fair enough, although it is the case if the host OS is different then the container, VM would be a better name then I guess.
If the host OS is different than the container's you need either an intermediate hypervisor / VM (pretty sure that's what happens for docker on OSX, and why it's much more expensive than on linux) or for the host to provide an ABI compatible with the VM's OS/expectations (WSL, smartos's branded zones).
Either adds even more "dynamic linking" to the pile.
> So maybe not kernel, but the entire userland/OS of the app is still statically linked with containers.
Yes, a container's userland should be statically linked.
Do you have any evidence to support your claim? My understanding is that containerized processes are theoretically the same as any other process on the system in terms of runtime performance.
In plan9 they are exactly the same, not theoretically. Isolation happens on a file system level (everything is a file, but truly, not a "special device file").
While it's practically not possible to compare performance, just the fact that this is a part of the initial design and a core of Plan9 vs adhoc feature (actually ported from Plan to Linux) is enough to speculate on performance.
>Linux namespaces were inspired by the more general namespace functionality used heavily throughout Plan 9 from Bell Labs.
> While it's practically not possible to compare performance, just the fact that this is a part of the initial design and a core of Plan9 vs adhoc feature (actually ported from Plan to Linux) is enough to speculate on performance.
It really doesn't. For all we know, because the plan9 version has to be more general it is much slower than the ad-hoc version.
> Statically linking the OS is quite common nowadays, better known as containerization.
Shipping prepackaged images with self-supplied libraries is functionally equivalent to shipping a blob linked statically. But the implementation is different, in a container, each program still loads the self-supplied libraries using the loader, and this difference matters. Only dynamically-loaded libraries with PIE executables can enjoy the fully benefit of Address Space Layout Randomization. But statically-linked executables cannot, and they leave fixed offsets that can be exploited directly by an attacker.
> Security measures like load address randomization cannot be used. With statically linked applications, only the stack and heap address can be randomized. All text has a fixed address in all invocations. With dynamically linked applications, the kernel has the ability to load all DSOs at arbitrary addresses, independent from each other. In case the application is built as a position independent executable (PIE) even this code can be loaded at random addresses. Fixed addresses (or even only fixed offsets) are the dreams of attackers. And no, it is not possible in general to generate PIEs with static linking. On IA-32 it is possible to use code compiled without -fpic and -fpie in PIEs (although with a cost) but this is not true for other architectures, including x86-64.
Unless we stop writing major programs in C one day, and/or formally verify all programs in the system and provide a high-assurance system that most people would use, using exploit mitigation techniques is still the last line of defense to limit the damage created by a memory-safety bug.
After thinking for a while, I think should change my stance, and support containerized applications such as Docker or Flatpak despite it's an ugly blob. A buggy and ugly container with prepacked applications linked dynamically to vulnerable blob with PIE and ASLR, is less of a evil than a buggy and ugly executable, linked statically to vulnerable blob with fixed addresses.
You are conflating a lot of terms that have nothing to do with each other.
Static linking would never cross process boundaries, would optimize for modules that are actually called from the program that the libraries are being linked to and has absolutely nothing to do with IPC.
Dynamic linking is useful, but not always necessary and not always appropriate and does not result in 'calculator apps that are hundreds of megabytes', it might be 10Mb or so, which is still large but not comparable.
If I'm not mistaken shared libraries are also shared (for the ro sections) in RAM. Saving 10MB on disk is mostly negligible nowadays but 10MB in RAM less so, and 10MB in cache is very much significant.
I really think that the linked post throws the baby out with the bath water, library versioning issues can be handled by the package manager to become effectively non-issue. DLL hell is a thing on Windows because it lacks a proper package manager. On a case-by-case basis applications that can't be packaged properly (proprietary binaries for instance) can be statically linked.
In my experience that's how it works on Linux and for the BSDs and as far as I'm concerned it hasn't caused many issues. Maybe I've just been lucky.
Shared libraries are shared in RAM between processes. But I believe the cache for each process is not shared. If three processes currently have the same shared lib code (or any shared memory) in their working set, they will each work it into cache independently. This sounds inefficient but is rare in practice and maintains process isolation. The added complexity of cache sharing would not be worth it.
Kcalc doesn’t use the whole qt library, the whole X library, the whole
kDE library, or the whole C library. When you statically link, only the functions that are actually called are included. This can trim a lot of space.
> When you statically link, only the functions that are actually called are included.
This is not true in general.
For this code:
#include <stdio.h>
int main(void)
{
puts("hello world\n");
return 0;
}
both GCC 7.3.0 and Clang 6.0.0[1] on my system generate an 828 KB binary when compiling with -static. The binary includes a whole bunch of stuff, including all the printf variants, many of the standard string functions, wide string functions, lots of vectorized memcpy variants, and a bunch of dlsym-related functions.
That's great, but can you now demonstrate the same with Qt and KDE, as you claimed above?
FWIW, I believe (but could very well be wrong) the reason that this works this well with MUSL is that every function is defined in its own translation unit. I doubt that Qt and KDE do that.
The library/project has to be designed right. Musl has dummy references that make static linking work without pulling in a bunch of junk. This is very much a language/project issue and I wish more languages actually looked at the huge mess of dep graphs that are needed for simple programs.
I think you will have better luck static linking against a different libc, like uClibc or musl libc. Statically linking against glibc doesn't work very well, and isn't supported by upstream.
uClibc and musl libc have much better support for static linking, and you will be able to make a much smaller binary with them than you would static linking against glibc.
It's not that complicated. -ffunction-sections -fdata-sections arguments to gcc coupled with --gc-sections argument to the linker will get rid of most of the fat.
> The key is not to eliminate dynamic linkage, but rather to apply it with discretion, at carefully defined boundaries.
Question: In the right system, does discretion even matter? Eg, NixOS - where linked libraries and dependencies are defined by content addressing. Should we use discretion with linked libs in a system like that?
When I started developing in golang I used to prefer static binaries without cgo 100% of the time. Now that I have moved to NixOS and each binary can either use the one in the system or provide their own libc if it is different I feel much better about it. I don't think we need to use discretion in a system like that.
Several comments have already pointed out many ways that this take is incorrect, but I'll also mention that software like KDE is the antithesis of plan9 design and I bet I'll be 6 feet under before we see it ported.
> But let's not stop there, because the linkage to the kernel is also dynamic, which is a no-no. So every app would now need to be linked to a specific kernel, and include all the kernel code.
> Now imagine you upgraded the kernel. You'd have to rebuild EVERY SINGLE THING on the system to build with that new kernel, or a new version of KDE or QT or X or anything in between.
You could though, that's pretty much what unikernel systems do.
You'll still likely have dynamic linkage of sort to the hypervisor or (ultimately) the hardware.
As a converse, its very "nice" to have programs in /sbin and some in /bin to be statically linked.
And when I say nice, I mean absolutely essential. Ive had glibc go south before... And then nothing works. Its a "hope you can boot with rescue media and hope and pray the old glibc is in your apt/rpm cache".
But having the rest of the system being dynamic linked makes a great deal of sense for the reasons you stated.
Think about it: If you were to force absolutely everything to be statically linked, your KDE app would have to include the ENTIRE KDE library suite, as well as the QT libraries it's based on, as well as the X window libraries those are based on, etc etc. You'd quickly end up with a calculator app that's hundreds of megabytes.
But let's not stop there, because the linkage to the kernel is also dynamic, which is a no-no. So every app would now need to be linked to a specific kernel, and include all the kernel code.
Now imagine you upgraded the kernel. You'd have to rebuild EVERY SINGLE THING on the system to build with that new kernel, or a new version of KDE or QT or X or anything in between.
The kernel membrane is a form of dynamic linkage for a reason. Same goes for IPC. Dynamic linkage is useful and necessary; just not to such a microscopic level as it once was due to size constraints.
The key is not to eliminate dynamic linkage, but rather to apply it with discretion, at carefully defined boundaries.