This might be the most mundane topic Ive found myself naturally extremely excited about.
If your a developer working full time in only one or two languages you may never experience just how good/bad you have it.
When you do, its really eye opening.
Every time I transition to a new language professionally it can be like opening a bag of Bertie Bott's Every Flavor Beans when you look into the packaging story.
* Go binary release story is great but the gopath method for dependencies is annoying
* Elixir has lockfiles and built-in package docs but the release story deviates too much
* Javascript now that everything has settled into npm is a delight but the lack of stdlib, painful local aliasing and extremely heavy node_modules folder can be offputting
* Python just sucks (lets hope poetry can bring the promised land of deterministic builds)
As an old school server side Java dev, I boggle at the hoops folks jump through to get stuff done when it comes to packaging.
We had two large dependencies: the JDK, and the app server (our container, like Tomcat). Maven, that we’ve had forever, for library dependency. The resulting War files were effectively self contained.
JDKs were trivially installed, explode it somewhere (anywhere) and set JAVA_HOME. App servers were the same. Self contained directory trees. Install as many as you like, just change the port they used. Or just add another War to the one you have already. Pros and cons.
War files were essentially static linked, carrying all of their libraries. Self contained blob you could drag and drop into a directory and watch the server shutdown your old version and fire up the new one.
Sure, we had our bit of DLL Hell, rarely when building the app. I’m certainly not going to suggest we never had class loader issues.
And enterprise Java has a notoriety all its own, but packaging was pretty low on the list. But we didn’t need Docker, or dedicated VMs or anything like that. Our OS dependencies were all handled by the JDK. Java didn’t have any shared library concerns. I honestly never questioned it, either it was statically linked, or just entirely self contained outside of the C library. Everything else was in Java. OpenSSL, for example, was never an issue.
I’m on board though, I loathe packaging. I’m not a big fan of arcane parameters passed on over the council fire and tea. Just never been my drive.
It definitely does have such concerns, but the community has just accepted that the JDK will never help with this and so every JAR that uses a native library hacks around the lack of proper support in its own unique and special way. Usually some project-specific ad-hoc code that extracts native libraries into some directory in the user's home directory.
Is the "cache" versioned properly? Maybe.
Can you control where it is? Maybe.
Code signing? Probably not.
Can you ship only native code for the machines you actually care about? Maybe.
Does it work when you run as a dedicated server UNIX users that doesn't have a home directory? Nope.
Hydraulic Conveyor (https://hydraulic.dev/ - disclosure, my company) fixes a lot of these problems automatically. But it's not like there are no problems to fix, and of course writing a library that uses native code is still a big pain. It's a major blind spot of the JDK unfortunately, and the community has never risen to the challenge of fixing it.
I'm currently working on Java bindings for my own native library. (I'd like to use the Panama FFM API, but unfortunately, my first user is stuck on JDK 11 for now, so I'm stuck with JNI.) Do you have any recommendations on how I should handle the packaging and library loading problem, so I don't make things worse for Conveyor (though my first user isn't using that)? Any reference implementation that I should look at and copy?
So it's really easy. Just run System.loadLibrary("foo") before doing anything else. On developer setups that line will throw because the JVM won't find the shared library, so then you can go down the road of extracting things to the homedir or whatever else you want to do (or do it manually and check in the results). Deployed installs will find the library in the right platform specific directories inside the app package and pass.
* Rust is awesome, and nothing can beat it. Cargo (the package manager that ships with Rust) is the best thing since sliced bread! Every other language can suck a bag of burritos.
I'm sure we can all agree that Cargo is the best thing since sliced arrays, but I've finally reached the point where we have enough different Rust projects at work that rebuilding the world from scratch all the time and having ~40 GiB `target/` directories scattered around the place is getting old.
I'm not meaning to criticize Rust or Cargo here; they're both doing their jobs just fine. But I do find myself craving a different compromise for much of what I/we use Rust for today. And I'm really hoping that Wasm Components (and WASI) will be that different compromise — e.g.:
- Don't rebuild an entire HTTP stack from scratch for every tool that happens to use HTTP.
- Therefore most projects won't have enormous `target/` dirs.
- Reuse components built in CI for Linux directly on a Mac dev machine (cross platform).
- Mix and match new processor architectures available in AWS without building two versions of everything.
- Also reuse some components in the web browser (yes, we actually have real, boring use cases for this).
- Wait, I'll be able to do all of this and still be writing Rust? Shut up and take my money.
I do wonder how much sharing could be done at a project level for libraries. Would it be possible to have at least debug builds with matching compilation settings be compiled down into a shared place?
Definitely. Much of this can be mitigated by using a shared target directory, which is at least somewhat supported in Cargo. I should probably start doing that and promote it at work if I don't run into any problems.
Some clever garbage collection would help, too, but I imagine different people would have different and very strong opinions about how that should work.
The fact that two projects that use the same library may enable different features also complicates things. Again, a of this can be mitigated.
I'm looking for a step change for application development, though — i.e. not 5 minute build reduced to 2 minutes, but 5 minutes reduced to 5 seconds (and a correspondingly tiny target directory). That's what excites me about WASI in this context at least.
Leaving WASI aside for a minute, I do wonder how much more could be saved in local disk space and compilation time across projects (and hosts, a la sccache) if this was a high priority goal for the Rust project. E.g. even if the MIR for a crate with two different sets of feature flags enabled ends up substantially different, would they still compress well against each other if a lot remained common?
Once upon a time I briefly looked at symlinking rubygem install dirs from global to project-specific directories (because a project-specific $GEM_HOME avoided functionally all the problems with just about any other approach, and is still what I use today). Functionally it worked, it just needed some tooling to make it easy.
It sounds like what you might want is a shared global dir that uses hashes of feature flags to separate crate installs in the same target directory, then some after-the-fact GC to hardlink matching files across different builds of the same crate. Then you can symlink from there into your local project target.
FYI - Go no longer uses GOPATH for managing dependencies since the official module system was released in Go 1.11. There's still plenty of tutorials and the like out there that mention GOPATH, but it shouldn't be needed any more for basic scenarios.
All that old information is terribly confusing, but it's not entirely useless. GOPATH is still critical, as it's how you can run fully-offline builds of Go programs and override dependencies without changing the root project itself.
This isn't accurate: `go mod vendor` will dump the dependencies into the vendor repository these days, so you can build totally offline. There's also a mechanism for renaming dependencies (so you can replace them basically).
Not quite: both vendoring and the 'replace' directive in go.mod files (which I presume is the mechanism for renaming dependencies you are referring to) both require modifying the root package.
With GOPATH, I can package a single library and use that package to fulfil the dependency on the library of any other Go application offline. Conversely, with the vendoring mechanism, the Go tooling will need to download (or copy from cache) the library when you originally create the vendor directory or when you add a new dependency to the root project.
The Python world won't improve as long as programmers add dependencies on libraries written in other languages and (here's the important part) attempt to compile those packages themselves within a Python build process. Poetry is a nice chapter in the Python package definition story, but it is only a tentative step to fixing the wider problem.
I firmly believe that languages should not manage packages. While it makes the simple cases easy for beginners, the trade off is mixing languages becomes harder. There is no perfect language and often mixing should be the right answer and we don't want any more friction there.
I hope that fad dies. It assumes all the world is x86-64 Linux. Maybe you get a few who acknowledge the raspberry pi. However there is a whole world of new other processors, *BSD, and others that the fad makes difficult to use.
Nobody that currently exists - at least to my knowledge. This is a hard, mostly thankless problem. Distros solved a similar problem, but they have different motivations and so didn't fully solve everything. Languages like rust have a partial solution, but they don't play well with other languages and the full complexity that can result.
I quite agree! And would it even be harder for beginners? When trying a new a program written in a language I don't use frequently, I usually spend a good half-hour working out what command to run to install the dependencies and going through the logs working out what implicit requirement wasn't in the README. A holistic package, even one written for a different software distribution than the one I use (Debian at present), would immediately get me 99% of the way there.
PS. If you find yourself in the same situation I do, Repology is your friend: https://repology.org/
I do believe this is an issue of not having explicit dependencies. Julia takes the approach of, we build and ship everything for every OS, which means Pkg (the package manager) knows about binary dependencies as well. Making things more reproducible in language
Linux distros often do things to force packages to declare all their dependencies: Nix and Guix use unique prefixes and sandboxed builds, openSUSE builds all their packages in blank slate VMs that only install what is declared, standard Fedora tools for running builds in minimal chroot environments, etc.
I'm not aware of any language ecosystem package managers taking similar measures to ensure that dependency declarations in their packages are complete.
The problem with system packaging is there are so many systems. For example if you only package something for Debian, how should a Fedora, Arch, Gentoo or even Mac and Windows user use that?
The problem with systems packages is they solve a slightly different problem from language packages, and so while they are closer to what is needed, they are not right either.
Debian has by far the most rigorous packaging standards of that list, so subsequent packaging for the other distributions should not be very difficult. Bundling the dependencies into an OCI container or AppImage to run them on non-Debian systems is also trivial once you have a Debian package, but of course that comes with disadvantages. Neither Mac nor Windows have a proper first-party package manager (although winget makes some inroads, thanks of course to Keivan Beigi), so comparison becomes rather a moot point for those platforms.
Agreed. It's unbelievable to see all these languages inventing packaging over and over again. It's just an archive with some metadata and a hash/signature and a transport mechanism.
One package management feature that not even the really good languages seem to have is synchronizing dependency versions across multiple packages, usually in the context of a monorepo.
Like if my codebase has webapp A, library A and library B rather than separately defining that they all use third party library foo v3.1.4, it would be really nice to have a single source of truth.
It's one part of the problem. Reliably managing those environments at scale can be tricky. Plus dealing with hybrid environments that include pip and conda packages.
If your a developer working full time in only one or two languages you may never experience just how good/bad you have it.
When you do, its really eye opening.
Every time I transition to a new language professionally it can be like opening a bag of Bertie Bott's Every Flavor Beans when you look into the packaging story.
* Go binary release story is great but the gopath method for dependencies is annoying
* Elixir has lockfiles and built-in package docs but the release story deviates too much
* Javascript now that everything has settled into npm is a delight but the lack of stdlib, painful local aliasing and extremely heavy node_modules folder can be offputting
* Python just sucks (lets hope poetry can bring the promised land of deterministic builds)