It's really exciting to see the Nix ecosystem start to gain traction here and th...

faitswulff · on May 25, 2021

> The novel layout of Nix's content addressable store make this conflict free even with many versions of the same package.

Is there a simple English explanation for how Nix does this?

geofft · on May 25, 2021

Let's say you're building Python. You write a rule to build Python, which is something like "Get a tarball from python.org/... with hash abcd1234; depend on glibc and zlib and OpenSSL; ./configure && make install." The libraries each of which have their own rules like that, say "Get a tarball from gnu.org/glibc/... with hash dbca4321; ./configure && make install."

You can hash these build descriptions themselves, and that hash is available to you before you start building. So the hash of that glibc description is 98765432 - which includes the hash of the tarball. So you install glibc into /nix/store/98765432-glibc-2.30. (The name and version there are just for human convenience.) Then, when you install Python, you point it at the glibc in that directory and build it into e.g. /nix/store/aaaabbbb-python-3.9.

If anything changes - tarball hashes, build steps, dependencies - the Nix hashes of everything affected also change. So the two versions, with and without the change, can be installed side by side.

There are separate mechanisms in Nix to do things like get the hash of the "current" version of Python and put it on your PATH.

(The downside is that Nix does not make use of the ordinary dynamic-library behavior where you can upgrade OpenSSL without recompiling Python. It does use dynamic linking, but it always points at exact paths and the thing being pointed to never changes versions. But there are good reasons to prefer this anyway, and the only downside is compilation time, and computers have gotten a lot faster since Linux distros and dynamic linking itself were originally designed.)

faitswulff · on May 25, 2021

Thank you for your explanation. Does this process take a lot of disk space?

abathur · on May 25, 2021

I'm not sure exactly where your focus is, so I'll just give you a mix of thoughts and see if I accidentally help :)

- Building takes space, but the build takes place in a temporary directory that is discarded afterwards, and only the output artifacts are kept.

- ~Normal software is usually served pre-built from a cache, which short-circuits the above as long as the package is available from the cache.

- One of the nicer things about the process Nix uses here is that it's also tracking what's ~in-use (via "GC roots"), and you can trivially garbage-collect everything in /nix/store that isn't.

- On macOS I use Nix for just about everything that isn't a desktop app. My /nix mount had 9.7G used when you asked this question, and 3.6G after I garbage-collected, and 7.1G after I re-built the dev environment for a server project.

- The ~3.4G I mentioned above is a really big fraction of my total, but as a reference point, Nix is natively replacing, in ~3.4G, the Vagrant/virtualbox VM I used to need for this development.

- In my experience, the only time I'd describe the storage load as "a lot" is when I've been iterating on something that is fairly large. This can add up as the store accumulates slightly-different copies, but these can be readily GCed.

- Multiple versions of something (say, multiple different pythons or rubies or llvms) will consume more space, but the ~Nix way also makes it possible to avoid vendoring. At system scale I would _guess_ these are a wash?

- At least when you're using Nixpkgs, "multiple versions" generally just means a few major/minor version variants of common dependencies (i.e., at a single commit, most things in Nixpkgs that use Python 3.7 reference the same build).

faitswulff · on May 25, 2021

That’s a really comprehensive answer, thank you! I don’t have a particular focus, just wondering what the trade offs are.

takeda · on May 25, 2021

Each package does use the same amount of space as with other packaging methods, but when you upgrade things, Nix won't garbage collect it you can run it manually, from schedule, there's also an option to have nix run it on every invocation (kind of like git does).

Anyway I wanted to also to make comment to GPs explanation. What is written, is how Nix is currently addressing each package (it's a hash computed based on source code, dependencies, system, compile options etc).

In many cases different options or change in source files can still produce the same exact package, but under a new hash. Nix has an optimise-store command that scans its store and if two packages have exact same result it uses hard links. That similarly can be run via schedule. This ensures that duplicates do not occupy extra space.

Though in context of nix, what is referred as content addressed is the new way Nix planning to address the package. Instead of computing hash from package and its dependencies it computes hash based on the package's contents. That solves many problems with the old way, but of course question is how is it done, because it kind of is impossible to get hash of something that you did not produce yet. Of course like everything of this kind, it's possible to do if you cheat a bit. Package is first compiled with a dummy paths, then hashed, then references to those paths are substituted.

Here's more about [1].

Having said that, in the blog article when they use "content addressable" it looks like they still meant the current way of doing it (it looks like proper term is input-addressable), not the new way, but if you search "nix content addressable" you are likely find articles about the new way and things might get confusing.

[1] https://www.tweag.io/blog/2020-09-10-nix-cas/

forgotpwd16 · on May 25, 2021

Each package in /nix/store is in a directory $hash-$pkg-$vers (technically $vers isn't even required since the $hash can already distinguish them). Therefore each can have its own independent dependency tree. For a more detailed explanation see https://shopify.engineering/what-is-nix#:~:text=Language-,th....