So, basically, replace yum, apt, etc. with a 'stateless package management syste...

cwp · on March 11, 2014

No.

Well, yes, replace yum, apt, etc. But once you have a functional package management system, you don't need Puppet, Chef or Ansible, because the same stateless configuration language can be used to describe cluster configurations as well as packages. So build a provisioning tool based on that, instead.

That provisioning tool is called NixOps. The article links to it, but doesn't really go into detail about NixOps as a replacement for Puppet et al.

lmm · on March 11, 2014

That's just as true without Nix. I've worked somewhere that applied changes to its clusters by building debs; all you need is something that regularly executes apt-get and you're golden.

(Of course, whether something's a good language for expressing particular kinds of tasks is another question)

jes5199 · on March 11, 2014

am I going to switch distros just to use a different provisioning tool?

sparkie · on March 11, 2014

Yes... Eventually.

The Nix model is the only sane model going into the future where complexity will continue to increase. Nobody is forcing you to switch distros now, but hinting that you'll probably want to look into this, because the current models might well be on their way to obsolescence.

westurner · on March 12, 2014

* Unique PREFIX; cool. Where do I get signed labels and checksums?

* I fail to see how baking configuration into packages can a) reduce complexity; b) obviate the need for configuration management.

* How do you diff filesystem images when there are unique PREFIXes in the paths?

A salt module would be fun to play around with.

How likely am I to volunteer to troubleshoot your unique statically-linked packages?

berkay · on March 11, 2014

Author's point is that we are focusing on the wrong place (puppet, chef, etc.), hence potentially making the problem worse by attempting to deal with the symptoms instead of addressing the root cause. Puppet/Chef etc. may indeed still be relevant and potentially even more widely used as it would become much simpler to develop recipes, etc.

geerlingguy · on March 11, 2014

I think the takeaway is that package management is ripe for improvement, but CM tools (and more flexible tools like Ansible) do so many more things besides automating package management that I question the author's mention of them for anything besides a hook to get more pageviews.

This article has little to do with config management. A better title would be "apt, yum and brew aren't good enough; we can do better".

girvo · on March 11, 2014

No, NixOS and NixOps tackle both, and you can't have the latter without the former. They are all intertwined, hence talking about both in the article :)

bryanlarsen · on March 11, 2014

True, but:

`sudo apt-get install nginx` just works. Perhaps they're doing it "wrong", but there are thousands of people who are making sure it just works.

I have some of the problems described in the article, but it only happens when I can't use the package manager. It happens when I have to compile from scratch, or move things around or mess with config files, et cetera.

For me, phoenix servers and docker are the solutions. Maybe they're not as pretty as what he describes, but there is a solution that works.

mcguire · on March 11, 2014

"`sudo apt-get install nginx` just works. Perhaps they're doing it 'wrong', but there are thousands of people who are making sure it just works."

True, but that is only half of the problem. Once you have nginx installed you have to configure it, which requires another layer of automation. Package installation is the trivial part of the stuff done by Ansible, Puppet, etc.

You could package up pre-configured .deb's, but you lose the "thousands of people" aspect and are back to "compile it from scratch". It's significantly less entertaining, particularly if you're trying to do it repeatably.

I think the real point of Nix is that you use the same mechanisms for both package management and local configuration.

sparkie · on March 11, 2014

> but there are thousands of people who are making sure it just works.

Problem identified!

Nix doesn't just solve user-facing problems, it's first and foremost a solution for developers to package applications that should just work. You won't need thousands of people making sure a package works - but one. If it builds for him, it should be for everyone, by virtue of the fact he specifies exact dependencies and build instructions, and performs the build in a chrooted environment which should be reproducible exactly on a users machine.

josegonzalez · on March 11, 2014

`sudo apt-get install nginx` does not just work if:

- The repo is down - External network doesn't work - You are missing a dependency not in your apt-cache - It conflicts with another package due to a dependency

All of which are possible and happen.

0xbadcafebee · on March 11, 2014

Oh yeah, Debian's package management is certainly not the example i'd use of a bulletproof system. On the other hand:

  upgradepkg --install-new /dir/*.tgz

Works pretty much infallibly. Nobody wants to admit it, but the most reliable package management system is 'tar' (or cpio, really). Everything else just introduces new points of failure. If you just want to get something installed, there is nothing that works more effectively than completely ignoring dependencies and metadata.

turrini · on March 11, 2014

That's simple not true.

apt-get ensure that application will get the right version of a library for its execution.

'Untaring' doesn't mean the application will work OK with their dependencies (since there's no verification).

So anyone could do a dpkg -i --force-all *.deb. Its the same thing.

0xbadcafebee · on March 11, 2014

Actually I believe it's more like

  dpkg -i -E -G -B --force-overwrite --force-breaks --force-conflicts --force-depends --force-depends-version *.deb

with one or two extra things, like installpkg/upgradepkg will prompt you for what you want to do with config files, and adds one or two extra heuristics. But, yes, dpkg could totally be used in a similar way as upgradepkg. It's just more complicated so it doesn't work as well ;)

icebraining · on March 11, 2014

You're not solving the problem, you're just pushing it out to the steps that get the right tgz files to /dir and claiming it's magically solved. If that step fails, how is upgradepkg going to work?

0xbadcafebee · on March 11, 2014

All apt-get does is download the right .deb files from /dir and attempt to install them. They can be just as messed up or incorrect as a tgz file, in which case if the .deb files are messed up, installation fails. So both cases are identical in respect to the steps required before install. The difference is, upgradepkg will virtually never fail on install, while apt-get has about a hundred things that can fail on install.

Apt-get gives you some insurance in that it (mostly) won't screw up your system if you try to do something wrong. But it also adds levels of complexity that can make it very difficult to get anything done, even if you know what you're doing. Both systems will work, but only one is more likely to do what you want it to do without extra effort. And just besides maintenance woes, it's much more difficult to recover a broken apt system than it is to recover an installpkg system.

If you ask an admin "What's more likely to succeed: rsync and upgradepkg, or apt-get", the answer is the former, because the level of complexity of the operations is so much smaller. As long as your packages are correct, everything else is determined to succeed. With apt-get, you have many more conditions to pass before you get success.

__david__ · on March 11, 2014

> The difference is, upgradepkg will virtually never fail on install, while apt-get has about a hundred things that can fail on install.

That's because dpkg does more. It's designed to ask you questions interactively when configs change and get you to look at things. If you ignore config files and just blindly install new binaries (even yum/rpm do this!) then you end up with an upgrade that "worked" except that the new binaries won't run at all because the config files are now invalid.

Failing silently like that is hardly better. I would say it's objectively worse.

icebraining · on March 11, 2014

What are the hundreds of things that apt-get (or actually, dpkg, which is the software that actually installs debs) does that can fail? Dpkg isn't that complicated.

waffle_ss · on March 11, 2014

Potentially worse than all of those, it might install just fine, but actually install a newer version of nginx than the one that you tested locally or on your test environments.

This is one of the things that Docker solves, as you are able to test and deploy using exact filesystem snapshots.

enjo · on March 11, 2014

But you can install specific versions:

apt-get install nginx=x.x.x

I'm not aware of a case where that would install anything but what you ask for.

bmurphy1976 · on March 11, 2014

Which works great until you want to run two dependent services, one that requires no greater than v1.5 and the other which requires no less than v3.0.

borntyping · on March 11, 2014

And most of these would not be solved by the package manager described in the article.

rmc · on March 11, 2014

There are lots of people ensuring that doesn't happen.

Sure it's possible, but it's unlikely. There are loads of debian/ubuntu apt mirrors. apt-get (or aptitude) downloads dependencies, package maintainers ensure that there aren't those conflicts.

haberman · on March 11, 2014

This reminds me of a particularly devious C preprocessor trick:

    #define if(x) if ((x) && (rand() < RAND_MAX * 0.99))

Now your conditionals work correctly 99% of the time. Sure it's possible for them to fail, but unlikely.

Now you might object that C if() statements are far more commonly executed than "apt-get install". This is true, but to account for this you can adjust "0.99" above accordingly. The point is that there is a huge difference between something that is strongly reliable and something that is not.

Things that are unreliable, even if failure is unlikely, lead to an endless demand for SysAdmin-like babysitting. A ticket comes in because something is broken, the SysAdmin investigates and found that 1 out of 100 things that can fail but usually doesn't has in fact failed. They re-run some command, the process is unstuck. They close the ticket with "cron job was stuck, kicked it and it's succeeding again." Then go back to their lives and wait for the next "unlikely but possible" failure.

Some of these failures can't be avoided. Hardware will always fail eventually. But we should never accept sporadic failure in software if we can reasonably build something more reliable. Self-healing systems and transient-failure-tolerant abstractions are a much better way to design software.

lmm · on March 11, 2014

That difference goes away at the point where other risk factors are higher. How high is my confidence that there isn't a programming bug in Nix? Above 99%, perhaps, but right now it's less than my confidence that apt-get is going to work.

Most of us happily use git, where if you ever get a collision on a 128-bit hash it will irretrievably corrupt your repository. It's just not worth fixing when other failures are so much more likely.

haberman · on March 11, 2014

The point of my post wasn't "use Nix", it was "prefer declarative, self-healing systems."

Clearly if Nix is immature, that is a risk in and of itself. But all else being equal, a declarative, self-healing system is far better than an imperative, ad hoc one.

Other risk factors don't make the difference "go away", because failure risks are compounding. Even if you have a component with a 1% chance of failure, adding 10 other components with a 0.1% chance of failure will still double your overall rate of failure to 2%.

This is not to mention that many failures are compounding; one failure triggers other failures. The more parts of the system that can get into an inconsistent state, the more messed up the overall picture becomes once failures start cascading. Of course at that point most people will just wipe the system and start over, if they can.

File hashes are notable for being one of the places where we rely on probabilistic guarantees even though we consider the system highly reliable. I think there are two parts to why this is a reasonable assumption:

1. The chance of collision is so incredibly low, both theoretically and empirically. Git uses SHA1, which is a 160 bit hash actually (not 128), and the odds of this colliding are many many orders of magnitude less likely than other failures. It's not just that it's less likely, it's almost incomparably less likely.

2. The chance of failure isn't increased by other, unrelated failures. Unlike apt-get, which becomes more likely to fail if the network is unavailable or if there has been a disk corruption, no other event makes two unrelated files more likely to have colliding SHA1s.

Spooky23 · on March 11, 2014

There are solutions for that too: Ubuntu/Debian: apt-mirror RHEL: Create a local RHEL repo. http://kb.kristianreese.com/index.php?View=entry&EntryID=77

njharman · on March 11, 2014

That specific case works.

But, it's "yum install nginx" some places. And some package names vary by distribution and even versions of same distro. And there is post install configuration that has to happen. And there is coordination with the proxy cluster, the memcache cluster, the db cluster, monitoring system that has to happen. And many sophisticated users will need a custom nginx or extension which requires building locally (and all those dependencies) or private repo. And you're glossing over configuring sudoers so the user you are logged in as can even sudo in the first place.

tldr; Provisioning is fucking hard.

Argorak · on March 11, 2014

An the blog post argues for adding another command and a new set of names ;). As much as I love the idea of Nix, it would only make that problem go away if it managed to take over the whole world - which it won't.

joevandyk · on March 11, 2014

"It happens when I have to mess with config files."

It's very rare I don't have to change configuration files for any software that doesn't come with the default OS install. It's also pretty rare I can use the default packages that come with Ubuntu, I very often need different versions or ones that are custom compiled.

I'm looking forward to Docker, but can't use it quite yet.

NyxWulf · on March 11, 2014

Yeah that xkcd came to my mind as well. I'm amazed at the timeless truths he captures in those comics. Creating another version which may be better isn't the hard part. Gaining consensus and getting people to give up the other ones is the hard part.

talkingquickly · on March 11, 2014

And in a way which so many people relate to. As soon as I saw "xkcd" I knew which one was going to be linked to...

saraid216 · on March 12, 2014

I still want to do a thing where response codes are returned the way they are for HTTP.

Query: Why are you screwing around? Response: XKCD 303.

_vya7 · on March 11, 2014

As existing solutions get more painful, people quickly adopt less painful (i.e. generally better) solutions. So it's tied just as much to how much the old ways suck as to how good the new ways are.