This kinda makes the mistake that nearly everyone makes with regard to containers: They view the container as a black box into which complexity disappears.
A container doesn't consume complexity and emit order. The complexity is still in there; you still have to build your containers in a way that is replicable, reliable, and automatable. I'm not necessarily saying configuration management is the only way to address that complexity, but it does have to be addressed in a container-based environment.
Now, I understand that in many cases some of the complexity is now being outsourced to others who maintain the containers (in the same way we've been outsourcing package management to operating system vendors for a couple decades), and so maybe some of the complexity has been pushed outside of your organization, but you have something being deployed that is internal to your org, so you have to deal with that complexity. Container management tools just don't do things at that level.
There's always a point where what you're doing is unique to your environment, and you have to be able to manage/reproduce/upgrade it somehow.
> Now, I understand that in many cases some of the complexity is now being outsourced to others who maintain the containers
We have a winner.
Containers in the end looks to management and developers as a means of outsourcing those pesky sysadmins that always get in the way of dreaming big.
Its not without reason why devops and containerization seems to go hand in hand, as both are pretty much about neutering or sidelining the sysadmin role.
The sysadmin role isn't being neutered or sidelined, it is being moved from something every company has to do to something that primarily will be performed by large providers at massive scale. System infrastructure and the management of it is becoming commoditized, and containers are one part of that story, just as the increasing level of abstraction of cloud platforms is another.
The more a complex system increases in complexity (parts) the more likely some part will fail and the less likely that failure will create a catastrophic failure. (Mark's Law)
Any sufficiently advanced technology is indistinguishable from magic. (Author C. Clark)
With the advent of AI we will soon reduce IT into magic.
I guess that's true if you're treating containers and the environment they run in as "another layer" of stuff that you use configuration management to deploy on top of your existing automation you maybe already had prior to adopting them. I do suppose that is probably true of many users still today. There is also the use case of updating your base image or AMI and then replacing slaves/followers/your servers that run the containers, more or less eliminating the need for a job that is constantly running, or running on a schedule to keep the host system up to date. It's true that in the case of container hosting systems, that is mostly being abstracted away from you, and this post was written with that mindset as we do just that. Good comment and point of view though, I think it's still something a lot of folks struggle to wrap their heads around!
This article reads as very developer-centric, is my first thought. When comparing Docker and Configuration Management, he uses deploying code and scripts as his comparison, which is far from extensive when thinking through a list of all changes on a system.
As a former operations engineer himself, the author didn't touch on a suite of other day to day complexities to consider : how to handle emergency change management and track that (most configuration management tools can revert to baseline if they detect a change they don't expect), keeping systems patched, etc.
While not a bad article, I feel like we've talked about just deployment, and quite frankly there are a zillion ways to solve the automation of it.
That's fair. This was definitely written with deployments in mind so I can't disagree with your assessment. With that said, I think the real downside of config management, and the real upside of containers really comes down to deployments. There is definitely still a use case for system level changes and patching that are best served by config management. How long until the container based config management? :serious_question:
> There is definitely still a use case for system level changes and patching that are best served by config management.
This is only true for bare-metal deployments where virtualization is not an option, which isn't too common these days. When you have virtualization, deployments can and should be viewed as immutable, which gets rid of the need for any of the config management tooling.
If you want to make a system-level change, you update the script that builds the image, CI builds a new copy of the image and you deploy a new, immutable version of your infrastructure. Not having immutable infrastructure components is just opting into a world of pain that's completely unnecessary these days.
Why were people horrified? I liked the points you made in your slides, specifically patching openssl in containers is a great use case. When will this connector be beta? By the way this connector along with something like "clair" would be a good foundation for vulnerability management system for containers in production:
Agreed. There's lots of developer-centric posts lately that "config management" is dead, "no ops", "serverless" etc and they are sort of shortsighted. Sure it might work in a small greenfield environment but at any kind of scale you will still have a need for config management, servers and someone that pays attention to the ops side of things.
The role of config management has changed slightly in the era of micro services and containers but it is still essential.
I see some parallels between J2EE and containers. In both cases it is a platform that runs a "bundled application" with the hope of interoperability and ease of use. Download and deploy the .war (and edit five .xml config files) and you're done. Pull the image and run CMD in it, and you're done.
No matter how good the "easy app runner" platform is, one still has to specify the network between apps, logical dependencies, and --volume dependencies. And also "runtime config" to the individual apps. I really like the docker compose approach, but does a docker compose recipe work on every docker hosting provider? Also, the 12factor app approach is good, but is just a bunch of ENV vars sufficient for every config one might want to do for an app?
When facing this duality a few years ago, I had to look at the facts that encourage Config Management in favour of a Dockerfile approach:
* Dockerfiles do not work for configuring host systems, only Cfg Management is applicable
* Configuration management systems usually have a very declarative approach, easier to extend and maintain that Dockerfiles and bash scripts
* Dockerfiles contain too many arbitrary choices that do not work for everyone, starting with the choice of the distribution and OS version: companies like standardizing around one distribution and would have to fork Dockerfiles based on other systems, every single time.
The best solution I see at the moment is to use setup containers using configuration management.
I'd add the Dockerfiles only allow extension by inheritance, leading to a lot of cases where people extend for construction.
Put another way: you can't safely compose Dockerfiles. If you want clean, narrow, special-purpose containers, you have two alternatives:
1. Build Dockerfiles for each use case precisely. This will lead to duplicated code and all the excitement thereof.
2. Build a giant shaky tower of Dockerfiles with FROMs, of which most are essentially 'abstract' Dockerfiles, not intended to be actually used. This will lead to documentation and all the excitement thereof.
Stackable filesystems are a triumph in some regards, but they create ugly backpressure on the design of libraries of Dockerfiles.
In this respect I see configuration management tools as vastly superior, as they generally have a concept of assembling goals and converging to them, rather than a concrete stacking logic.
2 does not work for many scenarios. You end up having to build all the FROM combinations which quickly grows out of practicality.
3. use any templating system you like
For example, I make mixins out of little bash scripts (mostly heredocs) that outputs Dockerfile lines. Mixins are shared between projects via git submodules. Voilà.
I don't like configuration management mainly because the syntax is plain stupid, making it difficult to read. Check out Ansible and you'll know what I mean... although I do I have some hope in Terraform once they roll out their feature of "write out the config files out of whatever I already have in that AWS account" (not having to write the config scripts from scratch gives me a tailored example to learn from).
Whenever some bash script gets big (say, 100LOC), it gets rewritten in ruby.
I think of Dockerfiles as a very static piece of code, it urges you to use templates on it.
Just like for a nginx.conf
And 100LOC, well, I've gotten very much used to bash. It is so comfortable when the terminal is your REPL. To tell you the truth my main deploy script is actually 500LOC of bash. It reads top to bottom with a lot of conventions (I was suprised to follow google's bash conventions), and short circuiting abuse.
> their feature of "write out the config files out of whatever I already have in that AWS account"
Whenever I tell people about my own config mgmt solution (http://holocm.org), this is the immediately most-requested feature. They say "I would like to start using configuration management for my system, but I don't want to reinstall my system." So there needs to be some way to serialize all the changes one has made to the system since its installation.
I understand the point about Ansible being hard to read, I find SaltStack much more decent regarding syntax, or PyInfra regarding new kids in the game.
Regarding building Dockerfiles using bash scripts or Ruby, I find the concept interesting but using 'scripts' feels wrong. Building a 'standard' library to generate Dockerfiles on the other hand might be an interesting project that would allow collaboration.
The author seems to think configuration management is about installing packages. That's something the package subsystem does, with a reasonably mature handling of dependencies too. That makes it look like a glorified shell script. (Which may be a case of when all you've got is a hammer everything looks like a nail.)
But configuration management concerns centralized management (hence "management") of decentralized systems. It can answer questions such as "why is this sceduled job running on nodes of type x but not y and who put it there?", and guarantees such as "application w and z are always in lock step concerning parameter p".
I've seen organizations move to containers, and they all inevitably end up with more and more containers and increasing complexity. Centralized configuration management is more important in that environment, not less. Modern tools such as Ansible and Puppet have grown up in a devops worls and have good support for managing containers (even if they are a bit of a moving target) and there is no reason to be scared of them.
> All of the logic that used to live in your cookbooks/playbooks/manifests/etc now lives in a Dockerfile that resides directly in the repository for the application it is designed to build. This means that things are much more organized and the person managing the git repo can also modify and test the automation for the app.
It sounds like the author didn't have experience with larger systems - or maybe did, but my experience contradicts this.
Let's say you have everything you can in the containers. Now you want to deploy test and production environments. How do containers know which environment they're running in? Or specifically, things like what's the database user/password, what queue to connect to, where to find local endpoint for service X?
That still needs to live outside of containers. And at some point etcd and similar solutions have a problem of "what if I don't want to share all the data with all the infrastructure all the time"? Well... you can solve that with a config management service. Edit: just noticed etcd actually gained ACL last year - good for them. But how do you configure/deploy etcd in the first place?
It sounds like the author didn't have experience with larger systems
In actual _large_ systems like Facebook or Google you explicitly keep runtime applicaiton configuration info out of the config management system and use another system entirely to manage that. This lets you easily manage test/deploy, shard services easily, and do rolling deployments with config info updated instantly across the fleet if necessary.
Have a look at ConfigMaps and Secrets in Kubernetes. We use these to run the same container images in staging and production, without additional config mgmt.
> How do containers know which environment they're running in? ... But how do you configure/deploy etcd in the first place?
This is where you wind up installing a platform like Cloud Foundry or OpenShift, or building your own out of lower-level components like Kubernetes or Mesos.
Disclaimer: I work at Pivotal, we provide the majority of engineering effort on Cloud Foundry.
> All of the dependencies of the application are bundled with the container which means no need to build on the fly on every server during deployment. This results in much faster deployments and rollbacks. ... Not having to pull from git on every deployment eliminates the risk of a github outage keeping you from being able to deploy.
Obviously there are other benefits, but it's funny how much of the motivation for containers comes from "now we don't have to do git clone && npm install!", which was always the case.
Eliminating running applications which talk over the network from deployment is a huge gain. Making things happen once rather than per-server is also a huge gain.
On one server, cloning and installing things from the internet is all fine. Installing/upgrading hundreds at a time means you get to see how unreliable package management can be. You also have to guarantee enough bandwidth to the repo/package server to handle everything, or implement orchestration so servers don't install the same thing at the same time. At scale this matters.
>Instead you can write a block of code with Chef, for example, that abstracts away the differences between distributions. You can execute this same Chef recipe and it will work anywhere that a libxml2 package exists.
But this doesn't really work. What if the package is differently named on different distributions? What if one distribution's version of the package isn't compatible with your use of it? Besides, how often do you switch between distributions on your servers?
> What if the package is differently named on different distributions?
You write a case/when block that chooses the right name.
> What if one distribution's version of the package isn't compatible with your use of it?
Rarely happens, but then you just compile/package your own. Or solve the incompatibility from the other side.
> Besides, how often do you switch between distributions on your servers?
Rarely. More often you want to reuse configuration management code from a common repository which installs X, but abstracts the distro differences because many people collaborated on it. Also, often you're running multiple distros within the same company but it helps to have the same deploy/management process.
> You write a case/when block that chooses the right name.
I think this starts to get to the heart of the issue. This is why Configuration Mangement gets so complex - because your CM scripts need all of this special purpose logic to work around these kinds of environmental differences.
The CM scripts become their own source of complexity.
Abstract / concrete is a spectrum ;) It abstracts over all the things which are standardised, known ahead of time, and known at the time of writing the system. Things that people just come up with are always going to be unstable.
It's not that big of an issue in practice though. About the only package I can remember out of under a hundred I dealt with was mysqld/mysql-server. Names match up in almost every case, especially on user-facing things (less on -dev/-devel and similar ones)
I decided to use docker since a particular client I am working with mandated that we use bare metal servers they own. A problem that I haven't quite solved yet is that I have several distinct hosts running docker daemons, with certain apps designated for certain hosts. I wish there were a way for docker compose to know about multiple docker hosts. Docker swarm could maybe help, but the clients specifically wants certain things to be redundant across two specific hosts.
Sounds like you might want CoreOS or just fleetd or Shipyard (which is Docker Swarm based). Is there any reason Docker Swarm can't support the your redundancy needs?
No mention of how to debug code in containers... or shared containers created so developers can share libraries.... or a multitude of other things which do happen when you start letting developers directly push things to production
If you are using docker you need to ensure the docker container author takes the pager for the services he provides :)
> Not having to pull from git on every deployment eliminates the risk of a github outage keeping you from being able to deploy. Of course it’s still possible to run into this issue if you rely on DockerHub or another hosted image registry.
A container doesn't consume complexity and emit order. The complexity is still in there; you still have to build your containers in a way that is replicable, reliable, and automatable. I'm not necessarily saying configuration management is the only way to address that complexity, but it does have to be addressed in a container-based environment.
Now, I understand that in many cases some of the complexity is now being outsourced to others who maintain the containers (in the same way we've been outsourcing package management to operating system vendors for a couple decades), and so maybe some of the complexity has been pushed outside of your organization, but you have something being deployed that is internal to your org, so you have to deal with that complexity. Container management tools just don't do things at that level.
There's always a point where what you're doing is unique to your environment, and you have to be able to manage/reproduce/upgrade it somehow.