I feel like the long-term architectural implications of virtual machines and now containers haven't quite sunk in. I'm not talking about the administrative advantages, which I think everyone is across these days. I mean the implications for the design of new applications.
As far as I am aware, folk aren't really writing distributable applications that target the VM up. You can get preconfigured stacks, or you can get standalone apps that you install in your environment.
But nobody's said: "Hey, if we control the app design from the OS up, we can make it much more intelligent, robust and at the same time sweep away a lot of unnecessary inner platform nonsense".
In terms of the slides, my approach is to reduce the NxN matrix by eliminating a lot of the choices. Why write your blog engine to support 5 different web servers when you can select and bundle the web server? Repeat for other components.
It gets better. Why write a thin, poorly-featured database abstraction layer when you can take serious advantages of a particular database's features?
You can't do this if you write under old shared-hosting assumptions. You can do this if you target the VM or container as the unit of design and deployment.
A step forward was made by RoomKey. You should read what their CTO wrote. At RoomKey, they made several radical decisions that gave them a very unusual architecture:
--------------
Decision One: I put relational data on one side and “static", non-relational data on the other, with a big wall of verification process between them.
This led to Decision Two. Because the data set is small, we can “bake in" the entire content database into a version of our software. Yep, you read that right. We build our software with an embedded instance of Solr and we take the normalized, cleansed, non-relational database of hotel inventory, and jam that in as well, when we package up the application for deployment.
We earn several benefits from this unorthodox choice. First, we eliminate a significant point of failure - a mismatch between code and data. Any version of software is absolutely, positively known to work, even fetched off of disk years later, regardless of what godawful changes have been made to our content database in the meantime. Deployment and configuration management for differing environments becomes trivial.
Second, we achieve horizontal shared-nothing scalabilty in our user-facing layer. That’s kinda huge. Really huge.
But then, you get to take responsibility for that entire stack. This is a bad thing.
Remember PHP register globals? Rails 2.x? Perl 4? I'd bet a lot of lazy devs would still be using those if they could just wrap it all up into a container and say "run this!" That's what commercial products do. And they're much worse for security as a result.
Fundamentally, I'd say the solution is to automate testing and installation. Make it extremely easy for a dev to test app A against a matrix of language implementations B, C, D, databases E, F, G, and OS platforms H, I, J. Make it easy to make packages that install natively on each platform, with the built-in package management tools. FPM and similar help with this. Nearly every platform will allow you to create your own package repos. Better tools = better code = more flexibility = less ecosystem dependency
Containerization as a logical separation for security (ala chroot/jails before it) makes sense, but doing it so you can shove your whole OS fork in there and then fail to maintain it seems foolhardy and myopic.
What I notice about all your examples is that they are stack problems that an application could not, on its own, have fixed. In the current model, each part of the stack has an independent lifecycle, creating shear points and hidden security flaws.
If the application can control the whole stack, then the application author can fix it.
Automating test and install just puts you back where you started: with a gigantic test matrix that will impose non-trivial drag on the whole application's development.
And it's not necessary. It's just ... not. necessary.
> If the application can control the whole stack, then the application author can fix it.
You are right, but the other point is that it becomes the application authors responsibility to fix it.
If you're bundling apache httpd with your app, and there's a security flaw and a new version released, it becomes your responsibility to release a new version of your app with the new version of httpd.
If there are 1000 apps doing this, that's 1000 apps that need to release a new version. Instead of the current common situtation, where you just count on the OS-specific package manager to release a new version.
Dozens of copies of httpd floating around packaged in application-delivered VMs is dozens of different upgrades the owner needs to make, after dozens of different app authors provide new versions. (And what if one app author doesn't? Because they are slow or too busy or no longer around? And how does the owner keep track of which of these app-delivered VM's even needs an upgrade?)
You're describing what you see as the advantages of the shared hosting scenario and in the blog post I linked, I explain why I think that business will be progressively squeezed out by VPSes and SaaS.
In any case, there's no difference in kind between relying on an upstream app developer and an upstream distribution. You still need to trigger the updates.
And you might have noticed that stuff is left alone to bitrot anyhow.
I am not talking about an app that is distributed to be installed "on" OS X, BSD, Illumos etc.
I am talking about an app that is packaged to run "on" Xen, VMWare, or maybe docker (LXC) for some cases. Or zones for others. Or jails. Whatever.
The point is that you, the application designer, ask yourself, "what happens if I have total architectural discretion over everything from the virtual hardware up?"
But, rather than the panacea you envision, what I think would actually happens is you end up with a lot of people doing substandard OS release engineering jobs, neglecting security patches, etc.
Or...
Cargo culting around a small number of "thin OS distributions", which is substantially the same as what we have today.
Heck, "total architectural discretion over everything from the (virtual) hardware up" is pretty much the definition of an OS distribution. Am I missing the point here? Is there something about this other than the word "virtual" slapped on there that's unique from what we have now?
Consider that a lot of applications, when shipped in VMs or containers, needs very, very thin slivers of a full OS. Especially in things like an LXC container which can easily be set up to share a subset of the filesystem of the host.
E.g. many apps can throw away 90%+ of userland. So while they need to pay attention to security patches, the attack surface might already be substantially reduced.
And LXC can, if your app can handle it, execute single applications. There doesn't need to be a userland there at all other than your app.
Now, it brings its own challenges. But so does trusting users to set up their environments in anything remotely like a sane way.
> Is there something about this other than the word "virtual" slapped on there that's unique from what we have now?
Yes: virtual machines and VPS hosting make it possible to bypass shared hosting. That means you needn't write apps which have to aim for lowest common denominator.
Edit: I agree that the approach I'm advocating introduces new problems. But obviously I think that it's still better than the status quo, which is largely set by path dependency.
I think you just end up moving the work around. Not sure the current concentration of security at a few points (distros) has scaled. Most web application developers do not use a distro stack anyway for much. Most of the security issues in a distro apply to stuff you don't use, although it may be installed. Traditional Unix was a much more minimal thing.
> The application author can provide an updated container if there's a security problem.
Yes, but now it's their responsibility to stay on top of updates for the entire stack, and push out updated containers whenever any part of it changes. Whereas in the traditional model staying on top of updates to anything other than the application itself is the responsibility of the user or his/her sysadmin.
Not saying containerization of apps isn't a promising concept, just that it does require the app developer to take on a lot of additional responsibility.
But doesn't that ignore the move toward IaaS we are seeing? Where a customer is buying compute time, instead of access to install an app onto a managed OS?
It's getting to the point where very soon we will have complete clouds on demand - we pretty much already have them, but soon that will be a trivially selected level of granularity.
We can deploy OpenStack clusters extraordinarily easy now with Fuel. Having fully deployed app clouds is pretty much already here, if not just around the corner.
> "As far as I am aware, folk aren't really writing distributable applications that target the VM up. "
We are. You might want to check out Mirage [1] microkernels, which is an OS that targets the Xen Hypervisor. You write your application code and select the appropriate libs, then compile the whole thing into an 'appliance'. We have big plans for the kinds of system we want build using this and if you'd like to keep up with it, pls drop me an email. The devs are presenting a Developer Preview at OSCON so there'll be more activity soon.
Agreed, sort of. I actually referred to Mirage in my honours proposal and it was one of my inspirations.
I had in mind to use the facilities of an existing OS and tools rather than reinvention. My other major inspiration was OK Webserver (and through it, "Staged, Event-Driven Architecture").
I host blogs and one of my pain points is that slow plugins hold up rendering.
What if, for instance, rendering is a graph of pipelines, and there's known logic for failing to meet a rendering deadline? So if you have the blog page and the %#%^^ "Popular Posts" plugin is running slow again, it doesn't slow down the whole site. That <div> merely shows old content, or is excluded.
You can then use standard operating system facilities to ensure that, for example, the "posts" and "page" modules get top priority. Then "comments". Somewhere way down the list might be "complicated plugin that talks to five remote servers which crash half the time".
You can do some or all of this within a programming environment. For the common case, only some. But why not use operating system facilities? They're already there, they're battle-hardened, they enjoy universal coverage of the system and are closer to the metal, real or virtual.
Mirage is I think more of a programming-lanugage environment. Some of the facilities I'm point out exist. Some don't. I don't feel like writing all the missing bits from scratch when they're already available off-the-shelf.
I have been experimenting along these lines, by making a scripting interface to Linux that lets you do the basic stuff that you need to bring up a VM/container/hardware [1], This includes bringing up networks, configuring addresses and routing. There is a lot to do (still need iptables), but you end up with a script over Linux that configures it, using a scripting language and ffi, which you can compile into a single binary and eg run as your init process. Linux is a pretty decent API if you wrap some scripting around it rather than a lot of C libraries and shell scripts.
You do not of course end up with a standardised interface from this, as it is dependent on the Linux environment you are running in (although a VM can standardise this). So I am also experimenting with userspace OS components, like the NetBSD rump kernel [2] in the same framework.
Also helping here is the shift (gradually) from applications (like web servers) to libraries that you can link in to your application, or full scripting inside (like openresty for Nginx), which addresses your bundling issue. If you are building single function applications that are then structured into larger distributed applications this is much simpler.
Back when I was preparing to take an honours year, I wrote an circulated several project proposals. One was to explore the argument above with a constructive project -- writing a blog[1].
An example I gave for the inner platform effect was the Wordpress file wp-cron.php.
It gets called on every request made to Worpdress because WP has no other way to arrange for scheduled tasks to be carried out. So you get a performance hit and your scheduling relies on stochastic sampling. Oh, and it stops working very well when (as inevitably happens) you slather Wordpress with caching.
In an OS-up design, you just delegate this to cron.
Or plugins. In a standard current design, these can't be isolated. In an OS-up design, you can can make them standalone programs with separate accounts that can't reach into and interfere with the core code. No more broken sites from a PHP error in a hastily-installed plugin. Similarly, you can control their access into the database (instead of having a shared login that all code running in the application shares).
And so on, and so forth.
[1] I'm happy to forward copies of the proposal. My email is in my profile.
Sounds like Erlang. (You ship a copy of the Erlang VM emulator ("ERTS") with your Erlang application "release", and upgrade the release as a whole, rather than upgrading just the application code.)
Of course, you could argue that the whole design of Smalltalk with it's VM is already abstracting away the OS...
Indeed, some of the advantages from java on the server side is arguably that it has its own VM, so that you don't target the OS platform, but the java platform.
Everywhere I've worked, the developers have needed to figure out the dependencies requires to get the software working. Sometimes with the assistances of a dev-ops or ops guy.
All this does is say "while you figure that out, put it in a script". First, it means we can test the dependencies easily by re-running the script to see that it actually accurately reflects what needs to be done.
Secondly, when you're done, you have a reproducible deployment environment that massively simplifies ops and dev: Ops can decide on upgrades, re-run the scripts, have QA run their tests and know the upgrade won't break stuff in production. Dev can make code changes and be confident that what they hand off will actually work in the production environment because they've test deployed it on VMs built from identical templates.
As long as your team can figure out how to deploy the software they write, they can do this. If they can't, you have bigger problems.
Here's a rather contrived example but it illustrates the idea i'm getting at: "Will the inventory agent software be able to login to audit this container environment when they're done building their release?"
The point is : PaaS give you the advantages of shared hosting, but a good PaaS isolate properly all apps. It's definitively a good way to focus on dev and let a ops and constraint on trained teams.
I think docker or similar is a great step forward... One question in my mind though: what about the databases?
Say you have a web app and a reporting app that use the same database (and probably a communications framework - zmq server rabbitmq etc in the middle there as well). How does docker deal with the following:
* The data? I can see postgres, redis or whatever being packaged up into containers, but what about the data that they use? Will there be attachable storage? Will you share some exposed resource on the host? Will the data be another container on top of the database app container?
* Routing. How do you tell your reporting/web app containers "this is where your message bus and database live?"
* Coordination. I'm used to using something like supervisord to control my processes - what's the equivalent in docker land? Replace the scripts in supervisord with the equivalent dockerized apps? A docker file for the host specifying what to run? How do you know if your app that you've run inside docker has crashed?
* Or do you just package the whole lot above up into one container?
> * Coordination. I'm used to using something like supervisord to control my processes - what's the equivalent in docker land?
The question doesn't really make sense: The equivalent is supervisord running inside the lxc container.
> * Routing. How do you tell your reporting/web app containers "this is where your message bus and database live?"
How do you tell them in a cluster? This is a problem anyone who's ever needed to scale a system beyond a single server has already had to deal with, whether or not the application is package up a container, and there's a plethora of solutions, ranging from hardcoding it in config files, to LDAP or DNS based approaches, to things like Zookeeper or Doozer, to keeping all the config data in a database server (and hardcoding the dsn for that), to rsyncing files with all the config data to all the servers, and lots more.
I'm not sure there's going to be a bullet-proof solution to this. I believe that Docker is the right step forward, especially for the application containers. What I have been tinkering with is the idea of having smaller, configurable pieces of infrastructure and then providing a simple tool on top of that (e.g. 'heroku' CLI).
Once you are past the procurement and provisioning steps you really need a way to describe how to wire everything together. I definitely haven't solved it yet but I sure hope to! :)
Take a look at "juju". Canonical is doing a bunch of stuff in this area. Juju does service orchestration across containers. I don't particularly like how they've done it, but it shows a fairly typical approach (scripts that expose a standard interface for associating a service to another, coupled with a tool that lets you mutate a description of your entire environment and translates that into calls to the scripts of the various containers/vms that should be "connected" in some way to add or remove the relations between them)
To be fair, it's been a while since I've looked at it, so it could have matured quite a while. I should give it more of a chance. My impression was probably also coloured by not liking Python... Other than that, the main objection I had was that it seemed on one hand that writing charms seemed over-complicated (might have been coloured by the Pyhthon examples..) , and that there seemed to be too much "magic" in the examples. But I looked right after it was released, so it's definitively time for another look.
(EDIT: It actually does look vastly improved over last time; not least the documentation)
Specifically, I run a private cloud at work across a few dozen servers and a bit over a hundred VMs currently, and we very much need control over which physical servers we deploy to because the age and capabilities varies a lot - ranging from ancient single CPU servers to 32 core monstrosities stuffed full of SSDs. They're also in two data centres.
When I last looked at juju it seemed to lack a simple way to specify machines or data centres. I just looked at the site again and it has a "set-constraint" command now that seems to do that.
The second issue is/was deployment. OpenStack or EC2 used to be the only options for deploying other than locally. Local deployment was possible via LXC. EC2 is ludicrously expensive, and OpenStack is ridiculous overkill to us compared to our current system (which is OpenVz - our stack predates LXC - managed via a few hundred lines of Ruby script) .
I don't know if that has changed (will look, though, when I get time away from an annoying IP renumbering exercise...), but we'd need either a lighter option ("bare" LXC or something like Docker on each host would both be ok) or an easy way to hook in our own provisioning script.
(EDIT: I see they've added support for deployment via MAAS at least, which is a great)
Docker will soon expose a simple API for service discovery and wiring. This will standardize 1) how containers look each other up and 2) how sysadmins specify the relationship. The actual wiring is done by a plugin, so you are free to choose the best mechanism - dns, juju, mesos/zookeeper, or just manual configuration.
> The question doesn't really make sense: The equivalent is supervisord running inside the lxc container.
So the solution is to "batch up" a load of apps into one container and run with supervisor or something as per my last bullet? I had pretty much envisaged a one docker container per application type of model...
You can do that too. But that is a very different setup. If you build "single application" containers, then the container will stop if the application stops, and you can run supervisord from the host, configured to bring up the container.
If you build full containers, with a single application, you probably still want supervisord inside each container. EDIT: This is because the container will remain up as long as whatever is specified as the "init" of the container stays up, so in this case your app can die without bringing down the container, and something needs to be able to detect it.
> The data? I can see postgres, redis or whatever being packaged up into containers, but what about the data that they use? Will there be attachable storage? Will you share some exposed resource on the host? Will the data be another container on top of the database app container?
Docker supports persistent data volumes, which can be shared between containers, between the underlying host and the container, or both.
> Routing. How do you tell your reporting/web app containers "this is where your message bus and database live?"
For now, docker leaves that to companion tools - which typically pass the information via command-line argument, environment variable, or some sort of network service (dns or some other custom protocol).
In the near future Docker will include a more standard way for containers to discover each other. Just like the rest of Docker, we will try hard to make it simple and easy to customize.
> Coordination. I'm used to using something like supervisord to control my processes - what's the equivalent in docker land? Replace the scripts in supervisord with the equivalent dockerized apps? A docker file for the host specifying what to run? How do you know if your app that you've run inside docker has crashed?
Docker itself monitors the process, so you could use it as a process monitor - but it's probably better to couple it with a more standard process supervisor, like supervisord, runit, upstart, systemd etc.
There are 2 ways to combine docker with a supervisor: docker on the inside, or docker on the outside. Both are used in the wild, but I think "docker in the inside" makes more sense. That is, have your init process monitor docker at boot, and make sure it's restarted if necessary.
> Or do you just package the whole lot above up into one container?
You can also do that (and some people do), which leads to the "docker on the outside" scenario, with supervisord or runit inside managing X processes. I would only recommend this as a stopgap, if you are unable or unwilling to break down your stack into multiple containers.
In the long run multiple containers is the way to go. It makes your stack easier to scale, maintain and upgrade.
I like Docker and I use it in dev/sandpit quite a bit lately... but I must admit, I don't quite follow the metaphor being pushed here.
Standard workloads don't require containers. You can have standard workloads on your physical or virtualised hardware. The choice between these is going to depend on a bunch of factors.
I don't see why they need to be bound together, as appears to be the case with Docker? Plus I don't see how that provides any special leverage (as opposed to having the choice).
Virtualized hardware and containers are roughly equivalent here - the reason for containers is lower resource usage, not some other fundamental difference vs. vm's.
But physical servers will not have "standard workloads" unless you wipe them everytime you reinstall, or run a system that can guarantee the server is put in identical state. It's not practical for most people to do.
The reason for binding containers or vm's together with "standard workloads" is thus because the key to standard workloads for most people is down to functionality for rapid, reproducible deployment coupled with isolation.
That Docker has so far been tied to LXC is a separate matter, but I frankly don't see that as a bad thing - too many virtualization / cloud solutions are hopelessly over-engineered.
1. For developer:
- your application works exactly the same way on your development, test and production environment because of using exactly the same os/libs/configuration
- very fast snapshot/restore simplifies automatic tests and makes them practical
2. For system admin:
- configure your system once in order to make docker work, run any docker image (program) with one, the same command
- run many applications which do not influence each other in a quite safe way
- download and run application without worrying about its dependencies
- very low footprint in terms of disk/memory/cpu usage in comparison with standard virtualization
1. Agree that's handy in dev. For development a container is probably a good choice - in other environments virtualization or physical hardware may make sense.
"More consistent" dev/test/prod is great, but I don't need nor want identical. My needs in dev are different from production - e.g. Everything from basic settings, URLs, networking, reloading behaviour to performance tuning. My needs for my dev DB are quite different from those in production.
2. Agree, but this isn't unique to containers (which is probably what I was aiming at).
Low footprint is somewhat irrelevant to "configure once". You can manage this on physical hardware if needs be.
Low footprint is great when you're resource constrained, somewhat irrelevant at scale. Plus a VM gives you other features/tradeoffs in return for that cost. Different circumstances will tend to favour one over the other.
I find Docker useful. Containers and Standardised Workloads are great. I see the utility in both, my argument is I see them as orthogonal. Particularly referring to the "alternatives" at the end of the deck, which seemed unnecessary distinctions for me.
> My needs in dev are different from production - e.g. Everything from basic settings, URLs, networking, reloading behaviour to performance tuning. My needs for my dev DB are quite different from those in production.
But you'll need a lot of things to be configured identically, and you'll need it to be reproducible (say for when a server fails). That's the point - nobody forces you to have just one identical script.
The containers vs. vm's distinction otherwise makes a difference for one reason only: Because containers allow higher density, it is acceptable to employ them in situations where the cost of vm's would be too high.
This is how it hangs together with standardised workloads: They're lightweight enough that standardised workloads becomes viable for many more use cases.
(In fact, on Linux, LXC is built on top of cgroups/namespaces which are far more flexible than that: Instead of building full containers, you can run individual applications in their own containers, or you can have servers with built in cgroups support that could e.g. do stuff like take an incoming connection, authenticate the user, look that user up and see "ok, this users connection is allowed to use 5% CPU and 10% of disk IO and 100MB memory", fork() and assign the forked process to cgroups accordingly, while at the same time isolating him so the process in question can only see a certain directory, can't see other processes etc.. Heck, you could even give the process its own network, and firewall it so it can only connect to certain specific network resources. )
I lover docker as much as the next pseudo-sysadmin (dev forced to do admin work), but I think the analogy with packing containers is a bit of a stretch. LXC works only on Linux. It's not gonna help you with BSDs, OpenSolaris or Windows.
Additionally, a lot of this feels like Docker is taking credit for LXC. Docker is a brilliant abstraction of LXC's obscure native interface, but LXC came before Docker and does most of the heavy lifting.
To your second point - I think you're right, but I think that pattern - the productization of existing technology - is a very powerful, positive thing. And picking the right abstraction, so that it feels - as you put it - brilliant (a view I share totally), is not trivial.
I dont get the hype. BSD and Solaris have had proper containers for years now. Just because Linux hasn't had any all these years and LXC is maturing only now, it is wrong to think that containers are the next big thing.
I also think it's strange. FreeBSD jails are essentially the same thing. When I describe Docker, I would call it "FreeBSD jails for Linux" rather than "lightweight virtual machine" or even "iPhone apps-style isolation".
But I do get the hype. BSD and Solaris, although they have many interesting benefits, just aren't that popular for a variety of reasons. I won't turn this into a discussion of why they aren't popular, but it's a fact that Linux is much more popular. Most people aren't going to switch even if BSD and Solaris provide more benefits. But now that an important feature has become available for Linux, I can see why people get excited.
It's like Node.js. Servers with evented I/O has exited for years, if not decades. But the fact that Node.js is Javascript made it mainstream and that's why there's hype.
Actually the Linux "container" implementation (namespacing) is based on Plan 9, and is much more modular than FreeBSD jails. Not that lxs/docker use that, just use as a jail...
Not in stock Linux - OpenVZ has been a bit like Xen that way, requiring a custom kernel. Which for most people means a dedicated host machine is needed.
Unrelated: Why are people familiar with Hacker News (e.g. the Docker folks) still using slideshare.net? That site feels like it's stuck around 2005. I thought everyone here knew about http://slid.es by now -- its UX is far superior.
I think http://speakerdeck.com/ is even better. IMO the UI is very cool and the entire site very easy to navigate; I actually find it hard to believe it's not a very popular slides sharing platform.
cgroup type interfaces are not very portable, havent really looked at the interfaces. You could use a tun/tap interface and a userspace network stack (eg NetBSD rump kernel is portable).
Right. You could have a container which has universal dylibs and static libs for x86_64/i386 compiled for Darwin in one directory and ELF shared objects and static libraries compiled for x86_64/i386 for Linux, logic to detect the platform and the main application binaries compiled for multiple platforms. And why not throw Windows in there too?
This would create a universal container, assuming all major OSes acquire facilities for process control groups, namespaces and chroot.
Disk space is no longer a consideration. The containers can be as big as we want - why not make them run natively everywhere?
Why bother? I'd be far simpler, and more resource efficient, to run whatever the user prefers of Xen/Virtualbox/Vmware or "bare" Linux as the base and not have to create monstrous franken-containers.
Well, speaking from personal experience, I develop on Mac OS X and deploy to Linux. It would be helpful to be able to run the same container on both for testing purposes.
There is one fallacy here: The 'thin' mobile client as being the future. Even though the world has moved towards thin client in the form of web clients, I firmly believe that the tides will turn - either by having the browser becoming a thick client itself (with local storage, lots of local logic) or by reintroducing native clients again - for which Containerization could play an essential role.
Imagine if we'd have a standardized container format on top of Linux, the BSDs including OSX and Windows with Cygwin - hello cross platform client applications with a single installer. Apple and Microsoft could integrate container technology in their AppStores and we could submit the same package everywhere.
The browser is already a thick client with local storage and lots of local logic. The difference is that the entire client is shipped to the browser on demand.
Ok, that's an interesting point. I'd argue that local storage hasn't really taken hold yet, i.e. the large majority of sites and apps don't work offline, but yeah, it already stretches the notion of a thin client (hence my beef about using this term on where we're headed in the future).
Would you say it's one of those things you feel the warm fuzzy feeling with after using it? I've been wanting to give it a try with all the hype surrounding it, but I don't see what it's going to do that provisioning a $10 Digital Ocean server wouldn't (albeit with a little bit of hassle).
I could literally type for days about the benefits of docker over a $10 digital ocean server, but no one would read it.
What I will say is this. If you're writing a trivial application that you and only you will ever need to work with, in an environment completely controlled by you, and you have a recipe that works - you're right, Docker probably isn't for you.
If you, like me, work with a huge product suite with many buildtime and runtime dependencies (services and applications), with many different runtime configurations, where even automated installation can take up to 15-20 minutes because of the sheer amount of work that's going on, there's a massive massive amount of efficiency to be gained in the dev/test/release/packaging process, let alone the massive amount of efficiency to be gained by the ops team in working in foreign environments.
There are certainly lots of other use cases (PaaS/SaaS are easy,) and those are valuable business building tools, but less interesting to me personally.
Docker and Digital Ocean servers are different things that can be used together.
Docker allows you to distribute applications that come bundled with their own OS-level packages/configuration.
Imagine you wanted to run Wordpress on your DO server. Instead of configuring a LAMP stack and setting up Wordpress on it, you could download and run a Wordpress docker container that came bundled with its own LAMP stack. It would be as simple for you as calling "docker run wordpress".Most importantly, its configuration would be completely isolated from that of the host machine.
But that's not quite true is it? You would still have to open up ports on your host, configure security, and somehow do the meta-config for the dockerfile to make sure that if the host goes down, `docker run wordpress` is called on startup right?
I really want to get excited about docker, but I guess I just don't understand it. Any links to more specific use cases?
Stuff like apt-get works great until it doesn't. You've run into this no doubt. Jails/LXC abstract dependencies away in an attempt to avoid this class of problems.
In other words: this helps to avoid mutating your system state with every command you run from the shell. :)
How does Docker handle deploying to machines with wildly varying capabilities? Every machine you deploy on may have different configurations and performance tunings. Is there a Docker container that can run a DB well on a EC2 small instance and 16 core 128GB RAM dedicated server?
Yep, jails, containers, solaris zones all the way back to IBM mainframes have been similar core technologies. Docker it self isn't containerization, Docker builds on Linux containers but adds a layer to dramatically simplify the build, distribution and execution of containers and that's what makes it game changing.
docker run gtihub.com/some/project
That command will clone, build and execute a container based on the contents of a repository by simply dropping a Dockerfile in the root of the project. That's a fundamentally different level of usage beyond any existing container technology.
As far as I am aware, folk aren't really writing distributable applications that target the VM up. You can get preconfigured stacks, or you can get standalone apps that you install in your environment.
But nobody's said: "Hey, if we control the app design from the OS up, we can make it much more intelligent, robust and at the same time sweep away a lot of unnecessary inner platform nonsense".
In terms of the slides, my approach is to reduce the NxN matrix by eliminating a lot of the choices. Why write your blog engine to support 5 different web servers when you can select and bundle the web server? Repeat for other components.
It gets better. Why write a thin, poorly-featured database abstraction layer when you can take serious advantages of a particular database's features?
You can't do this if you write under old shared-hosting assumptions. You can do this if you target the VM or container as the unit of design and deployment.
Yes, this is one of my bonnet-bees, since at least 2008: http://clubtroppo.com.au/2008/07/10/shared-hosting-is-doomed...