I'm not saying those looking to fork Docker are wrong. I don't think that they are. But I think Docker's approach to Swarm is more useful than the roadmap that those organizations considering a fork wish to pursue.
Kubernetes, Mesos, etc. appear to be great orchestration tools for an organization with a few [or many] engineers dedicated to operations. Their not so great for a small team [or individuals] who are just trying to deploy some software.
As I see it, Swarm seeks to solve orchestration analogously to the way Docker seeks to solve containers. Before Docker, LXC was around and the Google's of the world had the engineer-years on staff to make containers work. Docker came along and improved deployment for the ordinary CRUD on Rails developer who just wants to go home at night without worrying about the pager going off.
To put it another way, it looks to me like the intent of Swarm is to provide container orchestration for people who don't run a data center. Like Docker, it is an improvement for those scaling up toward clusters not down from the cloud.
None of which is to say that moving fast with Swarm isn't a business strategy at Docker. There's a whole lotta' hole in the container market and part of that is because the other organizations currently supporting development of container orchestration tools has business interests at a much larger scale...Google doesn't see a business case for pushing Kubernetes toward the Gmail end of the ease of use spectrum.
The desire to fork is based on the needs of the cathedral not those in the bazaar.
> Kubernetes, Mesos, etc. appear to be great orchestration tools for an organization with a few [or many] engineers dedicated to operations. Their not so great for a small team [or individuals] who are just trying to deploy some software.
In my opinion, the opposite is true about Kubernetes (though it may be true about the Mesos stack). It's a great way to reduce the need for dedicated ops engineers.
Once you have a Kubernetes cluster running, it's essentially a "platform as a service" for developers. They can deploy new pods and configure resources all with a single client. You generally no longer need to think about the servers themselves. You only need to think in terms of containers. No package management, no configuration management (in the Salt/Puppet/Ansible/Chef sense), no Unix shell access needed.
It's hugely liberating to treat your cluster as a bunch of containers on top of a virtualized scheduling substrate.
Kubernetes itself is very lightweight, and requires minimal administration. (If you're on GKE, K8s is managed for you.)
Swarm Engine is very much knocking off Kubernetes' features. Swarm offers the convenience of building some orchestration primitives into Docker itself, whereas Kubernetes is layered on top of Docker. Other than that, they're trying to solve the same thing. I'd say Kubernetes' design is superior, however.
You still need to provision the Kubernetes/Docker servers? You still need Anisble/Puppet for that part.
There aren't a lot of "Here's an API to throw up your containers and provision IPs" services out there. I see that Amazon has one (I avoid Amazon; they're the new Wal-Mart), but no one else really. I mean DigitalOcean provides CoreOS, so in theory you can provision one of those droplets and toss up some containers, but there's not a real API for "use this endpoint to deploy your container."
In the corporate world, yes .. we have dev ops teams to try out and create provisioning systems to throw up Mesos or CoreOS or OpenStack clusters. Once they're up and your devs are using them, they're lower maintenance that provisioning individual servers for individual services. For home devs, it'd be nice to have a plug an forget for docker containers (other than Amazon).
The other thing I still don't get about Docker: storage. MySQL containers have instructions for mounting volumes, but it doesn't feel like there's a good, standard way to handle storage. I'm sure that makes it more versatile, but like I said previously: plug and forget for home devs and startups. You want a databases as a service, get ready to shell out $80/month minimum for Amazon RDS or create a set of scripts around your docker containers that ensure you have slaves and/or backups.
> You still need to provision the Kubernetes/Docker servers? You still need Anisble/Puppet for that part.
Also true about Docker Swarm.
> There aren't a lot of "Here's an API to throw up your containers and provision IPs" services out there.
I strongly believe this (in particular, "hosted Kubernetes as a service") will start arriving soon. It's a great business opportunity.
> The other thing I still don't get about Docker: storage. [...]
Kubernetes solves this. For example, if you're on GCE or AWS, you can just declare that your container needs a certain amount of disk capacity, and it will automatically provision and mount a persistent block storage for you. Kubernetes handles the mounting/unmounting automatically. Kill a pod and it might arise on some other node, with the right volume mounted.
Don't try to put storage into containers at this point. You'll just be in for a bad time.
Either run a seperate vps for MySQL or use a hosted solution.
Containerization will get there but i think putting old some types of old school apps in containers is a bad idea.
E.g. redis, beanstalk, and similar fit nicely I to clusters and restarting nodes without too much issue or downtime.
MySQL, if your install is large enough, you never want it to go down. This pretty much goes against everything to do with containerization, your developers also gain no benefit of it being in a container (apart from maybe running a small db in developent only where Provisioning a vm would be too painful)
I don't disagree. I think the differences lie in reducing versus eliminating a requirement for operations engineers and the level at which abstractions are surfaced. To me, Swarm looks headed toward abstractions at a higher level than Kubernetes much like Docker surfaces higher level abstractions than LXC.
Maybe that's the difference between a product and a solution. I don't know.
> To me, Swarm looks headed toward abstractions at a higher level than Kubernetes
Kubernetes admittedly takes a slightly different, modular, layered approach, whereas Swarm is simple to the point of naivety.
This simplicity is potentially a threat to its future ability to adapt to different requirements, whereas Kubernetes offers a separation of primitives that allows it to scale from "simple" to "complex".
For example, in Kubernetes, a deployment is a separate object from the pods themselves. You create a deployment, which in turn creates a replica set, which in turn creates or tears down pods as needed.
But you don't really need to work directly with replica sets or know exactly how they work, but they're there, and can be used outside of a deployment. If all you care about is deploying apps, then you only need to deal with the deployment.
Exactly. To me, that approach is what allowed Docker to make containers on the laptop ubiquitous. Kubernetes is unlikely to be the tool that takes schedulers to that universe. Swarm might and I think that's the goal.
I mean, I don't really want to care about pods and replica sets. They're obstacles in between me and what I want when what I want is more horsepower behind an app. For the same reasons I probably am better off with garbage collection than malloc [1]. I've only got so many brain cells.
It's not useful though. You can't take Docker "laptop containers" and use them in production. It's at best a development tool, at worst creating yet another environment to support and to have test differences in.
The nice thing about Kubernetes is that you don't really need to know much about those things, but if you do need to "level up" in terms of production complexity, those tools are there to use.
I'm not sure Swarm has it's place yet, but rather that it is actively looking for it and that's the basis for the complaints. Kubernetes and Mesos have market segments that are distinctly different from that which Swarm appears to target. It's roughly those capable of moderately sophisticated devops versus anyone who can use Docker.
Those two are just examples. My point is that if you need to dig in, you can. You can do this because those higher-level behaviors are built on a solid foundation of well-thought-out primitives.
What I expect to happen is that we will have a diverse set of things built on top of Kubernetes. Some of it will get folded into Kubernetes core. Some will not. I think the center of gravity for containers has been shifting towards Kubernetes.
Kubernetes deployments are as high level as you can get while still having anything to do with actual operational environments. You don't have to use the lower level constructs if you don't want to. If you don't understand what either is doing, you're not going to be able to operate it anyway, and all of these still require operators.
When you say K8s is lightweight, I presume you're purely talking about GKE deployments, and not on-prem or alternate cloud deployments?
My experience of K8s so far is that GKE is happy path and most of the demonstrations/documentation is focused on that use case. When you step off that path, you can either go for something scripted which does a lot of things in the background, or what seems like quite an involved manual setup (etcd, controller node setup, Certificate setup, networking etc)
If you don't feel like watching a video, here's the commands to get a fully production-ready cluster going:
master# apt-get install -y kubelet
master# kubeadm init master
Initializing kubernetes master... [done]
Cluster token: 73R2SIPM739TNZOA
Run the following command on machines you want to become nodes:
kubeadm join node --token=73R2SIPM739TNZOA <master-ip>
node# apt-get install -y kubelet
node# kubeadm join node --token=73R2SIPM739TNZOA <master-ip>
Initializing kubernetes node... [done]
Bootstrapping certificates... [done]
Joined node to cluster, see 'kubectl get nodes' on master.
All certs, all configurations, etc - done. No bash, no mess. All you need is a package manager and docker (or your container runtime of choice) installed.
I think one of the challenges for me in learning Kubernetes is that the a lot of the tutorials do take a quite "run this script and things happen" approach which is great in that it gets you up and running quickly, but is less good for understanding what's under the hood, which is really needed before some people will feel fully comfortable with the technology.
Obviously there's Kelsey Hightower's great Kubernetes the hard way tutorial, but even there some details are quite GKE specific.
It'd be nice to see some tutorials that make no assumptions about running on a specific cloud provider or using automated tools to bootstrap the cluster.
Oh, for sure! Being totally script free (and cloud neutral) is the goal of this effort. As you see above, there's absolutely ZERO script/bash/automation/etc.
My suggestion is to design an interface for Kubernetes in ways less reflective of Conway's Law. That's how the needle moves from 'solution' toward 'product'. In essence, that's what I think 'Gmail Easy' boils down to. Gmail has user stories that are based on people (much)+ less technically sophisticated than a new fresh from school Google SRE.
I think that the call to action suggests why Docker is pursuing Swarm.
In a decade of container orchestration development, creating a good onboarding experience with the diverse demographic of Docker users has never been a priority within Google. If it had been a priority, Gmail-easy Kubernetes integration might have long since been a pull request from Google. That's what it says right on the label of the open-source software can.
The talk about forking Docker seems consistent with the absence of pull requests. It's just another business decision.
People on the bazaar who just want to deploy a CRUD web app you mentioned shouldn't be getting into the containers/orchestration in the first place. Just go for some PaaS or rent a server or two, write shell or python scripts to set them up and save yourself all the hassle.
The benefits of containerization are not limited to production deployment. Docker makes it trivial to run the exact same container in QA/dev as well - it's a 10 line Dockerfile and three commands (create, build, run) vs having to build custom shell/python scripts. Don't underestimate how useful that is, especially in smaller offices where you don't have the staff dedicated to automating everything. Even small shops also benefit from the ephemeral nature of containers - redeploying a container is a lot quicker and easier than redeploying an entire VM/physical server. PaaS isn't without it's own issues either (you have to learn the AWS/Google/openShift/azure/heroku way of doing things) and can be cost prohibitive.
What does Docker bring to this? This is basically the argument "use your production deployment mechanism to create your development environment". There are lots of ways other than Docker to do this, Vagrant being one of the most prominent.
One of the differences between a containerized deployment and scripted provisioning with tools like Vagrant is the state of a node following a deployment failure.
Deployment of a container to a node is roughly an atomic transaction. If it fails due to a network partition or server crash etc. The container can just be deployed again to the node again. By comparison, a partially executed provisioning script can leave the target node in an arbitrary state and what needs to happen next to bring the node online depends on when and how the deployment script failed as well as the nature of any partial deployment...and whatever state the server was in prior to deployment.
That's a reason why Docker is great for production deployment. If you don't use Docker for production your production deployment scripts have to deal with that.
But if you use your production deployment scripts for development deployments, then that problem has been dealt with one way or another.
Absolutely. One can reinvent any part of the Docker [+ GKE] setup from first principles on either dev or prod axis. That being said, it's nice to spend a day to setup Docker [+ GKE], and have [most of] the "babysit apt-get" and "babysit prod runtimes" taken care for.
Not being a containerization user currently, I cannot vouch for the accuracy of the above assertions, but this reasoning is why I've been pondering moving from VMs to containers at or consultancy. We regularly have to "approximate" customer setups for development, and distributing an environment definition to peer developers seems more appropriate than the umpteenth VM clone.
I can very much recommend that approach. Working in a very small team we're using docker solely for reproducible development environments with very small footprint. Our projects are usually smaller low range backoffice sites.
We're still hesitant to deploy docker in production though, but its on our list. For now we rebuild the production environment from the recipes in the Dockerfile and docker-compose.yml.
Before that we were using VMs with vagrant. I never really liked how ressource intensive and slow they where. And without VM I also don't see the need for Vagrant anymore if I can use docker directly.
So after 2 years of using docker I'm really very happy with our development setup.
It's just adding yet another layer to the stack, so another interface to learn. And if things don't work you have to deal with the “low level“ docker setup anyway.
Seriously, I've never looked back at Vagrant ever since.
> People on the bazaar who just want to deploy 3 node web app with some web server, some sql database and some caching layer shouldn't be getting into the containers/orchestration in the first place. J
But, that goes against the main hype driven development tenant. If it is on the front pages of HN, CTOs should quickly scoop up the goodies and force their team to rewrite their stack without understanding how technology works. Things will break, but hey, they'd be able to brag to everyone at every single meetup how they are using <latestthing>. Everyone will be impressed, and it also looks good on the resume.
/s (but I am only half-joking, this is what usually happens).
Really? I use kubernetes right now for a small use case (that hopefully grows) like this. I find it to be pretty exceptional, although the initial cluster setup can be difficult.
I also like using multiple vendors for a tool stack. Helps keep concerns separated.
People in the bazaar know the math and figured out, that they can get a second-hand Xeon or two servers from eBay, put it into cabinet with AC and electricity and withing few months, it will be cheaper than continued renting of PaaS.
For internal apps, that might be way better and cheaper than to run on AWS or GCP.
In my comment I considered mentioning that Google might be seen as having a commercial interest in a trend toward CRUD on Rails developers paying for PaaS/IaaS on it's proprietary platforms as a reason for not making Kubernetes as dumb-easy as Docker for an average developer.
On the other hand, I don't think that roll-my-own shell scripts are going to solve the problem Swarm/Kubernetes/Mesos address -- scheduling -- as well as those tools will. The reason I think that is that many scheduling problems are NP-complete [1] and dynamically optimizing a scheduler for a particular workload is a non-trivial algorithmic challenge even at the CS Phd level.
Sometimes CRUD on Rails turns into Twitter. Containers might have pushed the switch to Microservices down the road and allowed an earlier focus on monetization.
> Kubernetes, Mesos, etc. appear to be great orchestration tools for an organization with a few [or many] engineers dedicated to operations. Their not so great for a small team [or individuals] who are just trying to deploy some software.
I disagree. I worked with Kubernetes on a small team, and its core abstractions are brilliant. Brilliant enough that Docker Swarm is more or less copying the ideas from Kubernetes. Even so, Docker Swarm just isn't there yet.
> As I see it, Swarm seeks to solve orchestration analogously to the way Docker seeks to solve containers.
That's how I see it too. However, multi-node orchestration is MUCH harder. The Kubernetes folks know that. Setting up K8S from scratch on a multi-node setup is still difficult, and the community acknowledges it. Contrast that with the way Docker is handling it. Docker seems to be trying to sweep difficult issues under the rug. Look at the osxfs and host container threads on the Docker forum as examples. People had to ask the developers for more transparency before it started showing up.
I don't blame Docker for how they got to "sweep difficult issues under the rug", since that is part of their product design DNA. It's just that things are coming apart at the seams.
> Google doesn't see a business case for pushing Kubernetes toward the Gmail end of the ease of use spectrum.
That's correct. That is what GKE is for. On the other hand, that is what Deis/Helm is for. Turning the project over to the Cloud Native Foundation, listening to the developer community ... there is a lot of things Docker can learn from the Kubernetes project when it comes to running an open-source project.
So what should cathedrals do? Rely on the unpredictive bazaar?
They're [Kubernetes, Mesos, etc] not so great for a small team [or individuals] who are just trying to deploy some software
The industry consists not only of small teams and individuals, we (by this I mean all IT) also use services provided by cathedrals (AWS) which have to have something stable to rely on
Kubernetes, Mesos, etc. appear to be great orchestration tools for an organization with a few [or many] engineers dedicated to operations. Their not so great for a small team [or individuals] who are just trying to deploy some software.
As I see it, Swarm seeks to solve orchestration analogously to the way Docker seeks to solve containers. Before Docker, LXC was around and the Google's of the world had the engineer-years on staff to make containers work. Docker came along and improved deployment for the ordinary CRUD on Rails developer who just wants to go home at night without worrying about the pager going off.
To put it another way, it looks to me like the intent of Swarm is to provide container orchestration for people who don't run a data center. Like Docker, it is an improvement for those scaling up toward clusters not down from the cloud.
None of which is to say that moving fast with Swarm isn't a business strategy at Docker. There's a whole lotta' hole in the container market and part of that is because the other organizations currently supporting development of container orchestration tools has business interests at a much larger scale...Google doesn't see a business case for pushing Kubernetes toward the Gmail end of the ease of use spectrum.
The desire to fork is based on the needs of the cathedral not those in the bazaar.