Right tool for the right job. Is K8s too complicated? For some use cases it is.
They probably should do a better job of discouraging certain use cases, but calling their elevator pitch “bullshit” is hyperbolic.
There are exceptions to every rule, but a good rule of thumb is cluster size. If you’re managing less than 25 servers than K8s is probably over kill. As you start to creep north of 40 servers K8s really starts to shine. The other place K8s really shines is dynamic load. I manage anywhere from 600-1000 16Gb VMs, I can’t imagine doing it without K8s.
If cluster size isn’t a good rule of thumb then application architectur probably is. If no one person in you company has a complete mental model of the application architecture or it is impossible because it is so complex, again container orchestration might be a good way to go.
Final point:
If you’re struggling with K8s swallow your pride and buy a managed solution, then learn as you go.
Even though I fully agree with you, running a little 3 node cluster just for fun is amazing. Thanks to Rook and an Nginx ingress controller with kube-lego, I’m able to deploy applications leveraging distributed storage and getting tls secured endpoints without a single ssh session. This, in my point of view, is absolutely powerful.
Shameless plug, I‘ve been working on a project explaining how to run small scale clusters in the cloud for more than a year: https://github.com/hobby-kube/guide
Not really; I had an issue that silently broke renewals. When I found the open issue that corresponded to it the maintainers were herding people to cert-manager. There are a lot of issues where they are doing that (besides very obviously at the top of the README with warning symbols).
At the very least it's eventually going to break on newer versions of Kubernetes. Maintenance-only != LTS.
Maintenance-only usually means deprecated. There are some areas in the software world where maintenance-only can last a long time and can be de facto LTS. The Kubernetes ecosystem is not one of them.
Depends on where you come from. In JS world, stable means there's a fork in it, and it's grayer than last weeks' meat. In server-side world, that means you're good to build a business on it for 15 years
Top voted reply: You had to use a thing to do what you want, and that thing isn't even supported any more. [Either I was hallucinating at the time and imagined it, or there was a reply that got deleted]
I think you've just quite nicely demonstrated the original authors point.
The words have meaning. If you don't understand them you could just ask for them to be explained rather than throwing out insults. I suspect you actually do understand the meaning of what was written though so I suppose that means you're just trying to start a flamewar.
The problem is that "leveraging" can be replaced with "using" every time, with the added benefit that it won't leave a bad taste in the reader's mouth.
I'd say it's actually just a convention vs configuration argument.
Kubernetes isn't giving you anything you couldn't have already built with configuration management. It just happens to a standard written by a bunch of people with a background in the problem domain.
Personally, I think the abstractions are thoughtful and the system isn't really inherently more complicated than what you'll eventually build anyways, but as the article said, we all have a bias towards that crap we ourselves invented, because we already know how that works, and we find learning something like kubernetes a chore.
That said, I wouldn't say it's about how many servers you manage. If you have a single monolith, written and ran by a single team, it doesn't neccessairly get more complex as you throw more servers at it. But when you have several teams, writing and hosting several systems, it's nicer to have a convention framework instead of an undocumented snowflake platform.
I have to agre on your final point though. If you find running kubernetes complicated, go to GKE and treat it like you do your IaaS provider, which is more complicated but isn't something you typically deal with.
And there’s two sides to the equation - even if you have a world class orchestration platform if your application architecture is extremely stateful and can’t tolerate certain services being down the platform can not help. Most places I’ve observed (large and small) simply do not write software in a manner that can be containerized effectively. The abstractions and patterns developers have as a default in most languages tends toward system designs (emergent) where people as a rule write and cache files locally, datastores are expected to stay up forever and never change hostnames, and that service discovery is “too complicated” to ever use.
Configuration management absolutely can do this. Note that I'm a builder of kubernetes clusters (on-premise) but also have contributed 400+ patches to the salt configuration management tool.
Salt has an "orchestrate layer"[1] which allows running states on sets of minions. One of those layers can be to configure a service, and another can be to update the load balancers when said service is healthchecking green. Saying these things simply can't be done with configuration management is utterly false. Kubernetes just makes it easier and more approachable for those less skilled in under the covers systems and infrastructure stuff. Kubernetes is a tool that allows you to build things quickly, but it isn't for everyone.
It's not about skill and approachability; it's about not having tens of thousands of systems in different companies built out of hacks and science experiments, where experience isn't portable and every system's hacks and idiosyncrasies need to be learned anew. K8s raises the abstraction level on deployment of a multi-node application architecture.
Further, as it builds out standard abstractions for external services, it should enable applications to be written agnostic as to the specific cloud provider. That's pretty valuable. It has the potential to reduce AWS's lock-in advantage.
Yes, I'm aware and you're preaching to the choir. I'm literally the lead on $employer's k8s deployment. I was just pointing out that things like k8s can absolutely be done reliably with old school bare metal and config management + some orchestration glue. Only it isn't as simple for most and generally is done poorly. I'm a huge fan of k8s.
You're trading operational complexity in other areas for arguable operational functionality that CM of haproxy (or perhaps Microsoft's load balancer if you need UDP LB services) already provides for, in software that is still under active development and has sharp edges. Your systems should be as simple as possible, but no more simple than that.
Kubernetes is not turnkey and requires a significant operational time and resource commitment. Prepare accordingly.
> If you have a single monolith, written and ran by a single team, it doesn't neccessairly get more complex as you throw more servers at it.
In which world does our monolith only connect to a database that is hosted on a single node ;)
Most often we have at least 3 database nodes 3 x nodes, 3 redis nodes, whatever. even when we run everything on vms instead of bare metal instances (for which k8s even works) we would still need to manage them and create an insane amount of work to manage/keep them running. in the worst case people would ssh into them
If no one person in you company has a complete mental model of the application architecture or it is impossible because it is so complex,
One caveat to this: architectures that are complex because of poor design should probably be redone rather than hidden behind yet another layer of complexity.
Or find someone to train you. The Kubernetes stack is deep. There's no shame in being taught this stuff. Nobody is born already knowing all the nuances.
As an OpenShift consultant for Red Hat, in a single day, I've run a tcpdump trace to figure out why a particular NFS vendor had trouble with traffic over a new network vlan, then, spent an hour or two with a Dev in ChromeDevTools to help them get session management cookies straightened out. Some people call that a day spent as a "Full Stack Engineer", I call that Tuesday. And I can't tell you how many obscure corner case scenarios I've found bugs in Kubernetes, OpenShift or a client's App.
I think it's actually good as soon as you're doing microservices and don't want to use any managed vendor lock in service like app engine.
The setup is easy if you use a managed kubernetes offering and it provides so many things out of the box you'd otherwise have to take care of yourself. Service discovery, service lifecycles, updates, storage management, logs, load balancing (inter-service), all those things you should automate even if you have 5 servers, and k8s makes this darn easy in my opinion.
The declarative abstraction is also very easy to use and intuitive in my opinion.
I routinely take app engine apps and run them on the FOSS AppScale platform unchanged. For the things you can't port, like using Google as your auth, there are 3rd party solutions for that already.
Choosing Google's App Engine is more than just choosing a managed solution, too. You choose it because of Google's reach: consider, for a moment, how you send a push notification to an Android device. When you do it from Google's cloud, your metal is co-located with their push infra. Also caching, logging, tracing and even a global CDN is integrated into the platform. Think of the cost of having to sort this yourself.
The feature gap between compose and k8s is starting to add up though. Compose is nice for tutorials and POC within a containerized environment but IMO not that good for full-fledged production.
what if you can't use hosted k8s.
what if you are on vmware 3 node (no vsphere).
what if you at least need consul/etcd, postgresql, haproxy + two nodes (should be as highly available as possible).
how do you keep everything up to date, how do you deploy (no downtime) without a complex ansible config.
for myself I use k8s either bootkubed or kubeadm, I create ignition configs via cloud-config (coreos) and create coreos nodes via the vmware gui. after that I ssh once into either one node (bootkube) or in all nodes and call kubeadm or bootkube after that I have a working cluster. Maintance is done via coreos update operator. and I can upgrade k8s via either kubeadm or via kubectl on bootkube. (so basically the most complex thing on k8s is updating k8s, everything else isn't as hard as people thing. p.s. I do not have a bach/major in tech).
running into bugs and that I need to ssh into the nodes ;) (serious answer)
I can basically automate nearly everything, os updates (coreos), deployment updates (ci/cd), however updating etcd or k8s is a manual operation on bare metal, either I'm on bootkube than I need to login and update kubectl, than I need to update all daemonsets/deployments in kube-system (it's not always easy and bootkube has strange bugs, https://github.com/kubernetes-incubator/bootkube/issues/977)
So one should actually use kubeadm, once again I still need to update kubelet, so I need to ssh into the machine (anyway) I also need to update kubeadm and run it, again a manual operation.
*ssh means I need to do it on my own, for small clusters this can be done manually for bigger clusters it can be scripted via ansible or even better create an own software that does it. My point is, that's the only thing which is not "just works" on k8s (yet).
Edit: I also run into a bug that I could not upgrade k8s with kubeadm (but thats gonna be fixed soon: https://github.com/kubernetes/kubeadm/issues/727)
Edit2: the good thing is that reloading kubelet won't kill anything.
Similar to a couple of the other replies, updating kubelet or the like tends to touch enough dependencies that it starts looking sensible to do updates as rolling VM re-provisioning and rescheduling tasks, so you just sidestep the lifecycle bit of a kubelet upgrade gone pear shaped.
I haven't had to touch any of that since February though, but I will be again soon. I hope kubeadm and its ilk have gotten better at handling the lifecycle of a long lived cluster. I've done all that in terraform (from scratch), and I'd really rather have a more standardized option that's not buy $HOSTED or $VENDOR solution.
Nomad (www.nomadproject.io) is what I would recommend.
Nomad is ONLY a task scheduler across a cluster of machines. Which is why it’s not rocket science, nor is it operationally complex.
You say I need X cpu and X memory and I need these files out on disk(or this docker image) and run this command.
It will enforce your task gets exactly X memory, X cpu and X disk, so you can’t over-provision at the task level, Docker and friends don't do this.
It handles batch(i.e. cron) and spark workloads, system jobs(run on every node) and services (any long-running task). For instance with nomad batch jobs you can almost entirely replace Celery and other distributed task queues, in a platform and language agnostic way!
You can do load balancing and all the other things k8s does, but you use specialized tools for these things:
* For HTTPS traffic you can use Fabio, Traefik, HAProxy, Nginx, etc.
* For TCP traffic you can use Fabio, Relayd, etc.
These are outside of Nomad’s scope, except that you can run those jobs inside of Nomad just fine.
> You can do load balancing and all the other things k8s does
Nomad can do many of the things that k8s does, and even some things that k8s can't, but it does not do all the things that k8s does. For example, nomad doesn't have network policies.
Also, just to approach the base level of functionality and security in k8s you need to setup consul as well and bootstrap all the tls stuff, and at that point it's not much more work to get k8s going.
Nomad is pretty cool, and is rapidly catching up to k8s, it will be interesting to see what happens in the long run.
Yes, True enough. Arguably, network policies are outside of Nomad's scope. It's a resource scheduler. Nomad doesn't turn up interfaces for you, or do routing of IP traffic.
Nomad is not a one-stop shop, like k8s tries to be, it does resource scheduling in a nice declarative manner, that's about it.
It's much more in the unix toolset philosophy, let 1 tool do 1 thing well, and make integration as painless as possible.
k8s is much more in-line with the systemd way of thinking, it owns all the things and can be the only shiny in the room.
I'm not sure I'd agree that turning up k8s is as easy as turning up consul, nomad and vault. K8s in my experience tends to require lots of babysitting. Consul can require some babysitting, but vault and nomad require basically no babysitting, except the occasional, mostly painless upgrade.
For me it's really about operational simplicity. I have to keep this stuff running tomorrow and next year. Shiny, super-fast moving be all and do all stuff like k8s, Openstack, etc tend to be a giant pain that requires loads of babysitting(most notably at upgrade time), as the edge cases tend to hurt a lot, when you hit them.
I don't really see how nomad is operationally simpler than k8s. To run a service behind something like traefik on k8s I would:
bootstrap a CA for cluster tls
run etcd
run k8s (apiserver,etc,kubelets)
run your application
run the traefik ingress controller
To run a service behind something like traefik on nomad you would:
bootstrap a CA for cluster tls
run consul
run nomad (servers, clients)
run your application
run traefik somehow
I think the only thing that is really more complicated in k8s is the networking stuff, but that's only because it has more features, like having cluster dns automatically configured and being able to give every pod its own ip address which means every service can bind to port 80 without conflicts, and policy enforcement.
We are talking about different things. I'm talking about keeping Kubernetes/Nomad alive and breathing and happy. The ops part of devops. You are talking about running stuff under them. I agree they are similar in running applications under them.
Operationally simple:
* 1 binary, for both servers and agents.
* 1 config file. For consul & nomad 2 config files.
* Upgrades are simple, bring down a node, replace binary, start it back up.
Docs are straightforward, it's easy to understand how Nomad works it's not complicated, you can get your head around the server/agent split and nodes and everything in an hour probably (for both nomad AND consul). k8s is a very complex beast, with many, many binaries and configs, there are helpers that get you setup, but they all have their own pros and cons, and sharp edges. Chances are you would not want to use a helper in a production setup, which means you have to understand all those moving parts.
Keeping a k8s cluster running nicely and upgraded consistently requires many full-time admins. Keeping a nomad cluster running requires very little work(I do it part time, maybe an hour a month on a busy month).
Arguably for dev/testing under consul & nomad, you would do this:
consul agent -dev &
nomad agent -dev &
nomad run myapp.nomad
nomad run traefik.nomad
Adding vault to the mix for secrets: vault server -dev &
For production use it's obviously more involved, but most of that is just around understanding your operational requirements and getting them working with nomad, not really about nomad itself.
No, I'm talking about operationally too. If your consul and nomad deployments are only one binary and one config file then you're not using TLS . Half the effort of setting k8s up is bootstrapping the CA and certs for etcd and the k8s components
> k8s is a very complex beast, with many, many binaries and configs
Because it's much more in the unix toolset philosophy, let 1 tool do 1 thing well. Is that a bad thing now?
hyperkube does put all the server side components in a single binary, there's still a bit of configuration though. A lot of the options are repetitive, I bet one could wrap hyperkube with a single config file and some defaults and the end result would look a like like nomad.
You keep going on about setting k8s up, and not about maintenance. How much time in a week do you take to babysit your k8s cluster? Do you have an HA setup?
OK TLS takes 3 files, 2 for the key and crt and 1 for the config. If you get your TLS certs out of the vault PKI backend, it's very, very simple (https://www.vaultproject.io/docs/secrets/pki/index.html) the linked page covers the complete steps.
Again, I keep talking about maintaining Nomad/k8s for years. I've been running nomad in production for a few years now, I've had no downtime from nomad, and I spend about an hour doing upgrades every once in a while. I don't worry about nomad, it's just there and works. I run 3 nomad servers per data center for an HA setup. k8s doesn't even test their HA setup in development (source: https://kubernetes.io/docs/admin/high-availability/building/) . There is no way it works out well in real life, if they don't even test it yet.
Nobody I know that runs k8s pretends it's easy to keep running for years. Most places that run k8s have dedicated engineers to babysit k8s. I babysit our nomad, and lots of other infrastructure, and I do development of applications as well.
> How much time in a week do you take to babysit your k8s cluster? Do you have an HA setup?
I don't have a k8s cluster... so zero :-)
I don't have a nomad cluster either, because every time I look at it and start planning out what I would need to do to bootstrap consul+nomad and secure it, it starts to look more like a k8s install.
> There is no way it works out well in real life,
except that every cluster on GKE or created using kops, kubespray, or even kubernetes the hard way is HA, so it's not like no one is running an HA cluster. I think from k8s point of view, there isn't much to test as etcd is doing all the work.
Setup and install is the least of your issues when running something like nomad/k8s in production. The part that matters more, is what's it like to babysit it, and keep it running.
I agree people are running k8s HA in production, but there is a reason those people are dedicated k8s engineers. It's because it's a giant pain the ass to keep it running. Hence what I mean when I say it's "operationally complex".
Most people using GKE don't actually operate the k8s cluster, they let GKE run it for them. They just use it.
Using k8s and using nomad are similar from a developer perspective. Operationally they are night and day different.
Anyways, I suggest you go play with both systems, and try them out, put some non-important stuff in production under both of them.
Hosted k8s on google can be as little as $50/month. If your project has really little load, I think serverless with docker containers is the way to go, which might be under $1/month.
However that's not supported yet by Google or AWS, so you have to find another provider.
If you're on AWS, I highly recommend Convox [1]. They use AWS ECS instead of k8s, but EKS support will be coming soon.
Their rack project is free and open-source [2]. They also have a hosted console that can manage your racks, deploy from GitHub webhooks, and send slack notifications. But you can use convox without that.
It wasn’t a hard rule. It depend on the app and the use case, but probably just docker on all the machines with some static cloud loadbalancers in between everything. Or just straight config management.
At less than 25 machines throw it on Heroku. Yes, it’ll be more expensive hardware wise, but you’ll probably save that much in not operating you’re own servers.
Or mesos/marathon, though I'm in no way qualified to make comparisons vs k8s.
I am qualified to make that comparison, and Mesosphere all the way. Rapid, no bullshit deploy, not having to deal with key stack components being alpha, simple management, wide support (including running k8s on mesos if you are so inclined). There is a very long list of why mesosphere can be considered to be a stable and mature product.
Every time I work with k8s I think "this is cool and all, but absolutely not ready for production use" - and in my book production use means not needing a small army of sysadmins (sorry, SME's) to keep it running, not changing core components every change of the moonphase, the majority of core components being stable and battle-tested, and a learning curve that doesn't look like a altitude-vs-time plot of the latest SpaceX launch.
If you follow that link on Twitter, there are a lot of reasonable answers to his (rhetorical?) question - "In a single tweet — can you name a technical benefit you and your team have gained by switching to Kubernetes?"
"Sensible configuration, fast deployments, awesome community, and flexible control plane...among others"
"single root of truth for configuration"
"predictable deploys"
"Standardized orchestration, which makes talent easier to find."
"the capability to deploy on more than one platform, more than one cloud."
"Not called in at midnight when an entire node segment went down due to hardware failure."
"The ability to deploy hundreds of services within minutes."
I would argue that those were benefits of switching to any containerised infrastructure with an orchestrator, though. You could probably obtain most of those benefits with Nomad or Mesos.
It’s kinda like all those “we rewrote our Ruby app in Go and it’s 1000x faster so Go is awesome” articles. Containerisation is good for those things, but it’s still possible that Kubernetes is an excessively complex solution to the problem!
Fair enough. To your original comment, I was part of a large transition from bare metal servers and custom rolled CI, deployment, monitoring, etc over to using Mesos and the end result was infinitely more pleasurable and productive to work with as an engineer.
Fair enough (though i was referring to the blog author, not the tweet author, but could see how that could be confusing given the nested references)
The tweet author seemed sincere in their desire to answer that question, whereas the blog author seemed to be quoting it ironically (eg, he felt there are no ways to express k8s values in a tweet)
I posted this in another thread, but check this out: https://stackoverflow.com/questions/50195896/how-do-i-get-on.... That's the amount of crap I waded through trying to rubber-ducky myself into figuring out how to get two pods to talk to each other. In the end, I copied a solution my friend had gotten, and it's still not great. I'd love to be able to use Ingress or Calico or Fabric or something to get path routing to work in Kubernetes, but unfortunately all the examples I've seen online suffer from too much specificity. Which is the Kubernetes problem - it can do everything so trying to get it to do the one thing you want is hard.
I think part of the problem is that I can't immediately understand what is actually being done. You say you want a per se React frontend to talk to a Node.js backend. But that's not really a pod-to-pod communication issue; both frontend and backend will be communicating with the user's browser, outside the cluster.
Secondly, you deployed an Nginx ingress controller. You don't need to deploy more than one of these in your whole cluster, so you can go ahead and separate this from your program's deployment manifests. Typically, cluster add-ons are installed by running kubectl -f with a URL to a GitHub raw URL, or, if you want to be much cleaner, using Helm (basically, a package manager. It installs with one command and then you can use it to install things into your Kubernetes cluster easily, such as an Nginx ingress controller.)
If you're wondering why the process is such a mess, it's probably just because Ingress is still new. In the future, support for more environments will probably come by default, without needing to install third party controllers. Already, in Google Cloud, GKE clusters come ready with an Ingress Controller that creates a Google Cloud load balancer.
As a side note, I found that the nginx ingress controller was not working by default in my cluster. I noticed some errors in the logs and had to change a parameter with Helm. Don't recall what it was, unfortunately.
The problem with adding the Ingress controller via Helm (and with a lot of other Kubernetes abstractions) is that it spits out a lot of code that is then difficult or impossible to reason about. `Helm Ingress --whateversyntaxdefualt` spits out 1000+ lines of Ingress controller code that is essentially two deployments with a health check and auto spin up, but it's complicated. In production can I use this or is there a security hole in there? What if the ports the health check are using overlap with other ports I have assigned somewhere else? What if something equally silly?
Maybe Kubernetes is new so that's why it's so wild west, but it really feels like a pile of bandaids right now.
I have read through the nginx ingress controller code in Helm before deploying it into production.
What you're saying is pretty much the result of my biggest gripe with Kubernetes, though it's one I don't have a lot of ideas of how to fix; there's too much damn boilerplate. 1000 lines of YAML to store maybe 100 relevant lines.
That being said, can you trust that there is not a security vulnerability when you deploy i.e. NGINX alone? Your answer should not be yes. Even if you read through every single line of configuration and understand it, it doesn't mean something isn't wrong. Google "nginx php vulnerability" for an example of what I mean; innocent, simple configuration was wrong.
I read the Helm chart for nginx ingress because I wanted to understand what it was doing. But did I have to? Not really. I trust that the Helm charts stable folder is going to contain an application that roughly works as described, and that I can simply pass configuration in. If I want to be very secure, I'm going to have to dig way, way deeper than just the Kubernetes manifests, unfortunately. There's got to be some code configuring Nginx in the background, and that's not even part of the Helm chart.
> What you're saying is pretty much the result of my biggest gripe with Kubernetes, though it's one I don't have a lot of ideas of how to fix; there's too much damn boilerplate. 1000 lines of YAML to store maybe 100 relevant lines.
I think that's more a helm issue than a k8s issue. I've been using helm in production for over a year and k8s for almost three years. Prior to adopting helm we rolled our own yaml templates and had scripts to update them with deploy-time values. We wanted to get on the "standard k8s package manager" train so we moved everything to helm. As a template engine it's just fine: takes values and sticks them in the right places, which is obv not rocket science. The issues come from its attempt to be a "package manager" and provide stable charts that you can just download and install and hey presto you have a thing. As a contributor to the stable chart repo I get the idea, but in practice what you end up doing is replacing a simple declarative config with tons of conditionally rendered yaml, plug-in snippets and really horrible naming, all of which is intended to provide an api to that original, fairly simple declarative config. Add to that the statefulness of tiller and having to adopt and manage a whole new abstraction in the form of "releases." At this point I'm longing to go back to a simpler system that just lets us manage our templates, and may try ksonnet at some point soon.
The stable chart thing is so weird. Internally use we some abstractions, but I looks at stable charts and it requires so much time just to understand all of what's going on. Everything is a variable pointed to values, and you can't reason about any of it.
It seems like the hope is, just ignore it all, and the docs are good, and just follow them, but I don't live in any kind of world I can do that.
And the commits, and the direction of all of them seem to go more and more impossible to read conditionally rendered symbols.
I've had such a challenge understanding and using helm well enough. Small gotchas everywhere that can just eat up tons of time. This doesn't feel like the end state to me.
> It seems like the hope is, just ignore it all, and the docs are good, and just follow them, but I don't live in any kind of world I can do that.
Yep, agreed, we've used very few charts from stable, and in some cases where we have we needed to fork and change them, which is its own special form of suck. The one I contributed was relatively straightforward: a deployment, service and a configMap to parameterize and mount the conf file in the container at start. Even so I found it a challenge to structure the yaml in such a way that the configuration could expose the full flexibility of the binary, and in the end I didn't come anywhere near that goal. You take something like a chart for elasticsearch or redis and its just so much more complicated than that.
Right, I'm in particular working on charts for ELK, and it's just a mess. I just took down all my data (in staging, so all good) due to a PVC. The charts won't update without deleting them when particular parts of the chart change, but if you delete them, you lose your PVC data.
So I find the note in an issue somewhere stating, this is.. intentional?.. and that of course you need some annotation that will change it.
Let alone the number of things like, xpack, plugins, the fact that java caches the DNS so endpoints don't work on logstash, on and on.
It seems like everyone is saying operators are going to be the magical way to solve this, but if anything it seems like one set of codified values, that don't address any of the complexity.
You're using a statefulset? Here's a tip: you can delete a statefulset without deleting the pods with `kubectl delete statefulset mystatefulset --cascade=false`. The pods will remain running, but will no longer be managed by a controller. You can then alter and recreate the statefulset and as long as the selector still selects those pods the new statefulset will adopt them. If you then need to update the pods you can delete them one at a time without disturbing the persistent volume claims, and the controller will recreate them.
The Kubernetes creators never intended this verbose YAML format to be the long-term format for humans to work with directly. Heptio's ksonnet is where they want to go: https://ksonnet.io
No, this is not replacing the YAML under the hood, it's just more convenient for humans as a higher layer.
I found ksonnet, by actually looking for smarter json, jsonnet (which ksonnet is based on). Had little experience, while at Google with borgcfg - and while not the same, it's very similliar in spirit, even has easier to understand evaluation rules (unlike borgcfg, which I could never get fully, or I would understand them when focusing, and then if I haven't used them in a while would completely forget again).
> In production can I use this or is there a security hole in there?
What if there's a bug in nginx? That has a lot more lines of code than the controller code. As always, feel free to audit the code, but as with any environment to eventually have to trust someone's code.
> What if the ports the health check are using overlap with other ports I have assigned somewhere else?
Each container can bind to every port, only those that are exposed can conflict. (Similar to how docker works).
Honestly, kubernetes might not solve your use case. I use it because it solves mine (Self-healing, declarative configuring that works seamlessly across multiple nodes - aka accessing multiple nodes as one big computer).
You should not use Ingress.
Use Nginx or Haproxy and do it on K8S like you would do it normally and you can scale your nginx haproxy with kubectl scale --replicas=2 deploy nginx
On the outside use metallb which than gets you a single IP which is highly available either via L2 or with bgp (if you have bgp gear) if you are not on the cloud.
What people do wrong with k8s is that they think different, which is silly. k8s just exposes a "managed vm" where you can built stuff like you would do on vmware vApps.
Why not? It allows you to route your applications automagically with Kubernetes objects. Instead of writing nginx configurations that do what you want, you can just describe how you want your routing to work. I don't see why that isn't useful.
> k8s just exposes a "managed vm" where you can built stuff like you would do on vmware vApps.
Pods aren't even containers, less VMs. They're namespaces with containers in them.
Secondly, while you can use those pods like VM and boot systemd or whatever in them, that's not really the way you're intended to use Docker. Just to quote an official source:
> It is generally recommended that you separate areas of concern by using one service per container.
Instead of treating Kubernetes like a VM manager, the actual intended way to use it is to treat it like a task manager, like systemd or what have you. The pods are meant to represent individual services, and containers individual processes.
The problem Kubernetes solves is managing applications, not machines. The difference is not merely semantic rambling; it's a paradigm shift.
You have to accept that Kubernetes is a platform, and any platform, no matter how simple or complex, will come with its own set of technical challenges. Complexity isn't in itself isn't an evil. Unix is complex.
Just imagine the complexity of something like APT on Debian/Ubuntu, or RPM on Red Hat/Centos. You could run into a problem with installing a package with apt-get or yum, perhaps some configuration script written in Bash that misbehaves during installation. To fix it, you have to understand how it's put together. The same applies to Kubernetes. You have to know the layers in order to work with them. Someone who doesn't know shell scripts or how init scripts work will not be able to work on Unix. Kubernetes is kind of like an operating system in the sense that it's a self-contained abstraction over something lower-level; the complexity of Unix isn't different, it's just that the design and implementation different.
Helm "just" installs parameterized YAML manifests. But Helm doesn't pretend to be an abstraction that simplifies Kubernetes. It simplifies the chore of interacting with Kubernetes, but in order to really use Helm, you have to understand what it is doing. Specifically, you do have to understand the "1000+ lines" of ingress declaration that it spits out. The notion that you can get around the complexity of Kubernetes with Helm is simply false.
To start with Kubernetes, take a step back, forget about Helm, and simply use Kubectl. You can accomplish everything absolutely you need with "kubectl apply -f". Learn each basic building block and how all of them fit together. Learn about pods before you learn about anything else. Deployments build on pods and are the next step. Then learn about services, configmaps and secrets. These are all the primitives you need to run stuff.
Ingresses are arguably the worst part of Kubernetes, since it's a pure declarative abstraction — unlike pods, for example, an ingress doesn't say anything about how to serve the ingress, it just expresses the end goal (i.e. that some paths on some hosts should be handled by some services). Ingress controllers are probably mysterious to beginners because they're an example of a "factory" type object: An ingress controller will read an ingress and then orchestrate the necessary wiring to achieve the end goal of the ingress.
Moreover, you don't need ingresses. Ingresses were invented a little prematurely (in my opinion) as a convenience to map services to HTTP endpoints have make these settings portable across clouds, but what most people don't tell you is that you can just run a web server with proxying capabilities, such as Nginx. This gist [1], which can be applied with "kubectl apply -f nginx.yml", describes a Nginx pod that will forward /service1 and /service2 to two services named service1 and service2, and will respond on a node port within the cluster (use "kubectl describe endpoints nginx" to see the IP and port). Assuming a vanilla Kubernetes install, it will work.
> the amount of crap I waded through trying to rubber-ducky myself into figuring out how to get two pods to talk to each other
well, it's more the amount of crap you waded through trying to figure out that you were not actually trying to get two pods to talk to each other at all.
Path routing should work, if it's not, what you should do is exec into the nginx pod and inspect the nginx config that the nginx ingress controller generated.
Traefik has an example of this that is basically what you are doing:
I only glanced at your repo, but it sounds like you have an ingress problem. That is, you don't really have two pods communicating, but you need to hit your backend service from outside the cluster. Under the hood this is always accomplished with a node port on a service- a single port on any node in your cluster will forward to said service.
k8s has integration with cloud providers like aws to hook all this up for you, but all its doing is setting up an elb which load balances to that port on every node in your cluster.
I suppose we have to assume that we should use the right tool for the right job, and all that. And I'm sure that the Kubernetes folk know what they're doing. But I definitely think that it's too complicated without a cutting-edge Kubernetes expert in place to manage it. And even then, it's just a building block for a larger system.
I've tried maybe half a dozen times to get started for relatively small workloads – let's say between 5 to 50 servers. There seems to be a lack of any good documentation on how to get started – sure, I can spin up a couple of pods on a cluster or whatever, but then actually taking that to an easy-to-deploy, load-balanced, publicly-accessible infrastructure seems to be much harder.
It's maybe a lack of some kind of official, blessed version of a "this is how you should build a modern infrastructure from container to deployment" guide. There are so many different moving parts that it's hard to figure out what the optimal strategy is, and I've not found much to make it easier.
Maybe there's actually a niche for a nice and simple "infrastructure in a box" kind of product. I'd love something like Heroku that I could run on my own bare metal, or on AWS – the various solutions I've tried were all lacking.
In the end I've fallen back to using the Hashicorp stack of Nomad and Consul. This seems to work in a much simpler way that I can actually wrap my brain around – I get a nice cluster running balanced across machines, deploying and scaling my applications as required, and it was super easy to set up with good documentation.
I've been nervous about this approach because of how Kubernetes-mad the industry seems to have gone – so maybe it'll be worth another look when there are some more comprehensive solutions in place that make it easier to get started!
I’ve worked as a developer on 2 major managed k8s providers. Using k8s is easy but operating it is not. The biggest reason is that every cloud provider is using different underlying technologies to deploy k8s clusters. See for example all the different kinds of ingress controllers.
And it doesn’t help that docs sometimes mention how you do things assuming you’re using Google Cloud, other times with examples for Google, AWS, Azure, or bare metal, other times minikube or some other smaller scale wrapper, and other times it’s not clear what kind of platform the commands should ‘just work’ on as-written.
I had a ‘fun’ time figuring out the basics of things like ingress controllers in a bare metal setup because of this mishmash of docs.
I'm not the GP. But if I were setting up infrastructure for a fledgling startup today, I'd want a PaaS-in-a-box that worked like this:
Given these inputs:
* A minimum of 3 Linux servers: These could be VMs or bare-metal servers. They're all in the same data center. They each have 1 public IP address. Other than that, there's no network between them. Each of these nodes has at least 4 GB of RAM and a healthy amount of SSD storage. Note that I'm perfectly comfortable manually provisioning a fixed number of nodes, and manually adding more if truly necessary, because predictable cost is important.
Edit: The platform should assume that each node has a bare installation of Ubuntu or CentOS. Don't make me install something custom. And ideally, don't be picky about the kernel, because some dedicated server hosts provide their own.
Edit 2: All nodes should be treated equally. Any of them may become the master when needed. There should be no dedicated master; all nodes should be available to run applications. After all, resources are tight, and I want to get the most out of those three servers.
* An API for creating and updating DNS records, e.g. Amazon Route 53, DNSimple, etc.
* Edit: An off-site object store (e.g. Amazon S3, Backblaze B2) where the cluster can automatically send backups of all durable storage.
The PaaS-in-a-box should give me an installation script to run on each node. During installation, I provide my DNS API credentials and the domain(s) I want the platform to manage. Edit: I'd also provide credentials for the off-site backup object storage. And I forgot that when installing on the second and subsequent nodes, I'd provide the public IP of an existing node during installation.
And that's it. The cluster then manages itself, distributing durable storage among the nodes using something like Ceph for file storage, and DB-level replication for the supported database(s). HTTP/HTTPS traffic can be load-balanced among nodes using round-robin DNS, with unhealthy nodes automatically removed from the DNS records by one of the remaining nodes. If I need to run an outward-facing non-HTTP service, I should be able to reserve one or more ports for it and run it in either a round-robin or active/standby configuration with automatic failover.
Perhaps this is a tall order. Is it impossible to do this on top of something as last-decade as a handful of manually provisioned dedicated servers, with only the public network between them (but in the same data center)? I hope not, because that kind of server, while now out of fashion, is attractive to a company on a shoestring budget that nevertheless doesn't want to compromise performance.
Anyway, if a product meeting these requirements exists, I don't yet know about it. Cloud Foundry certainly doesn't market itself for that kind of deployment.
> Cloud Foundry certainly doesn't market itself for that kind of deployment.
Historically customers of Pivotal, IBM, etc focused on high availability requirements, which requires more machines. I've seen the default deployment for megacorps and it is, essentially, "bring all your dollars". But that is what they want -- no single points of failure. That means multiple VMs for everything. In fact it usually means multiple AZs for everything -- every component replicated thrice in at least two widely separated locations.
Even so, we've done work to split the difference. There's a "small footprint" version of PCF which is 4-9 VMs depending on how much risk you feel like taking. There's also cfdev if you just want to kick tires.
For the rest I can point to this and that. Service brokers for stuff like the DNS, traffic is directed by Gorouter or TCPRouter currently (with plans to switch to Istio), backups by BOSH Backup & Restore (BBR), running on raw hardware if the provider uses RackHD or can give you a BOSH Cloud Programming Interface (CPI).
BOSH is probably where you'd need to make your deepest peace. It has a very emphatic model of operations, which is that you are building a distributed system from known-good states, so individual machines are there solely to be paved whenever necessary. BOSH manages everything down to the operating system on the machine. You give it keys to an API that can provision compute, disks and networks and it will do the rest.
The uniform node thing would be tricky and my hunch is that, requiring a bunch of de novo engineering, would be less reliable on average than current arrangements.
The OP linked to a set tweets from Joe Beda (author of Kubernetes) that read "When you create a complex deployment system with Jenkins, Bash, Puppet/Chef/Salt/Ansible, AWS, Terraform, etc. you end up with a unique brand of complexity that you are comfortable with. It grew organically so it doesn't feel complex."
Kubernetes is a convention over configuration framework for operations. And by doing that it allows you to easily build on its abstractions.
At GitLab the Kubernetes abstractions allowed is to make Auto DevOps that automatically builds, runs tests, checks performance, diagnoses security, and provisions a review app. We could have made that with gitLab CI and bash but it would be much harder for people to understand, maintain, and extend.
Having an abstraction for a deployment is something that doesn't even come by default with Jenkins as far as I know. Now that we have that as an industry we can finally build the next step of tools with the usability of Heroku but the control, price, flexibility, and extensibility of self-hosted software.
I started with k8s beginning this year and from my point of view, the documentation is not good - and a major pain point when trying to get started.
Each part on its own is good and well written, but it lacks the overall picture and does not connect pieces well enough. For example, the schema definitions for the all the configuration files are not linked from the official docs (at least I wasn’t able to find them). The description of how to get started with an On Premise setup is scattered over multiple pages from multiple tools.
So from my point of view, things could be improved for beginners.
This! So many parts of the docs are excellent in isolation, but if you jump from one page on Ingress Controllers to a page on Pod replicas, to a page on something else, the pages are obviously written by different authors who use different services (e.g. Google Cloud vs, bare metal vs. Azure) and some of the docs assume you’re also using said service, but don’t explicitly call it out. And then you wonder why some command or example isn’t working the same in minikube or some test bed environment.
Too much of my ‘getting up and running’ time was spent figuring out things on my own based on Stack Exchange, blog posts, and experimentation.
The kubernetes up and running book gives a better big picture view than the documentation online does. It does a thorough job of developing motivation and context for using a broad swath of the system in a cohesive way.
K8s is a beast, and it has a fairly specific workflow, which for most people is not a good fit.
For example you need to understand that certain workloads dont fit well on the same box (DBs and anything IO/memory sensitive for example)
Then there is the "default" network setup where each node is _statically_ assigned a /24. (because macvtap + dhcp is "unreliable" and inelastic apparently.)
Now, I've heard a lot of talk about you either use k8s or ssh to manage a fleet of machines. That pretty much a wrong comparison.
K8s provides two things a mechanism for shared state (ie. I am service x and I can be found on ips y & z) and a scheduler that places containers on hosts (and manages health checks.)
If your setup has a simple config scheme, (using a simple shared state mechanism, like a DB, or a filesystem, or DNS) or you have no issues with creating highly automated deployments using tools like CF, chef, anisble cloudformation $other, then k8s has vanishing returns (it sure as hell doesn't scale to the 50,000 node count, because its so chatty. )
Basically its a poor man's mainframe, where all the guarantees of a process running correctly regardless of what is happening to the hardware has been taken away.
The best tools work at multiple levels and operator skills. Imagine a tool that can do everything from pound in a nail to launch a spacecraft. In an ideal situation, a carpenter can grab it and pound some nails and a rocket scientist can launch a Mars mission with it.
In the worst case, the tool forces you to learn how to launch spacecraft before you can pound nails.
In my limited experience (having twice now made attempts to get up to speed and actually accomplish something with K8s) it felt an awful lot like learning rocketry to pound nails. I may have had an entirely different experience if I had come looking to do a moon shot.
Having worked with Docker Swarm previously - Kubernetes seems like they added 5 layers of abstractions and at the same time made it feel more low-level.
I think Kubernetes has all the building blocks to be great but desperately needs a simplified flow/model and UI for developers who just want to run their apps.
I feel like managed Kubernetes solutions are a completely different beast from on-prem physical deployments, and that when people talk about how Kubernetes isn't complicated, they are probably running a webapp on GKE or Minikube on a DO droplet, or something similar.
Running Kubernetes on your own private, on-prem infrastructure - integrating services that live outside the cluster, exposing your cluster services, rolling your own storage providers, adding a non-supported LoadBalancer provider, managing the networking policies, etc., etc. - can quickly become an incredibly messy and complex endeavor.
Couldn't agree more with - "Like a lot of other tech that has ostensibly come out of google, it will likely have at least one major source of complexity that 95% of people do not need and will not want. "
We really need to understand our applications, their architecture, and then the right way to build and run them. I've talked with many people focused on k8s and losing sight of what they are trying to build and why.
"In a single tweet — can you name a technical benefit you and your team have gained by switching to Kubernetes?"
"The missing step in a comprehensive OSS devops strategy, from code (git) build (Docker) test (Docker) to deploy (Kube), which enables push-button CICD like no one has gotten right since Heroku."
If they'd got it as right as Heroku then deployment (or CD configuration) would be a one liner once a server is set up and scaling etc. would be configurable via a very simple web console or CLI.
Unless I missed something, last time we tried k8s (admittedly ~6 months ago) it was woefully far from that goal.
Although it didn't come out of the box, my deploys are now one-liners. The command builds, tests and packages my app into a docker image, then bumps the version number on the k8s deployment which triggers a rolling update of production pods. I'm very happy with my current deployment flow, mostly because it works reliability and I can forget about what it's doing under the hood.
Suppose I'm an application developer who is only interested in infrastructure because there needs to be some to run my stuff.
In this scenario, would I actually learn Kubernetes, or would it make more sense to go to straight to a PaaS solution? Like OpenShift (which uses k8s in the backend, I believe), or Cloud Foundry, Stackato etc.
I always get the impression that k8s has a lot of good ideas, but doesn't provide everything out-of-the-box for actually deploying complex application. How true is that?
My advice is to go directly to a PaaS. I work for Pivotal R&D in and around Cloud Foundry, so that's my personal horse. But I'd rather that you used any PaaS -- Cloud Foundry, OpenShift, Rancher -- than roll your own.
Building a platform is hard. It's really really hard. Kubernetes commoditises some of the hard bits. The community around it will progressively commoditise other aspects in time.
But PaaSes already exist, already work and either already base themselves on Kubernetes or have a roadmap to doing so.
To repeat myself: building PaaSes is hard. Hard hard hard. Collectively, Pivotal, IBM, SAP and SUSE have allocated hundreds of engineers in dozens of teams to work on Cloud Foundry. We've been at it non-stop for nearly 5 years. Pivotal spends quite literally millions of dollars per year testing the everliving daylights out of every part of it [0][1]. (Shout out to Concourse here)
I fully expect Red Hat can say the same for OpenShift.
Building PaaS abstractions on top of Kubernetes is an order of magnitude easier than doing what you guys did with Cloud Foundry. Building something that can scale to 250k Containers is monumentally hard, but with K8s, it is taken care of for you: https://kubernetes.io/docs/admin/cluster-large/
If you are a large enough organization, it is quite feasible to setup Kubernetes, chose an ingress solution and then build templated configurations that generate K8s yaml flies and run your deployments with Jenkins. I am not saying it is easy, but you don’t need any expertise with bin packing algorithms and control loops, and really is in the sweet spot of “devops” engineers.
> Building PaaS abstractions on top of Kubernetes is an order of magnitude easier than doing what you guys did with Cloud Foundry
It's worth noting Kubernetes didn't exist when Cloud Foundry started. Neither did Docker. The reason Cloud Foundry built two generations of container orchestration technology (DEA/Warden and Diego/Garden) was because it was partly inspired by direct experiences of Borg, as Kubernetes was. Folks had seen the future and decided to introduce everyone else to it.
The point here is not whether sufficiently large organisations are able to build their own PaaSes. They absolutely can. Pivotal's customer roster is full of companies whose engineering organisations absolutely dwarf our own.
The question is: should you build your own? This is not a new question. Should I build my own OS? My own language? My own database? My own web framework? My own network protocol? My own logging system? My own ORM?
The general answer is: no, not really. It's not the most effective use of your time, even if it's something you'd be perfectly able to achieve.
I know Kubernetes wasn’t around when Cloud Foundry was started. That wasn’t my point. Some of your argument was that building Cloud Foundry was hard (and I agree!), therefore you need a vendor’s PaaS. That isn’t true.
If an engineering organization takes Kubernetes and adds their own tooling around it to turn it into a PaaS for their org, that isn’t in the same league as building their own Database or what you did with Cloud Foundry originally.
> Suppose I'm an application developer who is only interested in infrastructure because there needs to be some to run my stuff.
Yes, you are the perfect candidate for a PaaS solution.
In general, unless you are spending thousands of dollars in infra per month, managing your own services is a waste of money.
Full Disclosure: I work for Red Hat Consulting in the Container and PaaS Practice, an OpenShift/Kubernetes expert group.
OpenShift's value (and its open source upstream Origin) is not just that it is a PaaS. It's a PaaS that you can install and run on many different infrastructure bases: baremetal, VM, public & private cloud, and others very soon. I personally run a couple flavors on my laptop for demo and training purposes. So maybe upstream Kubernetes is not so simple. But that is why we (Red Hat) call OpenShift "Kubernetes for the Enterprise". It means we make some sensible defaults, decisions, architecture choices and make installation supportable to get you started. However, the fact that OpenShift & Kubernetes allow you to define you application architecture in reusable and portable object definitions is it's biggest benefit, IMO. And you may not need to spend thousand of dollars in infra per month to enjoy that benefit.
You should start with a managed solution and see if it makes sense to write/move your code in k8s. If it does, awesome, keep using the managed solution. Roll your own only when you see your traffic/load increasing your costs to prohibitive levels or if you want to do something very unique.
K8s comes with the inherent dependence that you are using a cloud infra with a lot of the setup done right.
Most importantly - ingress. Getting your ingress on k8s is still a big issue.
I'm not sure about all of these statements that networking in k8s is superior to swarm. For the longest time, it was a huge mess to configure weave vs flannel vs calico. Arguably it was because of the third party implementation themselves, but then I would argue that comparing the superiority of k8s to docker swarm is an apples to oranges comparison...Since swarm's networking was always built in and opinionated.
"I suspect that a significant source of programmers flooding into management is the decreasing wavelength of full scale re-education on how to accomplish the exact same thing slightly differently with diminshing returns on tangible benefits."
Wow. I'm curious to see how others are interpreting this sentence.
I've got:
I suspect many developers opt to move on to management largely because of frustration with having to learn new abstractions with increasing frequency and decreasing benefit.
I agree. Most of software engineering has turned into resume driven development, and engineers can only keep up at it for some time.
Kubernetes did not have to be so complicated. Docker swarm just proved it. If it takes 10 engineers to really program and manage your kubernetes instances, what problem did it solve again?
>In a single tweet — can you name a technical benefit you and your team have gained by switching to Kubernetes?
- Continuous delivery made easier
- Code-as-infra you can deploy anywhere (Pi, GCP, AWS...)
- Ability to pack VMs tighter, viewing your cluster as a pool of resources.
Kubernetes may seem too complicated if you miss the point. It's throwing the baby out with the bathwater, but it's doing so with purpose. Kubernetes didn't become popular by accident.
The benefit is hard to explain for the same reason that it's hard to learn: it's a complicated piece of tool that solves a problem most people don't understand. That problem is scheduling. Most people view systems administration as the practice of managing machines that run programs. Google flipped the script: they began managing programs that run on clusters, probably about as early as any other company, if not earlier.
The key insight here is that with Kubernetes, you are free from the days of SSHing into machines, apt-get installing some random things, git cloning stuff, and setting up some git hook for deployments. No matter how much more advanced your process is, whether you have God's gift of a Chef script, or you have the greatest Terraform setup in the world, you're still managing boxes first, and applications second, in the traditional model. You have to repeat the same song and dance every time.
To be fair, Kubernetes is not the only platform to provide this sort of freedom. Obviously, it's based on Google's famous Borg, and Docker Swarm also exists in this realm, as well as Apache Mesos. I think Kubernetes is winning because it picked the right abstractions and the right features to be part of itself. Docker Swarm did not care enough about the networking issues that came with clustering until recently. Specifically, one of the first problems becomes "What if I need multiple applications that need to be exposed on port 80?" Kubernetes IMMEDIATELY decided that networking was important, providing pods and services with their own IP addresses first thing, including LoadBalancer support early on. In my opinion, pods and services are the sole reason why Kubernetes crushed everything else, and now that other solutions are catching up and implementing better abstractions, and other solutions for networking are appearing, the problem now is the massive headstart Kubernetes had. Kubernetes let you forget about managing ports the same way you forget about managing machines. Docker Swarm wasn't offering that.
Yeah, it took a few paragraphs to explain why it makes sense, but once you "get" it, it's hard to unget.
That does not mean that Kubernetes is not too complicated. It's probably way too complicated for most of us, and a lighter solution with similar properties would be fine. But that doesn't mean the complexity is all a waste; it's just not useful to all of us.
disclosure: im an engine mechanic with a lot of interest in Linux as a hobby.
Kubernetes seems like a great alternative to stuff like openstack that seems like it requires an entire datacenter to get going properly, but I feel like the hype (k8s? really?) is outliving the reality.
youre also bucking up against a problem where on smaller scales, it just seems easier to use something else. Maybe not "agile" and all that nonsense but certainly easier.
OpenStack is used mostly for the layer beneath Kubernetes. We use it to provision the VM's/networking for the Kubernetes cluster itself.
Kubernetes might not make sense for your small scale projects but it's great when used for enterprise scale microservices. It makes deployments easier, faster and more secure (with things like network policies, namespacing and RBAC).
It depends on your context. Having moved from ECS on AWS to K8s on GCP I found kubernetes simpler despite its vast number of concepts. It has the correct abstractions for a microservice architecture, and it made us write less boilerplate code to manage and deploy services.
We just gave up on Mesos. It has been impossible to hire a team to support Mesos, which in many ways, is simpler than k8s while delivering greater scale[1]. But it's been me and one other guy building the whole thing when we planned on having ten. We're planning on replacing it with something else, and New Guy wants that something else to be k8s. Or wanted. Then we spent an hour or so asking "So, how will you handle failure mode (that happened) X?" or "What about the teams that need Y?" and no ECS is looking real good to this guy.
At the literal end of the day, when a bug causes masters to lose quorum at 2am, do you want to be fixing that, or do you want an AWS or Azure team to be fixing that. When you have a hundred apps happily running but the coordination framework is down, so that if you lose and agent, or get lots of traffic, the system that would add or replace capacity is down, how fun is that? Well, we didn't have an outage, because all the apps were up and running, but thank goodness we didn't lose an agent of get a spike in traffic. You live in constant terror, because it never gets so bad (site down), but you have lots of near death experiences where you race against time to fix whatever fun way the system failed. God forbid you upgraded to the release (of Marathon) that has a failure mode of "shut everything down" (we didn't, because I'm "overly cautious", but others did).
Now for some companies, where they have the need for a huge cluster, and the cash and reputation to be able to hire for it, then Mesos or k8s seem like a great idea. But for everyone else, use ECS and have a team at AWS keep that thing running.
[1] I suspect it's probably easier to hire people for k8s, and at the same time more difficult to hire people who can actually do the work. k8s looks simpler than mesos, but is far more complicated. Mesos has a smaller following, but those I've met know what they're doing and respect the problem. And Mesos didn't get everything right, but it's "righter" than k8s IMHO. There's a need for a Mesos 2. =)
There are certainly many moving parts- but there's nothing quite like running kubectl create -f <somefile> and sitting back to watch your entire application stack spin up
K8s does a relatively good job at tackling a very large problem space, but it never tried to tackle the other problem space: learning and using complicated things quickly and easily.
The fact that there is no tool to walk a human through building commonly reproduced configuration files is proof that humans were an afterthought.
Yes. There wouldn't be so many different vendor solutions if it wasn't. I wouldn't say that the complexity isn't warranted, and it obviously can be wrangled in and once you know it it's not all that bad.
Conversely, it wouldn't be as popular as it is if the value it provided wasn't clear.
I’ve always sorta felt they solved the problem backwards, focusing on the infrastructure and not on running applications. Now we have all these competitors/confusion around orchestrating running an app. They could have just copied Heroku’s CLI as a base starting point and ease adoption.
For a system that needs to take care of a great deal of different use cases on a wide variety of infrastructure, k8s is surprisingly simple in context.
However I also think for a great deal of “every day” use cases solutions like GKE, EKS, Rancher, Cloud 66,... are good enough.
Interesting that the article never mentions the word “container” (outside of a quoted tweet).
If you don’t want a cluster management system that can schedule your many containers over a pool of many nodes, then yes Kubernetes is not aiming to solve your problems.
Yes it's too complicated. But I think it's a low level tool that most people will use via an abstraction built on top. We use Rancher which makes using kubernetes a breeze.
Side note - i stumbled across an attempt to rebuild K8 from scratch in python (it looked like a learning attempt) - but i forgot to bookmark it - anyone seen something similar
Maybe the most concise statement I've seen on the meaning of employment in our times. ..really reminds me of PG's refragmentation essay http://paulgraham.com/re.html
You need to hire people who are competent with a new, quickly evolving technology. That means it's hard to hire someone good and it's expensive.
The upside is that you can replace multiple "normal" ops people with just a few K8 folks but that also requires a bit of culture change of how you do development.
I started a year ago on a team, and there are so many different technologies, things to know, gotchas everywhere, that even competent people can struggle outside of taking a single domain and owning it.
It is not remotely easy to actually use and understand all parts of a massive system, at least ours.
(I've been using k8s for several years now, and have a couple dozen contributions so far)
Two big drawbacks.
First, with k8s itself is _stability_. K8s was a design-by-committee attempt to make Borg 2.0 with open source. It had massive feature development (which continues), but little actual real usage of those features. So it has taken a long time for things to get into a good shape. Things are much much better now, because of the popularity. "Given enough eyeballs, all bugs are shallow". But expect to debug things, especially if you use a feature not used by GKE.
The other big downside is it takes a lot of domain knowledge to manage it, and keep up to date (there is no LTS version). There isn't a standard way of deploying it into on-prem either. So that stuff needs to be built.
After three years of running it in production I don't really have any horror stories, but we have definitely run into odd and puzzling things. For example, when we were experimenting with auto-scaling node-pools on GKE, and the cluster would scale a node-pool up then fail to scale it back down due to the fact that there are daemonsets on every node of a GKE cluster, and this left it in a state where it flopped back and forth between two nodes every ten minutes, evicting the entire workload, trying to delete the node, failing, and resetting.
Thing is, we have also run into really bad things in our puppeted infrastructure that we run on vms and manage with terraform. Like the time I accidentally ran a command in the wrong folder and deleted 120 vms :). Well maybe that was just me and a deficiency of our tooling, but anyway one thing is for sure: k8s makes it far easier to recover from situations that arise, once you understand what the problem is and update the cluster with the right state, the controllers will set things right.
They recently added namespaces to specs, and I think they should add context as well.
Not really a horror story, but we ran into some mysterious behavior when one of our services appeared to alternate between two difference instances of an app. It was amusing to find someone had deployed prod specs to both prod and dev clusters, The only difference being their dns. The result was a fight between the two clusters' dns controllers to sync the A record to their respective endpoints.
like @fear91 almost said, but didn't quite get there.. it's operationally VERY complex. It has it's reasons for being so complex, but jeez! doesn't help that tooling and docs are all over the map. Right now I'd say it's useful only if you can afford to hire a handful of very talented k8s knowledgeable people to babysit it, because it def. needs babysitting.
Maybe in a few years it will be nicely stable. Otherwise you make someone like Google, Microsoft, etc run it for you.
Or skip the complexities and use something operationally simple and sane like Nomad or FreeBSD Jails.
Not sure if this was an accidental key press or something, but generally here we try to be constructive in our comments. "?" is a fairly ambiguous post to make, I'm not sure many people understand what it is you're questioning.
They probably should do a better job of discouraging certain use cases, but calling their elevator pitch “bullshit” is hyperbolic.
There are exceptions to every rule, but a good rule of thumb is cluster size. If you’re managing less than 25 servers than K8s is probably over kill. As you start to creep north of 40 servers K8s really starts to shine. The other place K8s really shines is dynamic load. I manage anywhere from 600-1000 16Gb VMs, I can’t imagine doing it without K8s.
If cluster size isn’t a good rule of thumb then application architectur probably is. If no one person in you company has a complete mental model of the application architecture or it is impossible because it is so complex, again container orchestration might be a good way to go.
Final point: If you’re struggling with K8s swallow your pride and buy a managed solution, then learn as you go.