To be honest, I haven't gotten super-into-the-weeds on Kubernetes. Another guy i...

To be honest, I haven't gotten super-into-the-weeds on Kubernetes. Another guy is the main k8s guy, but I have used the cluster he's configured and deployed a few containers on it. I've also had to troubleshoot a few nodes. A lot of these complaints may be things that are already solved, but we just don't know how/where/why yet. I think we're also using a relatively "old" version of k8s (in young technology, "old" is anything more than a few months old), so some of these issues may have already been addressed.

First issue for me: the recommended way to run k8s for local testing, etc., is minikube. I run a hybrid Windows-Linux desktop env since June (full-time Linux for 10+ years before that), where Windows is the host OS and my Linux install is running as a VBox guest with raw disk passthrough. I have it configured essentially so that Windows acts like a Linux DE that can run Photoshop and play games, while I do all my real work through an SSH session to the local VM, which is my Linux install (and which I can boot into natively if desired, but dual-booting always impairs workflow, which was the reason I switched to this setup in the first place; previously, I would reboot into Windows maybe once a year even though there were games and things I wanted to try and photo editing in VMs hosted on my Linux box was painfully slow).

This means that minikube, itself dependent on VMs to spin up fake cluster members, won't work because VM hardware extensions aren't emulated through VirtualBox's fake CPU. So that's the first hurdle that has stopped me from tinkering more seriously with k8s clusters. I know there is "k8s the hard way" and stuff like that too, but it'd be really nice if we had a semi-easy way to get a test/local k8s up and running without requiring VM extensions, as I imagine (but don't actually know) most cloud rentals don't support nested VMs either.

Besides this big hurdle to starting out, many of the issues are high-level complexity things that create a barrier to entry more than things that actively get in the way of daily use once you understand them.

For example, we have 3 YAML files per service that need to be edited correctly before something can be deployed: [service]-configmap.yaml, [service]-deployment.yaml, and [service]-service.yaml. We have dozens of services deployed on this cluster, so we have hundreds of these things floating around. They're well-organized, but this alone is a headache. The specific keys have to be looked up, they have to be in the right type of configuration; if something that is supposed to be in the configmap is in the deployment file, k8s will be unhappy, the right env variable won't get set (more dangerous than it sounds sometimes), the wanted shared resource won't get mounted correctly (and my experience is that it's not always obvious when this is the case, and the mount behavior is not always consistent), or whatever. Keys must be valid DNS names, or something like that, because etcd, which runs under the covers here somewhere, doesn't accept names that would be invalid DNS entries. This means no underscores. There's nothing wrong with any of that per se, but it's a lot to wield/remember.

I also remember mostly thinking the errors related to k8s configurations and commands were unhelpful. For example, it took me a long time (a frustrating 60-90 minutes probably) to realize that `kubectl create --from-file` wasn't reading in my maps as config structures, but rather as literal strings. This seems like something that should've been made obvious through something like a warning on import ("--from-file imports your map as a literal; if you want to parse the contents and use it as a config, please use `apply -f`" (and `apply -f` means apply the config read and parsed from the file, not "apply with force", while `create --from-file` means "create a literal string as a resource instead of parsing and creating this config as a config object; however, be careful with kubectl apply, because it will try to silently merge existing configs with new values, which is sometimes helpful, and sometimes can drive you nuts if you forget about this behavior; I dunno if kubectl delete configmap my-configmap.yaml and recreate is always feasible or if that would give dependency conflicts or what?)).

To deploy, kubectl apply -f changed-yaml.yaml, which sometimes does and sometimes doesn't clean up the running pod (service configuration thing? or is it a matter of which config type I'm applying, cm, deployment, or service?), `kubectl delete pod old_pod_id` if not automatically reaped, restarting is automatic under our config after a delete which I'd guess is configurable too, then you have to `kubectl list pods | grep service_name` to get the new pod id, and `kubectl logs pod_id` to get the logs and make sure everything started up normally, though this just shows the logs output by the container's stdout, not necessarily the relevant/necessary logs. Container-level issues won't show on `kubectl logs`, but require `kubectl describe pod pod_id -o wide`.

Then you have to `kubectl exec -it pod_id /bin/whatever` to get into the right container if you need to poke around in the shell (and I know, you're not supposed to need to do this often). Side note here, tons of people trying to containerize their apps that run on Ubuntu or Debian today onto Alpine, another mostly-unnecessary distraction, and seems to result in just grabbing a random container image from Docker Registry that claims to provide a good Ruby runtime on Alpine or something without looking into the Dockerfile to confirm, which IMO is a much larger security risk than just running a full Ubuntu container.

Lots of extended options like `kubectl get pods -o [something]` are non-intuitive. I guess they're JSON pathspecs or something like that? Again, that probably makes sense, but it's pretty unwieldy. Often have to do `kubectl describe pod pod_id -o wide` to get useful container state detail.

When a running pod was going bananas, we had to `kubectl describe nodes`, again a long and unwieldy output format, and we have to try to decipher from the 4 numbers given there what kind of performance profile a pod is encountering. This leads us into setting resource quotas to make sure that pods on the same node don't starve each other out, which is something I know the main k8s guy has had to tinker with a lot to get reasonably workable.

Yes, we have frontend visualizers like Datadog that help smooth some of this over by giving a near-real-time graph with performance info, but there's still a lot of requisite kubectl-fu before we can get anything done. I also know that there are a ton of k8s and container ecosystem startups that claim to offer a sane GUI into all of this, but I haven't tried many yet, probably because I'm not really convinced any of this is necessary as opposed to just cool, which it undoubtedly is, but that's not how engineers are supposed to run production environments.

I mean all of this doesn't even scratch the surface, and I know they're not huge complaints, but they just speak to the complexity of this, and a reasonable person has to have some incentive to do it besides "It makes us more like Google". Haven't talked about configuring logging (which requires cooperation from the container to dump to the right place), inability to set a reliable and specific hostname for a container in a pod that will persist through deployments, YAML/JSON/etcd naming and syntax peculiarities in the deployment configs, getting load balancing right, crash recovery, pod deployments breaking bill-by-agent services like NewRelic and Datadog and making account execs mad, misguided people desparetely trying to stuff things like databases into this system that automatically throws away all changes to a container whenever it gets poked, because everything MUST be using k8s, since you already promised the boss you were Google Jr. and he will accept nothing less, and a whole bunch of other stuff.

All of this ON TOP OF the immaturity and complexity of Docker, which itself is no small beast, on top of EC2.

That's QUITE the scaffolding to get your moderate-traffic system running when, to be honest, straightforward provisioning with more conventional tooling like Ansible would be more than sufficient -- it would be downright sane!

SOOOOOOOOO ok. Again, I'm not saying there's anything wrong with how any of this is done per se, and I'm sure some organizations really do need to deal with all of this and build custom interfaces and glue code and visualizers to make it grokable and workable, and of course Google is among them as this is the third-generation orchestration system in use there. None of this should be taken as disrespectful to any of the engineers who've built this amazing contraption, because it truly is impressive. It's just not necessary for the types of deployments we're seeing everyone doing, which has nothing to do with the k8s team itself.

I'm sure that given the popularity of k8s, people will develop the porcelain on top of the plumbing and make it pretty reasonable here in the not-so-distant future (3-5 years). However, like I said in my original post in this thread, I don't think this is benefiting many of the medium-sized companies that are using it. I think, to be completely frank, most deployments are engineers over-engineering for fun and resume points. And there's nothing wrong with that if their companies want to support it, I guess. But there's no way it's necessary for non-billion-user companies unless you REALLY want to try hard to make it that way.

I could write something extremely similar to this about "Big Data". Instead of concluding with suggesting Ansible, we could conclude with suggesting just using a real SQL server instead of Hadooping it up with all of those moving parts and quirky Apache-Something-New-From-Last-Week gadgets and then installing Hive or something so you can pretend it's still a SQL database.

Is there a way to make over-engineering unsexy? That's the real problem technologists who value their sanity should be focusing on.