I wouldn't say it's cargo-culting, but, it's definitely silly (and intended to b...

Alupis · 2024-08-23T21:01:20 1724446880

People so very often are quick to look down upon anything using k8s these days, often citing complexity and some hand-wavy statement about "you're not web scale" or similar. These statements are usually accompanied by complex suggestions as "more simple" alternatives.

On one hand you can manually provision a VM, configure the VM's firewall, update the OS, install several packages, write a bash script which can be used with cron, configure cron, setup backups, and more, followed by routine maintenance that is required to keep this system running and secure. This dance has to be done every time you deploy the app... oh, and don't forget that hack you did one late night to get your failing app back online but forgot to document!

Or, on the other hand, you can write 1-3 relatively simple yaml files, just once, that explicitly describe what your app should look like within the cluster, and then it just does it's thing forever.

k8s is very flexible and very useful, even for simple things. Why would you not use it, should be the default question.

drdaeman · 2024-08-23T21:16:46 1724447806

Setting up and maintaining along the happy path is not an issue, it’s well-documented and has existing automations if desirable so you can set up a cluster in just a few minutes (if you have the experience already).

And YAML files are nearly trivial in most cases. Boilerplate is not pretty but it’s easy.

The only Problem (with a capital P) with K8s is when it suddenly and unexpectedly breaks down - something fails and you need to figure out what is going on. That’s where its complexity bites you, hard - because it’s a whole new OS on top of another OS - so, a whole new giant behemoth to learn and debug. If you’re lucky it’s already documented by someone so you can just follow the recipe, but if luck runs out (it always eventually does) it’s not a common case you’re going to have some very unpleasant time.

I’m a lazy ass, I hate having to debug extremely complex systems when I don’t really need to. All those DIY alternatives are there not because people are struggling with NIH syndrome but because it is orders of magnitude simpler. YMMV.

Or if you want to make K8s do something it isn’t designed or not presently capable of, so you need to hack on it. Grokking that codebase requires a lot of brainpower and time (I tried and I gave up, deciding I don’t want to, until the day finally comes and I must.)

Alupis · 2024-08-23T21:34:24 1724448864

This is the other common pushback on using k8s, and it's usually unfounded. Very rarely will you ever need to troubleshoot k8s itself, if ever.

I highly doubt most situations, especially ones like this article, will ever encounter an actual k8s issue, let alone discover a problem that requires digging into the k8s codebase.

If you are doing complex stuff, then ya, it will be complex. If you are not doing complex stuff and find yourself thinking you need to troubleshoot k8s and/or dig into the codebase - then you are very clearly doing something wrong.

A homebrewed alternative will also be complex in those cases, and most likely more complex than k8s because of all the non-standard duct tape you need apply.

cjk · 2024-08-23T23:26:03 1724455563

Even using managed k8s such as GKE, at $DAYJOB, we have run into issues where the default DNS configuration with kube-dns just falls over, and DNS resolution within workloads starts hanging/timing out. We were doing absolutely nothing special. Debugging it was challenging, and GCP support was not helpful.

drdaeman · 2024-08-23T22:26:50 1724452010

Hah. Okay, here's an anecdote.

First time I had to look under the hood (and discover it's not exactly easy to get through the wiring) is when I set up my first homelab cluster. I wanted to run it over an IPv6-only network I had, and it while the control plane worked data plane did not - turned out that some places were AF_INET only. That was quite a while ago, and it's all totally okay. But it was documented exactly nowhere except for some issue tracker, and it wasn't easy to get through CNI codebase to find enough information to find that issue.

So, it wasn't doing something wrong, but I had to look into it.

So I ran that cluster over IPv4. But it was a homelab, not some production-grade environment - so, commodity hardware, flaky networks and chaos monkeys. And I had it failing in various ways quite frequently. Problems of all kind, mostly network-related - some easy and well-documented like nf_conntrack capacity issues, some were just weird (spontaneous intermittent unidirectional connectivity losses that puzzled me, and the underlying network was totally okay, it was something in the CNI - I had to migrate off Weave to, IIRC, Calico, as Weave was nicer to configure but too buggy and harder to debug).

I also ran a production GKE cluster, and - indeed - it failed much less often, but I still had to scrape and rebuild it once, because something had happened to it, and by then I knew that I'm simply not in capacity of proper diagnostics (but creating new nodes and discarding old ones, so all failing payloads migrate is easy - except, of course, that it solves nothing in the long term).

In the end, for the homelab, I realized I don't need the flexibility of K8s. I briefly experimented with Nomad, and realized I don't even need Nomad for what I'm running. All I essentially wanted was failover, some replicated storage, and a private network underneath. And there are plenty of already robust age-proved tools for doing all of those. E.g. I needed a load balancer, but not specifically K8s load balancer. I know nginx and traefik a little bit, including some knowledge of of their codebase (I debugged both, successfully) so I just picked one. Same for the networking and storage, I just built what I needed from the well-known pieces with a bit of Nix glue, much more simpler and direct/hardcoded rather than managed by CNI stack. Most importantly, I didn't need fancy dynamic scaling capacities at all, as all my loads were and still are fairly static.

As for that GKE cluster - it worked, until the company failed (for entirely non-technical reasons with upper management).

If you know what I did wrong, in general - please tell. I'm not entirely dismissing K8s (it's nice to have its capabilities), but currently have a strong opinion that it's way too complex to be used when you don't have a team or person who can actually understand it. Just like you wouldn't run GNU/Linux or FreeBSD on server back in '00s without having an on-call sysadmin who knows how to deal with it (I was one, freezing my ass in the server room, debugging NIC driver crashing kernel at 2am - fun times (not really)!, and maybe I grew dumber but a lot of Linux source code is somewhat comprehensible to me, while K8s is much harder to get through.)

This is all just my personal experience and my personal attitude to how I do things (I just hate not knowing how things in my care/responsibility work, because I feel helpless). As always, YMMV.

Alupis · 2024-08-23T23:15:05 1724454905

1) If you're dipping into Calico and friends then I'd argue your setup is not simple, so it is not surprising your config/experience were also not simple. Configuring a VPC and setting up routes etc for any cloud provider by hand will also be quite complicated. In my opinion, this is not a k8s issue, but rather just a complexity issue with your setup. Sometimes complexity is necessary...

2) IPv6 is an issue even today with many systems. There's also not a large need to run IPv6 inside your cluster, but I have not actually tried and cannot comment on how well it works. It's possible it was a "round hole, square peg" issue at the time.

3) Regarding your GKE Cluster, I find it improbable that k8s itself was borked, especially so if it was "managed" k8s from the cloud provider (meaning they provide the control plane, etc). It seems to me it's much more likely something inside the cluster broke (ie. one of your apps/services) and in the heat of the moment it was easier to just throw everything out. Cycling nodes has the effect of redeploying your apps/service (they are not migrated the way something like xen or vmware does), which probably indicates they were the issue if you did not also modify the manifests at the same time. Were your services configured with the relevant health endpoints k8s uses to determine if your app is still alive & ready to receive requests? Without it, a failing app will stay failed, and k8s won't necessarily know to cycle it.

drdaeman · 2024-08-24T00:07:07 1724458027

> If you're dipping into Calico and friends

That was my homelab (all bare metal, except for one node that was an VPS), so full DIY setup without any clouds, meant to try things out and learn how they work. I tried a bunch of CNI options, I remember at least Weave, Calico and Flannel. I just opened K8s page about networking and looked at the presented options, trying them out.

The idea was to learn stuff. I’ve yet to see a complex system that never fails, so I wanted to see how it works (and what are the limitations, thus the v6 experiment, heterogeneous node architecture with one aarch64 node, etc.), how it can fail (accelerated through less homogeneous and less reliable conditions) and what does it take to debug and fix it. On my terms, when I can tolerate downtime and have time to research, not when a real production system decides it wants to satisfy Murphy laws at the worst possible moment. And - no surprise - it had its failures, and I learned that debugging those aren’t easy at all. Was nice when it worked, of course (or I wouldn’t have bothered at all).

> Regarding your GKE Cluster, I find it improbable that k8s itself was borked

Well, what can I say? It did, and it certainly wasn’t an application-level error.

Actually, now I remember (it was five years ago or about so - quite a while so my memory is blurry) it did that not once but twice. One time I diagnosed the issue - it was a simple conntrack table overflow, so I had to bump it up. Another time, I have no idea what was wrong - I just lost database connectivity, but I’m certain it wasn’t the application or the database but something in the infra.

Alupis · 2024-08-26T20:18:18 1724703498

> Actually, now I remember (it was five years ago or about so - quite a while so my memory is blurry) it did that not once but twice. One time I diagnosed the issue - it was a simple conntrack table overflow, so I had to bump it up. Another time, I have no idea what was wrong - I just lost database connectivity, but I’m certain it wasn’t the application or the database but something in the infra.

Neither of these are k8s issues though. Where were you playing with `conntrack`? On the backplane?

The issues you describe here are issues you created for the most part. They are not issues people run into in production with k8s, I can assure you of that.

> I just lost database connectivity, but I’m certain it wasn’t the application or the database but something in the infra

Most likely something with your cloud provider that you did not understand fully and therefore blamed k8s, the thing you understood the least at the time.

drdaeman · 2024-08-26T21:15:26 1724706926

> Neither of these are k8s issues though. Where were you playing with `conntrack`? On the backplane?

Yes, on the host (GKE-provisioned VPS) where the application container ran.

While it's certain this is not in K8s itself, I'm not really sure where to draw the line. I mean, IIRC, K8s relies on kernel's networking code quite a lot (e.g. kube-proxy is all about that), so... I guess it's not precisely clear if it's in or out of scope.

But either way, they're still certainly GKE issues, because the whole thing was provisioned as GKE K8s cluster, where I think I wasn't really supposed to actually SSH to individual nodes and do something there.

> The issues you describe here are issues you created for the most part. They are not issues people run into in production with k8s, I can assure you of that.

Entirely irrespective of K8s or anything else... people don't create weird issues for themselves in production? ;-) I honestly suspect making sub-optimal decisions and reaping their unintended consequences is one thing that makes us human :-) And I'm sure someone out there right now tries some weird stuff in production because they thought it would be a good idea. Maybe even with K8s (but not exactly likely - people hack on complex systems less than on simple systems).

By the way, if you say connectivity hiccups aren't a thing in production-grade K8s, I really wonder what kind of issues people run into?

> Most likely something with your cloud provider

I remember that node-to-node host communications had worked and database was responsive, but the container had connection timeouts, which is why I suspect it was something with K8s.

But, yes, of course, it's possible it wasn't exactly a K8s issue but something with the cloud itself - given that I don't know what was the problem back then, I can't really confirm or refute this.

cjk · 2024-08-24T03:38:17 1724470697

I have had similar experiences on GKE. If a managed service can’t even get k8s right…

nucleardog · 2024-08-25T02:29:28 1724552968

> Or if you want to make K8s do something it isn’t designed or not presently capable of, so you need to hack on it.

I ask totally nonargumentatifely: can you think of any examples?

I’ve twisted and contorted Kubernetes into some frankly horrifying shapes over the years and never run into a situation where it wouldn’t do something I needed it to do.

It’s, essentially, just a whole big pile of primitives and most of the functionality is implemented within the platform anyway (not as part of the platform).

The furthest I’ve had to go is some custom resources, setting up some webhooks, writing a controller, and integrating against the API.

Wondering what gaps I have here and what I might try or run in to that’s gonna run me up against that wall.

drdaeman · 2024-08-25T03:17:32 1724555852

Running everything over an IPv6-only network, I believe, was not possible five years ago. IIRC I had some issues with data plane, but it was a while ago so I don't really remember the details. Maybe it is now, maybe there are still some issues - I honestly haven't looked since then.

nucleardog · 2024-08-26T12:39:20 1724675960

Thanks for replying--that's actually really good to know right now. Was just embarking on a project to finally try and get IPv6 running on my network... I'll maybe plan to do a PoC with IPv6 in the cluster before getting myself committed.

OJFord · 2024-08-23T21:30:43 1724448643

If you stood up k8s just for an IRC client that would definitely be silly, but honestly if you already have it (in this context presumably as the way you self-host stuff at home) then I don't really think there's anything silly about it.

If anything it would be silly to have all your 'serious' stuff in k8s and then a little systemd unit on one of the nodes (or a separate instance/machine) or something for your IRC client!

(Before I'm accused of cargo-culting: I don't run it at home, and just finished moving off it at work last week.)