Hacker News new | past | comments | ask | show | jobs | submit login
Unironically using Kubernetes for my personal blog (mbuffett.com)
266 points by tate on March 18, 2021 | hide | past | favorite | 227 comments



Kubernetes is... fun. It's fun by making things that used to require hundreds of hours to figure out, only require dozens. This can give superpowers to the right kind of engineer. I spent a bunch of time hand-rolling my own cluster with it's own Wireguard network, on personal hardware, with one node in the cloud because I thought I needed one.

What's neat about it is, you generally solve one problem, and once it's solved, it stays solved. Your solution is in a yaml file somewhere or in a command line option that you've persisted to a script or whatever. And everything accumulates! If you manage to get a storage cluster installed, boom, all of a sudden you have a ton of options available to you.

Once the cluster is stable and you are comfortable bringing it up and tearing it down so that everything's working right, you can start pulling parts of the k8s infrastructure in, need a DNS server? Run it on your cluster! K8s wants a docker container registry, you really don't want to run that on your cluster in the beginning, but once your cluster's secure, why not! K8s starts eating everything in your life like the effing Borg and it's great!

If you're the kinda cat that will spend 80 hours on a Factorio world, then dive into the mods and make the tweaks you really wish the authors would make, and you have enough admin experience to do general troubleshooting of complex systems, I can't recommend Kubernetes enough for the sheer fun factor.

The downside if you want to do anything serious with it is, actually the same as running your own infrastructure always is and was, network connectivity. Hardware is cheap. Last mile network connectivity, isn't. The solution to that, is, of course, colocation.


> What's neat about it is, you generally solve one problem, and once it's solved, it stays solved. Your solution is in a yaml file somewhere or in a command line option that you've persisted to a script or whatever. And everything accumulates! If you manage to get a storage cluster installed, boom, all of a sudden you have a ton of options available to you.

Can you expand on how this is different from traditional networking? I used to run a FreeBSD router with a ZFS storage setup and various other stuff on my home network, and it wasn't like I had to keep tinkering with it. Once I had network storage, I could use it from lots of different things, etc. But eventually you need software patches, updates, etc, and that was where the ongoing pain would sometimes crop up. Is that so different here? From some coworkers who are closer to the cluster, keeping up with K8s version changes doesn't seem like a small effort.


There isn't any gain for home networking. The gain is for enterprise applications that can now migrate a software-defined network to a completely different infrastructure provider without having to change the way they do monitoring, log collection, storage provisioning, DNS migrates with it for free, etc.

Maybe, if you're lucky, you had some set of really good and reliable Sys Admins that figured out a robust way to script and configure the setup process of your original on-premise data center and they captured that in very good, well-maintained documentation. If you're even luckier, maybe those same guys still work for you.

If not, well, Kubernetes provides an open standard that a lot of different people know how to use and the documentation for how everything is setup is directly in your source control. There's nothing exactly unique about this, but it is an actual open standard. It hasn't been owned by Google for years. It's owned by a non-profit. It's open source. If your deployments work on a cluster provided by one Kubernetes engine implementation, it'll work on any of them. You can roll your own, use GKE, EKS, whatever Microsoft offers, and it'll work in exactly the same way wherever you go.

For one person, this means nothing. This guy is just doing it for fun. But for organizations that used to lose millions of dollars a year and sometimes their entire market position thanks to vendor lock in or had to experience months of downtime for a data center migration, they may find something like this useful. Or maybe you just don't want to rely too heavily on Dennis Nedry and want to be able to bring in anyone with a couple years of experience using an open source, open standard toolset who can come in and be reasonably expected to understand how your system works pretty quickly without needing ten years of tribal knowledge.


> There isn't any gain for home networking. The gain is for enterprise applications that can now migrate a software-defined network to a completely different infrastructure provider without having to change the way they do monitoring, log collection, storage provisioning, DNS migrates with it for free, etc.

It sounds like you're making the argument that deploying k8s for a startup is an extreme case of premature optimization.


Nah, it's par for course for startups for the following reasons...

You can get tons of credits for your startup, typically hundreds of thousands from Microsoft, Amazon, etc. -- Eventually you run out of credits, so you switch providers. I did this 3 times at a startup and got three years of free infrastructure.

If you are selling enterprise software, then you use k8s so you can deploy at enterprise without having to integrate with all the wacky requirements.

You can build a saas that offers isolated service nodes on k8s infra pretty easy if you just give everyone a small vm with a k8s cluster on it.

If you use a scale to zero model, your infra costs are a lot cheaper. Simiarly the auto-scaling k8s capabilities are really nice and it's awesome to be able to easily build systems that can scale up massively. You never know when you r startup will get popular.


Depends on what you mean by "deploy".

If you mean assembling your own k8s infrastructure by spinning up machines, then you're right.

If you mean deploying your software on managed k8s infrastructure, then this an outstanding use of your startup's time. You will be able to find plenty of developers already familiar with the workflow, and you'll have a straightforward time growing and deploying your app on different providers. It cleanly side-steps many production pitfalls that burned our time 10 years ago.


It's very frustrating that on HN of all places there is still this constant belief that all startups are the same and are building basic web apps.

Many types of startups will benefit from K8s right from day one.


My first, original production kubernetes cluster existed precisely because it lowered our costs - ridiculously so.

Because all that above? It comes with a system that will actively help you binpack your workloads as much as possible into the compute you give it.

Your usual "throw separate instances"? Can get expensive quick, especially if you're not aggressively modifying instance sizes. As for "lol just use Heroku"... I call that kind of company "bankrupt", but that's probably because the difference between cost of infrastructure and cost of engineer time is wildly different in my area to SV.


Maybe. I guess you could argue that before you find product market fit and your survivability is at stake, its better to be as ruthless as possible and cut down on any infra costs.

All startups are not the same though. If the startup has obtained any kind of reasonable Series A, it totally makes sense to invest in kubernetes, and allow your engineering team to essentially pull in literally any kind of dependency, test it quickly, and find out if it works or not. It acts as a catalyst that would let you churn out new products really quickly, and as such its an invaluable tool to allow your startup to move quickly.

If you work in enterprise environments and are able to use kubernetes, you are set to really shock your organization with how quickly you can move. I've seen this same situation play out in a few orgs and its really amusing how stupidly productive it allows engineers who learn it to be, and how quickly they get things done and get more responsibilities, promotions etc.


Startups can use operators which are like extra headcount.


> There isn't any gain for home networking. The gain is for enterprise applications that can now migrate a software-defined network to a completely different infrastructure provider without having to change the way they do monitoring, log collection, storage provisioning, DNS migrates with it for free, etc.

This is an unbelievable amount of gate-keeping hogwash. I don't know who this person is that they think they can arbitrate what is a good usage & what cases this is too-powerful too-interesting too-useful to bother using it in.

There is so so so much fear & doubt & scare in this post. Screw this gate-keeping crap.

> Maybe, if you're lucky, you had some set of really good and reliable Sys Admins that figured out a robust way to script and configure the setup process of your original on-premise data center and they captured that in very good, well-maintained documentation. If you're even luckier, maybe those same guys still work for you.

"Only us good right & virtuous & amazing engineers can handle this! This is too pure, too amazing for mortals! They're wasting their time! They'll get the configuration wrong! They're bad people. Only professionals are qualified to play with Kubernetes!"

UGH ENOUGH. Stop this terrible attitude. This is so down-talk-y.

Please don't assume, please don't dictate your limited terms to the world. Let the world try. Let us not be cowed, & afraid, to use good tech, by these scare words.

As it turns out, it's just not that hard. It's a better environment, a better world. There are lots of home users using Kubernetes. It doesn't take a colossal investment. It's fairly secure out of the box, at least if you're not trying to run a multi-tenant home. It just works. This blog post shows that! It's really simple.

See? Look. Lots of projects: https://github.com/k8s-at-home/awesome-home-kubernetes . Lots. Good people, just trying. Not taking the poison words to be afraid, that this is too hard.

"For one person, this means nothing. This guy is just doing it for fun."

What a BAD ATTITUDE. Snarking & being mean, to people out there, trying to find better ways to do things, to create shared, meaningful value. With good, autonomic systems. With reasonably competent free-to-everyone utility scale / cloud computing. Don't accept such words as these. Do not be afraid to involve yourselves. Do not be gate-kept like this. Run Kubernetes. Run good systems. Stop being sold on second, third rung systems. Believe in yourselves. Don't exclude yourself, be afraid. You are not saving yourselves any hardship by choosing lesser technology. You can run K3S in <20 minutes, and you can start loading amazing manifests & Charts seconds after. This blog write up shows that. IT MAKES PEOPLE AFRAID to think how democratic technology could be. Please allow yourself a moment of un-doubt where you consider, maybe, this has amazing value for the home, that it's already possibly incredibly robust, please consider that applying some manifests might be super easy. Please consider that blog posts are the canonical way to share work, before Kubernetes, but now I can link you to a repo full of people sharing manifests & charts & works, that stand a decent chance of running on any cloud or at home. There's a lot of sophisticated under-the-hood boons to running Kubernetes too, but as for what the home-user gains: it's amazing. It's easy to understand, eventually, and shared.

Skills learned building one thing in Kubernetes parlay much better into doing other Kubernetes tasks, related or not so much. Kubernauts are growing & learning & exploring, in a far more communal, shared operational way than what loose daemon-monkey-wrenching spread-out weirdness came before. It's not hard, it's not weird, it's not for enterprise. It runs out of the box, nicely. It's for people who want a good means to think about & control a wide variety of digital things. People who want a reasonably consistent underpinning to their technical operations. The alternative, what we had been doing, is having a lot of different things that each had their own means to be thought of & controlled; dis-unified, chaotic, piece-meal, inconsistent. Come try a better more unified way of computing. Come try utility computing. It's for people, all people. Many of us believe it serves us all better.


>Can you expand on how this is different from traditional networking?

Not OP, but I can chime in here.

It's not that it's solved in this specific deployment, but it's solved in yaml files that you can re-deploy to any environment. Once I get my bundle of yaml files, I can just point it to any cluster (or namespace) et voila, it'll be deployed. It's "infrastructure-as-code"[0]. To be clear, Kubernetes doesn't have a monopoly on this approach, but it certainly follows it.

Another idea is treating your servers like cattle, not pet. I used to have Linux VMs in my house named after Greek myths – these are pets. I named and loved each of 'em. And yes, after being setup, they worked. In Kubernetes, they're cattle – and you kill your cattle. You don't name and grow to love and nurture a specific VM. You can wipe your deployments and reprovsion a new one in a blink.

>I used to run a FreeBSD router with a ZFS storage setup and various other stuff on my home network

>From some coworkers who are closer to the cluster, keeping up with K8s version changes doesn't seem like a small effort.

You're not wrong here though. Sometimes there are maintenance tasks you have to do on the whole cluster (make sure there's enough storage, everything is updated, certs are good, etc). But this is orthogonal to having your infrastructure-as-code in yaml files. The yaml files assume a healthy cluster/namespace, but besides that any cluster should be fungible. This is treating your servers like cattle, instead of pets.

[0] https://docs.microsoft.com/en-us/azure/devops/learn/what-is-...


> Another idea is treating your servers like cattle, not pet. I used to have Linux VMs in my house named after Greek myths – these are pets. I named and loved each of 'em. And yes, after being setup, they worked. In Kubernetes, they're cattle – and you kill your cattle. You don't name and grow to love and nurture a specific VM. You can wipe your deployments and reprovsion a new one in a blink.

Very poetic, I love it! These kind of internet conventional wisdoms are gold, I also like:

Batteries included but swappable

Free as in Beer vs. free as in speech

Are there other contenders?


From a sysadmin/config management side of things, and since I "dabble" with networking, I've always been rather jealous of the network appliances. All the serious network devices had a single configuration file. Often the lines in the config file were equivalent to shell commands. For any network device, you could put the list of all the commands in notepad, as a script. But, if you exported the config file, it would look like a nicer version of your "script".

With config management, be it puppet, chef, salt, ansible, you start to approach that. Except these only manage the resources you define. If your config management manages apache resources, and someone wandered in and installed nginx that's out of scope. If you've got a list of machine-local usernames that you want to exist, you might not thinking about ensuring that other usernames are absent.

With kubernetes, it gets a bit closer. Let me preface that standing and running your own kubernetes cluster is a separate level of complexity, but once you have a k8s cluster, you can dump out a list of all resources (namespaces, pods, volumeclaims, ingress controllers, services...), modify that resulting file and apply it. Most people work in chunks, having a yaml file for each set of associated resources, or template those into a helm chart. With that, you can put all associated resources into a release, which you can delete and recreate as you desire.

There's still lots of gotchas, especially around persistence. It's not uncommon to find an orphaned volume claim that prevents your release upgrade from continuing until you dig in with 3 coworkers watching you as you try to remember if it's safe to delete that or not.


What's different is that the applications and the infrastructure are all managed and stored in the same location. There's a lot less you have to keep track of. Sure, you can get it all done and have it stable the traditional way, but k8s makes it fun!


Now smash that machine to bits with a hammer. How long do you need to get the exact same system back up?


In my case, it's 15 minutes.

    1. Get a new server
    2. Connect ethernet and IPMI ports
    3. Add node definition to XCAT
    4. Set boot to network, arm node for installation, power-on
    5. Profit.
    6. Get a coffee.
It's 15 minutes for [1-150] servers. Slightly longer for 150+. Because, network.

Any further setup is done via Salt, if necessary. With a single state file.

Fun fact: I provision K8S nodes that way too.


Just use https://github.com/fluxcd/flux — „The GitOps Kubernetes operator”


Let's assume I have backups for both my kube configs and my standalone machine... and spare parts for the cluster(s). I'm trying to understand better the maintenance story, because my impression is that kube itself has spec changes, and I have to stay on top of all the dockerfiles I'm using to keep up with patches there?


I use Kubernetes for work. We utilize GKE in google cloud. There have been a few times things have required some editing of our YAML files, such as when we had to change the first line of our yaml from:

    apiVersion: apps/Betav1
    kind: Deployment
to:

    apiVersion: apps/v1
    kind: Deployment
Because in K8s 1.16, they stopped the backwards compatibility for Beta on deployments. But that was quite some time they left it in for backwards compatibility, and it was a simple sed script to change it. We have never changed our Dockerfiles except to change the FROM line when we update a base image. They often introduce new features, but try to keep things as stable as possible for existing setups.

In our case, we store our YAML in git, and can deploy a new cluster with all our microservices in about 10 min (most of that is delays in the google global load-balancer setting up a TLS certificate for us, the machines are up and running after just a few minutes).

There can be a bit more scripting needed to do updates of the nodes to newer builds with no downtime, if your pods aren't totally stateless, but compared to what it used to be at an old job, with C apps running on Linux, behind load-balancers that had to be removed, upgraded, and added, a rolling update is like black magic.


> if your pods aren't totally stateless

This is exactly the bit I care about: what do I do with "stateful" systems (my db storage, mail server data,...)?

If I just mount those from external volumes, I lose a lot of idempotency and I now have to worry about whether I am trying to attach that volume to PG 9.3.1 or 9.3.2 or 12.0 (some are fine, others might cause data corruption).

I know idempotent deploys are all the rage, but all those deployed apps are there to serve some data which is as stateful as it can be.

I'd like a system where those external volumes are automatic snapshots on LVM (or ZFS/btrfs) when attached, but that introduces a whole another level of what-now if you need to go back to an older, now slightly stale data set.

How does k8s solve this problem for me?


This question always gets asked. I'm hoping for an answer one day too!


> and can deploy a new cluster with all our microservices in about 10 min I know nothing about k8s yet, but try to solve the same issue with docker swarm and one thing I'm unable to get my head around is this: as soon as yaml files are stored in apps' repos and deploy is being initiated by commiting to them, how do you let the repos know about the new deployment? Or is there a tool that is present in k8s but is missing in Docker Swarm that knows about scattered yaml files' locations (git repos) and pulls them on the new deployment?


People generally keep their kubernetes spec files in Git. Be it either the kubernetes .yaml files they apply directly, or helm charts.

These files can idempotently be re-applied to a cluster to restore, or upgrade, the running applications.

When you lose your cluster and you restore to a new cluster, you will of course initially deploy the same version of kubernetes as you were using previously.

Then once you choose to upgrade to a new version of kubernetes, it can happen that some of the APIs you were using have now been deprecated/removed. This is a very slow process, so you had plenty of time to upgrade ahead of time before being forced to.

But let's say you ignore all that, and have chosen to upgrade to a new kubernetes major version and are now forced to upgrade all your yamls. This happens rarely, and recently only because a few BETA APIs have become STABLE and people are now expected to use the STABLE versions. So you go ahead and make those few changes, re-run your deploy and you're done.


The people claiming it's easy are, well, lying. Our deployments fail all the time for things like Google moved their Helm images to a new repo.

Kubernetes is still early tech and thus has more moving parts than a more well established stack. Keeping up to date is crucial and no small feat.


At least the first 10 minutes is waiting for my corporate laptop to boot up ;)


The big difference is that kubernetes YAML is in a change control system, and specifies what you want rather than how you get it.


I think one big big aspect, in “use the right tool for the job”, that is often overlooked, is how good you already are wielding said tool.

I use Terraform to manage the few personal cloud resources I have. It is overkill, but I already learned it on the job so it ended up being quicker than setting stuff up by hand. If not, I wouldn’t have bothered learning it. At least in this context.

Same when building an MVP, I’m not going to pick some shiny tools I have never used before, but tools I know I can get the job done with.

Sure, I do like learning new things and tinker around with them, but it is not always the right time to do so, and the list of things I’d like to get good at, engineering related or not already is long a enough for a lifetime so one needs to prioritize.


Not only how well do you use the tool, but how useful is it to learn the tool? Once you can use it well, what else can you do with it?

I recently saw a video via twitter that mocked using kubernetes for your blog by showing someone making a sandwich with woodworking tools.

Overkill, yeah. But if you stretch the metaphor to the way it translates in k8s, if you get good at making a sandwich in 5 minutes with the tools, you're fairly close to also stamping out dressers and end tables in 5 minutes. At the very least you can make them, which you can't say of a bread knife. And once you do it a few times you can do it with incredible speed.


>Once the cluster is stable and you are comfortable bringing it up and tearing it down >K8s wants a docker container registry, you really don't want to run that on your cluster in the beginning, but once your cluster's secure, why not!

I'm betraying my ignorance here, but how does this work? If you're running the registry in your cluster, and you tear down your cluster (and the registry with it), how do you rebuild the cluster without being able to pull images?


Usually the important images are hosted somewhere else (ex. you're probably not hosting the controller-manager which is normally pulled from k8s.gcr.io), but for the ones that were hosted in your cluster, the pods will basically fail (and start to back off) until your local registry comes up and you'll see pull related errors. The rest of the pods that don't reference images from your cluster-local registry will start up just fine.

Assuming that you rebuild your cluster and restore your registry from backup (or simply reconnect it to some object store that it was connected to before, like S3), your registry pod (in a Deployment/Stateful/DaemonSet) will come up, and then the cluster will be able to connect to it, and the pods that were failing to start will start succeeding and you're back where you started.


You can always run it in the cloud and at least theoretically be able to switch providers and take your cluster with you.


Expensive. I've got all these machines sitting here at home, why not use those?


Hard to beat the uptime of a cloud provider with on prem, if you literally mean “at home” even harder, my Comcast goes out a few times a day. Even in offices you don’t benefit from the economies of scale and redundancies in the power and network systems that cloud providers have.


If you're really trying to serve the outside world, then you want colocation. Cloud providers are way more expensive.


Colocation isn’t a binary on-or-off thing. As a crappy example, Kubernetes pods maintain the invariant that all containers within a pod exist on the same Kubernetes node. So you can avoid the most egregious waste of resources/time that comes from having no collocation.

To your point, when talking to your db or [insert shared infra here], then you may want to thinking about real collocation. But it very much depends on your application’s needs - and remember that cloud providers can offer guarantees about which AZ or region you’re in.


I did mean real colo. If I ever run a serious public-facing website, I have a colo facility picked out, just drive down there with my custom server. Only a few hundred bucks a month to serve data to a CDN, along with whatever services I want to directly expose.

With the magic of VPN, I can put my colo box on the same k8s cluster as the rest of my home stuff, in case I want to live dangerously.


Depends on what you’re doing and at what scale. Especially for smaller orgs it’s really hard to duplicate the level of service the big boys can provide yourself.


I just thought of another reason. Rolling your own cluster forces you to understand what Kubernetes is at a basic level. If you never get that experience, and just roll with cloud services, you aren't really getting that and will tend to get mystified by errors which, you'd have a much better idea of where they're at if you've rolled your own.

Kubernetes is very complicated, but the complexity can be managed if you approach it from the bottom up. Using a cloud service robs you of that ability.


> Kubernetes is very complicated, but the complexity can be managed if you approach it from the bottom up.

s/Kubernetes/Computers


The pitch for Kubernetes is kind of supposed to be “infrastructure as a service,” isn’t it?


No, that's the pitch that Amazon / Google makes. Kubernetes has a much more complicated value proposition that you have to grasp the hiring dynamics of Google to really understand. Google hires really smart people but smarts isn't necessarily skills. So it builds tech to leverage that workforce.

Other companies hire skilled people that aren't necessarily super-smart, because super-smart people can make a lot more money at Google. So other companies use Google / Amazon, they don't really care what underlying techs it uses, so long as they know they'll be able to hire people with the skills.

If you're rolling your own Kubernetes, you're putting your faith in your own smarts, because you're largely on your own. K8s rewards a certain kind of developer, and badly punishes someone thinking running their own k8s is going to be as seamless a process as consuming GCP resources.


Yeah but that’s my point, not everyone needs to actually set it up. The intent is for most to just be users and concern themselves with editing yml files to bring up their services.


It's certainly not for everybody.


> my Comcast goes out a few times a day

I’m really sorry to read this. Given that you’re a Comcast customer it’s probably because you have no other viable high speed choices, but please know that regular outages at home are not normal in other parts of the world.


Lol, this is great to know but I live in downtown Dallas within spitting distance of AT&T world headquarters and literally the only "high speed" provider available to me is Spectrum (I can't explain why AT&T itself can't provide service to someone in walking distance of their own corporate headquarters compound - do they even provide service to themselves?). It goes out on me all the damn time. Heck, I lost service a few weeks back because someone moving into the house next door accidentally gave the wrong address when signing up for a new service account and Spectrum canceled my account, assuming I'd moved out since apparently someone else was moving into my house.

I have never been madder at a service provider of any service. But there is nothing I can do, at least not for six more months until Starlink finally goes online in my area and sends me a dish.

This in what? The heart of the 4th largest metro area in the richest nation on the planet? Maybe this is legitimately not a problem in most of the developed world, but U.S. residential communications infrastructure is amazingly pathetic.


Mine has never gone out, but when I had AT&T DSL it dropped whenever it rained and it was impossible to fix. (the other mystery being why we only had 1.5MBit DSL in the middle of Atlanta.)


What’s he meant to do with this information?


Lament alongside the rest of us Comcast customers paying our "protection money".


> like the effing Borg and it's great!

Don't know if that was an intended nod but I liked it :)

https://storage.googleapis.com/pub-tools-public-publication-...


I really like K3s. Comes with an ingress controller out of the box which makes it easier to get up and running. Add Let's Encrypt, and its pretty trivial to stand up new services on their own subdomains.


Also worth noting that it's a lot less resource hungry than the official Kubernetes distro. Kubernetes documentation recommends 2 CPUs and 2GB of RAM per node just to have enough left over to actually run any workloads.

k3s is specifically designed for resource constrained environments (e.g. Edge, IoT, CI/CD) and can get away with as little as 0.05 CPU and 256M RAM (on worker nodes).

I don't recommend this as a way to save money on infrastructure but it does open the door to further use cases.


I'm struggling to easily configure new subdomains that easily using GKE. The managed certificates system in GCloud is straight user-friendly hostile. I'll have a look at that.


It really doesn't take much experience with kubernetes before it becomes fairly easy. And then using it for trivial things like this isn't a big deal.

My home "server" runs plex, sab, mongo, unifi, and various other things in a single node k8s with a local zfs volume provisioner. The previous revision of this server I'd switched to using docker for everything and was annoyed with the upgrade process with just docker alone. With k8s, I just use :latest for most images and an upgrades happen every time I restart a pod or reboot the machine.

I've been working with k8s since petSets and so this is all NBD to me. Linux had a learning curve too and ya'll got over it.


Kubernetes is incredibly daunting for someone who only used VMs, but after spending some time with it, it's not really that hard. I also unironically run my personal stuff on K8s just because it's familiar to me, and it is also a good exercise for a very useful skill. People complain about the price, but you can easily get a 3-node cluster on GCP for less than $15/month.


Yeah, I've been learning Kubernetes recently and a simple setup isn't too difficult. However, there's a lot of concepts involved with Kubernetes to grasp and I've definitely not learned everything there is. For example, Jobs still baffle me in regards to what is the purpose outside of CronJobs.


I use Jobs for helm hooks like integration testing and DB migrations associated with deployments. They work great for one off workloads like that.


You just beat me to it!


The term "batch job" (as opposed to "service") appears frequently in the Google K8s-like scheduler papers:

https://research.google/pubs/pub41684/ https://research.google/pubs/pub43438/

In other words, jobs seem like the go-to way to take advantage of unused CPU/memory.


I developed a simple wrapper around vegeta, called vajra, that can be given the URL to a file containing test URLs (in vegeta format), a target QPS/duration, and then easily run a load test.

Gets deployed as a k8s `Job` via GitLab (can be scheduled/on-demand), with a simple script echoing back the vegeta status every 1 minute. Jobs don't yet support automatic cleanup, so another simple GitLab job deletes the job from k8s upon completion. The execution log is anyway available in GitLab/Kibana.

All in all, an extremely simple way to run load tests against a service deployed in k8s.

What grew as a personal utility is now being used by many teams for quick load tests.


Jobs can get useful in more complex workflows. The most common use I’ve had for them us running a smoke test suite or a database migration as part of the larger workflow for a big Rails app.


But why wouldn't I just run a database migration as part of my CI flow? I don't understand why I would add a persistent element to my Kubernetes cluster for just a one-off task, since Jobs do not actually automatically remove themselves from the cluster. And as far as I know, you can't even reuse the already created Job, so if I want to migrate the database again, I need to make it from scratch.


Well, CI doesn't necessarily mean CD. In more than a few cases, I've found it advantageous to use CI to do automatic container builds, but manually attend to the deployment upgrades, particularly when things like db migrations are happening.

> Jobs do not actually automatically remove themselves from the cluster.

set `ttlSecondsAfterFinished` and they will

> And as far as I know, you can't even reuse the already created Job, so if I want to migrate the database again, I need to make it from scratch.

If you need to run a job on demand, you can always 'kubectl apply -f' the metadata, or you can create a cronjob and `kubectl create job --from` your cronjob. Or, if it needs to run at a particular part of your installation/upgrade workflow, use helm hooks. Jobs are intended to be a one-time thing, so the lack of functionality to repeatedly run them is intentional.


There's a TTLAfterFinished feature you can enable since 1.12. It's been in alpha forever. :/ Lets you add a ttl to your job spec to automatically clean up completed jobs.


One more unironic personal user here - I use k3s for cheap-as-possible.


I actually compared K3s with Docker Swarm in my master's thesis where i also wrote a simply Python tool to automate setting up servers for deploying containerized workloads on then.

While i can't really share the text (because it's all in Latvian), i did find that K3s is a pretty good option for when you want to use Kubernetes, but have pretty limited hardware resources. So much so, actually, that it used only slightly more resources than Docker Swarm (and, as a consequence, the deployments that were running on the nodes could serve almost equal amounts of requests under load).

Personally, i do believe that Kubernetes is sometimes overrated for simpler deployments, but if people don't mind the manifest syntax and are comfortable with the tool and its ecosystem, i think that K3s is probably the way to go in those situations (maybe with something like Portainer/dashboard running for graphical administration, if necessary). Such a good distribution, could probably even run it in prod with a HA setup.


> but you can easily get a 3-node cluster on GCP for less than $15/month.

Just wondering, how are you exposing your services from the cluster? (I assume you aren't using a load-balancer ingress)


Did you use Docker Swarm in the previous revision?

I've been quite happy with it for a few years now on a single-node cluster and a handful of services. I could easily add another node, but haven't had the need to. The setup was quite simple and it requires practically no maintenance. If I have to reboot everything starts up automatically, image upgrades are a breeze, and I really have no issues with it.

Sure it doesn't have all the bells and whistles of a k8s cluster, but it's perfectly fine for personal use.

I'm still partly annoyed that Swarm is mostly dead in this space and k8s has undoubtedly "won". It's only a matter of time before Docker Inc. fully abandons it. Such a shame.


Agreed, while i've dabbled in Kubernetes, most of my homelab and hybrid cloud infrastructure (think ~10-20 nodes) runs in Docker Swarm with Portainer for graphical administration.

A while back i also tried Nomad, and while it felt polished, it was also a bit lacking in features (manual encryption config, mostly read-only UI, though that was a while back).

It's a shame that Docker Swarm never got more popular, because in my mind it's basically the sweet spot between running containers directly or using Docker Compose and full blown Kubernetes distributions. The Compose manifest format that it uses is really great and clear in my mind, instead of Kubernetes with all of its selectors, object types and so on.


You can use Flux and it will just poll your source control and image registries to see if any updates to the images or configs are available and reconcile automatically. No need to even restart pods. Totally commodity continuous deployment. It's wonderful.


A friend and I have recently stood up homelabs for funsies. Mine's a cluster of x64 boxes, his is a small pile of raspi4s. He went for a k8s cluster off the bat--having worked with cloud stuff for a while, he had the familiarity to hit the ground running.

I was immediately overcome by all the abstractions. I had no idea where to look to figure things out, and was mostly relying on advice from my friend. I dind't know what an ingress controller was, much less how to configure one--but I knew I wanted each 'service' to have its own IP I could route to from my network.

Overall it felt like I had SO MUCH TO LEARN at *every point* it was difficult to get even close to my actual goals (CI/CD for a personal site + some hosted game servers).

I eventually went with the philosophy of "build something now, and move towards perfection later" / "don't let best be the enemy of good" and begain spooling up LXCs and VMs to do the work I needed, planning to move things into k8s later when I better understood the actual things I wanted to move.

(Plus then I got some satisfaction out of actually accomplishing the goals I wanted, instead of just banging my head on k8s documentation and learning all the abstractions.)

As an example, I've not used docker in any meaningful capacity. For anything. No idea how to make a docker image. To k8s the CI for my docker site, I needed to know how to :

1. install the dependencies, which requires compiling a plugin for pandoc, which requires installing haskell and cabal. this is expensive, so I'd prefer to get the pre-res set up once... but that doesn't seem to be how docker works? Do I need an image repository? can I use DockerHub? I've seen HN talk about how docker is trying to monetize, should I run my own repository? Can I do that on my cluster? I'll need an ingress controller to route to it... I don't even know what that is. 2. I need some way to pass the built website files to the contianer I actually want to host them. I think that means I need an NFS share of somekind to store the files, so one coantainer can load them and another can read them. Do I hos tthat on my NAS? I could put an NFS share in the cluster, maybe. No idea how to get Docker to mount one, or k8s to host one. All the examples I seem to find deal more with connecting to remote services on a host than mounting local storage. is it even local storage? 3. Everyone says infrastructure as code is good, so I guess I'll follow this flux tutorial--only to find out the one I followed is out of date, and I should follow their NEW one. But they still assume I know way more baout k8s than I actually do, but still, I'll spend the few hours to get this operational, so then in theory everything else I deploy can be IaC'd, which is just good practice.

At this point I'm so many layers of abstraction deep, I have no idea what I'm actually doing or how concepts relate to each other, and I'm no closer to actually having my goal.

So last night I spent 2 hours spinning up VMs on my cluster and installing dependencies, configuring an nginx proxy, and now I actually have my personal blog self-hosted and updatable. Way more "progress" than the 10ish hours I've sunk into building a k8s cluster already.

There's something to be said for limiting the number of abstractions you're dealing with.


> I didn't know what an ingress controller was

I've seen this comment a lot when discussion introductions to Kubernetes, and it is probably one of the biggest "first step" problems I see, and a perfectly legitimate complaint. If there was one thing that Kubernetes could address to help "onboarding", it might be thing.

Kubernetes is essentially a collection of controllers, that each control a different aspect of the system. These are all "internal", and you don't need to know/understand them in order to stand up Kubernetes in the happy-path. If you're using a managed solution, these are all managed by your provider anyways.

The ingress controllers are the exception. It brings the concept of "controller" out of the "Kubernetes administration" space, and into the "Kubernetes user" space. It opens up a whole can of worms around "What exactly is a controller?" that you shouldn't have to understand in order to get started using Kubernetes.


Part of the issue, I suspect, is I was blending k8s use and k8s administration by running my own k8s cluster on my own equipment--that's likely an extra layer of complexity.

(of course, the whole point of the homelab endeavor I set out on was to be fully self-hosted to learn all these underlying concetps--but definitely doesn't help that I layered some extra abstractions on top of the soup that k8s already is.)


I don't really know anything about Kubernetes either. I've been using Ansible myself for my projects, though they're all single servers. I get the idea that Kubernetes uses Docker heavily, and that you don't understand Docker that well. I'd suggest, if you care to stick to getting into that, learning Docker first.

I have been using Docker for work for a while, though I still don't feel much desire to use it for any personal projects. It seems straightforward enough - basic idea is that each Docker image runs one and only one process, and has it's own directory tree. So if you want to run a Python or Ruby webapp, all of the app files plus the interpreter and all required debs/packages/gems live in that Docker image, and you don't need to have anything else special on your host. A Dockerfile is then a sort of script that holds all of the instructions for setting up an environment your application can run in from scratch.

Of course that means if you want to run a database or a cache server or something, then you'll need to run several Docker images and coordinate communications between them and launching and scaling. I gather that's where Kubernetes and Docker Swarm and other such things come in.


> you don't understand Docker that well

I wholeheartedly agree :)

The path I'm currently taking is to spin up Concourse CI which uses containers, and start using that to CI/CD my personal site. I'm likely gonna overkill and end up hosting my own image repository, but overall I think this'll teach me the necessary concepts of containers in a directly applicable way to my end goals.

From there I can start to play with k8s itself, if I so desire.

Where I tend to get hung up is things like entrypoints. My personal site is static, so the container to build it isn't a hosted service, it's just invoking Pandoc (with a bunch of dependencies installed as well). I think that makes Pandoc the 'entry point', that just seems strange since it isn't a persistent service, and I think that makes the lifetime of the container rather ephemeral.

Which... likely is a valid usecase for docker containers, I've just only interacted with them e.g. on my Unraid box as persistent services.


I gather the actual site is a static files site, and Pandoc is the tool that translates whatever format you're writing them in to the final HTML. So then your site docker image would just contain a server, Nginx or something, and a directory of all of the files for your site, and the correct config for Nginx or whatever to serve them.

The Dockerfile image build process would take care of building that. It's pretty standard for your single project Dockerfile to first assemble an image with a bunch of build tools, like this Pandoc and whatever other dependencies are needed, copy the source files into that image, run the commands to generate the final site, then assemble a second image with just Nginx and copy the output files from the first image onto that to become the final output image. That way, the whole build process is scripted and automated and doesn't require any of the tools on the local system (handy if you were doing stuff like onboarding new contributors), and the final image is small and secure because it only contains a web server and the final HTML pages and no build tools or source files.


You've listed a dozen or so other technologies you've had to learn in the past, which in aggregate required much more effort and experience to master. Much of which provides a good foundation for understanding the concepts that kubernetes builds upon.

I feel like this was more of a hyperbolic rant.


> I feel like this was more of a hyberbolic rant.

Reasonable--it definitely is a bit of a rant. I'm not sure it's hyperbolic, personally, but clearly I have a bias.

Overall, my acute frustration with k8s coming from a native-destkop-development background is I simply don't have the scaffolding to be effective relatively quickly. The learning curve is steep enough that I get discouraged before I actually begin making progress on my own goals, and I just feel like I'm trying to do things The Right Way, without understanding what I'm doing, or why, or actually accomplishing my original goals.

As you said, this is true of concepts I've had to learn in the past, but I could learn all those concepts in isolation, then apply them together. K8s I feel like I have to understand a much larger chunk of before I hit critical mass and can start being effective.

i.e. python, I can start with, like, sqlite, before I move to an external hosted DB. K8s I feel like I have to understand waaaay more components--and they don't directly build. Like Docker-compose _seems like_ a stepping stone to k8s, but I've been told is a false path.


From overview vantage point, YAML-files seems to be stepping stone to even better abstractions. Operators maybe? The orchestration is complex work in itself, and to build up to more advanced setups you need incremental work, or simplifications. In orgs K8s tends to require a separate team.


Can someone who is familiar with Kubernetes please explain in plain words why we use Kubernetes? I’ve read a lot about it but I always read some bullshit non-answer such as “It’s an orchestration platform for containers”. What does that even mean?

Even this blogpost doesn’t explain what and why’s of Kubernetes.

I have a docker container. I deploy them on vm. I use load balancer to split traffic. Could you please walk me through what problem Kubernetes would solve here?


The problem Kubernetes is trying to solve is to eliminate the burden and potential mistakes in the workflow you mentioned by automating your container deployments and actively monitoring their state. This allows the cluster to balance workloads, heal failing components, (re-)distribute work to nodes with the appropriate resources, and migrate between different versions of container images without downtime or manual intervention.

It does all of this by allowing you to specify a service architecture in configuration files and then actively ensures that this configuration is maintained even as the underlying state of containers change. You can specify things such as the minimum number of backend containers needed to provide a service, scaling parameters to add more backend containers as load increases, and you can tag nodes with different attributes so that containers are distributed and maintained with the appropriate amount of resources.

Kubernetes also provides automating various aspects of networking such as provisioning and configuring load balancers for service ingress as needed. It provides an internal DNS service which automatically registers names for deployments so that linked services can just refer to each other by name without any additional configuration. It can also manage things like SSL certificates which can be shared across multiple services.

Lastly, it provides you with a single place where you can store secrets and configuration values that these services require and again, you wire all of this up with configuration files which can be stored in a git (or other VCS).


Read up on the philosophy of "pets vs. cattle". Kubernetes is meant to totally abstract the hardware and operating system. For a use case of one app on one machine it's not really doing anything.

But what if I told you that you needed to run that docker container on 5,000 machines with 2,500 load balancers in regions across the world? Are you going to SSH into thousands of boxes manually and run docker commands? Are you going to try setting up some monster ansible inventory to do the same? In practice the best minds in distributed computing have found those kind of practices break down at large scale--you just cannot reason or deal with individual machines when there are thousands of them.

This is where kubernetes comes in--it's an abstraction that lets you declare "here's the state I want, X machines running Y containers, all linked through Z services" and kubernetes will make it happen, period. It will take care of contacting thousands of machines, controlling the running containers, ensuring they stay running, handling failures, monitoring, load balancing, etc. You no longer think about problems in terms of low-level machines and instances, you think about the higher level objective like deploying code.

The beautiful thing is that it scales down nicely. A simple 50 line YAML file that declares running your docker container and load-balancing it with a service can easily deploy just to your local machine, or be scaled up to run on 5,000 machines by just changing a variable in the deployment scale. The same simple one-liner kubectl command kicks of either deployment and helps you monitor its progress. If you've ever worked in distributed systems it is really incredible to see this in action at scale.


> I have a docker container. I deploy them on vm. I use load balancer to split traffic. Could you please walk me through what problem Kubernetes would solve here?

What happens when your VM dies? Kubernetes would automatically bring it back up. It has health checks, and knows when containers die/crash.

What happens when you need another docker container due to traffic? Again, Kubernetes fixes situations like this. Kubernetes has a lot of built in support around scaling etc.

Also, what if your docker container doesn't need a whole VM? Say you've got 5 different docker containers (which all scale independently), and lets say 3 VMs. Kubernetes will distribute them across those VMs based on there resource needs.

There's a lot more, but that's kind of what I think of when you say 'Orchestration of containers'.


A funny phenomenon I noticed on my home setup:

If you start a deployment ("your container", roughly speaking), the age of that deployment will keep counting up even if the container exits-- k8s restarts it of course-- and even if the k8s scheduler goes down-- since we want to be able to restart the scheduler without unnecessary service restarts.

At home, it's all on one computer in my basement, so when I reboot that box, the k8s reports come back and keep telling me the deployment has been there for xx days (just with some availability hiccups).


That's because the Deployment has been up that long. Now, the Pods created from the pod template inside the deployment will not have nearly the same age as the Deployment, because they're much more likely to change as the Deployment spec changes or as Nodes come and go.

Even with a reboot, most OCR implementations (docker daemon, eg) will keep a pod around for 10 minutes until they reschedule it, so you'd see a restart count for the pod but probably not even a different age. It's tunable but IIRC 10 minutes is the default.


Thank you and everyone who has responded.


After using k8s for about a year at work now, the way I understand it is that there’s a really large gap in abstraction between “a bare metal machine or VM with a shell prompt” and “a service that handles requests with some Python code”, and kubernetes helps fill that gap with some sane defaults and useful abstractions for people who run services.

If you run your docker app on kubernetes, you get a lot of things for free with the platform (rolling no-downtime deployments, service discovery, auto scaling) that you’d have to set up manually if you were running your service on (say) EC2 instances instead.

It can be a headache to learn sometimes, but ultimately saves a lot of effort if your use case fits!


Not a huge fan, but for me the exact point where Docker ends and Kubernetes starts is binding a container to a network port. If you write what port to use inside a Dockerfile, it does nothing, it's only advisory.

For better or for worse, you need something outside the Dockerfile that can run it. That can be you, if you want to type out 'docker run -p8080:80' etc. You could probably script it, but does your script do restarts, failover, etc?


Downsides (see: tremendous complexity) being generally understood, I'll list some of the upsides I personally feel compared to your described setup:

- If you wanted to scale up or down the number of container processes running on your VM, you'd need to write some code that looks at system utilization. Autoscaling k8s clusters do that for you. They can even provision additional VMs ("nodes") for you during times of heavy traffic, or scale down to save money.

- Updating your app requires either logging into each VM, or writing an Ansible playbook to do that for you. By the time you've written a zero-downtime, health-check-honoring, contextually-aware Ansible playbook, you've made your own container orchestration solution.

- If you run multiple containers that need to talk to each other, you'd need to handle their networking. K8s gives you tools for handling networking between containers in the same namespace that allows them to communicate without exposing them to the wider internet.

- The ecosystem of utilities is as good (and sometimes better) than you'd experience in your VMs setup. cert-manager makes certificate management almost as easy as LetsEncrypt does on a single machine. Prometheus and Grafana are excellent logging and monitoring solutions (and, IMO, much easier to setup on K8s than ELK is within a distributed VM setup). Cillium provides extremely powerful and useful networking and security policies that leverage eBPF

- Changes you make to the configuration of your server won't carry over if you ever need to switch hosting providers, or (more often the case for me) just want to start fresh.

It's absolutely a huge learning curve, but eventually the complexity (mostly) goes away, and you're left with a reproducible method for deploying apps. So in the same way a rails/django developer might use an overpowered solution for their blog API, or a React developer may build a custom frontend when wordpress would also do... someone who's taken the time to familiarize themselves with K8s might find the familiarity and consistency of the interface enjoyable, even if it is clearly killing a fly with a sledgehammer.


I've used Kubernetes for a few years now, and there was a description that really resonated with me: You can think of k8s as an operating system where we can deploy applications, especially those that run more than a handful of services.

Said another way, if Linux (or whatever) is the OS for your server / VM / host level / network device, k8s is the OS for your cloud application.

And, when k8s is implemented properly, it takes a lot of headaches that can come from dealing with the myriad problems that arise when your architecture goes beyond a basic handful of "tiers."


Kubernetes does a lot of different things but one of the more important is resource management. You spin up a fleet of Kubernetes nodes which can have totally heterogenous (differing) amounts of compute/memory/disk, and have the scheduler make intelligent decisions about where to stick containers based off your (optionally specified) list of resource requirements.

So it solves the bin-packing problem automatically rather than having to manually map out an efficient way to use your infra. It doesn’t reduce the complexity per say - the interactions can get complicated if you’re using stuff like node taints/tolerations instead of the more computationally simple “App X needs this much RAM but beyond that I don’t care where it lives”

If I find a bunch of raspberry PIs in my basement and want to have them join my fleet, I just do that and even if my fleet varies from 256 core CPU boxes with huge raid arrays all the way down to tiny raspberry PIs, the scheduling just works. Note here the broader pattern of abstracting away the physical hardware; it’s a really important concept to grok.


> “It’s an orchestration platform for containers”

But that is what it means. K8s, ECS, even docker swarm are ways to orchestrate containers to do something useful.

Take your lb example. What happens when one of the containers you deployed or VMs you deployed to dies? How does it get restarted? Where does the lb send traffic?


I tell k8s: take this container, set up one or more pods with it, add some env variables and out a router on top, and restart the pods if they crash

If I add more physical hosts or scale the pods, k8s does everything for me like moving them around across the available resources


For me, using Kubernetes is all about the patterns that it provides and how it removes a certain class of problems for you.

Deployments, scaling, logging, etc are some of the patterns it provides and the consistency matters. How many of us have worked at companies where deploying two services have been completely different? One team runs the jenkins pipeline while another team FTPS the files over. Now multiply that by several services and several tasks (logging, scaling, etc).

The benefit is in the patterns and standards it provides.


From the other replies here, it sounds like what kubernetes does for most people is what people used to do with bash scripts in the past: automatic configuration and deployment of sets of Phoenix servers (except in containers instead of vms these days) with infrastructure as code. The main difference sounds like you write declaratively and you get to use yaml. It seems to throw in monitoring and control too, which seems like a timesaver. Is that about right?


None. K8s is the best way to add insane amount of complexity and yaml based programming where you do not need either of these. At my day job we spend hours on debugging k8s and many of the time it is impossible to find out why something times out or fails. People who like k8s tell me they use it because of monitoring and deployments. I am not sure what they mean by it. It is very hard to monitor application running on k8s and deployment is usually solved with tools made for deployments.


My head kind of spins here too. Obviously all these folks have gone through the tech tree implementing the various parts and then ceded control to kubernetes when they found it could handle it.

Seems to me like it's helpful to do an oil change yourself before you take it to the dealer, then understand what they do before you take it to jiffy lube the next time (or vice versa). You keep abstracting it away until you just give your credit card. :)


It's a single control plane for the things you're currently managing separately. You mentioned a load balancer, several VMs, and maybe you have some scripts you use to deploy or to add VMs when needed. And, as you suspect, K8S is overkill for many situations.


> Can someone who is familiar with Kubernetes please explain in plain words why we use Kubernetes?

I use Kubernetes because it makes my application, its configuration, and its operations portable.

That's it.


Most of the replies are tackling Kubernetes from a technical perspective so for a people perspective, I can try to explain what might be a logical progression that I've seen.

A startup that grows to have hundreds of developers might transition from running managed VMs to "the cloud". One team sets up the network (virtual networks and some subnets).

As new employees join, they have no reason to interact with those teams who are effectively "hidden" so they deploy their stuff and perhaps wrangle with subnets and what not. Someone tells you that you need to attach subnet-a20w88vhuh4fuih to your resource and it will magically be accessible in the office.

Nothing is in charge of VM sizing so you've got people blowing hundreds or thousands on massive VMs when they're only using 10% of it and vice versa, teams whose application is choking but they don't really have a good mental model of say general purpose VMs vs memory optimised so they just bump up the SKU instead of being more efficient. This is happening everywhere as the company accelerates more and more.

It gets worse when you have a shared cluster say; for an entire team that is globally distributed and the new intern application is doing some weird O(n)6 computations and absolutely blowing the side out of every other resource you've got.

Now at this point, it's effectively a communication/culture problem but Kubernetes can "fix" some of these issues in a sense.

Network for the most part becomes abstracted away and what you're left with is defining security (what ports and protocols should I expect) on an application level, rather than on a security group level. It's kinda neat because these rules are localised to your application whereas they might have been configured manually in a cloud portal or via some terraform config owned by some team in the shadows.

Each of your deployed applications become their own isolated units called pods. A pod could be one or more containers but it's effectively a standalone slice of an application (ie the web frontend while a redis instance might be another pod). There are bigger abstractions to group application pods together but that's besides the point.

These pods get deployed to a cluster (a bunch of VMs) and cough "orchestrated" but the value here is that your containers might be running right next to some containers for the business team or the machine learning team and you would never know. You don't need to know either. The value, as foreshadowed above, is that if you're being a noisy neighbour, your container will either get rebalanced somewhere else or just shut down for exceeding memory usage.

I'm a bit flakey on this point but since each node in a cluster is a massive VM, there's no need to worry about over or underspending based on your computational use as well. You define the amount of memory you want to allow and you get matched to a relevant node based on how much capacity is available. As you gain more users, you just add more nodes. Before that, you might have been "reserving" say X thousand compute hours of certain VM SKUs or whatever. You might still do that but you could feasibly just have whatever your node sizes pre purchased making capacity planning pretty straight forward.

Generally, there'll be some team whose purpose is to manage said cluster so in a funny way, it somewhat revives the whole dev/ops split in that your compute team generally know the nitty gritty of networking and what not while your developers just deploy an application and it "lives on Kubes".

I may have a missed a bunch of stuff but hopefully this outlines some of the more "people" issues a bit? It's half and half useful but also it can be used as a technical fix to a social issue.


now imagine you have 10,000 containers you want to run across 5000 VMs. Kubernetes solves that for you so you aren’t deploying each of those individually.


Is k8s enough for it or you want something like Helm also?


I mean... I'm happy he's happy, but for instance, I sign up with a budget webhost, run rsync and my blog is deployed. Super low mental burden. There's no arguing that K8s for something like a personal blog is a little overkill.

Again, totally not hating on this, we all have our hobbies and I love that the author is super into this.


It is mind bending to see how much more complicated than "rsync" deploying a simple static site often gets.


I have ansible so I can install nginx, acme.sh and copy files. Then if my host explodes, it's the same single command to deploy to a fresh host as my existing host.


It's just as trivial to spin up a new cluster, and do a

    kubectl apply -f .


With a managed service sure. If you have an unmanaged host? I'm not sure stuff like microk8s or k3s claim to be production ready.


If you have an unmanaged host you just need to install docker and k3s (with a curl command). Two lines in a userdata/init script and you're ready to go.

K3s claims to be production ready.


k3s is absolutely production ready. It's intended for IOT and edge devices, but if you want to use it in a data center, it'll work. What you lose is the failover of Kubernetes itself that you get from a proper HA etcd setup where the control plane for your cluster is itself also clustered, but you really don't need that if your entire application is a single node anyway.


k3s supports that. in fact rke2 is basically driven by k3s


For a static site, why not just use s3?


I have the server anyway for a few non static items (acme DNS for getting LE certs for my internal hosts for example)


I just want to write these days so I have been using HEY world.

Simple as sending an email!


Can you tell me a bit more about this? How does it work? Do you have to subscribe to hey.com? What did you like about this approach? Is it a free service?

I’m asking because I too made an email->blog post service for myself and I want to see if this is productizable.


Yeah if you have a HEY.com account (paid), you can send an email to world@hey.com and it's published.


Thanks for the answer. Do you also use this feature to blog?


Yes, here's mine at https://world.hey.com/daedalus

Very simple. Although I still haven't found out how to embed links into text.


And I thought my script wrapping s3cmd was overcomplicating things.


I'm happy you're happy, but do you really need all that fancy file syncing/diff/resolution stuff? Wouldn't FTP have been enough? Vim over SSH?

Seems like overkill. :P


True. My personal blog is also just super easy to run. I use Jekyll for the blog and Neocities as a host, so all I need to do is:

- `jekyll build` to build `_site`

- `neocities push _site` to recursively upload modified files in _site


And on kubernetes you can

`docker build [...]` the image and then

`kubectl set image [...]` to update the image that kubernetes is using

Or, even better, you can just set up CI to do everything on commit/push.

For the last static site I deployed, I just tossed Caddy on k8s and set up the git module. I commit, push, hit f5, and my site is already there. There's a ton of ways you can use kubernetes, which is probably part of the adoption problem.


Along with this a lot of people fail to also take into consideration flow that is NOT the happy path.

With k8s and a proper build system, you can roll back easily if you introduce a bug or something doesn't work. More importantly if your site doesn't start up in any specific way you deploy it, k8s won't even start a new pod.

Everything sounds all fine and dandy if all you think about is the happy path.


I am unironically using Kubernetes for my personal home server. Thanks to https://k3s.io/ this is really easy to do, great fun and extremely useful.

I have a git repo containing all my helm charts & docker files, testing & deploying changes is absolutely trivial now. And it's great to have everything version controlled.

Previously I used to use Ansible, but you quickly run into issues which make you want containerization: Conflicting library/tool versions, packages that pollute too much of the system, port conflicts, hassle of keeping the playbook idempotent, etc.

So while docker-compose would also do fine, having kubernetes to manage the ingress' routing system is rather practical. And the same goes for the other bits and bobs of infrastructure offers you if you're already using it. It's just very convenient.

I've been doing this for a few years now, and am now up to 14 different apps running on my single home machine in Kubernetes, ranging from Home Assistant to PostgreSQL to Plex.

Also it's just good experience. I also use Kubernetes for work, and this has made me noticeably more proficient.


This is something I see a lot of people ignoring about Kubernetes. There are two completely different sides to it: the developer experience, and the sysadmin/devops/sre/what-have-you experience. This post completely ignores that second part, which is arguably much harder. You can ignore large swaths of those topics by using managed K8s platforms, but the pricing for those means that running a blog like this might run you $50-$80/mo


Yeah, I came here to call this out. Speaking from the perspective of having also unironically sought to use Kubernetes for personal projects, the options I found available were these:

- Expensive in money: Pay Amazon or Google on the order of $100 a month for reasonably reliable managed k8s, with the option always available of a goofed deployment increasing your bill by an order of magnitude

- Expensive in time and money: Set up and run your own k8s cluster, at a monthly VM hosting cost not too far from what you'd pay a managed k8s provider, and spend a ton of time dealing with 100% of the ops burden

It's a shame. I use Kubernetes at work and I like it a lot, because it does a great deal to make complex tasks simple. It would be really nice to have that same kind of fungible resource pool available for personal stuff too, and be able to just knock out a few lines of YAML and deploy a new project and have the orchestrator take it from there to running without any further effort on my part. But the economics just don't seem to be there.

(I did try DO's managed k8s service, which is considerably cheaper than the big players. It was also very new, and very flaky - not a knock on DO, it's reasonable that a new kind of service would have some teething problems. This was also a year ago, so I wouldn't be surprised if they've gotten it considerably more stable since then.)


I spend $6/mo on my singlehnode kubernetes vps and I'd say my ops burden is about zero. it takes about 20 minutes to set up k3s and boom, it's running. since all my projects & manifests are in git, it takes another 40 minutes tops to reapply them on a fresh node, & I don't feel particularly worried about maintaining this one.

fear is the mind killer.


That's really interesting - I was aware of k3s, but hadn't looked closely at it; perhaps I'll give it a try now. Thanks for the info!


Cloud Run is probably the closest thing to a managed k8s service. It's using https://knative.dev/ under the hood, and while you don't get such features as secrets management, for personal stuff it's not like you need to disseminate secrets to your coworkers every time they change, so it shouldn't be a big deal. Takes a lot of the overhead out of managing k8s, and it's not too expensive either.


A single node Kubernetes "cluster" on DigitalOcean is $10/month. Adding a loadbalancer is another $15, but you can avoid that by using a hostNetwork ingress (https://stackoverflow.com/a/60726977/91365)

Single node Kubernetes is pretty silly on the surface, but it is useful in some cases.


Nothing silly about it. Yes, with one node you don't have High Availability, but you get other benefits, like a standardized way to manage configuration.


What's the point of a load balancer if you only have one node? Am I dense or is the idea that you'd only use this if you started to have N numbers of nodes sitting behind the load balancer (e.g. $15 + N * $10)?


A Load Balancer is the standard way to set up an Ingress on Digital Ocean's Kubernetes. (Google's and Amazon's too). This makes sense for the vast majority of clusters, which will have more than one node. As usual, to do things the non-default way takes a little bit more work.


It can do things like handle SSL and traffic balancing.

So if you ever have more than one application e.g. a search engine for that blog then it's much easier to add. SSL in particular is often a real pain to setup and varies in capability wildly between applications.


For Digital Ocean, the ingress is handling SSL termination, not the load balancer. Ingresses can also do traffic balancing.


You can easily run something like K3s on a single node. Works fine and installing is very simple. I have felt no need for a managed platform for simple deployment.


That's exactly what I was thinking. Defining an ingress is simple, setting up an ingress controller less so. Defining a PVC is simple, setting up a storage class and its backing storage less so. Running your app on multiple nodes is simple, setting up these nodes and keeping them up to date and secure is a full time job.

Managed k8s or a simple installation in a home lab behind a NAT make it much more tolerable.


Setting up an Ingress controller is a one liner, no? The problem is setting up the network. MetalLB is nice, but you need a /24 basically.

Using klipper-lb (from k3s) would be great, but then you are basically tied to k3s.


Exactly what I was thinking. So you have yaml file that makes it easy to deploy your blog with k8s and you don't understand why people would tell you that's silly. Meanwhile you're spending $50+/month for something you could run on a $2.50/month VM from digital ocean or vultr. And oh by the way, there aren't a dozen moving parts that you don't fully understand that could create security vulnerabilities if misconfigured.


I'm on a free plan on Google cloud. I'm currently using docker-compose, but I'll probably switch to minikube when I get some free time.


The author seems to think the only real alternatives are docker or whatever.

So here's a way to host your personal blog if you don't want to over engineer it:

1. Have a git repo with your nginx/whatever config files.

2. Have a VPS running debian.

3.

    apt-get install nginx git ...
    git clone ...
    ln ... # create symbolic links to your nginx/whatever config in your git repo
    systemctl restart nginx ...
4. You're done. Create a cron job to automatically pull the latest changes from your git repo if you want.

The above steps should take most people around 10 minutes.

If you need to actually pivot into something that scales easier from there, I recommend following these steps/levels as your scale increases:

1. Create an automatic install script.

2. Use that script to create a .deb package instead of installing directly (optionally create a repository for this).

3. If you want to move to docker or what have you, it's trivial to install a debian package in a container.

But let's be real, you never even will have to do any of the above because it's a personal blog, and it'll probably scale to the world population on a $5 VPS, especially if you slap cloudflare in front.


These days I see no point in installing services on a VPS directly, other than docker + docker-compose. You could do it in one image with the reverse proxy + static files, or break it out into two images (this is helpful if you run more services on the VPS).

As for updates, caddy v1 used to support pulling in from git, but I don't think that got ported to v2. So what I do is build+push a docker image, and have a cron job on my vps to pull+restart my website's container.

My preferences go Traefik > Caddy > Nginx, but traefik definitely has a bit of a learning curve.

https://github.com/andrewzah/andrewzah-com-source/blob/maste...

https://github.com/andrewzah/andrewzah.com-docker/tree/maste...


You might be interested in:

https://github.com/containrrr/watchtower

It would automatically pull the new image and restart your containers.


Let's see the description for traefik:

The simplest, most comprehensive cloud-native stack to help enterprises manage their entire network across data centers, on-premises servers and public clouds all the way out to the edge.

Sounds ideal for a personal blog! /s


If you already know Traefik, it is much nicer to work with than a static proxy as it automatically updates routes as you launch containers. Is it overkill for only 1 service- a small blog? Yes.

In my particular case, I run ~5 services on that VPS (used to be ~13), and I run about ~30 containers on my home server for https://zah.rocks. Having to manually update caddy or nginx and restart every time I added a dns entry would be a huge pain.

In addition, Traefik's middlewares made it relatively simple to add in SSO with External Auth Server/Authelia + Keycloak/OpenLDAP/LDAP Account Manager (LAM).


How do you update nginx, or git, or python/php/ruby (if you're using one of those for your little app)?

What happens if you get a spike in usage and it kills your little server? What if you want to run on a bigger host/smaller host/your image is killed by ec2?

An automatic install script with a deb is _way_ more work and way less portable than writing:

FROM nginx COPY . .

and

docker build -t <appname>.azurecr.io/myapp

az webapp restart


> How do you update nginx, or git, or python/php/ruby (if you're using one of those for your little app)?

I suppose you mean automatically update? If so, this has been solved decades ago. Unattended upgrades is just another debian package you can install.

If you meant just update in general, that's a pretty silly question.

> An automatic install script with a deb is _way_ more work

You misunderstand. You can use the automatic install script to create the .deb - or a script based on it. Basically instead of on a live system, you place the same files in a tree you create the deb package from.

Even if you use docker, I still recommend learning about how debian packages work. Installing your app in a docker container using a .deb is strictly nicer than having all those steps in a Dockerfile. Plus this way you can specifiy your secondary dependencies declaratively.

> more work and way less portable than writing: FROM nginx COPY . .

Now that is just flat-out wrong. The steps you'd execute after "FROM nginx" are the same steps you'd execute after apt-get install nginx. It's not more work, it's the same or less - because you don't need to deal with docker on top of everything.

You have the added benefit that afterwards you can install the .deb in a debian-based docker container, but you don't have to. You're not reliant on docker.

Oh great example btw. Because the debian-based nginx docker container is essentially created with "apt-get install nginx" in its Dockerfile. Funny how you're trying to argue against the setup I proposed by citing a docker package that does things precisely that way.

> What happens if you get a spike in usage and it kills your little server?

Yeah that's not gonna happen to a static blog behind CF. Also at least that way you're always only paying $5, regardless of what happens.

The sort of traffic that kills a $5 VPS serving only a basic static blog can bankrupt you on an automatically scaling infra.


> If you meant just update in general, that's a pretty silly question.

I disagree. How do you update manually? If your answer is ssh in and manually update, that's how you end up with a pile of different versions, and no reference what they are? You've now got random packages at random versions. If you automatically update, they're not infalliable, e.g. here's [0] a post from 12 months ago about someone having an issue with nginx not restarting.

> You misunderstand. You can use the automatic install script to create the .deb So now you have a deb package, and an install script; that's more complex and more work than a dockerfile (which is standard at this point. How do you update your deb package? With docker, you use the same commands above.

> You have the added benefit that afterwards you can install the .deb in a debian-based docker container, but you don't have to. You're not reliant on docker

You're reliant on debian, and the configuration of the OS underneath it, including the version of nginx, python, ruby, etc. With a docker image, that stuff is all pinned unless you engage with updating it.

> Yeah that's not gonna happen to a static blog behind CF. Also at least that way you're always only paying $5, regardless of what happens. > The sort of traffic that kills a $5 VPS serving only a basic static blog can bankrupt you on an automatically scaling infra.

Nobody is talking about auto scaling infra here, except you. You can run docker on a $5 VPS, or on a $5 DO app platform. If your blog dies, e.g. due to aws performing maintenance[1], you need to bring it back online. If you migrate to a smaller host or larger host, you need to rebuild it.

[0] https://www.digitalocean.com/community/questions/problems-in... [1] https://www.quora.com/How-often-do-EC2-instances-fail-Why


> I disagree. How do you update manually? If your answer is ssh in and manually update, that's how you end up with a pile of different versions, and no reference what they are?

On one server? What?

> With a docker image, that stuff is all pinned unless you engage with updating it.

Yeah now we're entering super silly territory. You can pin versions[1] in your debian package control file and/or on debian directly.

But in general automatic updates are preferable.

Unattended upgrades is often configured to only install security updates. Better to have stuff potentially going offline due to an update than running insecure versions of software.

What, do you subscribe to every single mailing list of every bit of software you're running in your docker containers, so you can manually update them when there's security issues with one?

You better have some automatism here or your setup is strictly worse than what Linux distributions could do decades ago. Like not just worse, it's broken.

> e.g. here's [0] a post from 12 months ago about someone having an issue with nginx not restarting.

That's ubuntu, not debian. Very different approach to non-security updates.

[1]: Here's how the depends line looks in one of my debian packages' control file:

    Depends: python (<< 3.0.0), ffmpeg (>= 10), imagemagick (>=8), webp, clamav-daemon(>=0.98.0), cron, haproxy (>= 1.7.0), psmisc, ntp, file, certbot, firejail, exiftool


You still haven't answered my question; how do you update? Yes on one server. If you ssh into that server or poke around on it, how do you rebuild it if aws kills your instance, or if your data center catches fire?

Re; version pinning that's cool - both methods work.

For updates, of course it's automated that's the whole point.

> That's ubuntu, not debian. Very different approach to non-security updates.

Actually this is a really interesting point; I don't want to be a sysadmin, I want to run my applications. I don't want to be aware of the platform differences. I actually didn't know that ubuntu and debian handled updates differently. They both expose the same interface to package management. Features like that are exactly why developers like me should be deploying docker containers to DO's app platforms for $5 a month, and not running VPS's


> If you ssh into that server or poke around on it

Just don't do any manual poking. You can't (really) do that with docker containers, and you shouldn't do it on a random server.

On a virgin server I generally do this to install my application, and I do it with a simple script (you could also use orchestration tools, but that's overkill for <10 servers) because I am even too lazy to enter five commands:

1. Add my repository

2. apt-get install my-app unattended-upgrades sshguard ...

3. config like setting the db server

4. reboot

That server will now automatically install security updates, but I need to manually tell it to install/update anything else (because I like it that way). My application, which sometimes is just a package that contains some haproxy/nginx config (if I run the server as a reverse proxy) is installed from my own repository, so I can update the way I'd update anything else on that server. You really shouldn't do anything on that server besides telling it to update (if you haven't set that to be fully automatic).

There's time-tested orchestration tools if you want to automate anything of the above. For some applications I have my build server poke a server that'll make the application on each other server update.

> I don't want to be aware of the platform differences.

I can understand that position, to a degree. But even with docker you'll be using some OS, or package manager repos, (generally) in your container.

If you use the docker repositories however, you're pretty much at the mercy of the update practices of individual packages, whereas with systems like debian you know what you're getting - which for debian is stable, but sometimes not up-to-date packages. And if you build debian-based containers, you should still be aware of this.

In any case it's really worth it to understand how dpkg/apt works and how debian packages work. It's not complicated and can be done in a day or two. You're already using them anyways.

It may also help you better understand exactly what niche docker is filling and you won't have smartasses like myself tilting their heads at you when you list something a typical linux system does (and sometimes does better) as a value proposition for docker.

Basically docker replaces and improves upon what people used to do with VM images. That's what it's competing with in the sysadmin world. It's not directly competing with debian/apt/dpkg/whatever, even though it makes some of their features redundant. It may make working with the latter more forgiving, because you can fiddle with a Dockerfile until it creates something that works, but it doesn't replace them.


Automated updates are great until one is installed that is incompatible with what you actually want to run. Now it's 3 in the morning and you have to find an old version of some package to fix your service.

Obviously protection from this problem isn't something Kube alone provides but it's still a problem with the container-less setup you're describing. Flatpack could be an alternative.

While we're on security - thinking simply installing your program via apt and running it in your VM is in any way compatible to containers in terms of the security benefit is simply wrong. In a container you'd getting filesystem isolation and capability restrictions OOTB. Via some program in apt you're going to have to deal with something like apparmor and a variety of other tools to get something even compatible to a Kube pod with default configs.

> Basically docker replaces and improves upon what people used to do with VM images.

This trivializes the feature set docker actually has.


Kubernetes for running a blog (or multiple) is fantastic... if you are already using Kubernetes for other things.

If it's for just the blog, and the only goal is to run the blog, then it is (almost definitely) overkill.

If you already have a cluster up, or have a bunch of other projects already running on the cluster, then Kubernetes is likely the easiest way to run a blog.


The reverse is true as well though. If you learn Kubernetes and set up a cluster to host a blog, you'll find that it is now trivial to deploy a hundred other things to it.


For sure.

> and the only goal is to run the blog

If part of the end of of "run by blog on Kubernetes" is "I want to use this as a learning experience to run other things on Kubernetes afterwards" it's definitely a worth-while exercise.

> you'll find that it is now trivial to deploy a hundred other things to it.

I mean, sort of. Starting with a blog is a good introduction and gives an on ramp to running other services, but I wouldn't say it is sufficient experience to call running hundreds of other services "trivial".


I mean it's one Kubernetes cluster. What could it cost? 300 dollars?


It depends on the size of the cluster. If it's just a little blog and you just want it to be as cheap as possible, but still using K8s just for fun, it will cost you around $4/month on GCP (1 f1-micro node).


> 1 f1-micro node

That only has 600MB of memory and the minimum memory requirements for a master node is 2GB[1].

[1] https://docs.kublr.com/installation/hardware-recommendation/


On GKE, the master nodes are provisioned behind-the-scenes. You get one free zonal master.

The f1-micro would be the worker node. It's still a bit of a squeeze, because GKE has system workloads that need to run on user nodes.


Master nodes are not billed by GKE, only worker one. But micro node is not going to work - all memory will be eaten by k8s services.


A minimal single host cluster runs fine for me on a $10/mo DO droplet with just 4GB RAM. It would probably work on the $5 droplet too if you weren't running much more than a blog.


Or 30 bananas


I also unironically use kubernetes (in particular, k3s) to run my personal blog. I run it on a colocated server I built myself, with the hope to have a couple colocated servers in the future I could network together. Right now, it has proxmox running a few VMs, 3 of which are for k3s, one for hhd backups, and one for a postgres server (which I am lazy, so am running with Dokku. Makes backups easy). The drives all run ZFS mirrors (2 2tb nvme, 2 8tb hdd)

Honestly, it's a super comfy setup, and very little maintenance. The one thing I have is a master update script that does the docker image upgrading and rollouts. I make updates all the time and pretty much don't think about it. More energy activation than a PaaS like Dokku, but worth in long run, I think.


Do you have a blog post showing how you set things up?


YAML is the biggest flaw of Kubernetes, so I'm quite exited that cdk8s is progressing very nicely https://github.com/cdk8s-team/cdk8s

There are other solutions that are potentially better (e.g. Dhall) but cdk8s seems to have momentum and sense for tackling the practical stuff (integrates easily with cdk, library with simplified constructs cdk8s-plus, import and convert existing stuff easily etc)


I do all of my k8s work through Pulumi using Typescript, and it as an absolute joy. And it's not limited to k8s, which is great.



k8s also has a first-party typescript API.


Well I wish he expanded a bit more on the “Just do X, that’s so much simpler” section.

He's on AWS, he could have gotten all those benefits with ElasticBeanstalk or ECS even. Plus no yaml files but actual IaaS (terraform etc.) and much better integrations with the other AWS services.

I'd personally still use EC2 for a blog, but if you're looking for convenience/battery-included type of thing...yes I'd argue you're better off just doing X instead.


Yeah... It's like software hipsterism, whatever is weird/hard is "cool," but the moment it become mainstream suddenly a new technology/language needs to made instead. 2 years from now everybody and their grandma will understand kubernetes or have a simple gui for it, and the hipsters will move on to something new and exclusive.


Another thing worth calling out, there's nothing wrong with over-engineering solutions in pursuit of knowledge.

Especially when it's just a personal project.


Okay but when Kubernetes breaks it's usually none-trivial to understand why. I've spent more time than I'd like diagnosing Kubernetes issues with networking and orchestration which could have been spent on building features.


50 lines for configuration of a blog is not what I'd have personally called "simple". Granted GUIs (e.g. AWS console as mentioned in the blog) are not as easy as an config file, the comparison should be against other blog deployment programs, like hugo/jekyll/ghost, not.. the AWS console.


That includes 2 load balanced instances of his blog + domain + automatic HTTPs


What clicks me about Kubernetes is not about orchestration or zero downtime or scaling,... I can just use Kubernetes on 1 node for all of my services.

What matters is the declarative approach of Kubernetes. It's like the first time I learnt about React: declarative renderering/deployment based on state !


Here's the thing that gets me: Docker Compose and Docker Swarm can offer declarative infrastructure and only take up under 50MB of resident memory.

I need 2GB minimum for a K8s master node, 2GB minimum for monitoring, and then 700MB minimum for each worker node[1].

I just can't justify running K8s for a blog when there are less expensive and complex offerings out there.

[1] https://docs.kublr.com/installation/hardware-recommendation/


As a big K8s fan, I have no idea why it's so fat. I'm sure there's a reason, but I really don't know where the bloat comes from.

I just run K*3*s when I'm not using a managed cluster, and it works great and is lightweight. I've never noticed whatever is missing from it.


Go is the new Java it seems :)


Try K3s. I run it on 3x of OVH's cheapest VPS ($3.50/mo) and works great. I have Postgres, Hasura, Keycloak, Riemann, Riemann dashboard and Digdag running- the cluster has been up over a year with no issues.


Can I use multiple machines as nodes for K3s? I think the last time I looked at it, clusters were limited to one machine, but that was a long time ago.


Yes, you can easily join multi master nodes or worker nodes with k3s.


Awesome, thanks.


Didn't Ansible/Puppet/Salt/Chef already do this?

I don't even know what Kubernetes is (because I refuse to look it up) but somehow all these people have heard of it and apparently not any of the previous simpler solutions.


Not quite. With kubernetes, you have the only choice, it's declarative yml.

With Ansible, you can easily shoot yourself in the foot with imperative shell command.

The point is, Kubernetes is very opinionated (for good reason) on how it's hard for you to break your system.


Puppet does 80% of what Kubernetes does, but you have to manually distribute your services across your hosts, and if you want failover or autoscaling then you have to implement them yourself.

Imagine Puppet but instead of saying "run service X and Y on host A, run service Z and W and the hot spare for U on host B, ...", you just say "deploy at least 3 instances of service X with this much CPU/memory, deploy at least 2 instance of service Y with at least this memory, these are the hosts you have available, you figure it out".


My 3 droplets run my Rust, Quake, gitea, registry, wireguard & pihole combo, traefik, grafana + prometheus with hashicorp nomad & consul. So yeah running an orchestrator for trivial tasks is actually fun, this is a good exercise and just in general a fun thing to do.


I run GKE for some small apps. I also use AWS S3 hosting for my personal blog. The cost differences are... non trivial to the point of a bad joke, if we were comparing ability to reliably ship plaintext over the wire. But I'm not. I host a database and webapps on the k8s cluster, without adding extra EC2 nodes, RDS costs, or wrestling with AWS Lambda limitations.

I can also confidently say that having something approximating a stable web app demands doing a lot of serious thinking, and "a single server running Apache on Digital Ocean" does not cover that case sufficiently. You need to tolerate failure, failover, load balancing, bin-packing, etc. I used to run a small autoscaling group on EC2 for my own systems; the dang thing would fail to come up on one node very frequently and so a number of the queries would fail. I eventually burnt it to the ground and redid it. I've never had that hassle in k8s. Its designed to succeed, in a way the "box of parts" approach doesn't.

Boxes of parts are useful. For a complexity-sensitive & thoughtful infrastructure engineer, having something like the old Synapse/Nerve[1] system with your apps distributed across some 5-20 machines with a monitor lease to spawn new ones on failure would probably approximate Kubernetes for a few years, until you have to do something fancypants. You've still reimplemented part of Kubernetes, though... The other angle is, boxes of parts can go in wildly weird directions.... if you need it.

Looking at some infrastructure these days professionally, the question is - when do we move to Kubernetes. It's not interesting or useful to the company to be maintaining our own thing or own strange path. The only questions are around the path - how much rework needs to happen and how much building in k8s needs to happen to get there.

GKE is a very good starting point for k8s. Strong recommend.

n.b. With respect to the cost. I consider this a professional investment / professional development expense. Spending $100-$200/month on a software engineer's salary is a reasonable return for being able to readily say I have experience in a current topic. Also I can run my own apps. :)

https://github.com/airbnb/nerve


> For example, the entire deployment configuration for this blog is contained in this yml file

Isn't that not quite correct?

If I'm reading this correct (which I may very well not be), isn't this a reference to a Dockerfile:

    image: marcusbuffett/blog:latest
Which then might have lots of other complexity contained within. I wouldn't call that self-contained, I think that's overselling it.


I wholeheartedly agree with everything said in this blog post. The real downside to using k8s for a personal blog or hobby project is that it's so damn expensive. I did try rolling my own k8s using k3s and some Raspberry Pis, but that quickly became annoying to maintain and to get up and running in the first place.


I had bad experience with k8s. The learning curve is step. A few days ago k8s decided to run a cron job just after the deploy (it was scheduled for Sunday). Why? it could be a bug (https://github.com/kubernetes/kubernetes/issues/63371) I'm not sure how to even debug this.

> A cron job creates a job object about once per execution time of its schedule. We say "about" because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent

Is there a way to make a cron job idempotent?


I don't really blog anymore but I do run a cluster in my basement for personal projects, so I totally get the appeal. I use Docker Swarm, not k8s, which I personally felt was way simpler to set up, but there's something kind of cool that the system I built is truly scalable to thousands of nodes if I really felt like it.

Out of curiosity, what do most people here who have done such a thing (with k8s or docker swarm or otherwise) use for storage? I tried briefly to use Gluster-fs but had really bad problems with performance (most likely because I set it up wrong). Right now, I just have a ZFS Raid on one of the nodes, and have an NFS share on that.


I've had a great experience with https://rook.io/ managing Ceph for Kubernetes in production.


I either use Longhorn (https://longhorn.io/) or TopoLVM (https://github.com/topolvm/topolvm), depending on what I need.

Rook (https://rook.io/) is also cool.


> I think Kubernetes has fallen into the Vim/Haskell trap

There was a period where Git was at this point as well, but it seems as though everyone's got over it and decided the learning curve is always worth it.


> I’m not exposed to that complexity

Some of us think editing YAML files is complex enough.


The problem I have with Kubernetes is the Docker part. I get why people who are running Python need to use Docker for all their deployments (because anything is better than Python package management), but I've got well-behaved programs in a language that has decent package management, so I just want to run a bunch of them on different hosts.

Is there an alternative that lets me orchestrate deploying a bunch of processes on a bunch of servers without having to interpose Docker in between? Anyone using Nomad?


The real competition in terms of price and convenience is a VPS.

So to approximate this with Kubernetes you have to avoid the big cloud providers and things like kubeadm as that would mean at least two cores and the resulting price tag (even if you use things like burstable). Using k3s is a nice option. Must avoid load balancer and often static IPs.

I wonder why there aren't vendors who are selling k8s namespaces directly with a quota (reinvented shared hosting in k8s).


I just connect Netlify to my repo, and boom blog is deployed.


Let's see what happens in a few years.

There are a bunch of blogs/content out there that have been abandoned for a long while, but you can still read the information that was put there in the past. The simpler the system serving the pages is (down to static HTML files), the more likely it is that the website stays up functional.

A blog that is built on Kubernetes sounds like it will be dropped pretty quickly once interesting in writing for it wanes.


Static HTML pages might be the only safe way to keep a blog running; my experience running Wordpress is if you leave it up long enough someone will drive by, hack it, and Dreamhost will block your site.


On the topic of "takes too long to learn" can someone recommend a good tutorial on basic Kubernetes concepts?


The trick is that to have a real chance of administering a k8s cluster you need Linux kernel knowledge (namespaces to understand docker, a bit of iptables, VXLAN or something similar to understand CNI/overlay network, how the container network namespace is connected via veth pipe, chroot/bind mounts, a very little bit of SElinux/apparmor/seccomp), systemd to see how it all starts, etcd to know about the consistency/consensus layer.. so far this is almost identical for all container orchestrator/scheduler platforms (nomad, docker swarm).

Then there are the specific k8s concepts: pod (containers that share a network and PID namespace), service (a name, a TCP and/or UDP port number, and the name/selector of the target pod), ingress (HTTP routing, basically haproxy/nginx/envoy or other layer 7 stuff), LoadBalancer (the magic API that connects your cluster to the outside world, allocates IP addresses for Services/Ingresses).

Then there are the background parts that take the YAML and convert it into actual running stuff: apiserver (this is the central info hub, kubectl and kubelets and controllers all talk to this basically), kubelet is the actual agent on each node, controllers have control loops and those actually calculate the necessary actions/operations to achieve what the YAML declared.

Then there are the subsystems: storage (persistent volume claims and PVs), settings (secrets - see vault too - and configmaps, and various special/custom resources like a TLS certificate, these are stored in etcd, through the apiserver of course), CNI networking flavors and knobs, an endless list of kubelet and apiserver command line arguments, PCI passthrough for GPUs or HSMs or whatever. And of course there's RBAC, security, admission webhooks, see how nowadays Authz and Authn is done through a "lookaside" OpenId Connect service (some Ingress Controllers support these out of the box).


I think OP put this just as good or better than I could've in their blog post.

> I think Kubernetes has fallen into the Vim/Haskell trap. People will try it for 10 minutes to an hour then get fed up. The point where you start to grok stuff just happens too late for most people to stick with it. Those become a vocal minority proclaiming it as too complex for humans to understand, and scare off people that haven’t tried it at all.

I've had good luck with the tutorials and various help articles in the official docs[0].

[0]:https://kubernetes.io/docs/tutorials/kubernetes-basics/


Kubernetes The Hard Way if you want to start from first principles and learn all about low level tools and setup. In practice you're probably going to use a cloud provider or tool to automate all of this stuff, but it's useful to understand the core infrastructure like TLS certs, etcd, etc.

Otherwise the official docs are quite good and worth starting there. I'd be wary of buying books that are more than a year or two old--the k8s space moves fast and they regularly have major updates once or twice a year. Books and docs can get a bit stale or out of date with current best practices.


I currently have a k3d cluster running locally with an entire product stack locally. It' nice to think that I can simply move that over to a full cluster at any time. There isn't even a change in tooling as the kubernetes context is the only change needed.


Next step: unironically creating a Kubernetes Operator for your personal blog https://github.com/dexhorthy/captains-log


I might be looking in the wrong places, but what I'm really missing is just an AAS where you deploy your pods and pay just for the CPU, memory, disk and ingress. No managing a cluster, no baseline cost, just usage.


GKE AutoPilot gives that. There's a cluster management baseline fee, but the first one is rebated.


Azure Container instances kinda do this, but you have to orchestrate scaling yourself

Azure App Services also fill this niche, but only works for web-based services. If you have a job queue and a worker reading jobs off the queue but otherwise not exposing a web endpoint, it doesn't seem to work quite as well and my (limited) attempts at faking an endpoint haven't appeared to pan out :d


You mean like AWS Lambda, but using your own pods?


Often, when people discuss how easy or hard Kubernetes is, they mean running it on a managed platform like GKE.

Running it on your own metal is supposedly what requires that team of engineers.


Setting up and running Kubernetes binaries across many VMs is not hard. The hard part is managing all of the addons, extras, and post setup tasks (roles, quotas, etc) to make (and keep) it production worthy. Operational experience is still rare


A side note, how do you use Kubernetes to set up HTTPS on wildcard DNS? Is there a service for autoconfiguring the DNS and getting the certificate without a forward proxy?


> Is there a service for autoconfiguring the DNS

There is! I use external-dns. [1]

I haven't actually set up a Let's Encrypt wildcard cert, but I'm pretty certain cert-manager [2] supports it. I don't think you need a proxy if you use the DNS01 challenges.

[1] https://github.com/kubernetes-sigs/external-dns/

[2] https://cert-manager.io/docs/


cert-manager supports the Lets Encrypt DNS challenge, which supports wildcard certs: https://cert-manager.io/docs/ It's a little bit of one time setup and then easy-peezy SSL automatically for all your services.

Do note that you need a DNS provider with a good API to programmatically control it (the DNS challenge requires setting a special txt record that Lets Encrypt verifies). Read the docs to learn more, but most of the big ones (AWS Route 53, Cloudflare, etc.) should be fine.


> After spending a couple hours learning the key concepts through the official tutorial (...)

That, and the ongoing complexity you're buying into, is exactly the problem.


A few lines of Kubernetes configuration ... and a custom Docker image, which might tilt the scales of complexity in favor of starting from a stock installation of whatever server stack serves the blog and adding content, configuration files, SQL scripts etc. more frugally and explicitly.

In particular, how is the blog updated? I don't see how replacing and/or backing up a Docker image could be more convenient than copying files through a SFTP client and using other kinds of real clients (administrative web interfaces, SSH, etc.).


Am I the only one still using Docker Compose?


No, But I'm looking at what to do to streamline deploys and provisioning across 10's of vms, each running docker-compose.

For one or two services on a machine, it's pretty good. Past that... not so much.


I run docker-compose inside supervisord. Surprisingly it works great, they play well together and I highly recommend it for managing multiple compose projects per machine.

    [program:some-docker-project]
    command=docker-compose up
    directory=/opt/some-docker-project
    ...

    [program:some-other-docker-project]
    command=docker-compose up
    directory=/opt/some-other-docker-project
    ...
This style of setup has served many millions of visitors across dozens of projects in production at our company over the last ~3 years. It's like a poor-mans kubernetes, it handles autorestarting failed projects, we handle ingress with cloudflare argo tunnels in docker, and deployments are easy (just git pull && supervisorctl restart all).


I'm more at the level of one Docker-compose per VM, and I've got it loading with some os too. Service or similar. But there are at least a few problems with that.

1) Want Ci where push to github builds the image and deploys.

2) Binpacking time. We're at the point where that's an issue

3) Host machines -- There's too much on them and we need to standardize and streamline.


Been running my blog on Kubernetes since 2017[0] and it's been great. Kubernetes is a force multiplier and the natural evolution in deployment methodology. It could be tighter/cleaner but it's extremely good for what it provides, but you have to know why you might need the things it's providing for it to seem worth it.

If you'd like to try kubernetes and get to understand it and feel comfortable with it I'd recommend:

- Work through kubernetes the hard way[1] (ignore/replace the GCP-specific things, if you don't know enough about what the non-GCP corrolary to a GCP-thing is, that's a bit of knowledge you need to fill)

- Set up your own cluster from scratch (no kubeadm, no alternate distros, etc), use the simplest options you can find at first (ex. flannel for CNI)

- Set up ingress (NGINX ingress is a good place to start)

- Run some simple unsecured but diverse workloads (static sites on NGINX, Wordpress, etc), figure out why you might pick a StatefulSet versus DaemonSet. At this point, use the hostPath/local volumes just to avoid trying to grok volume complexity.

- Install a useful cluster tool ("addon") like cert-manager[2] from scratch so you can see Kubernetes manage something you'd normally solve with systemd timers and `certbot` (or your reverse proxy would do for you if you're running caddy or traefik)

- Start putting your YAML in source control get familiar with either kustomize, helm, or both (you could also just use Make + envsubst like I did for a while[3]). It's at this point that it should click that all you need to get back to a certain state of your cluster is to get a machine, do basic hardening (ufw, etc), install kubernetes, and run "make" in this repo (excluding things like DNS entries, etc). Now things are probably getting fun, because you can have the distant cousin of immutable infrastructure; repeatable infrastructure.

- Tear down your cluster, set it up again with kubeadm (note that kubeadm actually has a file-driven configuration option[3]), run your yaml from source control and confirm that all the workloads you had in place are back up and secured.

- (optional) Tear down your cluster, try rebuilding it with k3s[4] or k0s[5]

- Start looking around and seeing what your options are and the ecosystem that exists -- digging deeper into the interfaces that make Kubernetes tick, for example volume management (Container Storage Interface) by deploying Rook[6] or OpenEBS[7].

Note that one of the best things that can happen to you during this process is something going wrong. Every time something goes wrong, and you go back to fundamentals to fix it. Being able to reason about kubernetes ad hoc (ex. if DNS is correct but you can't reach port 80/443, what should you check first?) is the key to feeling comfortable with it, and while failures and downtime will happen, you shouldn't feel just complete despair/confusion. Normally, once things are stable kubernetes is quite worry-free.

I've made a guide like this before, I'll see if I can find it.

[EDIT] Found only one post[8]

[0]: https://vadosware.io/post/fresh-dedicated-server-to-single-n...

[1]: https://github.com/kelseyhightower/kubernetes-the-hard-way

[2]: https://github.com/jetstack/cert-manager

[3]: https://www.vadosware.io/post/using-makefiles-and-envsubst-a...

[4]: https://k3s.io/

[5]: https://docs.k0sproject.io

[6]: https://rook.io/docs

[7]: https://docs.openebs.io/

[8]: https://news.ycombinator.com/item?id=22117684


One more thing -- Probably the first thing you should do (before doing kubernetes the hard way) is read the docs[0] cover to cover! They're actually pretty reasonable, and will prep you for all the concepts and discrete pieces you'll need to deal with!

[0]: https://kubernetes.io/docs


thanks for this i've been wondering about kubernetes for a while now purely out of curiosity


Yeah it's actually a good bit of fun! One thing I should have included is that it's a great idea to take a skim through the docs first -- that's actually the first thing I did, basically read the docs nearly "cover to cover".


With some IaC tools this would be done with 10 lines


[Deleted]


Sure, pay some other people to make your own worries go away. This post is clearly not for you.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: