I'm surprised to see this k8s consensus on HN. Kuberentes has a very steep learn...

I'm surprised to see this k8s consensus on HN. Kuberentes has a very steep learning curve. I spent a week over Xmas 2019 just bringing up and tearing down clusters until I got something that was functional. The documentation is not particularly friendly so there was a lot of trial and error. But eventually it clicked. Then a couple months later we decided to invest by slowly moving non production services over, and then ultimately moving everything over.

We have not had any major k8s incidents in the past 12+ months of running our own HA cluster. We even run HA Postgres using local volumes. We've pretty much moved everything over and couldn't be happier.

We have red teamed disaster scenarios, we can bring the full cluster up on another cloud provider using vanilla k8s in about an hour (including a restore from Postgres S3 backups + wals). And this is all with a very small team.