FWIW I worked with Borg for 8 years on many applications (and at Google for over a decade), so this isn't coming from nowhere. The author of the post I quoted worked with it even more: https://news.ycombinator.com/item?id=25243159
I was never an SRE, but I have written and deployed code to every data center at Google, as well as helping dozens of people like data scientists and machine learning researchers use it, etc. It's hard to use.
I gave this post a modest title since I'm not doing anything about this right now, but I'm glad @genericlemon24 gave it some more visibility :)
This article really resonated with me. We are starting to run into container orchestration problems but I really don’t like what I read about K8s. Apart from anything else, it seems designed for much bigger problems than mine, and require the kind of huge mental effort to understand which, ironically, will make it harder for my business to grow.
I’m curious if you’ve taken a look at Nomad and the other HashiCorp tools? They appear focussed and compositional, as you say, and this is why we are probably going to adopt them instead of K8s - they seem to be in a strong position to replace the core of K8s with something simpler.
Thanks. We're going to start small with just nomad, then vault, and as our needs grow we will probably adopt consul (we already use terraform so hopefully not a huge stretch) and maybe boundary.
This is thing I like about the HashiCorp tools. You don't have to eat the whole cake in a single sitting.
There are some good ansible playbooks on GitHub for nomad, consul and vault. I personally don't use vault because it's overkill for the proeuct in working on at the moment.
To avoid the pain of managing a CA and passing out certificates for TLS between services, I use a wireguard mesh and bind nomad, consul and vault to these wg interfaces. This includes all the chatter of these components, as well as the services I deploy with nomad. It's configured such that any job can join the "private" wireguard network or "public" internet gateway.
It takes a few days to set up, but it's very easy to manage.
>You will need to scratch your head a little bit to setup consul + nomad + vault + a load balancer correctly.
I've been wondering, would it make sense to try to package all that into a single, hopefully simple and easily configurable, Linux image? And if it might be, why hasn't anyone done that yet?
I've only looked at the HashiCorp tools, not really used them. My understanding is they originated in a VM-based world (?), and I've worked almost exclusively with containers. I'm sure that has changed over time.
I will say that I looked at HCL and it looks very nice:
But somehow it's not as popular as a mess of YAML and Go Templates? That genuinely leaves me scratching my head. I guess it's because people pick platforms and not languages? (BTW, in 2009 I designed and implemented the template language that Go templates are based on, and I find their common application pretty bizarre, e.g. in some Helm charts I looked at from this thread)
Oil is growing a config dialect that looks a lot like HCL (although it's convergent evolution; I've never used it.) I think there is a lot of room for mixing declarative and imperative; as far as I can see HCL is mostly declarative (defining data structures).
Anyway I'd be interested in reading about HashiCorp stuff but for some reason in my neck of the woods I don't hear too much about it. Maybe that's because they're paid services and the open source Kubernetes seems attractive by comparison? Or is it more of a VM vs. container thing?
All of the Hashicorp products are primarily open source products. While there are enterprise features and cloud-hosted versions of some of them, FOSS is the foundation of the company.
10 years ago there wasn't a Docker (released in 2013), and AWS was a tiny side player with most established businesses operating their own data centers.
I think it's safe to say that if the next 10 years are anywhere near as disruptive as the last 10 we will surely be doing a lot of things very differently.
Things have already changed since the first release of Kubernetes. Specifically hosted Kubernetes, aka GKE/EKS/AKS, is a marked step forwards from running Kubernetes yourself, that I think doesn't get enough recognition. We'll see what the future holds, but my prediction is that the future holds more layers of indirection, and the future of running web services is on AWS Lambda/Azure Functions/Google Cloud Functions, and other fully-managed PaaS, like Heroku, with more vendor agnosticism. Running Kubernetes, in addition to the technical benefits, also enables a company to treat AWS/GCP/Azure as a commodity, and can credibly threaten to move clouds when the contract is up for renewal.
Back in 2003 we had Solaris Zones (now called Solaris containers).
Same concept as Docker, but we didn't knew exactly why it was such a good idea, and the hardware was expensive.
What made Docker spark was being abke to use commodity hardware and ush to production with the same exact environment and behavior.
You could have done the same with Solaris, if you developed on a Sun Ultra5 workstation and published the application in a zone in the server. But 2003 was a different world and not everyone had a Spark box to develop nearby.
IMO hashistack nomad provides a better development experience. The complexity is gradual and it doesn't try to do "everything", it can stay focused on workload orchestration(whether its a container, vm, or even a process) and delegates coordination out to specific services better suited for it(consul for service discovery, vault for secrets etc...)
It's claiming that there's something better that isn't discovered. Probably 10 years in the future.
I will be really surprised if anyone really thinks that Kubernetes and even AWS is going to be the state of the art in 2031.
(Good recent blog post and line of research I like about compositional cloud programming, from a totally different angle: https://medium.com/riselab/the-state-of-the-serverless-art-7...)
FWIW I worked with Borg for 8 years on many applications (and at Google for over a decade), so this isn't coming from nowhere. The author of the post I quoted worked with it even more: https://news.ycombinator.com/item?id=25243159
I was never an SRE, but I have written and deployed code to every data center at Google, as well as helping dozens of people like data scientists and machine learning researchers use it, etc. It's hard to use.
I gave this post a modest title since I'm not doing anything about this right now, but I'm glad @genericlemon24 gave it some more visibility :)