Unhygienic string templating of a whitespace-sensitive language is a truly new hell for those who have not experienced it before. Quite an astonishing decision.
I remember being blown away by this. Unhygienic templating and lack of template and serialization boundaries has been such a consistent disaster in our field for so many decades that it is hard to believe we are still dealing with this kind of design error in relatively new technology.
My feeling is we will probably all look back at the over-use of text based DSLs and configuration languages as a giant mistake. Not just with respect to K8s, but IaC, CI/CD config and the rest of the DevOps YAML/config language mess. In retrospect, it has been a case of simplistic instead of simple. "Declarative" config languages are hello world optimized, and what looked great at the beginning of the S-curve is starting to look pretty damn bad.
You can with Pulumi, for example. I've only used it for provisioning a Digital Ocean droplet before, not for Kubernetes, but it was relatively painless. (Although I do find approximately one new panic in Pulumi every time I upgrade to a new version.)
Indeed, as someone who maintains a helm package it's mind-boggling. When I've been able to build k8s tooling from scratch, I've been reasonably happy with jsonnet [0], which is a constrained programming language designed for configuration. It has the property that it will always produce valid JSON.
Yeah, there used to be ksonnet [0], but it didn't take off. I think the k8s world wasn't ready for the complexity, and wanted something simpler. However, it's definitely more powerful, could be a future solution.
Seconded on jsonnet. I’ve started using it to generate json (that usually goes to an API) in gitlab ci pipelines. I use it to merge “gold standard” boilerplate configs with user changes.
JS and TS aren't configuration languages. If it wasn't for JS's early dominance and too-big-to-fail status on the web, we probably wouldn't use it for programming much. Platform and operations engineers often don't know JS well, favoring bash, python, and others for programming. Some exceptions exist, like full stack node shops.
Configuration languages like yaml, HCL, etc. are more reasonable alternatives.
yaml is a tree serialization format with some human-targeted ergonomic features like comments and multiline strings built in.
it'll work for simple configuration files just as well as format-less .ini files will. for complex configurations, even xml is better, and that's saying a lot.
in practice for object graphs of any sort of complexity, like cloudformation or k8s configs, you want a programming language which can reduce the kolmogorov complexity of your configuration, because that dominates ops in the limit. or, IOW, configuration is code is configuration is code is ...
go and jinja "templates" have always been ever so much fun. It depends on the complexity stored in said "configs"; what reads the configs and what writes to the configs.
as a human i want to be able to both read and write, easiliy, by knowing wtf is going on for the input(s), and to understand clearly within the config-at-hand, what does what ect (without 5x paragraphs per 1 key-value tuple).
then insert more templates to generate more templates, and then it is a fun spiral
k8s reminds me of those 20min long commercials combined with the energizer bunny, "but wait there's more" .. "and there's more" .. "and there's more" - as with orcrastrating infrastructure as microservices, there's always more and more complexity to add to the monster. More layers, secutity, networking, ect, ect..
well, there's more, because operating software is exactly like that and k8s tries to API-fy that, in my opinion with considerable success.
yes, the actual descriptors are atrociously hideous, but it's okay, it's low-level, evolving pretty quickly, and there are nice high-level representations -- https://cdk8s.io/docs/latest/plus/
so, yes, of course, compared to FTP copying PHP files into cgi-bin k8s is more complex, but the feature set is also different.
of course, not everyone needs declarative gitops-based blue-green deployment with pristine dev/demo/staging/UAT envs on each new PR. and usually when people think they do it's mostly just FAANG envy :)
but, speaking from experience, setting up a k3s cluster is easy, cheap, and deploying things on it with a "kubectl apply" is also easy. setting up CronJobs to do backups to some S3-compatible thing is also quite doable, and so on. and you end up with a big bag of YAML. is it better than snapshotting a VM? who knows!
Thanks, that little twitch under my eye just started up.
I worked at an org where an admin went a little rogue and updated the config without using the macro file. Another admin didn't realize. Well, didn't realize until it was too late.
It's not really astonishing. People don't usually have knowledge beyond a quick google search. If you google XML, you'll see a lot of negative sentiment about XML specifically related to mid-00s Java frameworks and some people chunk this fact as "XML old and bad". If you google YAML, you'll see that lots of people like using it for relatively simple things (Rails config files...) and some people will chunk that fact as "YAML good".
For me, one of the true advancements of JSON/YAML over XML is how its features are orthogonal to each other
I don't have to ask "should this be a tag property, or a subtag"?
Which, to your point, isn't to say that YAML is "good", but I think there was at least an advancement through minimalism.
Right tool for the job. I think XML is too much tool for most jobs, but there's a huge gap between JSON/YAML and XML, so in 99% of cases I'd rather err on the "too much" side.
defo agree, mate. if data that is consumed by computer is really crucial, I'd better be sure to use any format that has clear AST, schema, or whatever ya call it. now I'd rather use xml / clojure's data sexpression / even json over yaml. it's just a gut feeling, in the most part.
YAML and other human-friendly presentation/editor formats are cool, but they need to come with a schema/validation/types. k8s APIs have a bit of validation, but there are frustrating gaps.
That has been one of the most annoying aspects of dealing with helm templates. It's like that USB plug you never can insert the right side up, only it has like a half-dozen sides and to cycle through them all to figure out the right one.
I don't think I'll ever understand why YAML was chosen as the language of choice for all things devops, and why so many startups have sought to augment yaml with syntactical hacks
I mean, as soon as you template it and need to do `{{ something | indent 4 }}` or some shit to make the template work you know you're on a bad track.
I don't really think YAML is the problem, it's the string-based templating. I'd like to see Emrichen or something like it become more common. And Emrichen is format-agnostic, you can write your stuff in json or YAML and it looks and works pretty similarly. (Although I stick to yaml.)
100%. Target unaware templating is always a mistake. It is unfortunately common in our industry because it is easy and works most of the time. But this is also why SQL injections and XSS are one of the most common vulnerability. SQL injections are getting better because people are more often using parameterized queries which never need to actually encode the values into the "template" and XSS is getting better because most big frameworks now have target-aware templating that properly serializes values. But these are both hugely common issues. Look up a tutorial about how to use SQL or make your first website and more likely than not you will see examples that have vulnerabilities. Our whole industry is teaching the wrong thing by default, then hoping to fix it later.
String concatenation may have been a mistake. The concept of a string may have been a mistake. Every sequence of bytes has some structure, and in order to mash two "strings" together they need to be serialized in the correct way. Even when building error messages it would be nice if you can reliably identify the "chrome" from the "content".
Also look at terminal escape sequences. When we print text to a terminal we should probably be replacing non-printable characters with some sort of encoding so that the reader can understand that 1. these are from the content not the application and 2. Not do stuff like delete the output line to trick the reader.
Every time you put two strings together you should think about how you need to properly encode one into the other. Output unaware text templating almost always fails this because it relies on the user to do this for every single interpolation, and that is doomed to fail.
It's "declarative" which is supposed to be good, and you can represent complex data structures without any annoying closing braces (like json) or tags (xml).
Plus, well, there's a disturbingly high number of people who think that semantic whitespace is a positive.
This is ridiculous. Indentation is a visual concern for humans and braces are a semantic concern for a machine. Code only needs the latter to actually execute.
You can prefer them be coupled for taste reasons but they are fundamentally different. Code minification is a practical example where you can save many characters by omitting whitespace.
This is like tabs vs spaces. The white spaces crowd prefer the trade offs, and the tab crowd don't. But the white space crowd can't argue the perks of tabs - they exist.
The tabs/spaces war was fought because of IDE choices that restricted your ability for sane tab rendering defaults, and simple overriding of tab size.
Now that terrible text editors won, I have to watch a 2 Vs 4 space war.
If only we could have a character that could represent indentation and people could set the rendering so they can visualise it in their own preferred way.
And folks who want to remove braces should be pointed to the Apple certificate snafu.
Considering that we lost tabs because some peoples text editors and web browsers rendered them as 8 spaces, and drew the wrong conclusion, I don't have much hope for your plan.
However I would be completely on board if we could agree on a formatting configuration file that people could check into a repo for IDEs to pick up.
I think we've mostly agreed on editorconfig[1], though it doesn't concern itself with language-specific formatting - it's even built into a lot of editors.
For language-specific formatting, you can run formatters in pre-commit-hooks and CI. Treefmt[2] can help with that if you want to cover a lot of languages in your repo.
I see it, I understand the argument for coupling them. But the argument makes an assumption I am not comfortable with - it says they are there for the same exact same reason, which is not actually true. It is often incidentally true in practice depending on the needs of the language - but it is not universally true across all needs in all languages.
It's the same thing with semi-colons. Having a statement separator provides practical benefits in several languages. In many of these languages, they are also optional.
If you were to say that all languages should parse both styles to get the best of both worlds, that wouldn't be completely unreasonable. But it makes the parsing more complex than necessary to support both, so only one is often supported - which is fair.
That's not true, the argument does not "say they are there for the same exact reason", that's your strawman. The argument is just: "they always come together thus one is redundant and we obviously can't remove indentation". The difference in reason, syntax/compiler vs. legibility, is irrelevant.
I mean, you can argue for having both for a myriad of reasons. It's redundant and no big deal, a lot of languages do just that. And some languages do fine with only indentation and no braces. But there is no language that does braces without indentation, or at least stylewise the code is always indented.
These optimizations for some perceived ergonomic win almost always make terrible tradeoffs versus using a good well established data format. And especially systems which favor human consumption but create extreme difficulties for machine handling, those are the worst!
Yaml being such a non-Context Free Grammar is a huge pain. There's so much state in the parser. It only gets worse from there. Yaml has all kinds of wild crazy capabilities. References, a variety of inline content blocks, and weird ways to invoke stuff?? GitHub yesterday did a code review of Frigate, an enormously popular surveillance video analysis tool that's heavily downloaded, and found, oh yes, a huge glaring yaml bug allowing remote execution, because executing arbitrary code is just built right in to yaml amid 3000 other crazy hacks & who would have known to go look for & disable that capability?! https://news.ycombinator.com/item?id=38630295https://github.blog/2023-12-13-securing-our-home-labs-frigat...
Typing is not the problem (even though I see so many people just terrible beyond words at navigating project structures or the command line... Improve! Some day!).
I do think there's a power to the readability that makes it more approachable (but which eventually burns you). We were rewriting our AST at Kurtosis last year, and the default choice we were going to go with was of course YAML. But we came across a Github issue from DroneCI (who also started with YAML) that said something like, "we started with YAML, and we learned that you always eventually want to add more complex lotic on top. Go with a Turing-complete language to begin with, else you'll be in the CircleCI trap inventing a language via YAML DSL."
We decided to go with Starlark as the base language for our DSL, and we've had a consistently great experience. Users report that it's very approachable, and the starlark-go library is very pleaeant to deal with.
I totally agree that string-templating into a data serialization format is a mistake. But you can make life dramatically easier on yourself by doing `{{ something | toJson }}`. In fact write a linter that every single substitution is followed by `| toJson` and you will save yourself a lot of headaches.
The main issue is that it make it more difficult to mix hardcoded and inserted values.
Also the small technical concern that YAML isn't actually a superset of JSON. (But you are far less likely to hit these cases than other escaping bugs).
Dedicated config languages are the best usually. Jsonnet/Cue/friends.
Failing that if you actually need/want a procedural/non-pure language then I think Kotlin or Ruby take the cake. Both have extremely strong support for DSLs which IMO is key to reaching a modicum of usability.
Starlark is nice as well, it’s syntactically based on Python, and behaves a lot like regular procedural languages, but it’s meant to provide a pure and safe environment for configuration and be embeddable.
Big +1. We switched to Starlark for our DSL last year and have been very pleased. Users who've never used Starlark before come in with some 'what, another language?' trepidation, and end up pleasantly surprised.
Since using Bazel a bit I have grown an appreciation for Starlark also. The big thing is that list and dict comprehensions are a really nice fit for these types of tasks.
Lua is fantastic for the config-as-code use case. Easy to read and write, with a lightweight embeddable interpreter, and it has almost universal library support across different languages/environments.
It's a shame that helm v3 didn't move forward with the lua engine[0]. I don't imagine ~=/1-based arrays were a worse timeline... And here we are 5 years later.
YAML itself is not too bad, especially with a good IDE. But YAML and its ws-sensitivity combined with Go templating is horrible. Every component by itself looks kinda reasonable, but when they come together it makes an unholy mess.
the best part is that Helm does some sort of preprocessing to strip out comments (i think) when templating and validating, so it will report a failure at a line number that doesn't actually correspond to the problem in the original source. tracking down template failures is infuriating because of this
I think helm is at it's best when you need to _publicly distribute_ a complex application to a large number of people in a way that's configurable through parameters.
For internal applications, it's in an awkward place of being both too complex and too simple, and in a lot of cases what you really want to do is just write your own operator for the complex cases and use kustomize for the simple cases.
Most of the problems with updating and installing helm charts go away if you manage it with something like argocd to automatically keep everything up to date.
This is interesting, I have the opposite opinion. I dislike helm for public distribution, because everyone wants _their_ thing templated, so you end up making every field of your chart templated and it becomes a mess to maintain.
Internal applications don't have this problem, so you can easily keep your chart interface simple and scoped to the different ways you need to deploy your own stack.
With Kustomize, you just publish the base manifests and users can override whatever they want. Not that Kustomize doesn't have its own set of problems.
Kustomize also supports helm charts as a "resource" which makes it handy to do last mile modifications of values and 'non value exposed" items without touching or forking the upstream chart.
How would you feel if you could use Starlark (if you're familiar with it) to parameterize a la Helm, and then can add more Starlark commande later to update previously-defined infra a la Kustomize?
Full disclosure: our startup is trying to build a tool where you don't have to pick, so trying to test the hypothesis
Personally much prefer kustomize for the “ship an app” business.
Probably even better is to ship a controller and a CRD for the config.
Doing it that means you ship a schema for the parameters of the config, and that you have code that can handle complexities of upgrades/migrations that tools like kustomize and helm struggle or fail at altogether.
We switched from kustomize to helm and I really can't understand why anyone would prefer kustomize. Having the weird syntax for replacing things, having to look at a bunch of different files to see what is going on...
I love how in Helm I can just look at the templates and figure out what values I need to change to get what I want, and I love each environment only needing a single values file to see all the customizations for it.
People complain about it being a template language, but that is exactly what you need!
JSON patches aren't the most intuitive and kustomize needs some helper tooling to generate them for you (given JSON objects A and B, generate a patch that transforms A into B), but overall the kustomize model makes more sense, and the team behind it seems to be more actively improving developer QoL stuff than Helm is
templates are only good if your templates can remain simple and do not need to expose most of the output fields. my experience developing a chart for wide distribution has been very much that your templates will not remain simple (and will turn into an incomprehensible mess, since you'll need them to handle tasks templates are fundamentally poorly suited for) and that there is always someone, somewhere, that needs some particular resource field exposed in values.yaml. the
> As a result, the number of possibilities for configuration is often unreasonably large and complicated, mimicking the actual resources they want to create, but without any schema validation!
bit from the op is incredibly true. values.yaml grows, over time, to have every field in the objects it generates, just organized differently, without validation, and with extra complicated relationships with other settings
kustomize allowing you to provide a base set of resources that users can apply their own patches to avoids that config surface bloat problem entirely
Isn't the "weird syntax" just either Yaml files or just JSON Patches, which is a pretty easy standard?
>having to look at a bunch of different files to see what is going on
I consider that a feature, not a bug. prod/larger-memory-request.yaml makes it much easier for me to see what goes into deploying the prod environment instead of for example the test environment.
By "weird syntax" I mean stuff like "patchesJson6902" or "configMapGenerator" or "patchesStrategicMerge" where you have to know what each field means and how they work.
A template is much easier to read. I had zero experience with go templating, but was able to figure out what it all meant just by looking at the templates... they still looked like kubernetes resources
As for looking at a bunch of different files, if you like having a "larger-memory-request" file, you can still do that with helm... you can use as many values files as you want, just include them in precedence order. You can have your "larger-memory-request" values file.
That just using Kustomize. There’s a difference between learning curve and frustration post learning curve. Kustomize isn’t that bad, Helm comes with far more headaches, especially if you need to do any kind of inheritance.
Keep your customizations flat and compile them to yaml+grep to find out what’s getting overridden and where.
You shouldn't need to read the source to configure the values how you want.
Helm's problem is that `values.yml` is basically the API and every helm chart provides its own (often incomplete) interface with poorly documented defaults. Some of those can spread over 3k+ lines and it's utterly overwhelming to figure out what to do with that.
> Probably even better is to ship a controller and a CRD for the config.
Maybe it's just us, but our operations team puts pretty hard restrictions on how we're allowed to talk to the K8s API directly. We can turn a regular Deployment around as fast as we can write it, but if we needed a controller and CRD update it'd take us like three days minimum. (Which, I even sort of understand because I see the absolute garbage code in some of the operators the other teams are asking them to deploy...)
Generally speaking, operators and CRDs are more in the domain of your platform rather than your products. They should provide common interfaces to implement the business requirements around things like uptime, HA, healthchecking, observability, etc.
If a product team sees itself needing to deploy an operator, it's likely the platform is subpar and should be improved, or the product team is overengineering something and could do with rethinking their approach.
As in most cases, a conversation with your platform/ops/devops/sre/infra team should help clarify things.
If you run a multi-tenant Kubernetes cluster at scale, operators with poor discipline spamming the API servers and taking etcd down is a leading cause of sadness.
This is the common view among our ops team, sure, but for a vocation so prima facie obsessed with postmortems/five-whys/root-causes/etc it's depressingly shallow.
> Probably even better is to ship a controller and a CRD for the config.
But how do you package the controller + CRD? The two leading choices are `kubectl apply -f` on a url or Helm and as soon as you need any customization to the controller itself you end up needing a tool like helm.
Agree. I'd recommend to start with static YAML though. Use kustomize for the very few customisations required for, say, different environments. Keep them to a minimum - there's no reason for a controller's deployment to vary too much - they're usually deployed once per cluster.
The need to use something like Helm to distribute a complex application is a good indication you've built something which is a mess, and probably should be rethought from first principles.
Most of the problems associated with Helm go away if you stop using Kubernetes.
By that thinking "The need to use something like APT/YUM/DEB/RPM to distribute a complex application is a good indication you've built something which is a mess, and probably should be rethought from first principles."
So Linux is a mess? And we should rethink how rpm and deb work?
Or all Deb issues go away if you stop using Linux?
People forget that Helm is a package manager first and foremost (the only one for Kubernetes). It also happens to include a templating mechanism. The templating part has its issues, but until we find another package manager, I don't see Helm going anywhere.
> So Linux is a mess? And we should rethink how rpm and deb work?
Linux is indeed a mess, yes. RPM and Deb are both awful formats stuck in the 90s, with even worse package managers on the top. Even with the legacy of those, installing a package does not involve templating a whitespace-sensitive language with a mediocre template language.
> Or all Deb issues go away if you stop using Linux?
Never had any issues with debs on FreeBSD. Or NixOS if one likes the Linux kernel.
The idea that helm is even a package manager is fanciful at best, in any case.
> The idea that helm is even a package manager is fanciful at best, in any case.
The front page of helm.sh literally says "The package manager for Kubernetes". If it was advertised as "the best templating engine for K8s" or something similar I would agree with you.
People try to abuse Helm.sh as a fancy templating engine. And the testament to that is all the articles "Helm vs Kustomize vs JSonnet vs ..."
Lots of things describe themselves in fanciful terms. Helm has none of the trappings of a package manager, yet all of the trappings of a mediocre template renderer.
Vendors shipping things for customers to run in their clouds and prems have a very limited set of common denominators. When you add in requirements like workload scaling, availability, and durability, that set is very small.
So yeah we do this. Our product runs in 3 public clouds (working on 5), single VM, etc. and our customers install it themselves. We're helm plus Replicated. AMA.
Once you add in workload scaling, availability and durability, there is surely a dedicated ops team that want to control every aspect of how it’s deployed, including the security around it. They are not just going to blindly apply a chart without at least having reviewed it in great detail first.
What I found is that when doing such review, you realize 99 of the template variables are not relevant for you and the one place you need to template is missing a value. Just extracting the rendered manifests and modify them by hand from there becomes more maintainable. Like you say, there is a very limited set of common denominators.
For smaller orgs, just running a single container and increasing the Node size takes you a very long way. That doesn’t need helm.
This is absolutely the root of the problem. Most public Terraform modules suffer the same issues - configurable in so many ways it’s impossible to infer anything without a complete reading of the code.
When deploying into different clouds, do you require any cloud provider resources that require management with terraform etc. or is it relatively self contained?
Also curious what issues you've seen replicated prevent.
For public cloud k8s, no we don't provision or TF anything, we just shove in a manifest and k8s creates the workloads and it provisions persistent volumes and load balancers on your behalf. That's either Helm or Replicated (Kots) on top of Helm. Yes, it's basically self-contained and manages to abstract most of the cloud differences. We do have a custom storage class for each cloud but probably don't need it. The network load balancers need a little cloud specific annotation.
Replicated saved work by handling a configuration gui for the end user, licensing/ entitlements, support bundle collection, private image proxy, things like that we didn't want to deal with.
I barely have a horse in this race, but I think what I'd like to see is more apps that behave like 'npm config' or 'git config' where you can imperatively change one configuration value.
I take your image as the FROM for my own Dockerfile, tweak a few settings, maybe alter the CMD, and then run my image instead of trying to do some sort of ad absurdum variation on a Twelve Factor App.
> helm is at it's best when you need to _publicly distribute_ a complex application
I would say, helm is at it's best when you need to _publicly distribute_ a complex application, AND the majority of the users on the receiving end don't care about the complexity.
You don't need Helm if your manifests are simple. But when your manifests become complex, helm will make it even more complex by turning each field in the plain k8s manifests into a toggle in a values.yaml file.
but usually even the structure is wrong, and there's no option to disable a template in Helm, so I need to install and then manually edit/replace things, or just script Helm with bash or something, which is terrible .. so the whole things is just a big ball of WTF.
yes, it's nice to do env-var-substitution, and --set is not that dumb.
...that's it? What about hooks being an anti pattern? What about upgrades potentially resulting in blowing away your whole deployment without warning? What about a lack of diffs or planning changes? Or the complexity/kludginess of go template logic? Or the lack of ability to order installation of resources or processing of subcharts? Or waiting until a resource is done before continuing to the next one? Or the difficulty of generating and sharing dynamic values between subcharts? Or just a dry run (template) that can reference the K8s api?
There's a ton of pitfalls and limits. It's still a genuinely useful tool and provides value. But it's mostly useful if you use it in the simplest possible ways with small charts.
I just wish the "operation engine" were decoupled from the "generation engine", and pluggable. I like how it watches the deployment, has atomic upgrades, can do rollbacks. But if you want a complex deployment, you currently have to DIY it.
Helm is a tool to use Jinja to write an object tree (or dag), in yaml.
This is not endorsement. This is to point out that it makes hardly any sense! Use a proper programming language and serialize the object tree/network to whatever format is necessary.
I think that's where the landscape is heading--language frameworks that output YAML on one end, and operators that control YAMLs through the K8s control loop on the other end.
> See, there is no general schema for what goes and doesn't go inside a values.yaml file. Thus, your development environment cannot help you beyond basic YAML syntax highlighting.
… this is just an odd complaint. Naturally, there isn't a schema — there inherently cannot be one. Values are the options for the app at hand; they're naturally dependent on the app.
> but without any schema validation!
I have seen people supply JSON schemas for values with the chart. I appreciate that.
Of all the pitfalls … the clunky stringly-typed "manipulate YAML with unsafe string templating" is the biggest pitfall, to me…
I found that using "helm template" to convert every Helm chart into yaml, and then using Pulumi to track changes and update my clusters (with Python transformation functions to get per-cluster configuration) made my life so much better than using Helm. Watching Pulumi or Terraform watch Helm watch Kubernetes update a deployment felt pointlessly complicated.
The main thing is that it's still kicking, Ksonnet is sadly dead.
In addition to that it supports importing Helm charts, has a blessed convention for multiple environments and several native Jsonnet functions that make things a bit nicer.
This is both a good thing and a bad thing, but Pulumi is way more flexible than Terraform. I wanted to have a cloud-provider-specific submodule that created resources (like EKS and GKE) that then exported output values (think kubeconfig), and then I wanted the parent module to pass those in as inputs to a cloud-provider-independent submodule. Terraform couldn't do it without needing to duplicate a ton of code, or without something heinous like Terragrunt (not sure it even would have worked.) Pulumi makes it trivial and in a language I like writing.
Additionally, our applications consume our cloud configuration (eg something that launches pods on heterogenous GPUs needs to know which clusters support which GPUs, our colo cluster has H100s but our Google cluster has A100s etc.) Writing in the same language in the same monorepo makes it very easy to share that state.
For those not aware, there's https://timoni.sh. It's similar to Helm but used CUE instead of the Go templating language. We've recently switched to it and are not looking back. Would be great to get some more contributors on board.
I have to admit that I would chose Kustomize any day of the week. I personally find it easier to debug, version, grasp what is going on behind the scenes and simpler to maintain and operate. Also, to be fair, I have to admit that I've had a few negative experiences with Helm that made me to develop a negative bias against it. The main one being, to make the story short: I, as a DevOps guy, had the unpleasant experience of having to take care of the deployment process of a HeLLm chart(s) pile of crap that a very opinionated yet not very knowledgeable (on DevOps practices) team of developers created in a company I worked for in the past. It was composed by a myriad of obscure charts, jammed together as a house of cards. If that wasn't enough, the main chart was being called from a bash script that called Ansible and it pulled some data from a repo and rendered values coming from a repo into the values.yaml. The process of deploying that thing implied, apart from a couple of good hours of your time, to also have to perform some debugging sorcery with kubectl (removing this, modifying that etc) in a meeting while screen sharing your terminal with the proud parents of that monster. Absolute nightmare that still giving me sweaty hand palms to this day.
Anyways, I want to take the opportunity to ask other DevOps in the room something: What workflow do you use for performing CD for a bunch of Helm charts? I mean, I guess you version a bunch of values.yaml files in a repo but how do you manage the installation of the repository from the CD runner and so on? I'm curious because from the top of my head, Helm implies the runner (the actor who calls Helm to install a chart) to install a repository and then install the package from that repository passing to it the path to the values.yaml file, but on a CD pipeline, this actor is usually a disposable container. Do you install during one of the steps of the CD pipeline the repositories from an in-house created script perhaps every time the pipeline runs? Are you hopefully using a GitOps approach instead?
That was just badly done as anything can be. The template engine takes some getting used to, specially without Go experience. But I'd much rather have that than a million lines of manifests as in with Kustomize. That said, you need either Argo or Flux and only deploy Helm charts with GitOps workflows.
points #3 and #4; "user-friendly helm chart creation" and "values.yaml is an antipattern"...I think we're just all stuck in this horrible middle ground between "need static declarative configurations for simplicity of change management/fewest chances to mess it up" and "need dynamic, sometimes even imperative logic for flexibility, configurability, and ease of development"
several commenters have mentioned Cue/Jsonnet/friends as great alternatives, others find them limiting / prefer pulumi with a general purpose language
our solution at kurtosis is another, and tilt.dev took the same route we did...adopt starlark as a balanced middle-ground between general-purpose languages and static configs. you do get the lovely experience of writing in something pythonic, but without the "oops this k8s deployment is not runnable/reproducible in other clusters because I had non-deterministic evaluation / relied on external, non-portable devices"
Despite its pitfalls, I've found that Helm is still the best way to distribute an app that's destined for Kubernetes. It's easy to use and understand and provides just enough for me to ship an app along with its dependencies.
I use kapp from Project Carvel and the "helm template" subcommand to work around Helm's inability to control desired state. I've found that kapp does a pretty good job of converging whatever resources Helm installed.
+1 for Carvel suite. To me it hits the sweet spot by allowing you to compose your own pipelines. It's not too alien for other k8s folks, and UNIXy enough that you can cut out the stuff you don't care about.
What I do to remediate this sadness is use Helm from Tanka. There is still sadness but now it's wrapped in a nice Jsonnet wrapper and I can easily mutate the output using Jsonnet features without having to mess with nasty Go templating.
I've said it a million times before but it's always worth saying again:
Yep. Many complain that with Lisp, you need to count parentheses (spoiler: you don't need to). And then proceed to count spaces for indent/nindent in the charts... That's somehow ok with almost everyone
I can't actually put it into production at my company, but for selfish catharsis, I ran datamodel-codegen over our cluster's jsonschema and generated Python pydantic models for all resources. I was able to rewrite all our helm using pure Python and Pydantic models, since Pydantic serializes to json and json is valid yaml. Felt pretty good
We don't have any CRD, but the approach would extend to those, plus you get auto complete. The k8s jsonachema isn't super easy to work directly with, though.
Not just helm. There are probably a half dozen tools for rendering manifests in our company, only some use text/template, and they all suck. Text replacements are bad. Declarative structured patches are bad. Control flow in JSON is bad. We've had a language for dealing with generating complex nested structured data for years!
Have you seen Jsonnet, Dhall, and Cue? They are configuration language that are more limited than general purpose languages, more powerful that static, and designed for config files unlike templates.
text/template is probably ok... For some version of text. ditto with jinja and most templating languages. The cardinal sin of DevOps is using text macros to produce structured data. It only exists because unfortunately there is no other lowest common denominator for every config file syntax.
Sure and that forgives its use in maybe, like, Salt and Ansible. Not in Kubernetes where everything is structured in the same way, even with API-available schemas, to begin with.
Jsonnet wouldn't be as bad as it is if there was just a modicum of debugging aid.
I'm slowly chipping away at that problem by implementing some tooling. For example I recently added "traceback" functionality in https://github.com/kubecfg/kubecfg
Another thing that I noticed is that most people who end up writing template libraries for jsonnet are using too many functions and not leveraging the strengths of jsonnet, namely object extension.
I opensourced a library I'm using internally at $work. It's far from perfect and sorely lacking docs and examples but if you want to give jsonnet another go I'd recommend you try kubecfg + https://github.com/kubecfg/k8s-libsonnet
The problem with Pulumi is that it wants to manage the state of things rather than letting the k8s server handle that. This means it's a hell of a lot slower than the equivalent `kubectl apply`. Not to mention it's own state persistence really sucks if you don't use their hosted solution that supports a patched based state update protocol (which someone really should implement an OSS version of).
I have be working on a new project and I split the IaC stuff into two layers, essentially using Pulumi (w/Kotlin) to spin up the k8s cluster and dependencies for Config Connector (on GCP). From there I'm just generating and applying manifests with fabric8 (more Kotlin).
It's not quite as good as Jsonnet in some cases (because of lazy vs non-lazy mostly and always-supported deep-merge etc) but Kotlin is immensely powerful and has things like the `lazy` helper to help here.
Having the entire repo defined in Kotlin though is very nice. Build system is Gradle w/Kotlin script, frontend is htmx only generated by Kotlin DSL, IaC all Kotlin as described.
We did similar at $CURRENT_JOB with Typescript but Kotlin is miles better IMO.
I will agree that it's not optimal but there is still a big difference between text templating and Jsonnet.
Something Jsonnet-esque but with more typing help, better debugging (especially error messages instead of `blah thunk, thunk thunk` would go a long way.
Over seven years of using a variety of deployment tooling including helm (2 and 3), kustomize and our own scripting we concluded that helm's strength is as a package manager, akin to dpkg. Writing good packages is complex, but the tool is quite powerful if you take the time to do that. For our own deployments what we typically want to to do is: build, test and push an image, plug some context specific things into yaml, send the yaml to the control plane and maybe monitor the result for pod readiness. We have some custom tooling that does this in gitlab pipelines, relying on kustomize for the yaml-spattering bits. We still do use a lot of our own and third-party helm charts but for us there's a clear distinction between installing packages (which tend to be longer-term stable infra things) and rapidly iterating on deployments of our own stuff.
Any advice/ideas/articles/references on using kustomize efficiently?
I love the idea of using a tool bundled with kubectl for zero dependencies, but their examples and tutorials are horrible. I can't figure out how to use it correctly to have 1 copy of YAML that would deploy to 5 different environments. It seems I would need multiple copies of kustomization.yaml in multiple folders, if I have multiple namespaces/pods/etc...
The model is base yaml with patches applied to it results in final yaml that get sent to the api, so the typical structure for us is to have the base yaml live with the service source, be maintained by the service owners and include all environment-agnostic properties. We then have one folder per targeted environment for that service which includes any patches and the kustomization.yaml manifest. Basically in line with what other replies have mentioned.
We use kustomize with multiple copies of kustomization.yaml and I don't know if there is a way to do it without that. Basically, there's a base kustomization.yaml and then there's test/kustomization.yaml, prod1/kustomization.yaml, prod2/kustomization.yaml, and so on.
I got lazy and just wrote scripts that output k8s manifests.
The development story is much better (breakpoints! WHAT!?, loops and control flow!?), you can catch common issues quicker by adding tests, there's one "serialise" step so you don't have to deal with YAML's quirks and you can version/diff your generated manifests.
It's dumb, and stupid, but it works and it's far less cognitive load.
Now: handling mildly dynamic content outside of those generated manifests... that's a massive pain, releasing a new version of a container and avoiding to touch the generated manifests: not working for me.
at my current place, we started off with kustomize. I rewrote everything into helm, which was good initially (at least you can force inject some common params, and others can include this in their charts).
But people (including me) were unhappy at yaml reading; I also grew to hate it with a passion because it's neither go nor yaml, and super difficult to read in general. We are a typescript company, and https://cdk8s.io/ has been great for us. We can unit test parts of charts without rendering the whole thing, distribute canonical pod/deployment/service definitions, etc.
In all of the cases, we combined this with config outputted by terraform, for env specific overrides, etc.
Because you effectively CAN'T dynamically configure subcharts with templating that's done in your main chart, see eg https://github.com/helm/helm/pull/6876 here comes the hack.
We run helm in helm. The top chart runs post-install and post-upgrade hook job which runs helm in a pod with a lot of permissions. The outer helm creates values override yaml for the subchart into a ConfigMap, using liberal templating, which gets mounted in the helm runner pod. Then helm runs in there with the custom values and does its own thing.
Not proud but it lets us do a lot of dynamic things straight helm can't.
I appreciate that TF has loops and dynamic blocks, etc etc, but sometimes it's just a lot easier to look at a Jinja2 template and run a script to generate the TF.
Another one, when you upgrade your cluster and there's an API that is candidate for removal, helm doesn't have a way to update the Kind reference in their metadata which causes the inability to delete and update the release.
I personally like cuelang's philosophy but it could become a little messy when you have to iterate and handle user inputs in large codebases.
On top of what the OP mentions, Helm still doesn't have a way to forward logs from its pre/post-install hooks to the system calling helm upgrade (such as a Github Action) - a feature first requested in 2017 and still stuck in RFC stage.
I can understand moving cautiously, but it's at a point where it almost feels like allowing users to understand what Helm is doing seems not to be a priority for Helm's developers.
It's a good list, although I think there's more to it even. I wrote a bit more about helm design a while ago [0]. Nowadays, I use Helm from kustomize quite a lot because some projects don't provide any other way of deploying. However, you still need to check what helm is actually generating, especially if there's any hooks that need to be replaced with something declarative.
This is relevant to a discussion I am having at work right now. I am not a fan of using a templating language as such to generate string templates, especially for a whitespace sensitive language.
I would rather use Terraform's Kubernetes or Kubectl module for this. Are there any pros or cons I should consider?
I think one of the key things I like about it is that Terraform will show me what it plans to change whereas Helm doesn't (last time I checked)
The kubernetes provider, and kubectl works, but its not the nicest way of making changes. Its slow, quite clunky, and its not particularly intuitive. If your just getting started, and you know terraform its ok though. Its useful to bootstrap gitops tools like Argo or FluxCD though.
Helm diff will show you a similar diff to terraform. Running Helmfile in CD isn't a bad move, its really simple, and its a pattern that is easy to grok by any engineer. I think this is still a valid approach in a simple setup, its what some people call "CD OPS". It's a push model instead of pull, and there are downsides, but its not the end of the world.
Ultimately, at scale, i think gitops tooling like Flux and ArgoCD are some of the nicest patterns. Especially Flux's support for OCI artifacts as a source of truth. However then you will venture into the realm of kustomize, and much more complex tooling and concepts, which is not always worth doing.
Cons: The provider tries to manage its own state, as terraform normally does. This makes it slow to diff and the state often get out of sync with what is really in k8s. When it does, it fails loading the manifest during diff phase, so you can’t apply, even if all you want is to overwrite.
The diffs are very noisy to read because they show every possible attribute, even those you didn’t write.
The ready-made resources can some times be a bit behind on versions but you also have a raw manifest resource as an escape hatch if you depend on bleeding edge
Pros: The templating capabilities are fantastic, because it leverages terraform. Bye bye string templates. This also makes it easy to use values from other data sources.
YAML is just some data, like JSON or CSV. Ask yourself whether you'd use a third-party JSON templating or CSV templating tool, or whether you'd use your shop's language of choice and write a program to spit out the generated data.
You can also save yourself a step by just spitting out JSON, which is valid YAML.
every time I hear someone suggest such a thing, I remind them that now you have two systems who believe they own the state of the world: .tfstate and etcd and let me assure you that no matter how much our dumbass TF friend thinks it knows what's going on, etcd wins hands down every time
that's why I strongly suggest that if anyone is a "whole TF shop," they go the operator route, because trying any lower level management is the road to ruin
Terraform wants to be the only thing that owns a K8S object, but the way things work in reality is you have a dozen things that want to write back to this attribute, or that overwrite objects in other places, etc, and you're constantly fighting with TF about this or that triviality.
Needed to install pg/mysql using helm charts and needed to apply SQL schema.
And passing schema is so complex... (like write it into values file, but you cannot use any logic in values, so have an external script which would template values files). At least for the most popular Helm charts I've seen so far.
Kinda sad because it is trivial to do in Docker. And still possible to do in k8s with some configmaps.
While I really enjoy helm when playing with k8s or kickstarting projects, I never feel "safe" when using it in the long run for updates/upgrades.
"values.yaml" files and templating YAML files are too error-prone...
I'm not buying the example of using the operator to figure out things dynamically. Especially that detection of the cloud in the example is done by looking at some random labels or other attributes specific to a cloud provider.
This is what values and templates are for: no need to guess where you are deployed, I'll tell you that via values, template will make sense and adjustments of how resources will look like.
we use helm to deploy our apps and it works. Actually to be honest the templates are generated and then deployed from disk, without uploading a chart. We try to avoid the template language.
What Helm can't do currently is handle Ingress renames, and they do not allow loading files that are outside the chart.
The point isn't that you can never query the API, but that you can't really use helm chart as a controller (and, e.g. restart a pod under a certain condition, which is trivial for an operator).
i generally don't mind helm but im not sure i agree with every point. for the really simple stateless app situation, its trivial to create a chart with all the important or unique bits extracted to a values file.
the crd shit is borderline untenable. i learned about it during an absolutely cursed calico upgrade. oops.
since kustomize integrates tightly with kubectl these days though, i just use that for new things.
A few years I go I tried out an alternative approach to "templating".
Basically the idea starts from a world without templates where you would distribute the k8s YAML in a form that is ready to be directly applied, with whatever sensible defaults you want directly present in the YAML
The the user would then just change the values in their copy of the file to suit their needs and apply that.
We all recoil in horror to such a thought, but let's stop a moment to think about why we do:
The user effectively "forked" the YAML by placing their values there and what a nightmare would that be once the user would get a new version of the upstream file, potentially completely overhauled .
If the changes are very small, a simple three way merge like you'd do with git would suffice to handle that. But what about larger changes?
Most of the conflicts in the simple cases stem from the fact that text based diff/merge tools are oblivious to the structure of the YAML file and can only so a so-so job with many of the changes. Unfortunately most people are familiar only with text based merge tools and so they have been primed the hard way to assume that the merges only rarely work.
Structural merges otoh so work much much better. But still if the upstream refractors the application in a significant way (e.g. changes a deployment into a stateful set or moves pieces of config from a configmap into a secret!) not even a structural merge can save you.
My idea was to bring the manifest author into play and make them "annotate" the pieces of the manifest forest that contain configuration that has a high level meaning to the application
and that would be moved around in the YAML forest as it gets reshaped.
Another realization was that often such configuration snippets are deeply embedded in other internal "languages" wrapped inside string fields, subject to escaping and encodings (e.g. base64). E.g. a JSON snippet inside a TOML string value inside # base64 encoded annotation value (if you haven't seen these abominations I'm so happy for you you innocent child)
So I implemented a tool that uses neated bidirectional parsers ("lenses") that can perform in-place editing of structured files. The edits preserve formatting, comments, quoting styles, etc.
Even steing fields that are normally thought of as just strings are actually better though if as nested "formats". For example the OCI image references are composed of multiple parts. If you want to just copy images to your private registry and "rebase" all you image references to the new base, you can do it with an update that understands the format of the OCI image references instead of just doing substring replacement.
Knot8 is an opinionated tool meant to help manifest authors and users manage setting/diffing/pulling annotated YAML k8s manifest packages
I didn't have the time to evangelize this approach much so it didn't get any traction (and perhaps it would because it doesn't have enough merits). But I encourage to give it a go. It might inspire you
I also pulled out the "lens" mechanism in a separate binary in case it could be useful to edit general purpose files:
Helm and Kustomize are absolutely atrocious tools.
The job of these tools is ultimately to generate a bunch of data structures compatible with Kubernetes API schemas. That's it. Take some input, go brrr, and spit out some serialized data structure.
If we replace "YAML" with "JSON", this all seems a bit absurd.
Helm is a tool where you write JSON templates and JSON templating helper functions, then users can provide a values.json file which specifies inputs for the templates and maybe even can contain templated values themselves. But it's easy to get the contents of the values.json slightly wrong in ways that silently fail, so the JSON templates spit out something but it might still be semantically valid Kubernetes JSON. So you can include JSON schema files alongside your JSON templates package to specify the schema of the values.json file, which the tool will check. Then sometimes the values.json file is too verbose to write by hand, so the user chooses another tool (or god forbid writes a small script in a general purpose programming language) to generate their values.json file.
Absurd.
Kustomize is a tool where you give it a bunch of JSON files, or maybe tell it to use one or more of the hellish Helm charts, which is itself a glorified JSON templater. Then you write a JSON configuration file which looks declarative but is secretly somewhat imperative. In this JSON file you specify which transformations you'd like to apply, maybe even specify some inline patches which use the JSON path patch syntax or a strategic merge where you write some more JSON. Some bits of the tool allow you to generate more JSON from files on your disk, like .env files or data files. If the feature set doesn't work you can then use KRM functions as generators or transformers, which allow you to escape from JSON hell and use a general purpose programming language to emit or transform JSON, but the configuration to this KRM function is itself provided as JSON.
Absurd.
These YAML-obsessed tools like Helm and Kustomize purport to be easier because "it's just YAML!" but what you're really doing is writing all of the arguments for a computer program which you must understand, but in YAML. Kustomize is slightly less bad because it doesn't allow for much control flow, but in Helm templates you have control flow, loops, and helper functions. So it's really just a program anyway with a YAML syntax, but you're confined to a stringly-typed templating language.
This exists with tools like Kyverno, too: "oh, the policies are just YAML!" but they introduce a special DSL for writing conditionals, so you're still writing a program anyway, but in a YAML inner platform. If you think of it as "just JSON" then the YAML smokescreen and cargo cult falls away. It then becomes much more tempting to write something specific to your needs in Python, TypeScript, Ruby, Haskell, OCaml, or another general purpose programming language, where you can do whatever it is you need to do without this leaky YAML purism. Kubewarden's scope overlaps a bit with Kyverno, but Kubewarden chooses WASM as the common language instead of YAML.
However, there's an argument to be made about the security model of Helm and Kustomize. Assume that I trust that the authors of those tools are not going to smuggle my data to a malicious third party or install spyware on my computer and that they do not have critical security bugs. Then I can confidently run those tools on my computer with inputs (e.g. charts and kustomizations) that I do not trust. To say nothing of whether the tool will generate a deployment which compromises my cluster when applied.
I think what I'd like to see from the community is tooling that targets Deno or WASM as the common ground. Then I can sufficiently sandbox "packages" like helm charts or kustomizations without worrying that they have general access to my computer or CI/CD runner as if I'd imported a random Python package. I've already started writing something in Deno that tries to emulate Grafana's Tanka in TypeScript. My next job is to write a tool which runs WASM-compiled KRM functions.
But I might need a configuration language for that tool, so maybe I'll choose YAML...