I love me an INI. By far, IMO, the best human readable config syntax. Sure it's got some gotchas, like all old formats (CSV) there's no spec. But if I have the choice I'd go INI. It's simple and leaves the choices up to the program that's reading the file, because everything is a string.
To me there's no difference in this articles argument that with INI people would have to remember about the idiosyncrasies of the python implementation related to comments, and people having to know and learn the correct syntax of TOML. I'd say remembering when and where you can comment is easier too.
You’d probably be surprised how often people have to create config files programmatically. With properly specified formats, you can serialize the configuration, which means nesting and string escapes are taken care of automatically. With half-baked custom formats, you have to resort to string replacement and praying.
Having said that, there’s an official RFC for CSV, and INI files are de-facto specified by Microsoft’s implementation.
Yeah, half baked anything is going to screw you over. INI files are certainly programmatically serialisable, otherwise they wouldn't exist. It does move the datatypes to the program rather than being encoded in the config though, which adds to program overhead. Horses for courses though, with greater flexibility comes greater potential to f*uk it up.
More and more the last couple of years, I’ve started to realize that ”configuration languages” are just a bad idea. Please, for the love of all that is holy, just let us use a real programming language. For simple declarative configuration, it’s no more difficult to use Python (or Lua or whatever) than YAML, and for complex configuration (looking at you, YAML files for GitLab CI), it’s an absolute godsend to be able to use real if-statements and for-loops. If security/runtime is a concern, you can sandbox it or use something like Starlark.
JSON is a fine over-the-wire format, but can’t we leave YAML in the dustbin and just do it ”properly” from now on? Please?
Yes, Starlark is a relatively sane option, but the benefit of a language like that isn't simply that it's sandboxed, but also that it's not Turing complete because it only allows for primitive recursion and thus is guaranteed to terminate.
If you want a sensible compromise, it's to use something like Cue or Dhall to generate JSON configuration that's actually consumed by the software. There's also Jsonnet, but I've never been a fan.
Hard pass. There is a lot of software I am forced to run without wanting to involve myself with it. If your tool is so complex to warrant a need for a full programming language in its config files, to me that smells of another issue, but generally I just want a set of knobs to tweak behaviour.
I’m so over having to learn Go templates for a metrics exporter, or the need to write booleans as True or False for python apps, or some obscure Erlang syntax for RabbitMQ, or the batshit crazy APT config file syntax, or Apache's weird XML files… Just settle for a simple format and offer to load custom modules for those that need them.
And when you do need to get a bit more complex, you have the the tools to manage that complexity. Like, if you have a bunch of very similar things (say, job definitions in CI/CD), you want a way to not repeat yourself so you only have to edit that once, YAML has a tool for that, "anchors" [1]. But it's obscure and a bit hard to use. In a real programming language, it's trivial: just stick the repetitive parts in a variable and use that. Or define a simple function in case most of the stuff is repeated, but some is not, so you can do
foo = generateRepetativeConfig('specificOption')
Or maybe you can just do that with a simple for-loop. And if you want to do something like a string substitution, you have a whole dang language with the tools you need to do it.
Dhall and languages like that are a big improvement, but really, I just want a normal programming language. We have decades of experience now with managing complexity in computer systems, and programming languages have evolved robust systems for handling that.
I agree that sandboxing and security can be a real concern for some of this stuff, in which case Starlark is perfect, though you can sandbox languages like Lua or various Lisps as well.
I hope one day people will realize that configuration languages can just be implemented by adding a "--run-total-sandboxed" to their favorite language i.e. a flag that disables while loops and recursion making the language non Turing complete and that runs the program in a sandbox without direct external access.
This would be far better than all random configuration languages out there and far more extensible.
Programming language as config is bad because it makes it easy to do the wrong thing (too much power). Friction can be good.
Ideally, the less power, the better:
- .env :: simple flat untyped key/value pairs. In practice, something like Pydantic can introduce hierarchy and type validation for the config
- json :: simple, few data types. Can be read/written by humans
- toml :: easier to edit by humans
- json5 :: not in Python stdlib
- yaml :: may require discipline, to avoid turing tarpit
As an intermediate between pure config and programming languages, jsonnet language can be used to generate json.
Turing complete: Ruby, Python, Lua, Lisp, etc — avoid for simple configs. They can be used as extension languages/ DSL
That to me feels too dangerous. You'd need to sandbox every piece of config file you use.
I've been sketching on a shell/scripting language/system for a while and I've been thinking of it being in-effect three sub-languages, one nested within the other. At the lowest level: pure data, middle level: pure functional expressions, top level: imperative commands. I'd think that most use-cases for a configuration file would be satisfied by the lowest level, and some by the middle, but absolutely none would need the highest.
Then a config file would be "sandboxed" by the language syntax, with very little risk of it breaking out.
I disagree - I do not want to involve execution to interpret the contents of a config file. I at least want to be able to trivially determine if two config components are equal by simple visual inspection.
To me there's no difference in this articles argument that with INI people would have to remember about the idiosyncrasies of the python implementation related to comments, and people having to know and learn the correct syntax of TOML. I'd say remembering when and where you can comment is easier too.
Either way it's personal preference. I do occasionally like to reread this though (because I'm boring): https://github.com/madmurphy/libconfini/wiki/An-INI-critique...