That's a hot take. In my experience Ansible is trivial to maintain and, due to the lack of better templating like Salt, ensures most playbooks look relatively similar as it's meant to be a simple, straight forward task runner. Onboarding folks into Ansible is a breeze as everything matches the standard (roles, tasks, group_vars, etc). It's dumb, reliable, and does exactly what it says on the tin.
My biggest complaint with Ansible is the execution aspect. The fact that mitogen hasn't become the standard strategy plugin and instead Ansible still opens an SSH connection per task is nuts.
My experience is that Ansible becomes exponentially more difficult with the difficulty of your workflows, where programmatic tends to be logarithmic. For low complexity tasks, Ansible is simpler and for high complexity tasks it gets more complicated.
Just as a random example, for complex group structures you have to know/remember that although Ansible allows nesting groups, groups are actually a flat structure. If you have prod.webservers and uat.webservers, the 'webservers' group will include members of both.
That's not hard to work around, but it can easily lead to situations where you end up deploying to prod even though you only intended to deploy to UAT.
Variables are kind of a mess too. Assuming the docs are authoritative, there are 22 different places that variables can come from and which one is used will depend on the precedence. At some point, a 'switch' or series of 'if-elif' statements start to sound much simpler.
RE: variables, I'm a heathen and set `hash_behaviour = merge`. This forces nested dict values to be merged instead of overwritten (lists cannot be merged, unlike Salt...). For whatever reason Ansible encourages users to not use this [1], which makes variable usage far more difficult than it should be IMO and leads to situations like you describe where folks have various top level variables for overrides (ssh:, ssh_local:, ssh_global:, etc to combine values).
Oh wow, I've been working with ansible for a few years now, how have I not seen this before? I've even searched for it, but I didn't know what it was called, so...
Do you know if this behavior can be enabled per-role, or even better, per-task?
For context, I’ve written probably a million lines of Ansible and even maintain my own fork to add features like loopable blocks. I love it enough to curse its name.
Large Ansible code bases are painful to maintain. Roles don’t encapsulate well enough and the setting that makes their variables private is unusable if you need to “exports”/“outputs” and I guarantee you do, flow control isn’t powerful enough and you either supplement by dropping down to script/shell or by wring whole programs in Jinja, variables are so spooky action at a distance that you need an external tool to graph where they come from, vault variables aren’t greppable so good luck there, you can’t write a type checker because it’s all strings and any large codebase will dynamically generate variable names due to lack of expressiveness, you will desperately want a way to tell Ansible to eagerly evaluate some variables instead of keeping them as strings and find out how ugly the workaround is without your own patches, you eventually desperately wish you had the ability to evaluate filters and lookups on the target and might eventually, like me, write a plug-in for that. Your options for complex data transformations is writing 50 line filter pipelines with item.0 item.1 or dropping to python and just writing your own filter, you will also eventually desperately wish you could define new modules via “subroutines” by chaining a few trivial module invocations together and then find out that you must drop to an action plugins which expose all the underbelly of Ansible. You will pull your hair out getting Jinja to strip the whitespace from strings to get Ansible to recognize the results as an array since it’s all strings, you will find out that Ansible has its own string type that doesn’t play nice with str and that is somehow exposed to you in bog standard Ansible.
If Ansible was a Python library I would scream for joy at how much unnecessary complexity and necessary bullshit trivia I could drop from my brain.
> and instead Ansible still opens an SSH connection per task is nuts
Mitogen is not worth the bugs compared to using SSH ControlPersist.
Well yeah, because ControlPersist is just “make session creation fast” by multiplexing them over a single SSH tunnel. So you still have “ssh internal session creation” overhead but the tcp and ssh handshake is the slow part. The speed gap between pipelining with ControlPersist and Mitogen is small enough that it’s, to me, not worth the bugs and limitations of Mitogen.
My biggest complaint with Ansible is the execution aspect. The fact that mitogen hasn't become the standard strategy plugin and instead Ansible still opens an SSH connection per task is nuts.