I work at a small company - maybe 40 engineers. Our app is still a monolith, but we have secondary systems for processing data and communicating with external systems. It's not the most elegant thing ever engineered, but it's not bad given our security and compliance requirements.
I mention all this to illustrate that we are nowhere near NetAppAmaGooSoft scale. Nevertheless our deployment is complex. We can't just have a few hundred identical machines. There are many heterogenous parts, and they have to be hooked up to each other. We currently do this with ~20K lines of HCL applied with Terraform. It's mostly written by some hella-smart infra engineers, and is very well factored.
Still, it's a BEAST. The Terraform Enterprise workflow is tedious, and writing configuration is a lot of work. We would love to replace it, with ...something better. There are alternatives out there, but nothing that's obviously much better. It wouldn't be worth the migration effort. As far as I can tell, this is the state of the art, and it sucks.
My coworkers are sick of hearing me say "We are not Google", so I'm sympathetic to the YAGNI argument, but there really is a problem here. Flat YAML files would be a nightmare for us. I bet there are a lot of companies out there that have worse solutions than we do. The default of "each team rolls their own ad hoc deployment tools" masks it somewhat because it's not obvious that the company has 17 solutions to the same problem, with varying levels of effectiveness and reliability, all of which are expensive to write and maintain, and none of which will be reused when a new team gets organized.
Terraform is a valiant attempt to solve the problem with middling results. We can do better, and we need more attempts! I'm excited that CUE exists. It's not ready for my use case yet, but it's very promising. The best thing about it is that it scales well, up and down. If you have simple needs, you can just write flat CUE to start. It's just JSON with some syntax sugar. The fancy stuff can come later, but it's there when you need it.
I mention all this to illustrate that we are nowhere near NetAppAmaGooSoft scale. Nevertheless our deployment is complex. We can't just have a few hundred identical machines. There are many heterogenous parts, and they have to be hooked up to each other. We currently do this with ~20K lines of HCL applied with Terraform. It's mostly written by some hella-smart infra engineers, and is very well factored.
Still, it's a BEAST. The Terraform Enterprise workflow is tedious, and writing configuration is a lot of work. We would love to replace it, with ...something better. There are alternatives out there, but nothing that's obviously much better. It wouldn't be worth the migration effort. As far as I can tell, this is the state of the art, and it sucks.
My coworkers are sick of hearing me say "We are not Google", so I'm sympathetic to the YAGNI argument, but there really is a problem here. Flat YAML files would be a nightmare for us. I bet there are a lot of companies out there that have worse solutions than we do. The default of "each team rolls their own ad hoc deployment tools" masks it somewhat because it's not obvious that the company has 17 solutions to the same problem, with varying levels of effectiveness and reliability, all of which are expensive to write and maintain, and none of which will be reused when a new team gets organized.
Terraform is a valiant attempt to solve the problem with middling results. We can do better, and we need more attempts! I'm excited that CUE exists. It's not ready for my use case yet, but it's very promising. The best thing about it is that it scales well, up and down. If you have simple needs, you can just write flat CUE to start. It's just JSON with some syntax sugar. The fancy stuff can come later, but it's there when you need it.