Yes, when you are trying to intentionally change the output, you simply regenerate the gold file to be used as reference (and yes, it should be easy). It’s brittle for sure but it does catch unintentional changes and should be used where relevant (if sparingly). There are definitely existing frameworks that do this (eg Jest calls this snapshot testing and has tooling to make it easy).
I’m sorry your experiences with this kind of stuff have been bad. I’ve generally had good experiences in the machine learning space where we used it judiciously where appropriate but didn’t overdo it.
I don’t see how it can ever hinder you though - you can always choose to go “I don’t care that the output has changed dramaticallly - it’s the new ground truth” as long as you communicate that’s what happening in your commit. What it doesn’t let you do is that the output is different every time you run it but that’s generally a positive (randomness should be intentionally injected deterministically).
I’m sorry your experiences with this kind of stuff have been bad. I’ve generally had good experiences in the machine learning space where we used it judiciously where appropriate but didn’t overdo it.
I don’t see how it can ever hinder you though - you can always choose to go “I don’t care that the output has changed dramaticallly - it’s the new ground truth” as long as you communicate that’s what happening in your commit. What it doesn’t let you do is that the output is different every time you run it but that’s generally a positive (randomness should be intentionally injected deterministically).