> Then what happens when you have almost a hundred copy/pasted slightly rewritten 15-30 line variations on the same theme? How do you refactor then? (Yes, I have seen this is production systems, and yes, it was very critical code!) As you say, it comes down to cost/benefit.
The only thing that can happen - you understand the core function those 30 variations solve and introduce a full parametrised solution to the problem, then replace all places with calls to it. Generally there would be some way to tell where all these copy-pastes are using something not much more complex than a regex. But even then, maybe just leaving it duplicated is better.
The concern I have with this approach is how easy it is forget to alter one of the variants when you alter one of them.
How do you know the code you're touching has other versions that are semantically the same and should also be altered?[1]
How do you avoid having to fix the same bug several times because you bug-fixed one place but not the others?
How do you avoid the technical debt that builds over time when instances of the pattern within the codebase are each at subtly different "versions" with similar, but not identical, semantics (even though identical would have worked fine?)
[1] Of course, DRY code has the inverse: How do you know if an existing function to do what you want to do already exists so you can avoid duplicating the extraction?
Where my opinion falls today is: A slightly more complex solution is often less risky and more maintainable than the straightforward duplication solution because at least the complex one looks complex to a would-be maintainer who will at least be aware of things up front, whereas duplicated code can have a bunch of hidden costs whenever it's touched that won't necessarily become apparent until later when the presence of that technical debt throws a monkey wrench into unrelated plans.
If your variants need to be similar, that alone is a great reason to abstract. It makes the abstraction more valuable.
Meanwhile, there could easily be transformative code that just computes some stuff you often need. In those cases altering one variant need not effect the others.
No, what will happen is that one of your colleagues fixes a bug in a couple of places, someone else fixes some other bugs somewhere else, and at the end the buggy duplicated code becomes a buggy mess in which no one has any idea of what should be the right behaviour.
How, how, leaving duplicates and starting this bloody mess can be ever better?
I honestly can't see it.
Seems equivalent to the case where you have one generalized function with a bunch of obscure special cases coded into it. I feel like having 30 functions means you can easily trace which parts of the program are exercising which special cases.
The only thing that can happen - you understand the core function those 30 variations solve and introduce a full parametrised solution to the problem, then replace all places with calls to it. Generally there would be some way to tell where all these copy-pastes are using something not much more complex than a regex.
Knowing where these things were wasn't a problem. They were all on the class side of certain classes. (Yes, this was Smalltalk, but this entire subsystem didn't have a single instance variable in it!) I was on a team of 10, with some very smart guys. We all wanted to "understand the core function" in this subsystem but what it really was, was an object system, where objects were expressed as consecutive entries in a series of arrays. Every method resembled some kind of complex merge with multiple arrays and multiple incrementing indexes and varying side effects embedded in nested conditional logic. Only one developer understood the underlying object model, and she wasn't apt to share. Rather, it was the source of her job security. (Most days, she spent in the cafe on the 1st floor, reading a book, until she got notifications, then had to "consult.") If you pointed out the "unusual" nature of an entire Smalltalk subsystem without a single instance variable in it, she started talking to you about her PhD in Math.
No, you aren't such and genius, and myself and my colleagues such dullards, that we only needed you to show up and point out a few simple truths.
If you change something that has that many parameters there's a very high chance you will introduce a bug. You better have a test for each variations. Or simply keep them separate so that they do not affect each other.
The only thing that can happen - you understand the core function those 30 variations solve and introduce a full parametrised solution to the problem, then replace all places with calls to it. Generally there would be some way to tell where all these copy-pastes are using something not much more complex than a regex. But even then, maybe just leaving it duplicated is better.