This is useful for example when implementing compiler optimizations, or concrete string implementations. It means that you can reduce "foo" + "bar" + foo to at least "foobar" + foo and that you can intern "foobar". Would that work the same without calling it a monoid? Of course, monoid is just a name. But that name allows you to go shopping in a wide variety of other fields and work other people have done and mine it for ideas or even concrete implementations.
A vast, overwhelming majority of programmers will never implement a concrete string type or compiler optimizations. Can you show me how this kind of theory is practical for a working programmer who does things like batch processing jobs, web applications, embedded software, event-driven distributed systems, etc?
Consider the monoid abstraction in the context of batch processing. Anywhere you have a monoid, you have a corresponding “banker’s law” that says you can find the generalized sum of a collection of items by partitioning the items into groups, computing the sum over each group, and then taking the sum of the sums as your final result. This idea has many applications in batch processing.
For example, in the MapReduce framework, this idea gives rise to “Combiners” that summarize the results of each map worker to massively lower the cost of shuffling the results of the Map stage over the network prior to the Reduce stage.
Another example: In distributed database systems, this idea allows many kinds of addition-like aggregations to be performed more efficiently by computing the local sum for each group under the active GROUP BY clause before combining the groups’ subtotals to yield the wanted gobal totals.
Basically, in any situation in which you have to compute a global sum of some kind over a collection of items, and some of those items are local to each worker, you can compute the global sum faster and with fewer resources whenever you can take advantage of a monoidal structure.
Which is exactly the concept between optimizing for string concatenation and interning them, ultimately. Sure you can make do with the "algebraic" definition of a monoid, or entirely without it, but that doesn't mean the abstraction isn't there to inform your thinking and your research.
One point that really stuck with me is how people who apply category theory (say, people at the topos institute) use the concepts as little tools, and everytime a problem crosses their way, they try out all the little tools to see if one of them works, similar to how Feynman describes carrying out problems in his head until one day a technique unlocks something).
Having more general abstractions just allows them to be applied to more problems.
my personal favourite: seeing the foldable (a monoid with two types, kind of) in event driven systems means you can model a lot of your embedded software / web applications as a set of functional state/event reducing functions, and build a clean system right from the get go. Seeing the functors in there tells you which part of the state can be split, parallelized or batched for performance or modularity).
Again, these are all very possible without knowing category theory, but category theory studies what the abstractions behind this could be. I find that the huge amount of intuition (driven by some formalism picked up along the way, but even more so by just writing a lot of code) I developed is reflected in what I see in category theory. It's why I genuinely think that web development is just like embedded development: https://the.scapegoat.dev/embedded-programming-is-like-web-d... which seems to be way more controversial than I thought it would be.
I once wrote a unifying parent class for several client-specific reports we had. Think the parent class implementing the top-level control flow/logic and the child implementations having methods that it calls into, sometimes once (e.g. for the header) and sometimes per-row.
Specific reports needed various kinds of customization. Some needed to call the client's web API to get some extra data for each row (and since this was across the internet, it needed to be async). Some needed to accumulate statistics on each row and sum them over the whole report at the end. One needed to query an ancillary database for each row, but all those queries had to be part of the same transaction so that they would be consistent.
Now in theory you can do ad-hoc things for each of those cases. You could make the per-row method always async (i.e. return Future), so that you can override it to do a web API call sometimes. You could stash the statistics in a mutable variable in the report that accumulates them, remembering to do locking. You could use a session on the ancillary database bound to a threadlocal to do the transaction management (most database libraries assume that's how you're doing things anyway), and as long as your returned Futures are never actually async then it would probably work. But realistically it would be very hard to do safely, and I'd never have spotted the underlying symmetry that let me pull out the high-level logic. More likely we'd've stuck with three distinct copy-pasted versions of the report code, with all the maintenance burden that implies.
In principle an abstraction never tells you something you didn't already know - whatever properties you're using, you could've always worked them out "by hand" for that specific case. Like, imagine programming without the concept of a "collection" or any idea of the things you can do on a collection generically (such as traverse it) - instead you just figure out that it's possible to traverse a linked list or a red-black tree or an array, so you write the code that works on all of them when you need it. That's absolutely a way that you can program. But if you don't have this vocabulary of concepts and patterns then realistically you miss so many chances to unify and simplify your code. And if you have the rigorous category-theory concepts, rather than fuzzier design patterns, then you have quick, objective rules for figuring out when your patterns apply - and, even more importantly, when they don't. You can use library code for handling those concepts with confidence, instead of e.g. wondering whether it's ok for your custom iterator to expect the library to always call hasNext() before it calls next(). It's a huge multiplier in practice.