Unless `a` is a linear value, somebody might have a reference to it, so you can't just sort it in place under the cover. The entire thing is useless if you looks like you don't have side-effects but the compiler visibly breaks this.
And you probably want to pick one of sorting in-place and out-of-place.
Yeah, as a rule of thumb I'd say "If Haskell doesn't already do this optimization, find out why."
I say "rule of thumb" and I mean it that way. Sometimes there will be Haskell-specific answers. But if your programming language has less code metadata available than Haskell but is promising more optimizations, it's worth a cross-check. I agree with you strongly in this particular case; without type system support for this specific case, it's going to be very hard to optimize away things like a sort. You start getting into the world of supercompilation and whole-program optimization, the primary problems of which for both of them is that they tend to have intractable O(...) performances... but something about them makes them seem intuitively easy. I've seen a number of people fall into that trap.
(I haven't personally, but I understand the appeal. My intuition says they shouldn't be that difficult too! But the evidence clearly says it is. Very, very clearly. We're talking about things like optimizations that are super-exponential in complexity, but seem intuitively easy.)
And you probably want to pick one of sorting in-place and out-of-place.