You may think you don’t need it but I suspect if you program larger than trivial python applications there comes a point where you do but you don’t even think about how the GIL is restricting you. Think about load time for example of something that does a bunch of work to initialise. Without the GIL it would be relatively trivial to speed up things that are independent, while with GIL you are usually out of luck because you can have either shared memory (e.g. previously loaded state) or concurrency but not both at the same time. From experience, trying to serialise and use multiprocessing is often eating up all the potential gains.
I would love to parallelize a plugin script in Cura, the 3D print slicer. It does a bunch of embarrassingly parallel calculations, and could be made at least 16x faster for me. Because it's a plugin, though, it isn't pickle-able and multiprocessing doesn't work. I managed to make it work in a branch, but only on OSes that can fork processes. On windows, the plugin spawns multiple GUIs because importing the cura package apparently has the side effect of launching it...
If there wasn't the GIL, I could just create a thread pool and be done, and Cura could continue to be a delightful mess. :-)
20-odd years ago, my team solved some performance issues in our desktop app by splitting a couple of tasks into their own threads on a computer with a single CPU. Being able to write threaded code is really useful.