Because it's not a huge game changer.
What would you do differently if the GIL was gone?
when people need speed in Python they write C/C++ extensions or use multiple processes/computers
1 - This helps parallelize regular python code too, like the kind that goes into writing web services.
2 - While you'd still write native extensions to get optimum performance for single threaded code, having the thread dispatch layer in python makes parallelizing execution convenient and robust. For example, see the comment from people who maintain scikit-learn below. I'd love to see python get composable task parallelism like Julia where you can freely spawn threads within threads without worrying about their lifetimes.
Well for web dev it would be simpler because you can get rid of workers which in turn reduces the overall memory footprint of applications since the code is shared, the local caches are shared and the connections are shared.
Edit: For backend services it would probably put Python around one order of magnitude away from Go in terms of memory usage to serve the same load, enough for teams not to consider switching.
Writing C/C++ extensions is super hard, and not something > 90% of the Python users is willing to pick up. I would love for Python to support a Parallel.ForEach statement as is supported in C#, or for Pandas to support parallel operations.
Here's an example: embedding Python in other applications. Out of the box it's impossible to just start running scripts at will from multiple threads. Which happens to be what our application does, because as you say we want some speed/concurrency so have a backend running multiple threads C#/C++. But on top of that some things are customised in Python, and now we use IronPython which has no GIL so that all just works. Is this a huge game changer? As a whole, no, but for us: yes it would be awesome if we could just use Python.Net (which is just a layer over CPython) out of the box without having to deal with it's GIL.
What if, when people in python need speed, they have the option to just replace a for-loop with threapool.map?
Currently, that doesn't do much. Trying to do this with a processPool either works, or becomes a horrible exercise in fighting with pickle or whatever other serializer you pick.
Run an ordinary database connection pool in shared memory mode instead of offloading it to a separate process via PGBouncer. Do that with all of the other resource pools as well. Heck, for one off webservers with low load you can do in memory session handling without needing to pull in memcache or redis until you truly need multiple processes or the sessions to survive a restart.
What would I do differently? Actually make engineering choices based on the current project status and priorities without having those choices made for me.