As some other commenters have mentioned, the GIL removal branch also makes some unrelated optimizations and the performance improvement comes from those rather than from the GIL removal itself, as I understand it.
From your link:
"Stripping out some of the GIL-removal related changes would result in even faster performance on the single-threaded pyperformance benchmark suite. [...] The resulting interpreter is about 9% faster than the no-GIL proof-of-concept (or ~19% faster than CPython 3.9.0a3). That 9% difference between the “nogil” interpreter and the stripped-down “nogil” interpreter can be thought of as the “cost” of the major GIL-removal changes."
So it seems removing the GIL has a negative impact on single-threaded code, with the version that has both the GIL and the unrelated optimizations being 9% faster.
[1]https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsD...
> Why need concurrency in normal python scripts let’s days for web scraping or machine learning ?
I had a few scripts where I had to process over a hundred files, just running that in parallel can reduce a job that takes minutes to seconds.