Hacker News new | past | comments | ask | show | jobs | submit login

Hi! I'm the author of this article. Thanks for posting it.

The GIL is an old topic, but I was surprised to learn recently that it's much more nuanced than I thought. So, here is my little research.

This article is a part of my series that dives deep into various aspects of the Python language and the CPython interpreter. The topics include: the VM; the compiler; the implementation of built-in types; the import system; async/await. Check out the series if you liked this post: https://tenthousandmeters.com/tag/python-behind-the-scenes

I welcome your feedback and questions, thanks!




Haven't checked the full series yet but this GIL article is very nicely written and insightful. Thank you


Wow, just skimmed over a few articles in the series and they seem very nice and well researched. Congratulations.


You're doing great work, and this isn't the first article from your site that I've passed along to my coworkers. Keep it up!


You filled in a lot of gaps in my GIL knowledge. Today I learned something new. Thank you for writing it!


In the opening paragraph you state that the GIL prevents speeding up CPU-intensive code by distributing the work among multiple threads.

My understanding is that distributing work across multiple threads would not speed up CPU-intensive code anyways. In fact it would add overhead due to threading.

Perhaps you meant I/O bound code?


There are two things you should consider here, wall clock time, and cpu time. Making code faster using multiple threads will increase CPU time by some amount, but because that work is now distributed between several cores it should actually reduce wall clock time.

There are many CPU bound tasks which can be made multithreaded and faster, but it does depend on the task and how much extra coordination you’re adding to make it multithreaded.


The author is speaking about the general concept of threading, outside of Python (without using C extensions to help out as discussed in the article). In general, if you don't have a GIL, and you have 2 or more cores then if you run additional threads you will see a speedup for CPU-intensive code. The actual speedup will vary. A sibling comment mentions embarrassingly parallel problems, those are things like ray tracing, where each computation is independent of all the others. In those cases, you get near linear speedup with each additional core and thread. If there is more coordination between the threads (mutexes and semaphores, for instance, controlling access to a shared datum or resource) then you will get a less-than-linear speedup. And if there is too much contention for a common resource, you won't get any speedup and will see some slowdown due to the overhead introduced by threads.


Oh, I thought you needed multiple processes for this.


If it was being distributed amongst python threads (which run on one hardware thread), then CPU performance can't improve since they're just taking turns using the CPU. If you're running on multiple hardware threads (what I assume the author meant), that can causes better CPU performance since it will distribute work across real threads that can run in parallel.

The GIL restricts use of multiple hardware threads.


You can very well parallelise CPU-intensive problems. Look at e.g. "Embarassingly parallel" on Wikipedia. Intuitively if you can divide your work into chunks that are large enough, the scheduling overhead becomes negligible.


what are you talking about? worker thread pools are the most common way to take advantage multiple cores. Typically can see speedups (for highly parallel codes) of nX for n cores.


In python there is nuance to the term threads.

Threads in python will run in the same python process and use only one core.

If you want to use more cores you need to use more python processes.


... but that's because of these very GIL limitations, is it not?


Sure, but I think if you are discussing this type of thing in the context of python you have to use the threads/processes terminology to avoid confusion.


Not in an article explaining why those limitations exist at all.


The reason why this is true in Python is the GIL. In other languages without a GIL, multiple threads will run on multiple cores and can speed up CPU bound code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: