Threading

GIL: Who, what, why?

For most posts I concentrate on using Python to solve tasks (mostly system administration based). Apart from fringe cases these are not multi-threaded so I can safely ignore Python’s Global Interpreter Lock (normally shortened to GIL). Even when running a web server, it is usually left up to the web framework to handle any multi-threading so again the GIL is safely ignored.

What is the GIL? Firstly this is for CPython, the version of Python you are most likely to be running. Other implementations like Jython, PyPy or IronPython do things differently. It is just a mechanism to marshall access to internals (variables mostly) from different threads. It makes coding involving multiple threads straight forward but means threads in CPython are generally only good for solving blocking I/O.

For a brief but more technical explanation, Vinay Sajip posted a good single paragraph description to the GIL:

“Python’s GIL is intended to serialize access to interpreter internals from different threads. On multi-core systems, it means that multiple threads can’t effectively make use of multiple cores. (If the GIL didn’t lead to this problem, most people wouldn’t care about the GIL – it’s only being raised as an issue because of the increasing prevalence of multi-core systems.) If you want to understand it in detail, you can view this video or look at this set of slides.”

Why my interest? I have just finished reading an article by A. Jesse Jiryu Davis which goes into far more detail about the GIL. If you are planning a C extension, looking at multi-threading some code which shares data or just curious try his Grok the GIL article as a starting point.

Threading

If you’ve come to Python from another language (certainly a low level language) then you know that threading is hard. You have a mutex or a semaphore which may be achievable with the synchronized command if the language supports it.

So it should come as no surprise to find in Python you initialize a Thread class with the function you want to run (myfunc in the example below) and start it.

t = threading.Thread(target=myfunc)
t.start()

I could put a fully working example in with a few more lines but SaltyCrane already has a simple Threading example on his blog which I cannot beat for clarity.

The problem with this simplistic method is there is no way to interact with the thread. Fine if you want to split out a long running I/O operation or finite background task but what if you need to stop the thread or query its status. You could work around these by passing in a mutable object but really you want to create your own class.

In the Thread class, when you call start it passes control over to the run method to actually execute your function. So your class just needs to override this run method. You will probably want to override the __init__ method as well, in which case don’t forget to call the parent initialization.

To demonstrate I have created a simple threading example. You initialise the class with a name and the number of seconds to sleep and it just writes the name to the console then sleeps for the specified time. The testing code creates 5 classes with different names / sleep times and then starts them running for 5 minutes so you can see the different output then stops them.