Concurrency in Python - concepts, frameworks and best practices
This talk discusses: - concurrency concepts (e. g. atomicity, race conditions and deadlocks) - frameworks for concurrency and their use (`threading`, `multiprocessing`, `concurrent.futures`, `asyncio`) - higher abstractions, e. g. queues and active objects - best practices for writing concurrent code
Tags: Parallel Programming, Programming, Python
Scheduled on friday 10:30 in room lecture
Speaker
Stefan has been using Python professionally for more than 15 years, the last 10+ years as a freelancing software developer and consultant (https://sschwarzer.com/en/). He has published articles on Python and given talks on Python at several conferences. He's also the maintainer of the ftputil library (https://pypi.org/project/ftputil/).
Description
Have you run in situations where concurrent execution could speed up your Python code? Are you using a GUI toolkit?
This talk gives you the background to use concurrency in your code without shooting yourself in the foot - which is quite easy if you don't understand how concurrent execution differs from linear execution!
The presentation starts with explaining some concepts like concurrency, parallelism, resources, atomic operations, race conditions and deadlocks.
Then we discuss the commonly-used approaches to concurrency:
multithreading with the threading
module, multiprocessing with the
multiprocessing
module, and event loops (which include the
asyncio
framework). Each of these approaches has its typical use
cases, which are explained.
You can implement concurrency on a number of abstraction levels. The
lowest level consists of primitives like locks, events, semaphores and
so on. A higher abstraction level is using queues, typically with
worker threads or processes. Even higher abstraction levels are active
objects (hiding primitives or queues behind an API; this includes
"actors" if you heard of them), the thread and process pools in
concurrent.futures
and the asyncio
framework. Finally, you can
"outsource" concurrency by leaving it to a message broker, which is a
distinct process that receives and distributes messages.
The talk closes with some tips and best practices, mainly:
Don't use concurrency if you don't have to.
Keep it simple. "Simple" usually doesn't mean using primitives like locks, but rather using higher abstractions if you can.
Operations that look atomic may not be atomic. For example, if descriptors are used, an "attribute access" may do arbitrarily complex things. If there's any doubt, assume an operation is not atomic.
Try to hide concurrency behind an API. In particular, serialize accesses to a resource by using a single thread or process (if you use threads and processes) for this resource.
Defects in concurrent code are often difficult to expose. If your code seems to work, it doesn't mean it will work on a different computer, on a complex network, under high load etc. Think about what you're doing and what could go wrong. (Of course, this applies to coding in general, but even more so to concurrent code.)