Asynchronous Python: Various Forms of Competitiveness

Original author: Abu Ashraf Masnun
With the advent of Python 3 quite a bit of noise about “asynchrony” and “concurrency”, it can be assumed that Python recently introduced these features / concepts. But it is not. We have used these operations many times. In addition, beginners may think that asyncio is the only or best way to recreate and use asynchronous / parallel operations. In this article, we will look at various ways to achieve concurrency, their advantages and disadvantages.

Definition of terms:

Before we delve into the technical aspects, it is important to have some basic understanding of the terms often used in this context.

Synchronous and asynchronous:

In synchronous operation tasks are performed one after the other. In asynchronous tasks can be started and completed independently of each other. One asynchronous task can be started and continue to execute while execution proceeds to a new task. Asynchronous tasks do not block (do not make them wait for the task to complete) operations and are usually performed in the background.

For example, you should contact a travel agency to plan your next vacation. You need to send a letter to your supervisor before flying away. In synchronous mode, you first call the travel agency, and if you are asked to wait, then you will wait until you are answered. Then you start writing a letter to the manager. Thus, you perform tasks one after another. [synchronous execution, approx. translator] But, if you are smart, while you are asked to wait [hang on the phone, approx. translator]You will begin to write an e-mail and when they talk to you again, you will stop writing, talk, and then finish the letter. You can also ask a friend to call the agency and write a letter yourself. This is asynchronous, tasks do not block each other.

Competitiveness and concurrency:

Competition implies that the two tasks are performed together . In our previous example, when we considered an asynchronous example, we gradually moved forward in writing a letter, then in a conversation with the tour. agency. This is competitiveness .

When we asked to call each other, while they themselves wrote a letter, the tasks carried out in parallel .

Parallelism is essentially a form of competitiveness. But concurrency depends on hardware. For example, if there is only one core in a CPU, then two tasks cannot be executed in parallel. They just divide the CPU time among themselves. Then it is competitive, but not concurrency. But when we have several cores [like a friend in the previous example, which is the second core, approx. translator] we can perform several operations (depending on the number of cores) at the same time.

Summing up:

  • Synchronicity: blocks operations (blocking)
  • Asynchrony: does not block operations (non-blocking)
  • Competitiveness: Joint Progress (Joint)
  • Parallelism: parallel progress (parallel)

Parallelism implies competitiveness. But competitiveness does not always imply concurrency.

Threads and Processes

Python has supported threads for a very long time. Threads allow you to perform operations concurrently. But there is a problem with the Global Interpreter Lock (GIL) because of which the threads could not provide true concurrency. Nevertheless, with the advent of multiprocessing, you can use multiple cores using Python.


Consider a small example. In the following code, the worker will execute asynchronously and simultaneously across multiple threads.

import threading
import time
import random
    sleep = random.randrange(1, 10)
    print("I am Worker {}, I slept for {} seconds".format(number, sleep))
for i in range(5):
    t = threading.Thread(target=worker, args=(i,))
print("All Threads are queued, let's see when they finish!")

Here is an example of the output:

$ python
All Threads are queued, let's see when they finish!
I am Worker 1, I slept for 1 seconds
I am Worker 3, I slept for 4 seconds
I am Worker 4, I slept for 5 seconds
I am Worker 2, I slept for 7 seconds
I am Worker 0, I slept for 9 seconds

Thus, we launched 5 streams for collaboration and after their start (that is, after starting the worker function), the operation does not wait for the threads to finish before moving on to the next print operator. This is an asynchronous operation.

In our example, we passed the function to the Thread constructor. If we wanted, we could implement a subclass with a method (OOP style).

Further reading:

To learn more about threads, use the link below:

Global Interpreter Lock (GIL)

GIL was introduced to make handling CPython memory easier and to provide better integration with C (for example, extensions). GIL is a blocking mechanism when the Python interpreter launches only one thread at a time. Those. Only one stream can be executed in Python bytecode at a time. GIL ensures that multiple threads are not executed in parallel .

GIL in brief:

  • One thread can run at the same time.
  • The Python interpreter switches between threads to achieve concurrency.
  • GIL is applicable to CPython (standard implementation). But such as, for example, Jython and IronPython do not have GIL.
  • GIL makes single-threaded programs fast.
  • GIL I / O operations do not usually interfere.
  • GIL makes it easy to integrate non-thread-safe C libraries, thanks to GIL we have many high-performance extensions / modules written in C.
  • For CPU dependent tasks, the interpreter checks every N ticks and switches threads. Thus, one thread does not block the others.

Many see GIL as a weakness. I consider this as a blessing, because such libraries as NumPy, SciPy were created, which occupy a special, unique position in the scientific community.

Further reading:

These resources will allow to go deep into GIL:

Processes (Processes)

To achieve concurrency in Python, a multiprocessing module has been added that provides an API, and looks very similar if you have used threading before.

Let's just go and change the previous example. Now the modified version uses Process instead of Flow .

import multiprocessing
import time
import random
    sleep = random.randrange(1, 10)
    print("I am Worker {}, I slept for {} seconds".format(number, sleep))
for i in range(5):
    t = multiprocessing.Process(target=worker, args=(i,))
print("All Processes are queued, let's see when they finish!")

What has changed? I just imported a multiprocessing module instead of threading . And then, instead of a stream, I used a process. That's all! Now, instead of a set of threads, we use processes that run on different CPU cores (if, of course, your processor has several cores).

Using the Pool class, we can also distribute the execution of one function among several processes for different input values. Example from official documents:

from multiprocessing import Pool
deff(x):return x*x
if __name__ == '__main__':
    p = Pool(5)
    print(, [1, 2, 3]))

Here, instead of going through the list of values ​​and calling the function f one by one, we actually run the function in different processes. One process performs f (1), the other f (2), and the other f (3). Finally, the results are again combined into a list. This allows us to break up heavy calculations into smaller parts and run them in parallel for faster calculation.

Further reading:

The concurrent.futures

module The concurrent.futures module is large and allows you to write asynchronous code very easily. My favorites ThreadPoolExecutor and ProcessPoolExecutor . These implementers maintain a pool of threads or processes. We send our tasks to the pool, and it runs the tasks in an accessible thread / process. A Future object is returned , which can be used to query and retrieve the result upon completion of the task.

Here is an example ThreadPoolExecutor:

from concurrent.futures import ThreadPoolExecutor
from time import sleep
    return message
pool = ThreadPoolExecutor(3)
future = pool.submit(return_after_5_secs, ("hello"))

I have an article about concurrent.futures . It may be useful for a deeper study of this module.

Further reading:

Asyncio - what, how and why?

You probably have a question that many people in the Python community have - what does new asyncio bring? Why was another asynchronous I / O needed? Have we not already had threads and processes? Let's watch!

Why do we need asyncio?

The processes are very expensive [in terms of resource consumption, approx. translator]for creating. Therefore, for I / O operations, threads are mostly selected. We know that I / O depends on external things — slow disks or unpleasant network lags make I / O often unpredictable. Now suppose we are using threads for I / O operations. 3 threads perform various I / O tasks. The interpreter would have to switch between competitive streams and give each of them some time in turn. Let's call the streams - T1, T2 and T3. Three threads started their I / O operation. T3 completes it first. T2 and T1 are still waiting for I / O. The Python interpreter switches to T1, but it is still waiting. Well, the interpreter moves to T2, but it still waits, and then moves to T3, which is ready and executes the code. Do you see this as a problem?

T3 was ready, but the interpreter first switched between T2 and T1 - it incurred switching costs, which we could have avoided if the interpreter first switched to T3, right?

What is asynio?

Asyncio provides us with a cycle of events along with other cool things. The event loop monitors I / O events and switches tasks that are ready and waiting for an I / O operation [event loop is a software construct that waits for arrival and sends out events or messages in the program, approx. translator] .

The idea is very simple. There is an event loop. And we have functions that perform asynchronous I / O operations. We pass our functions to the event loop and ask it to run them for us. The event loop returns the Future object to us, like a promise that in the future we will get something. We hold on to the promise, check from time to time whether it matters (we are very impatient), and finally, when the value is received, we use it in some other operations [i.e. we sent a request, we were immediately given a ticket and told to wait until the result came. We periodically check the result and as soon as it is received we take the ticket and get the value by it, approx. translator] .

Asyncio uses generators and korutiny for stopping and resuming tasks. You can read the details here:

How to use asyncio?

Before we begin, let's take a look at an example:

import asyncio
import datetime
import random
asyncdefmy_sleep_func():await asyncio.sleep(random.randint(0, 5))
asyncdefdisplay_date(num, loop):
    end_time = loop.time() + 50.0whileTrue:
        print("Loop: {} Time: {}".format(num,
        if (loop.time() + 1.0) >= end_time:
            breakawait my_sleep_func()
loop = asyncio.get_event_loop()
asyncio.ensure_future(display_date(1, loop))
asyncio.ensure_future(display_date(2, loop))

Please note that the async / await syntax is intended only for Python 3.5 and above. Let's go through the code:

  • We have the asynchronous function display_date, which takes a number (as an identifier) ​​and an event loop as parameters.
  • The function has an infinite loop, which is interrupted after 50 seconds. But during this period, she repeatedly prints the time and pauses. The await function can wait for other asynchronous functions to complete (corutin).
  • Pass the function to the event loop (using the ensure_future method).
  • We start a cycle of events.

Whenever an await call occurs, asyncio understands that functions will probably take some time. Thus, it pauses execution, starts monitoring any I / O event associated with it, and allows you to run tasks. When asyncio notices that the suspended I / O function is ready, it resumes the function.

Making the right choice

We just walked through the most popular forms of competition. But the question remains - what should I choose? It depends on the use cases. From my experience, I tend to follow this pseudo code:

if io_bound:
    if io_very_slow:
        print("Use Asyncio")
       print("Use Threads")
    print("Multi Processing")

  • CPU Bound => Multi Processing
  • I / O Bound, Fast I / O, Limited Number of Connections => Multi Threading
  • I / O Bound, Slow I / O, Many connections => Asyncio

[Approx. translator]

Also popular now: