An introduction to asynchronous programming in Python with Async IO
In this week’s Writer’s Room – a regular blog series ofarticles and tutorials written by technologists from within our Andela Community – Ezzeddin Abdullah offers an introduction to asynchronous programming in Python using Async IO.
Writing sequential (or synchronous) code is familiar to many programmers, even when they’re just getting started. It’s the kind of code that is executed one line at a time, one instruction at a time.
In the asynchronous world, the occurrence of events are independent of the main program flow. This means that actions are executed in the background, without waiting for the completion of the previous action.
In other words, the lines of code are executed concurrently.
Imagine you have certain independent tasks and each one takes a lot of running time to finish. Their outputs aren’t dependent on each other. So, you want to start them all at once. If these tasks are executed in a particular order, the program will have to wait for each task to finish before starting the next one. This waiting time is blocking the program.
Asynchronous programming paradigm helps to execute these tasks concurrently and ensures you can beat that waiting time and use the resources more efficiently.
Python 3 has a native support for async programming, Async IO, which provides a simple way to execute concurrent tasks.
First, let’s set up our environment and get started.
Setting up the environment
In this tutorial, we will use async io module in Python 3.7 and above, so we need to create a new Python 3.7 environment. A clean Python way is to set up a virtual environment with conda and then activate it with the following commands:
Asynchronous programming building blocks
There are 3 main building blocks of Python async programming:
The main task is the event loop, which is responsible for managing the asynchronous tasks and distributing them for execution.
Coroutines are functions that schedule the execution of the events.
Futures are the result of the execution of the coroutine. This result may be an exception.
Introducing async in Python
Two main components are introduced in Python:
async io which is a Python package that allows an API to run and manage coroutines.
async/await to help you define coroutines.
The functionality and behavior of code is different when you choose async or sync to design your code.
To make it clear, to make HTTP calls, consider using aiohttp, which is a Python package that allows you to make HTTP calls asynchronously. It could be an essential tool if you’re blocked because of the requests library.
Similarly, if you’re working with the Mongo driver, instead of relying on the synchronous drivers like mongo-python, you have to use an async driver like moto to access MongoDB asynchronously.
In the asynchronous world, everything runs in an event loop. This allows you to run several coroutines at once. We’ll see what coroutine is in this tutorial.
Everything inside async def is asynchronous code, and everything else is synchronous.
Writing async code is not as easy as writing sync code. The Python async model is based on concepts such as events, callbacks, transports, protocols, and futures.
Things go fast in the async world for Python so keep an eye on the latest updates.
How asyncio works
The asyncio package provides two keys, async and await.
Let’s look at this async hello-world example:
On first glance you might think that this is a synchronous code because the second print is waiting 1 second to print “Hello again!” after “Hello world!”. But this code is actually asynchronous.
Any function defined as a async def is a coroutine like hello() above. Note that calling the hello() function is not the same as wrapping it inside asyncio.run() function.
To run the coroutine, asyncio provides three main mechanisms:
asyncio.run() function which is the main entry point to the async world that starts the event loop and runs the coroutine.
await to await the result of the coroutine and passes the control to the event loop.
The previous snippet still waits for the say_something() coroutine to finish so it executes task 1 in 1 second, and then executes the second task after waiting for 2 seconds.
To make the coroutine run concurrently, we should create tasks, which is the third mechanism.
asyncio.create_task() function which is used to schedule the coroutine for execution.
The above code is now running concurrently and the say_something() coroutine is no longer waiting for the say_something() coroutine to finish. It’s rather running the same coroutine with different parameters concurrently.
What happens is the following:
The say_something() coroutine starts with the parameter’s first task (1 second and a string “Task 1”). This task is called task1.
It then suspends the execution of the coroutine and waits 1 second for the say_something() coroutine to finish as it encounters the await keyword. It returns the control to the event loop.
Similarly for the second task, it suspends the execution of the coroutine and waits 2 seconds for the say_something() coroutine to finish as it encounters the await keyword.
After the task1 control returns to the event loop, the event loop resumes the second task (task2) because asyncio.sleep has not finished yet.
The asyncio.create_task() wraps the say_something() function and makes it run the coroutine concurrently as an asynchronous task. As you can see, the above snippet shows that it runs 1 second faster than before.
The coroutine is automatically scheduled to run in the event loop when asyncio.create_task() is called.
Tasks help you to run multiple coroutines concurrently, but this is not the only way to achieve concurrency.
Running concurrent tasks with asyncio.gather()
Another way to run multiple coroutines concurrently is to use the asyncio.gather() function. This function takes coroutines as arguments and runs them concurrently.
In the previous code, the greetings() coroutine is executed twice concurrently.
An object is called awaitable if it can be used with the await keyword. There are 3 main types of awaitable objects: coroutines, tasks, and futures.
In the previous example, the mult() and add() coroutine functions are awaited by the main() coroutine.
Let’s say you omit the await keyword before the mult coroutine. You’ll then get the following error: RuntimeWarning: coroutine 'mult' was never awaited.
To schedule a coroutine to run in the event loop, we use the asyncio.create_task() function.
A Future is a low-level awaitable object that represents the result of an asynchronous computation. It is created by calling the asyncio.Future() function.
Use asyncio.wait_for(aw, timeout, *) to set a timeout for an awaitable object to complete. Note that aw here is the awaitable object. This is useful if you want to raise an exception if the awaitable object takes too long to complete. The exception as asyncio.TimeoutError.
The timeout here in the Future object is set to 1 second although the slow_operation() coroutine is taking 400 seconds to complete.
In this tutorial, we introduced asynchronous programming in Python with Async IO built-in module. We defined what coroutines, tasks, and futures are.
We also covered how to run multiple coroutines concurrently with different ways and saw how a concurrent code might be your best option when you need to optimize performance for certain tasks.
Want to be part of the Andela Community? Then join the Andela Talent Network!
With more than 175,000 technologists in our community, in over 90 countries, we’re committed to creating diverse remote engineering teams with the world’s top talent. And our network members enjoy being part of a talented community, through activities, benefits, collaboration, and virtual and in-person meetups.
Your career is a journey, not just a job. Taking ownership of your career development and actively seeking out opportunities for advancement can not only spark career growth, but also increase your enthusiasm for your work. Read our seven tips to accelerating your work ambitions!
With technology advancing faster than ever before, tech skills are always in demand. These are the top six right now: Core engineering, Cloud API, database expertise, data analytics, communications, and Devops methodology.