Examples of using asyncio: HTTPServer ?!

Not so long ago, a new version of Python 3.4 was released in the changelog, which included many "goodies". One of these is the asyncio module , which contains an infrastructure suitable for writing asynchronous network applications. Thanks to the coroutines concept, the asynchronous application code is easy to understand and maintain.

In the article, using the example of a simple TCP (Echo) server, I will try to show you what to eat asyncio, and I would venture to eliminate the “fatal flaw” of this module, namely the lack of an asynchronous HTTP server implementation.

Intro

A direct competitor and “brother” is the tornado framework , which has proven itself and enjoys well-deserved popularity. However, in my opinion, asyncore looks simpler, more logical and thought out. But this is not surprising, because we are dealing with a standard language library.

You will say that it was possible to write asynchronous services in Python before and you will be right. But this required third-party libraries and / or the use of callback programming style. The concept of coroutine brought to this perfection in this version of Python practically allows you to write linear asynchronous code using only the capabilities of standard language libraries.

I just want to make a reservation that I wrote all this under Linux, but all the components used are cross-platform and under Windows should also work. But the Python 3.4 version is required.

Echoserver

An example of an Echo server is in the standard documentation, but this refers to the low-level API “Transports and protocols” . For "everyday" use, the high-level API Streams is recommended . There is no example of TCP server code in it, however, having studied an example from the low-level API and looking at the sources of this and another module, it is easy to write a simple TCP server.

import asyncio
import logging
import concurrent.futures
@asyncio.coroutinedefhandle_connection(reader, writer):
    peername = writer.get_extra_info('peername')
    logging.info('Accepted connection from {}'.format(peername))
    whileTrue:
        try:
            data = yieldfrom asyncio.wait_for(reader.readline(), timeout=10.0)
            if data: 
                writer.write(data)
            else:
                logging.info('Connection from {} closed by peer'.format(peername))
                breakexcept concurrent.futures.TimeoutError:
            logging.info('Connection from {} closed by timeout'.format(peername))
            break
    writer.close()
if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    logging.basicConfig(level=logging.INFO)
    server_gen = asyncio.start_server(handle_connection, port=2007)
    server = loop.run_until_complete(server_gen)
    logging.info('Listening established on {0}'.format(server.sockets[0].getsockname()))
    try:
        loop.run_forever()
    except KeyboardInterrupt:
        pass# Press Ctrl+C to stopfinally:
        server.close()
        loop.close()

Everything is quite obvious, but there are a couple of nuances that are worth paying attention to.

    server_gen = asyncio.start_server(handle_connection, port=2007)
    server = loop.run_until_complete(server_gen)

The first line does not create the server itself, but the generator, which, when it is first accessed to it and the subsoil, asynciocreates and initializes the TCP server according to the specified parameters. The second line is an example of such an appeal.

try:
            data = yieldfrom asyncio.wait_for(reader.readline(), timeout=10.0)
            if data: 
                writer.write(data)
            else:
                logging.info('Connection from {} closed by peer'.format(peername))
                breakexcept concurrent.futures.TimeoutError:
            logging.info('Connection from {} closed by timeout'.format(peername))
            break

The coroutine function reader.readline()asynchronously reads data from the input stream. But waiting for data to read is not time limited, if you need to stop it by timeout, you need to wrap the call to coroutine's function asyncio.wait_for(). In this case, after the time interval specified in seconds has elapsed, an exception will be raised concurrent.futures.TimeoutError, which can be processed as necessary.
Check that reader.readline()returns a non-empty value in this example is required. Otherwise, after a connection is disconnected by the client (connection reset by peer), attempts to read and return an empty value will continue indefinitely.

What about OOP?

With the PLO, too, everything is fine. It is enough to wrap the methods that use calls of functions-coroutine into the decorator @ asyncio.coroutine. What functions are run as coroutine in the API is clearly indicated. Below is an example that implements the EchoServer class.

import asyncio
import logging
import concurrent.futures
classEchoServer(object):"""Echo server class"""def__init__(self, host, port, loop=None):
        self._loop = loop or asyncio.get_event_loop() 
        self._server = asyncio.start_server(self.handle_connection, host=host, port=port)
    defstart(self, and_loop=True):
        self._server = self._loop.run_until_complete(self._server)
        logging.info('Listening established on {0}'.format(self._server.sockets[0].getsockname()))
        if and_loop:
            self._loop.run_forever()
    defstop(self, and_loop=True):
        self._server.close()
        if and_loop:
            self._loop.close()
    @asyncio.coroutine    defhandle_connection(self, reader, writer):
        peername = writer.get_extra_info('peername')
        logging.info('Accepted connection from {}'.format(peername))
        whilenot reader.at_eof():
            try:
                data = yieldfrom asyncio.wait_for(reader.readline(), timeout=10.0)
                writer.write(data)
            except concurrent.futures.TimeoutError:
                break
        writer.close()
if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    server = EchoServer('127.0.0.1', 2007)
    try:
        server.start()
    except KeyboardInterrupt:
        pass# Press Ctrl+C to stopfinally:
        server.stop()

As can be seen in the first and second cases, the code is linear and readable. And in the second case, moreover, the code is decorated in a self-sufficient class.

HTTP Server

Having dealt with all this, there is an involuntary desire to do something more substantial. The module asyncioprovides us with this opportunity. In it, unlike for example tornado, the HTTP server is not implemented. As they say, it’s a sin not to try to correct this omission :)

Writing an HTTP server entirely from scratch with all its classes like HTTPRequest, etc. is not sporty, considering that there are a lot of ready-made frameworks working on top of the WSGI protocol. Those in the know will rightly notice that WSGI is a synchronous protocol. This is true, but you can read the data for environ and the request body asynchronously. Issuing a result in WSGI is recommended as a generator, and it fits well with the concept of coroutines used in asyncio.

One of the frameworks that does everything right with content delivery isbottle . So he for example gives the contents of a file not entirely, but in portions through a generator. Therefore, I selected it for testing the server developed by WSGI and was pleased with the result. For example, a demo application was completely capable of sending a large file to several client connections at the same time.

Fully see what happened you can on my github . No tests, no documentation there yet, but there is a demo application using the bottle framework. It produces a list of files in a specific directory and gives the selected one in asynchronous mode regardless of size. So, if you add films to this catalog, you can organize a small video hosting :)

I would like to say a special thank you to the CherryPy development team, I often glanced at their code and took something entirely in order not to invent “my bikes”.

View sample application

import bottle
import os.path
from os import listdir
from bottle import route, template, static_file
root = os.path.abspath(os.path.dirname(__file__)) 
@route('/')defindex():
    tmpl = """<!DOCTYPE html>
<html>
<head><title>Bottle of Aqua</title></head>
</body>
<h3>List of files:</h3>
<ul>
  % for item in files:
    <li><a href="/files/{{item}}">{{item}}</a></li>
  % end
</ul>
</body>
</html>
"""
    files = [file_name for file_name in listdir(os.path.join(root, 'files'))
                        if os.path.isfile(os.path.join(root, 'files', file_name))]
    return template(tmpl, files=files)
@route('/files/<filename>')defserver_static(filename):return static_file(filename, root=os.path.join(root,'files'))
classAquaServer(bottle.ServerAdapter):"""Bottle server adapter"""defrun(self, handler):import asyncio
        import logging
        from aqua.wsgiserver import WSGIServer
        logging.basicConfig(level=logging.ERROR)
        loop = asyncio.get_event_loop()
        server = WSGIServer(handler, loop=loop)
        server.bind(self.host, self.port)
        try:
            loop.run_forever()
        except KeyboardInterrupt:
            pass# Press Ctrl+C to stopfinally:
            server.unbindAll()
            loop.close()
if __name__ == '__main__':
    bottle.run(server=AquaServer, port=5000)

When writing the WSGI server code, I did not notice any nuances that could be attributed to the module asyncio. The only time is a feature of browsers (for example, chrome), to reset the request if it sees that it is starting to receive a large file. Obviously, this is done in order to switch to a more optimized way to download large files, because the request is repeated and the file starts to be received normally. But the first request thrown causes an exception ConnectionResetErrorif the file has already been sent to it using the function call StreamWriter.write(). This case must be handled and closed by using StreamWriter.close().

Performance

For the comparative test, I chose the siege utility . “Our patient” (also aqua :) in conjunction with the bottlequite popular Waitress WSGI server also in conjunction with, bottleand of course Tornado, acted as the test subjects . As an application was the lowest possible helloword. Tests conducted with the following parameters: 100 and 1000 simultaneous connections; test duration 10 seconds for 13 bytes and kilobytes; test duration 60 seconds for 13 megabytes; three options for the size of the data given, respectively 13 bytes, 13 kilobytes and 13 megabytes. Below is the result:

100 concurent users	13 b (10 sec)		13 Kb (10 sec)		13 Mb (60 sec)
100 concurent users	Avail.	Trans / sec	Avail.	Trans / sec	Avail.	Trans / sec
aqua + bottle	100.0%	835.24	100.0%	804.49	99.9%	26.28
waitress + bootle	100.0%	707.24	100.0%	642.03	100.0%	8.67
tornado	100.0%	2282.45	100.0%	2071.27	100.0%	15.78

1000 concurent users	13 b (10 sec)		13 Kb (10 sec)		13 Mb (60 sec)
1000 concurent users	Avail.	Trans / sec	Avail.	Trans / sec	Avail.	Trans / sec
aqua + bottle	99.9%	800.41	99.9%	777.15	60.2%	26.24
waitress + bootle	94.9%	689.23	99.9%	621.03	37.5%	8.89
tornado	100.0%	1239.88	100.0%	978.73	55.7%	14.51

What can I say? Tornado certainly steers, but "our patient" seems to be breaking out ahead on large files and has improved the relative performance on a larger number of connections. In addition, he confidently bypassed the waitress (with its four child processes by the number of cores), which is not in a bad tally among developers. I will not say that my testing is 100% adequate, but as an estimate, it will probably fit.

Updated: Drew attention to weird numbers for 13 megabytes of the response body. And indeed, in 10 seconds the test there really probably did not have time to start :) Corrected for the numbers that I received when the test duration was 60 seconds.

Example of running the siege utility and full results for the last column of the second table

$ siege -c1000 -b -t 60S http://127.0.0.1:5000/
** SIEGE2.70
** Preparing1000 concurrent users for battle.
Transactions:               1570 hits
Availability:              60.18 %
Elapsed time:              59.84 secs
Data transferred:       20410.00MBResponse time:              5.56 secs
Transaction rate:          26.24 trans/sec
Throughput:           341.08MB/sec
Concurrency:              145.80Successful transactions:        1570Failed transactions:            1039Longest transaction:           20.44Shortest transaction:           0.00
$ siege -c1000 -b -t 60S http://127.0.0.1:5001/
** SIEGE2.70
** Preparing1000 concurrent users for battle.
The server is now under siege...
Lifting the server siege...      done.
Transactions:                526 hits
Availability:              37.49 %
Elapsed time:              59.20 secs
Data transferred:        6838.00MBResponse time:             16.05 secs
Transaction rate:           8.89 trans/sec
Throughput:           115.51MB/sec
Concurrency:              142.58Successful transactions:         526Failed transactions:             877Longest transaction:           42.43Shortest transaction:           0.00
$ siege -c1000 -b -t 60S http://127.0.0.1:5002/
** SIEGE2.70
** Preparing1000 concurrent users for battle.
The server is now under siege...
Lifting the server siege...      done.
Transactions:                857 hits
Availability:              55.65 %
Elapsed time:              59.07 secs
Data transferred:       11141.00MBResponse time:             20.14 secs
Transaction rate:          14.51 trans/sec
Throughput:           188.61MB/sec
Concurrency:              292.16Successful transactions:         857Failed transactions:             683Longest transaction:           51.19Shortest transaction:           3.26

Outro

Asynchronous web server using asynciohas the right to life. It is possible to talk about the use of such servers in serious projects too early, but after testing, running in and with the advent of asynchronous drivers asyncioto databases and key-value repositories, it may well be possible.

Tags: