Django Channels - the answer to the modern web

    In the world of Django is gaining popularity supplement Django Channels. This library should bring asynchronous network programming to Django that we have been waiting for. Artyom Malyshev at Moscow Python Conf 2017 explained how the first version of the library does it (now the author has already written down channels2), why does it do it and whether it does at all.

    First of all, Zen Python says that any solution should be the only one. Therefore, in Python, at least three . Network asynchronous frameworks already exist in large numbers:

    • Twisted;
    • Eventlet;
    • Gevent;
    • Tornado;
    • Asyncio.

    It would seem, why write another library and whether it is necessary at all.


    About speaker: Artyom Malyshev is an independent Python developer. Engaged in the development of distributed systems, speaking at conferences on Python. Artyom can be found under the nickname @ PROOFIT404 on Github and on social networks.

    Django is synchronous by definition . If we are talking about ORM, then synchronously refer to the database during attribute access, when we write, for example, post.author.username, it does not cost anything.

    In addition, Django is a WSGI framework.

    WSGI


    WSGI is a synchronous interface for working with web servers.

    defapp(environ, callback) :
        status, headers = '200 OK', []
        callback (status, headers)
        return ['Hello world!\n']
    

    Its main feature is that we have a function that takes an argument and immediately returns a value. This is all that a web server can expect from us. No asynchronous and does not smell .

    This was done a long time ago, back in 2003, when the web was simple, users read all kinds of news on the Internet, went to guest books. It was enough just to accept the request and process it. Give an answer and forget that this user was at all.


    But, for a moment, it’s not 2003 now, so users want much more from us.

    They want Rich web application, live content, they want the application to work great on the desktop, on the laptop, on other tops, on the clock. Most importantly, users do not want to press F5, because, for example, on the tablets there is no such button.



    Web browsers, of course, meet us - they add new protocols and new features. If you and I were developing only the frontend, then we would simply take the browser as a platform and use its core features, since it is ready to provide them to us.

    But, for backend programmers, everything has changed a lot . Web sockets, HTTP2, and the like are a huge pain in terms of architecture, because they are long-lived connections with their own states that need to be processed.


    This is the problem that Django Channels for Django is trying to solve. This library is designed to give you the ability to handle connections, leaving the Django Core to which we are accustomed to absolutely unchanged.

    This was done by a wonderful man, Andrew Godwin , who has a terrible English accent that speaks very quickly. You should know him for things like the long-forgotten Django South and Django Migrations, which came to us from version 1.7. Since he repaired the migration for Django, he has been busy repairing web sockets and HTTP2.

    How did he do it? Once upon a time, there was such a picture on the Internet: empty squares, arrows, the inscription “Good architecture” - you enter your favorite technologies in these squares, you get a site that scales well.



    Andrew Godwin wrote a server in these squares that stands up to the front and accepts any requests, be they asynchronous, synchronous, e-mail, whatever. Between them is the so-called Channel Layer, which stores received messages in a format that is accessible to a pool of synchronous workers. As soon as an asynchronous connection sent us something, we record it in the Channel Layer, and then the synchronous worker can take it from there and process it in the same way as any Django View or anything else, synchronously. As soon as the synchronous code sent the response back to the Channel Layer, the asynchronous server will give it, stream it, do everything it needs. Thus, an abstraction is made.

    This implies several implementations and it is proposed in production to use Twisted as an asynchronous server.which implements the frontend for Django, and  Redis , which will be the very channel of communication between synchronous Django and asynchronous Twisted.

    The good news is that in order to use the Django Channels, you don’t need to know either Twisted or Redis at all. Your DevOps will know this, or you will meet when you repair production at three o'clock in the morning.

    ASGI


    Abstraction is a protocol called ASGI. This is the standard interface that lies between any network interface, server, be it a synchronous or asynchronous protocol, and your application. Its main concept is the channel.

    Channel


    A channel is an ordered first-in-first-out queue of messages that have a lifetime. These messages can be delivered zero or one time, and can only be received by one Consumer.

    Consumers


    In Consumer, you are just writing your code.

    defws_message(message) :
        message.reply_channel.send ( {
            'text': message.content ['text'],
    } )
    

    A function that accepts a message may send several answers, or it may not send the answer at all. Very similar to view, the only difference is that there is no return function, thus we can talk about how many answers we return from the function.

    We add this function to routing, for example, we hang it to receive a message on a web socket.

    from channels.routing import route
    from myapp.consumers import ws_message
    channel_routing = [
        route ('websocket.receive' ws_message),
    }
    

    We register it in Django settings, as well as they would register the database.

    CHANNEL_LAYERS = {
        'default': {
            'BACKEND': 'asgiref.inmemory',
            'ROUTING': 'myproject.routing',
        },
    }
    

    There can be several Channel Layers in a project, just as there can be several databases. This thing is very similar to db router, if someone used it.

    Next, we define our ASGI application. It synchronizes how Twisted starts and how synchronized workers are started — they all need this application.

    import os
    from channels.asgi import get_channel_layer
    os.environ.setdefault(
        'DJANGO_SETTINGS_MODULE',
        'myproject.settings',
    )
    channel_layer = get_channel_layer()
    

    After that, the code is deployed: we launch gunicorn, standardly send an HTTP request, synchronously, with the view, as we are used to. We start an asynchronous server that will stand in front of our synchronous Django, and the workers who will process the messages.

    $ gunicorn myproject.wsgi
    $ daphne myproject.asgi:channel_layer
    $ django-admin runworker
    

    Reply channel


    As we have seen, message has such a thing as the Reply channel. Why do you need it?

    Сhannel unidirectional, respectively WebSocket receive, WebSocket connect, WebSocket disconnect is a common channel to the system for input messages. And the Reply channel is a channel that is strictly tied to the user's connection. Accordingly, message has an input and output channel. This pair allows you to identify from whom you received this message.


    Groups


    A group is a set of channels. If we send a message to a group, it is automatically sent to all channels of this group. This is convenient because nobody likes to write for loops. Plus, the implementation of groups is usually done using the native functions of the Channel layer, so it works faster than just sending messages one by one.

    from channels import Group
    defws_connect(message):
        Group ('chat').add (message.reply_channel)
    defws_disconnect(message):
        Group ('chat').discard(message.reply_channel)
    defws_message(message):
        Group ('chat'). Send ({
            'text': message.content ['text'],
            })
    

    Groups are also added to routing.

    from channels.routing import route
    from myapp.consumers import *
    channel_routing = [
        route ('websocket.connect' , ws_connect),
            route ('websocket.disconnect' , ws_disconnect),
            route ('websocket.receive' , ws_message),
    ]
    

    And as soon as the channel is added to the group, the reply will go to all users who have connected to our site, and not just the echo-answer to ourselves.

    Generic consumers


    What I love Django for is declarative. Similarly, there are declarative Consumers.

    Base Consumer is a basic one, it can only map the channel that you have defined to your own method and call it.

    from channels.generic import BaseConsumer
    classMyComsumer(BaseConsumer) :
        method_mapping = {
            'channel.name.here':  'method_name',
        }
        defmethod_name(self, message, **kwargs) :pass

    There are a large number of predefined consumers with deliberately augmented behavior, such as WebSocket Consumer, which determines in advance that it will handle WebSocket connect, WebSocket receive, WebSocket disconnect. You can immediately specify which groups to add the reply channel to, and as soon as you use self.send it will understand, send it to a group or to a single user.

    from channels.generic import WebsocketConsumer
    classMyConsumer(WebsocketConsumer) :defconnection_groups(self) :return ['chat']
        defconnect(self, message) :passdefreceive(self, text=None, bytes=None) :
            self.send (text=text, bytes=bytes)
    

    There is also a version of WebSocket consumer with JSON, that is, not text, not bytes, but already parsed JSON will arrive in receive - this is convenient.

    It is added to routing in the same way via route_class. In route_class, myapp is taken, which is determined from the consumer, from there all channels are taken and all channels specified in myapp are routed. Write this way less.

    Routing


    Let's talk in detail about routing and what it provides us.

    First, these are filters.

    // app.js
    S = new WebSocket ('ws://localhost:8000/chat/')
    # routing.py
    route('websocket.connect', ws_connect,
        path=r’^/chat/$’)
    

    This may be the path that came to us from the URI of the web socket connection, or the http request method. This can be any message field from a channel, for example, for an e-mail: text, body, carbon copy, whatever. The number of keyword arguments for a route is arbitrary.

    Routing allows you to do nested routes. If several consumers are determined by some common characteristics, it is convenient to group them and add everyone to the route at once.

    from channels import route, include
    blog_routes = [
        route ( 'websocket.connect', blog,
            path = r’^/stream/’) ,
    ]
    routing = [
        include (blog_routes, path= r’^/blog’ ),
    ]
    

    Multiplexing


    If we open several web sockets, each has a different URI, and we can hang several handlers on them. But let's be honest, open a few connections only to make something beautiful on the back end, unlike an engineering approach.

    Therefore, it is possible to call several handlers via a single web socket. We define such WebsocketDemultiplexer, which operates with the notion of stream within a single web socket. Through this stream, it will redirect your message to another channel.

    from channels import WebsocketDemultiplexer
    classDemultiplexer(WebsocketDemultiplexer) :
        mapping = {
            'intval': 'binding.intval',
            }
    

    The routing multiplexer is added in the same way as in any other declarative consumer route_class.

    from channels import route_class, route
    from .consumers import Demultiplexer, ws_message
    channel_routing = [
        route_class (Demultiplexer, path=’^/binding/’) ,
        route ('binding.intval', ws_message ) ,
    ]
    

    The stream argument is added to the message so that the multiplexer can figure out where to put the given message. The payload argument contains everything that goes to the channel after the multiplexer processes it.

    It is very important to note that in the Channel Layer, the message will fall twice : before the multiplexer and after the multiplexer. Thus, as soon as you start using a multiplexer, you automatically add latency to your queries.

    {
        "stream" : "intval",
        "payload" : {
            …
        }
    }
    

    Sessionions


    Each channel has its own sessions. This is a very handy thing, for example, to keep state between calls to handlers. You can group them by the reply channel, since this is an identifier that belongs to the user. The session is stored in the same engine, which stores the usual http session. For obvious reasons, the signed cookie is not supported, they are simply not in the web socket.

    from channels.sessions import channel_session
    @channel_sessiondefws_connect(message) :
        room=message.content ['path']
        message.channel_session ['room'] = room
        Croup ('chat-%s' % room).add (
            message.reply_channel
        )
    

    During the connection, you can get http session and use it in your consumer. As part of the negotiation process, setting up a web socket connection is sent to the user's cookies. Accordingly, therefore, you can get a user session, get a user object that you used to use in Django before, just as if you were working with a view.

    from channels.sessions import http_session_user
    @http_session_userdefws_connect(message) :
        message.http_session ['room'] = room
        if message.user.username :
            …
    

    Message order


    Channels allows you to solve a very important problem. If we establish a connection with a web socket and immediately send it, then this leads to the fact that two events — WebSocket connect and WebSocket receive — are very close in time. It is very likely that consumer for these web sockets will run in parallel. Debugging it will be very fun.

    Django channels allows you to enter two types of lock:

    1. Easy lock . With the help of the session mechanism, we guarantee that until the consumer receives the message, we will not process any message on the web sockets. After the connection is established, the order is arbitrary, perhaps parallel execution.
    2. Hard lock  - only one consumer of a specific user is executed at a time. This is an overhead of synchronization, since the slow session engine is used. Nevertheless, there is such an opportunity.

    from channels.generic import WebsocketConsumer
    classMyConsumer(WebsocketConsumer) :
        http_user = True
        slight_ordering =  True
        strict_ordering =  Falsedefconnection_groups(self, **kwargs) :return ['chat']
    

    In order to write this, there are the same decorators that we saw earlier in the http session, channel session. In declarative consumer you can just write attributes, as soon as you write them, it will automatically apply to all methods of a given consumer.

    Data binding


    In due time Meteor became famous for Data binding.

    Open two browsers, go to the same page, and in one of them click on the scroll bar. At the same time, in the second browser, on this page, the scroll bar changes its value. That's cool.

    classIntegerValueBinding(WebsocketBinding) :
        model = IntegerValue
        stream = intval'
        fields= ['name', 'value']
        def group_names (self, instance, action ) :
            return ['intval-updates']
        def has_permission (self, user, action, pk) :
            return True
    

    Django can now do the same.

    This is accomplished using hooks provided by Django Signals . If binding is defined for a model, all connections that are in a group for this instance model will be notified of each event. They created a model, changed the model, deleted it - it will all be in the alert. The notification occurs on the specified fields: the value of this field has changed - a payload is formed, sent via a web socket. It's comfortable.

    It is important to understand that if in our example we constantly click the scroll bar, then messages will constantly go on and the model will be saved. This will work up to a certain load, then everything will rest on the base.

    Redis layer


    Let's talk a little more about how the most popular Channel Layer for production - Redis.

    It is arranged well:

    • works with synchronous connections at the level of workers;
    • very friendly to Twisted, does not slow down, where it is particularly necessary, that is, on your front-line server;
    • MSGPACK is used to serialize messages within Redis, which allows you to reduce the footprint for each message;
    • you can distribute the load across multiple instances of Redis; it will automatically be shaded by using a consistent hash algorithm. Thus, the single point of failure disappears.

    The channel is simply a list of id from Redis. By id is the value of a particular message. This is done in order to be able to control the lifetime of each message and channel separately. In principle, this is logical.

    >> SET "b6dc0dfce"" \x81\xa4text\xachello"
    >> RPUSH "websocket.send!sGOpfny""b6dc0dfce"
    >> EXPIRE "b6dc0dfce""60"
    >> EXPIRE "websocket.send!sGOpfny""61"

    Groups are implemented by sorted sets. Distribution to groups is performed inside the Lua-script - it is very fast.

    >> type group:chat
    zset
    >> ZRANGE group:chat 01 WITHSCORES
    1)  "websocket.send!sGOpfny"2)  "1476199781.8159261"

    Problems


    Let's see what problems this approach has.

    Callback hell


    The first problem is the newly invented callback hell. It is very important to understand that most of the problems with the channels you encounter will be in style: arguments came to the consumer that he did not expect. Where they come from, who put them in Redis is all a dubious task to investigate. Debugging of distributed systems in general for the strong in spirit. AsyncIO solves this problem.

    Celery


    On the Internet, they write that Django Channels is a replacement for Celery.

    I have bad news for you - no, it is not.

    In channels:

    • no retry, you can not delay the execution of the handler;
    • No canvas - just callback. Celery provides the groups, the chain, my favorite chord, which, after parallel execution of the groups, causes another callback with synchronization. None of this is in the channels;
    • There is no task for the arrival time of messages, some systems without this can not be designed.

    I see the future as official support for using channels and celery together, at minimal cost, with minimal effort. But Django Channels is not a Celery replacement.

    Django for modern web


    Django Channels is the Django for the modern web. This is the same Django that we all used to use: synchronous, declarative, with a large number of batteries. Django Channels is just one battery plus. You should always understand where to use it and whether to do it. If the Django project is not needed, then the Channels are not needed there. They are useful only in projects where Django is justified.

    Moscow Python Conf ++

    Professional conference for Python-developers comes to a new level - on October 22 and 23, 2018 we will gather 600 best Python programmers in Russia, present the most interesting reports and, of course, create an environment for networking in the best traditions of the Moscow Python community with the support of the team “ Ontiko. "

    We invite experts to make a presentation. The program committee is already working and accepting applications until September 7.

    For participants, an online brainstorming program is conducted. In this document, you can make the missing topics or just the speakers, whose presentations are interesting to you. The document will be updated, in fact, you will be able to follow the program formation all the time.

    Also popular now: