homm June 21, 2009 at 18:25

We deploy nginx + mod_wsgi on the server

Hello. For a long time I looked closely at the wonderful django framework , read a book , studied articles, tried to write hello worlds (it was easy and pleasant with the server built into the jung). And yesterday I tried to configure a battle server from beginning to end, and as it turned out, it was not so simple, and it even seemed to me that if I were younger and inexperienced, I would have spat on this matter. So I decided to share with the readers the full instruction, providing it with some reasoning and configs. The article is intended for beginners, but it will be interesting to everyone, I promise.

Why wsgi and nginx?

There are several reasons for this. First, the solution should be fast, eat little memory and be sustainable. Nginx was created to be like that. In addition, according to performance tests , it is the nginx + wsgi bundle that provides less memory consumption, and high fault tolerance under heavy loads. Secondly, it is known that the simpler the system, the more reliable it is. If we compare wsgi with fastcgi, about the latter we can say that this is a separate server on which the application itself runs. Someone must ensure that this server does not crash. wsgi, on the other hand, is a call to python scripts from an application in C (which is nginx), so once we configure the interface, we halve the number of entities and then only communicate with the web server.

In addition, I have the assumption that wsgi will be generally faster for the end user, though they are based only on a theoretical argument: let's say that 50 users connected to our server at the same time and requested equally heavy pages. In the case of fastcgi, nginx will immediately make 50 requests to the fastcgi server and will wait for a response. Purely theoretically the same requests that arrived at the same time will spawn a bunch of running applications, they will compete equally for the processor time and all will be executed at one time (assuming that the number of running fastcgi applications is unlimited). Nginx will process the requests received by wsgi itself, and since the maximum number of simultaneously executed requests is equal to the number of workers, they will queue and users will receive the first answers almost immediately. But the latter will come in the same time as in the case of fastcgi, there is nowhere to go.

From theory to practice

The most boring part, we will collect nginx. You need to assemble it because there simply is no other way to connect mod_wsgi, so pull the latest stable version (now 0.7.60 ) from Igor Sysoev’s site and unpack it somewhere. The official page of mod_wsgi, as I understand it, is here . And it is here that the main disappointment befalls us - the project has not been updated for more than a year. It will be necessary to correct the sources a bit for compatibility with the current version of nginx, but nothing, the main thing is that everything should start. We download the latest version of the module and also unpack it somewhere near the Indix. What exactly needs to be edited: 1) In the archive with mod_wsgi in the patches folder lies the patch nginx-0.6.x.patch

, which patches the config and ngx_http_wsgi_module.c You can ignore the ending, we won’t need it.
2) Since when there was the last update of mod_wsgi, the current version of Enginix was not yet, then the official patch is not enough. Searches for other compilation errors only led me to this site . I don’t know Japanese, but it wasn’t required, right on the page are the necessary patches. True, the line numbers in this patch did not coincide with those in the source, and I had to manually edit:

In the src / ngx_http_wsgi_module.c file, we declare somewhere closer to the beginning of the file structure: In the same place, the first call to the ngx_garbage_collector_temp_handler function is replaced with NULL. The function call is changed to

static ngx_path_init_t ngx_http_wsgi_temp_path = {

    ngx_string(NGX_HTTP_WSGI_TEMP_PATH), { 1, 2, 0 }

};

 ngx_conf_merge_path_value(conf->temp_path,

        prev->temp_path,

        NGX_HTTP_WSGI_TEMP_PATH, 1, 2, 0,

        ngx_garbage_collector_temp_handler, cf);

 ngx_conf_merge_path_value(cf, &conf->temp_path,

        prev->temp_path,

        &ngx_http_wsgi_temp_path);

Looking ahead, I’ll say that in the future we’ll have to go back to the mod_wsgi sources and fix something in them, but in principle they are already ready for assembly.

Go to the directory with the Enginix sources and run
./configure --add-module = / path / to / mod_wsgi /

Here I had a slight shutdown. The fact is that since the http_cache module began to be delivered with indinix, it began to demand openSSL. I searched for the openssl-dev package for a long time and even wanted to build it without this module, but then the package was found and it was simply called ssl-dev. By the way, there is an error in the error description, the parameter disabling http_cache is not “--without-http_cache”, but “--without-http-cache”.

Here I did make & make install.

Nginx configuration

The very first thing you need to fix in nginx.conf is the number of workers. You've probably heard that usually the optimal number of workers is equal to the number of cores on the machine. When using mod_wsgi, things change a bit. The fact is that due to the fact that applications work in the context of these same workers, the number of simultaneously running processes is equal to the number of workers. Under ideal conditions, the number of workers in terms of the number of cores would load the processor exactly 100%. But the conditions are not perfect and application execution is sometimes interrupted. For example, when performing disk operations or querying the database, if the database server is on another machine. So it seems to me that the optimal number of workers is equal to twice the number of cores.

Next, configure the location in the server.
If we do as suggested in the example from mod_wsgi: then we can immediately say goodbye to statics processing on the same server. A popular solution is to set the regularity on popular file types and try to give them as files, and dump the rest on the backend. But I do not like this option. Firstly, when requesting a picture that is not on the server, there will be a standard Enginix message, and secondly, what if you need to give an automatically generated pdf file? It is better that he also has a pdf extension. Therefore, I propose my option: all the paths are first searched on the disk, and if no such file is found, we go after the page to the backend (and if nothing is also found on it, there will be page 404 from the application, and not from the server). It uses 2 files that will also need to be created:

location / {

        wsgi_pass /path/to/nginx/django.py;

}

location / {

        root           /path/to/www/;

        error_page 404 = @backend;

        log_not_found  off;

}

location = / {

        #Корень сервера

        wsgi_pass      /path/to/nginx/django.py;

        include        wsgi_params;

}

location @backend {

        wsgi_pass      /path/to/nginx/django.py;

        include        wsgi_params;

}

From here we take wsgi_params : And from the official documentation django.py : It is worth noting that DJANGO_SETTINGS_MODULE does not indicate the file name, but the module name, i.e. for the given example, the actual location of the configuration file should be “/path/to/project/settings.py”

wsgi_var REQUEST_METHOD $request_method; 

wsgi_var QUERY_STRING $query_string; 


wsgi_var CONTENT_TYPE $content_type; 

wsgi_var CONTENT_LENGTH $content_length; 


wsgi_var SERVER_NAME $server_name; 

wsgi_var SERVER_PORT $server_port; 


wsgi_var SERVER_PROTOCOL $server_protocol; 


wsgi_var REQUEST_URI $request_uri; 

wsgi_var DOCUMENT_URI $document_uri; 

wsgi_var DOCUMENT_ROOT $document_root; 


wsgi_var SERVER_SOFTWARE $nginx_version; 


wsgi_var REMOTE_ADDR $remote_addr; 

wsgi_var REMOTE_PORT $remote_port; 

wsgi_var SERVER_ADDR $server_addr; 


wsgi_var REMOTE_USER $remote_user;

import os, sys

# место, где лежит проект

sys.path.append('/path/to/')

# файл конфигурации проекта

os.environ['DJANGO_SETTINGS_MODULE'] = 'project.settings'

import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()

Big bummer

If I haven't missed anything, then everything should be ready to launch the Enginix. We launch the browser, go to http: // localhost / (or whatever you have there) and see the welcome page of the dzhanga, if the project is new, like mine, or the main page, if something is already written. Honestly, I already thought that the biggest difficulties were behind, but it wasn’t there. As soon as I typed http: // localhost / 1 (or any other url not from the root), I got the 500th Enginix. It is clear that this is worse than the five hundredth dzhangi. An entry appeared in the logs, saying that the negative second argument (string length) was passed to the PyString_FromStringAndSize function.
After disassembling with the sources, I came to the conclusion that the reason is that the module is trying to remove the name of the location from the beginning of the path_info line into the handler of which url got. Those. in our case, all pages fall into the location backend, and it is precisely 8 characters that the module tries to subtract from path_info, which simply doesn’t have that much. In the case when there were so many characters in the url, we got a path truncated from the beginning, i.e. instead of httr: // localhost / 123456789 they received httr: // localhost89
The comment says TODO "we should check that clcf-> name is in r-> uri", i.e. “We have to check if there is any name for this location in the url string.
In principle, I understand why this was done, for example, it was possible to determine

location /django {

        wsgi_pass …

}

and a url of the form http: // localhost / django / foo / bar / would be broadcast for the junga to http: // localhost / foo / bar /. But it is not clear why the author did not take into account that url can be regexps and links in general (as in our case). I decided that I didn’t need such a buggy functionality, especially since if I needed it, it could be done in django by rewriting urls.py with include.

In general, you need to fix line 584 in ngx_wsgi_run.c :

if (clcf->name.len == 1 && clcf->name.data[0] == '/') {
if (1 || (clcf->name.len == 1 && clcf->name.data[0] == '/')) {

then do make & make install again, after extinguishing the working Enginix.

Server restart

Python is not php. The application, once downloaded, is separated from its source and knows nothing about its change. Therefore, we will have to restart the Enginix from time to time when updating the code. Fortunately, Enginix is a very good server, we don’t have to write a script that would kill it and run it again, disconnecting all clients. It provides the ability to restart workers softly, without instantly disconnecting the clients they served. This can be done by transmitting the HUP signal to the host process. Therefore, all we need to do is write a simple script, naming it, say restart-server :

sudo kill -HUP `cat /usr/local/nginx/logs/nginx.pid`

Each time you run this script, the application will restart. In addition, I would highly recommend hanging this script on crowns, say once an hour. This will not affect performance significantly, but memory can be saved if scripts start to flow.

A spoon of tar

Unfortunately, the resulting server has an Achilles heel. If it so happens that 10 people decide to load some kind of brake page at the same time (say pdf generation or avatarka pinching), and you only have 10 workers, then no other request (even the lightest one, for example, a gif picture) will be processed until at least one worker will not be free. I unfortunately did not test other communication methods, maybe the same effect will be when working with fastcgi or http_pass, but it seems to me that since separate tcp connections are used there, they should not behave this way. But even this minus is not so scary, because, as I said, Enginix is a wonderful server. He divides the requests into heavy and light, and even if you have a lot of heavy requests in the queue, when you release at least one worker, he will first spit out all the easy requests, and only then will he accept the heavy ones. In other words, light requests are always higher priority than heavy ones, in whatever order they are received.

Tags: