
Increase performance with SO_REUSEPORT in NGINX 1.9.1
- Transfer
NGINX version 1.9.1 has a new feature that allows you to use the socket option
(In NGINX Plus, this functionality will appear in Release 7, which will be released later this year.)
The option
As shown in the figure, without

With the option

This allows for multi-core systems to reduce blocking when multiple workflows simultaneously accept connections. However, this also means that when one of the workflows is blocked on some long operation, this will affect not only the connections that it is already processing, but also those that are still waiting in the queue.
For inclusion
Testing performance with
Measurements were performed using wrk using 4 NGINX workflows on a 36 nuclear AWS instance. To minimize network overhead, the client and server worked through a loopback interface, and NGINX was configured to return a string

Measurements were also taken when the client and server were running on different machines and an html file was requested. As can be seen from the table,
In these tests, the request frequency was extremely high, while they did not require any complex processing. Various observations confirm that the greatest effect of using the option is
Thanks to Sepherosa Ziehau and Yingqi Lu, each of whom proposed their own solution for working
SO_REUSEPORT
, which is available in modern versions of operating systems, such as DragonFly BSD and Linux (kernels 3.9 and newer). This option allows you to open multiple listening sockets at the same address and port at once. In this case, the kernel will distribute incoming connections between them. (In NGINX Plus, this functionality will appear in Release 7, which will be released later this year.)
The option
SO_REUSEPORT
has many potential applications for solving various problems. So, some applications can use it to update executable code on the fly (NGINX always had this opportunitysince time immemorial, using a different mechanism). In NGINX, enabling this option increases performance in some cases by reducing locks on locks. As shown in the figure, without
SO_REUSEPORT
one listening socket is divided between many workflows and each of them is trying to accept new connections from it:
With the option
SO_REUSEPORT
, we have many listening sockets, one for each workflow. The kernel of the operating system distributes to which one the new connection will get (and thereby which of the work processes will eventually receive it):
This allows for multi-core systems to reduce blocking when multiple workflows simultaneously accept connections. However, this also means that when one of the workflows is blocked on some long operation, this will affect not only the connections that it is already processing, but also those that are still waiting in the queue.
Configuration
For inclusion
SO_REUSEPORT
in the modules http
or it is stream
enough to specify the reuseport
directive parameter listen
, as shown in the example:http {
server {
listen 80 reuseport;
server_name example.org;
...
}
}
stream {
server {
listen 12345 reuseport;
...
}
}
reuseport
In this case, the
indication automatically disables accept_mutex
for this socket, since in this mode the mutex is not needed.Testing performance with reuseport
Measurements were performed using wrk using 4 NGINX workflows on a 36 nuclear AWS instance. To minimize network overhead, the client and server worked through a loopback interface, and NGINX was configured to return a string
OK
. Three configurations were compared: c accept_mutex on
(default), c, accept_mutex off
and c reuseport
. As you can see in the diagram, the inclusion reuseport
of 2-3 times increases the number of requests per second and reduces delays, as well as their fluctuation.
Measurements were also taken when the client and server were running on different machines and an html file was requested. As can be seen from the table,
reuseport
there is a decrease in delays similar to the previous measurement, and their spread is reduced even more (almost an order of magnitude). Other tests also show good results from using the option. With the use of the reuseport
load was distributed evenly among the work processes. With the directive turned on accept_mutex
, an imbalance was observed at the beginning of the test, and in the event of a shutdown, all work processes took more processor time.Latency (ms) | Latency stdev (ms) | CPU load | |
---|---|---|---|
Default | 15.65 | 26.59 | 0.3 |
accept_mutex off | 15.59 | 26.48 | 10 |
reuseport | 12.35 | 3.15 | 0.3 |
In these tests, the request frequency was extremely high, while they did not require any complex processing. Various observations confirm that the greatest effect of using the option is
reuseport
achieved when the load matches this pattern. Thus, the option is reuseport
not available for the module mail
, because mail traffic does not definitely satisfy these conditions. We recommend that everyone take their own measurements to make sure there is an effect from reuseport
, and not blindly turn on the option wherever possible. You can find some tips for testing NGINX performance from a speech by Konstantin Pavlov at the nginx.conf 2014 conference.Acknowledgments
Thanks to Sepherosa Ziehau and Yingqi Lu, each of whom proposed their own solution for working
SO_REUSEPORT
in NGINX. The NGINX team used their ideas for an implementation that we consider ideal.