RoadRunner: PHP is not created to die, or Golang to the rescue
Hi, Habr! We at Badoo are actively working on the performance of PHP , since we have a fairly large system in this language and the issue of performance is a matter of saving money. More than a decade ago, we created for this PHP-FPM, which initially represented a set of patches for PHP, and later entered into the official delivery.
In recent years, PHP has made great progress: the garbage collector has improved, the level of stability has improved - today you can write demons and long-lived scripts for PHP without any special problems. This allowed Spiral Scout to go further: RoadRunner, unlike PHP-FPM, does not clear the memory between requests, which gives an additional performance gain (although this approach complicates the development process). We are now experimenting with this tool, but so far we have no results to share. To wait for them was more fun, we publish the translation of the announcement RoadRunner from Spiral Scout.
The approach from the article is close to us: in solving our problems, we also often use a bunch of PHP and Go, taking advantage of both languages and not giving up one in favor of the other.
In the past ten years, we have been creating applications for Fortune 500 companies and for businesses with an audience of no more than 500 users. All this time, our engineers have developed a backend mainly in PHP. But two years ago, something strongly influenced not only the performance of our products, but also their scalability - we introduced Golang (Go) into our technology stack.
Almost immediately, we discovered that Go allows us to create larger applications with an increase in performance up to 40 times. With it, we were able to expand existing products written in PHP, improving them thanks to the combination of advantages of both languages.
We will tell you how a bunch of Go and PHP helps to solve real development problems and how it turned for us into a tool that can eliminate some of the problems associated with the PHP dying model .
Your daily PHP development environment
Before telling how Go can be used to animate the PHP dying model, let's take a look at your standard PHP development environment.
In most cases, you run the application using a combination of the nginx web server and the PHP-FPM server. The first serves static files and redirects specific requests to PHP-FPM, and PHP-FPM itself executes PHP code. Perhaps you are using a less popular bundle from Apache and mod_php. But although it works a little differently, the principles are the same.
Consider how PHP-FPM executes application code. When a request arrives, the PHP-FPM initializes the child PHP process, and sends the details of the request as part of its state (_GET, _POST, _SERVER, etc.).
The state cannot change during the execution of the PHP script, so getting a new set of input data is possible only in one way: by clearing the process memory and initializing it again.
Such a model of performance has many advantages. You do not need to worry much about memory consumption, all processes are completely isolated, and if one of them "dies", it will be automatically recreated and it will not affect the other processes. But there is such an approach and disadvantages that occur when you try to scale the application.
Disadvantages and inefficiencies of a typical PHP environment
If you are engaged in professional development in PHP, then you know where to start a new project - with the choice of the framework. It is a library for dependency injection, ORMs, translations and templates. And, of course, all user input data can be conveniently placed in one object (Symfony / HttpFoundation or PSR-7). The frameworks are cool!
But everything has its price. In any enterprise framework, in order to process a simple user request or access a database, you will have to load at least dozens of files, create numerous classes, and parse several configurations. But the worst thing is that after each task, you will need to reset everything and start again: all the code you just initiated becomes useless, with its help you will not process another request. Tell about this to any programmer who writes in some other language - and you will see bewilderment on his face.
Over the years, PHP engineers have been looking for ways to solve this problem, have used lazy loading techniques, microframes, optimized libraries, cache, etc. But in the end, you still have to drop the entire application and start over, again and again. (Translator’s note: this problem will be partially solved with the advent of preload in PHP 7.4)
Can PHP use Go for more than one request?
You can write PHP scripts that last longer than a few minutes (up to hours or days): for example, cron tasks, CSV parsers, queue disassemblers. All of them work according to the same scenario: they retrieve the task, carry it out, wait for the following. The code resides in memory, saving precious milliseconds, since many additional actions are required to load the framework and application.
But developing long-lived scripts is not so easy. Any error completely kills the process, diagnosing memory leaks leads to rabies, and using F5 debugging is no longer possible.
The situation has improved with the release of PHP 7: a reliable garbage collector has appeared, it has become easier to handle errors, and kernel extensions are now protected from leaks. True, engineers still need to carefully handle the memory and remember the problems of the state in the code (and is there a language in which you can not pay attention to these things?). And yet, in PHP 7, there are fewer surprises.
Is it possible to take a model of working with long-lived PHP scripts, adapt it for more trivial tasks like processing HTTP requests and thereby eliminate the need to download everything from scratch with each request?
To solve this problem, first it was necessary to implement a server application that can receive HTTP requests and redirect them one by one to the PHP worker without killing it every time.
We knew that we could write a web server in pure PHP (PHP-PM) or using the C-extension (Swoole). And although each method has its merits, both options did not suit us - I wanted something more. We needed not just a web server - we expected to get a solution that could save us from the problems associated with the "heavy start" in PHP, which at the same time can be easily adapted and extended for specific applications. That is, we needed an application server.
Can Go help with this? We knew that it could, because this language compiles applications into single binary files; it is cross-platform; uses its own, very elegant, parallel processing model (concurrency) and a library for working with HTTP; and finally, thousands of open-source libraries and integrations will be available to us.
The difficulty of combining two programming languages
First of all, it was necessary to determine how two or more applications would communicate with each other.
For example, using the beautiful library of Alex Palaestras, memory sharing between PHP and Go processes (like mod_php in Apache) could be realized. But this library has features that limit its use to solve our problem.
We decided to use another, more common, approach: to build the interaction between processes through sockets / pipelines. This approach over the past decade has proven its reliability and has been well optimized at the operating system level.
To begin with, we created a simple binary protocol for exchanging data between processes and handling transmission errors. In its simplest form, this type of protocol is similar tonetstring with a packet header of a fixed size (in our case 17 bytes), which contains information about the type of the packet, its size and a binary mask to check the integrity of the data.
On the PHP side, we used the pack function , and on the Go side, the encoding / binary library .
One protocol seemed to us a little - and we added the ability to call net / rpc Go services directly from PHP . Later, this helped us a lot in development, since we could easily integrate Go libraries into PHP applications. The result of this work can be seen, for example, in our other open-source product Goridge .
Distribution of tasks across several PHP workers
After the implementation of the interaction mechanism, we began to think how best to transfer tasks to PHP processes. When a task arrives, the application server must select a free worker to perform it. If the worker / process terminates with an error or “dies,” we get rid of it and create a new one in return. And if the worker / process has completed successfully, we will return it to the pool of workers available to perform the tasks.
To store the pool of active workers, we used a buffered channel , to remove unexpectedly "dead" workers from the pool, we added a mechanism for tracking errors and states of workers.
As a result, we got a working PHP server capable of processing any requests presented in binary form.
In order for our application to start working as a web server, we had to choose a reliable PHP standard to represent any incoming HTTP requests. In our case, we simply convert the net / http request from Go to the PSR-7 format so that it is compatible with most of the PHP frameworks available today.
Since PSR-7 is considered immutable (someone will say that technically it is not), developers have to write applications that, in principle, do not treat the request as a global entity. This fits in perfectly with the concept of long-lived PHP processes. Our final implementation, which has not yet received the name, looked like this:
Introducing RoadRunner - High - Performance PHP Application Server
Our first test task was an API backend on which bursts of requests periodically appeared unpredictably (much more often than usual). Although in most cases the nginx capabilities were sufficient, we regularly encountered the error 502, because we could not quickly balance the system under the expected increase in load.
To replace this solution, we deployed our first PHP / Go application server at the beginning of 2018. And immediately got an incredible effect! Not only did we completely get rid of error 502, but we were also able to reduce the number of servers by two thirds, saving a lot of money and headache tablets for engineers and product managers.
By the middle of the year, we improved our solution, published it on GitHub under the MIT license and called RoadRunner, thus emphasizing its incredible speed and efficiency.
How RoadRunner can improve your development stack
The use of RoadRunner allowed us to use Middleware net / http on the Go side to perform JWT verification even before the request enters PHP, as well as to process WebSockets and globally aggregate states in Prometheus.
Thanks to the built-in RPC, you can open the API of any Go-libraries for PHP without writing extensions-wrappers. More importantly, using RoadRunner, you can deploy new servers that are different from HTTP. Examples include running AWS Lambda handlers in PHP , creating robust queue pickers, and even adding gRPC to our applications.
Using the PHP and Go communities, we increased the stability of the solution, increased the application performance up to 40 times in some tests, improved the debugging tools, implemented integration with the Symfony framework, and added support for HTTPS, HTTP / 2, plug-ins and PSR-17.
Some are still in the thrall of an outdated concept of PHP as a slow, cumbersome language, suitable only for writing plugins under WordPress. These people can even say that PHP has such a limitation: when an application becomes large enough, you have to choose a more “mature” language and rewrite the code base that has accumulated over many years.
I want to answer all this: think again. We believe that only you yourself set some restrictions for PHP. You can spend your whole life moving from one language to another, trying to find the perfect combination with your needs, or you can start to perceive languages as tools. Imaginary flaws of a language like PHP can actually be the reasons for its success. And if you combine it with another language like Go, then you will create much more powerful products than if you limited yourself to using any one language.
Having worked with a bunch of Go and PHP, we can say that we love them. We do not plan to sacrifice one for the other - on the contrary, we will look for ways to get even more benefit from this double stack.
UPD: Welcome to the creator of RoadRunner and the co-author of the original article -Lachezis