PHP GR8: Will JIT Improve PHP 8 Performance
- Transfer
PHP is one of the main development languages at Badoo. In our data centers, thousands of processor cores are busy executing millions of lines of PHP code. We are closely following the news and are actively looking for ways to improve productivity, since even a little optimization on our volumes leads to significant resource savings. One of the main PHP performance news is the appearance of JIT in the eighth version of the language. This, of course, could not remain without our attention, and we translated an article about what is JIT, how it will be implemented in PHP, why it was decided to do it and what to expect from it.
If you didn’t leave the cave or didn’t come from the past (in this case, welcome), you already know that PHP 8 will have JIT: the other day, the vote was quietly and peacefully, and the overwhelming majority of participants voted in favor of implementation, so everything is decided .
In a fit of joy, you can even depict several crazy movements as in the photo (this, by the way, is called the “Detroit JIT”:
Now sit down and read this article debunking myths. I want to clarify the misunderstanding related to what JIT is and what it means useful, and talk about how it works (but not in too much detail so you don’t get bored).
Since I don’t know who will read the article, I’ll go from simple to complex questions. If you already know the answer to the question in the title, you can Feel free to skip the relevant chapter.
What is JIT?
PHP is implemented on the basis of a virtual machine (we call it Zend VM). The language compiles the PHP source code into instructions that the virtual machine understands (this is called the compilation stage). The virtual machine instructions obtained at the compilation stage are called opcodes. At the runtime stage, the Zend VM executes the opcodes, thereby performing the required work.
This circuit works great. In addition, tools like APC (before) and OpCache (today) cache the results of the compilation stage, so this stage is only performed if necessary.
In short, JIT is a just-in-time compilation strategy (at the right time), in which the code is first translated into an intermediate representation, which then turns into an architecture-dependent machine code during execution.
In PHP, this means that JIT will consider the instructions for the virtual machine received at the compilation stage as an intermediate representation and issue machine code that will no longer be executed by the Zend VM, but directly by the processor.
Why does PHP need JIT?
Shortly before the advent of PHP 7.0, the main focus of the PHP team was language performance. Most of the major changes in PHP 7.0 were in the PHPNG patch, which greatly improved the way PHP uses memory and processor. Since then, each of us has to glance at the performance of the language.
After the release of PHP 7.0, performance improvements continued: a hash table (the main data structure in PHP) was optimized, specialization of certain opcodes in Zend VM and specialization of certain sequences in the compiler were implemented, Optimizer (OpCache component) was constantly improved, and many other changes were implemented.
The harsh truth is that as a result of all these optimizations, we are quickly approaching the limit of performance improvement opportunities.
Please note: by “limit of improvement opportunities” I mean the fact that the trade-offs that have to be made for further improvements no longer look attractive. When it comes to optimizing performance, we always talk about trade-offs. Often, for the sake of productivity, we have to sacrifice simplicity. Everyone would like to think that the simplest code is also the fastest, but in the modern world of C programming this is not so. The fastest most often is the code that is prepared to take advantage of the internal structure of the architecture or the structures built into the platform / compiler. Simplicity alone does not guarantee better performance.
Therefore, at this stage, the best way to squeeze out even more performance from PHP is to implement JIT.
Will JIT speed up my site?
Most likely, insignificantly.
This may not be the answer you were expecting. The fact is that in general, PHP applications are limited by input / output (I / O bound), and JIT works best with code that is limited by processor (CPU bound).
What does “limited by I / O and processor” mean?
To describe the characteristics of the overall performance of some code or application, we use the terms “limited by input-output” and “limited by processor”.
The simplest definition:
- code limited by I / O will work much faster if we find a way to improve (reduce, optimize) the performed I / O operations;
- processor-limited code will work much faster if we find a way to improve (reduce, optimize) instructions executed by the processor or magically increase the processor clock speed.
Code and application can be limited by I / O, by processor, or both.
In general, PHP applications tend to be limited by I / O: their main bottleneck is often I / O operations - connecting, reading and writing to the database, caches, files, sockets, etc.
What does processor-limited PHP code look like?
Perhaps some PHP programmers are new to processor-limited code due to the nature of most PHP applications: they usually act as a link to the database or cache, pick up and produce small amounts of HTML / JSON / XML responses.
You can look at your code base and find a lot of code that has nothing to do with I / O, code that calls functions that have nothing to do with I / O. And you may be confused that this does not make your application limited by the processor, although its code has more lines that do not work with I / O than work.
The fact is that PHP is one of the fastest interpreted languages. There is no noticeable difference between calling a function that does not use I / O in Zend VM and in machine code. Of course, there is some difference, but both machine code and Zend VM use a calling convention, so it doesn’t matter
какую-то_функцию_уровня_С()
if you call in opcodes or in machine code - this will not have a noticeable effect on the performance of the entire application, who makes the call. Note: in simple terms, a calling convention is a sequence of instructions executed before entering another function. In both cases, the calling convention passes arguments to the stack.
You ask: "What about loops, tail calls, and more?" PHP is smart enough - and when Optimizer from OpCache is turned on, your code will be magically converted to a more efficient version of what you wrote.
It should be noted here that JIT will not change the Zend VM calling conventions. This is done because PHP must be able to switch between JIT and VM modes at any time (therefore, they decided to keep the current conventions). As a result, any calls you see everywhere using JIT will not work much faster.
If you want to see what the processor-limited PHP code looks like, take a look here: https://github.com/php/php-src/blob/master/Zend/bench.php. This is an extreme example, but it shows that all the splendor of JIT is revealed in mathematics.
Had to make such an extreme compromise to speed up mathematical calculations in PHP?
Not. We did this for the sake of expanding the range of application of the language (and expanding significant).
We don't want to brag, but PHP dominates the web. If you are engaged in web development and do not consider using PHP in your next project, then you are doing something wrong (according to a very biased PHP developer).
At first glance, it might seem that the acceleration of mathematical calculations in PHP has a very narrow application. However, this opens the way for us, for example, to machine learning, 3D rendering, 2D rendering (GUI) and data analysis.
Why can't this be implemented in PHP 7.4?
Above, I called JIT an extreme compromise, and I really think so: this is one of the most difficult compilation strategies among all existing, if not the most difficult. Implementing JIT is a significant increase in complexity.
If you ask Dmitry, the author of JIT, whether he made PHP complicated, he will answer: “No, I hate complexity” (this is a quote).
Essentially, “complex” means “that which we do not understand.” And today, few of the language developers really understand the existing implementation of JIT.
Work on PHP 7.4 is progressing rapidly, and the introduction of JIT in this version will lead to the fact that only a few can debug, fix and improve the language. This is unacceptable for those who voted against JIT in PHP 7.4.
Prior to the release of PHP 8, many of us will understand the JIT implementation. There are features that we want to implement, and tools that we want to rewrite for the eighth version, so we need to first understand the JIT. We need this time, and we are very grateful that the majority voted to give it to us.
Complex is not synonymous with the terrible. Complexity can be as beautiful as a star nebula, and this is just about JIT. In other words, even when 20 people in our team start to understand JIT no worse than Dmitry, this will not change the complexity of the nature of JIT.
Will PHP development slow down?
There is no reason to think so. We have enough time, so it can be argued that by the time PHP 8 is ready, there will be enough among us who have mastered JIT enough to work no less efficiently than today when it comes to fixing bugs and developing PHP.
When you try to relate this to the idea of the original complexity of JIT, remember that most of the time we spend on introducing new features is spent discussing them. Most often, when working on features and fixing bugs, writing a code takes minutes or hours, and discussions take weeks or months. In rare cases, the code has to be written for hours or days, but even then discussions always last longer.
That's all I wanted to say.
And since we are talking about performance, I invite my colleague Pavel Murzakov to the report on May 17 at the PHP Russia conference . Pasha knows how to squeeze the last CPU second out of the PHP code!