5 ways to deploy PHP code in the conditions of highload
If highload were taught at school, the textbook on this subject would have such a task. “N social network has 2,000 servers, on which 150,000 files of 900 MB each PHP code and a staging cluster for 50 machines. The code is deployed 2 times a day to servers, the code is updated every few minutes on a staging cluster, and there are additional “hotfixes” - small sets of files that are laid out out of turn on all or on a selected part of the servers, without waiting for the full calculation. Question: are such conditions considered highload and how to deploy them? Write at least 5 deployment options. ” We can only dream about the hyload problem book, but already now we know that Yuri Nasretdinov ( youROCK ) would definitely solve this problem and get the “five”.
Yuri did not stop at a simple solution, but additionally made a report in which he disclosed the concept of the “code deploy” concept, talked about classic and alternative solutions to large-scale PHP deployments, analyzed their performance and presented the MDK deployment system.
In English, the term “deploy” means putting troops on alert, and in Russian we sometimes say “fill the code into battle,” which means the same thing. You take the code in the already compiled or in the original, if it is PHP, form, upload it to the servers that serve the user traffic, and then, by magic, somehow switch the load from one version of the code to another. All this is included in the concept of “deploy code.”
The deployment process usually consists of several stages.
After everything is assembled, the phase of the immediate deployment begins - the code is poured onto production servers . It is about this phase that Badoo will be discussed.
If you have a file with a file system image, then how to mount it? In Linux, you need to create an intermediate Loop device , attach a file to it, and after that this block device can already be mounted.
A loop device is a crutch that Linux needs to mount a file system image. There are OS in which this crutch is not required.
How is the deployment process using files, which we also call “loops” for simplicity? There is a directory in which the source code and automatically generated content are located. We take an empty image of the file system - now it is EXT2, and earlier we used ReiserFS. We mount an empty image of the file system in a temporary directory, copy all the contents there. If we don’t need something to get into production, then we’re not copying everything. After that, unmount the device, and get the image of the file system in which the necessary files are located. Next, we archive the image and upload it to all servers , there we unzip and mount it.
First, let's thank Richard Stallman - without his license, most of the utilities that we use would not have existed.
I conventionally divided the methods for deploying PHP code into 4 categories.
Each method has both pros and cons, because of which we abandoned them. Consider these 4 methods in more detail.
I chose SVN not by chance - according to my observations, in this form the deployment exists precisely in the case of SVN. The system is quite lightweight , it allows you to easily and quickly carry out a deployment - just run svn up and you're done.
But this method has one big minus: if you do svn up, and in the process of updating the source code, when new requests come from the repository, they will see the state of a file system that did not exist in the repository. You will have part of the files new, and part of the old ones - this is a non-atomic deployment method that is not suitable for high load, but only for small projects. Despite this, I know projects that are still deployed this way, and so far everything works for them.
There are two options for how to do this: upload files using the utility directly to the server and upload “on top” - update.
Since you first completely pour all the code into a directory that does not yet exist on the server, and only then switch traffic, this method is atomic - no one sees an intermediate state. In our case, creating 150,000 files and deleting the old directory, which also has 150,000 files, creates a large load on the disk subsystem . We use hard disks very actively, and the server somewhere for a minute does not feel very well after such an operation. Since we have 2000 servers, it is required to upload 900 MB 2000 times.
This scheme can be improved if you first upload to a certain number of intermediate servers, for example, 50, and then add them to the rest. This solves possible problems with the network, but the problem of creating and deleting a huge number of files does not disappear anywhere.
If you used rsync, then you know that this utility can not only fill entire directories, but also update existing ones. Sending only changes is a plus, but since we upload the changes to the same directory where we serve the battle code, there will also be some kind of intermediate state - this is a minus.
Submitting changes works like this. Rsync makes lists of files on the server side from which the deploy is carried out, and on the receiving side. After that, it counts stat from all files and sends the entire list to the receiving side. On the server from which the deployment is proceeding, the difference between these values is considered, and it is determined which files should be sent.
In our conditions, this process takes about 3 MB of traffic and 1 second of processor time. It seems that this is not much, but we have 2,000 servers, and everything turns out at least one minute of processor time. This is not such a quick method, but definitely better than sending the whole thing through rsync. It remains to somehow solve the problem of atomicity and will be almost perfect.
Whatever single file you upload, it is relatively easy to do using BitTorrent or the UFTP utility. One file is easier to unzip, can be atomically replaced on Unix, and it is easy to check the integrity of the file generated on the build server and delivered to the destination machines by calculating the MD5 or SHA-1 amounts from the file (in the case of rsync, you don’t know what is on the destination servers )
For hard drives, sequential recording is a big plus - an 900 MB file will be written to an unoccupied hard drive in about 10 seconds. But still you need to record these same 900 MB and transfer them over the network.
This Open Source utility was originally created to transfer files over a network with long delays, for example, via a satellite-based network. But UFTP turned out to be suitable for uploading files to a large number of machines, because it works using UDP protocol based on Multicast. A single Multicast address is created, all the machines that want to receive the file subscribe to it, and switches ensure that copies of packets are delivered to each machine. So we shift the burden of transmitting data to the network. If your network can handle this, then this method works much better than BitTorrent.
You can try this Open Source utility on your cluster. Despite the fact that it works over UDP, it has a NACK mechanism - negative acknowledgment, which forces re-forwarding packets lost upon delivery.This is a reliable way to deploy .
tar.gz
An option that combines the disadvantages of both approaches. Not only do you have to write 900 MB to disk sequentially, after that you need to write the same 900 MB once again with random read-write and create 150,000 files. This method is even worse in performance than rsync.
phar
PHP supports archives in phar format (PHP Archive), knows how to give their contents and include files. But not all projects are easy to put in one phar - you need code adaptation. Just because the code from this archive does not work. In addition, one file cannot be changed in the archive ( Yuri from the future: in theory, you can still), you need to reload the entire archive. Also, despite the fact that phar archives work with OPCache, when deploying, the cache must be discarded, because otherwise there will be garbage in OPCache from the old phar file.
hhbc
This method is native to HHVM - HipHop Virtual Machine and is used by Facebook. This is something like a phar archive, but it doesn’t contain the source codes, but the compiled byte code of the HHVM virtual machine - the PHP interpreter from Facebook. It is forbidden to change anything in this file: you cannot create new classes, functions, and some other dynamic features in this mode are disabled. Due to these limitations, the virtual machine can use additional optimizations. According to Facebook, this can bring up to 30% to the speed of code execution. This is probably a good option for them. It is also impossible to change one file here ( Yuri from the future: actually it is possible, because it is a sqlite-base ). If you want to change one line, you need to redo the entire archive again.
For this methodIt is forbidden to use eval and dynamic include. This is so, but not quite. Eval can be used, but if it does not create new classes or functions, and include cannot be made from directories that are outside this archive.
loop
This is our old version, and it has two big advantages. First, it looks like a regular directory . You mount the loop, and for the code anyway - it works with files, both on the develop environment and on the production environment. The second - loop can be mounted in read and write mode, and change one file, if you still need to change something urgently for production.
But loop has cons. First, it works weirdly with docker. I’ll talk about this a bit later.
Second - if you use symlink on the last loop as document_root, then you will have problems with OPCache. It is not very good at having symlink in the path, and begins to confuse which versions of the files to use. Therefore, OPCache has to be reset when deploying.
Another problem is that superuser privileges are required to mount file systems. And you must not forget to mount them at the start / restart of the machine, because otherwise there will be an empty directory instead of code.
If you create a docker container and throw inside it a folder in which “loops” or other block devices are mounted, then there are two problems at once: new mount points do not fall into the docker container, and those “loops” that were at the time of creation A docker container cannot be unmounted because they are occupied by a docker container.
Naturally, this is generally incompatible with the deployment, because the number of loop devices is limited, and it is unclear how the new code should fall into the container.
We tried to do weird things, like raising a local NFS serveror mount the directory using SSHFS, but for various reasons this did not take root with us. As a result, in cron, we registered rsync from the last “loop” into the current directory, and it ran the command once a minute:
Here
This method was proposed by Rasmus Lerdorf, the author of PHP, and he knows how to deploy.
How to make an atomic deploy, and in any of the ways that I talked about? Take symlink and register it as document_root. At each point in time, symlink points to one of the two directories, and you make rsync into a neighboring directory, that is, to the one to which the code does not point.
But the problem arises: the PHP code does not know in which of the directories it was launched. Therefore, you need to use, for example, a variable that you will write somewhere in the beginning in the config - it will fix which directory the code was run from and from which new files should be included. On the slide it’s called
Use this constant when accessing all files inside the code that you use on production. So you get the atomicity property: requests that arrive before you switched symlink continue to include files from the old directory in which you did not change anything, and new requests that came after symlink switching start working from the new directory and are served new code.
But this needs to be written in the code. Not all projects are ready for this.
Rasmus suggests instead of manually modifying the code and creating constants to slightly modify Apache or use nginx.
For document_root, specify the symlink to the latest version. If you have nginx, then you can register it
This method has interesting advantages - real paths come to OPCache PHP, they do not contain symlink. Even the very first file that the request came to will already be full, and there will be no problems with OPCache. Since document_root is used, this works with any PHP project. You do not need to adapt anything.
It does not require fpm reload, you do not need to reset OPCache during the deployment, which is why the processor server is heavily loaded, because it must parse all the files again. In my experiment, resetting OPCache by about half a minute increased processor consumption by 2–3 times. It would be nice to reuse it and this method allows you to do it.
Now the cons. Since you do not reuse OPCache, and you have 2 directories, you need to store a copy of the file in memory for each directory - under OPCache, 2 times more memory is required.
There is another limitation that may seem strange - you can’t deploy more than once every max_execution_time . Otherwise, the same problem will occur, because while rsync is going to one of the directories, requests from it can still be processed.
If you use Apache for some reason, then you need a third-party module , which Rasmus also wrote.
Rasmus says the system is good and I recommend it to you too. For 99% of projects, it is suitable, both for new projects and for existing ones. But, of course, we are not like that and decided to write our own decision.
Basically, our requirements are no different from the requirements for most web projects. We just want a quick deployment on staging and production, low resource consumption , reuse of OPCache and fast rollback.
But there are two more requirements that may differ from the rest. First of all, it is the ability to apply patches atomically . We refer to patches as changes in one or several files that rule something on production. We want to do it fast. In principle, the system that Rasmus offers is coping with the patch task.
We also have a CLI-script , which can work for a few hours, and they should still work with a consistent version of the code. In this case, the above solutions, unfortunately, either do not suit us, or we must have a lot of directories.
Possible solutions:
Here N is the number of calculations that occur in a few hours. We can have dozens of them, which means the need to spend a very large amount of space for additional copies of the code.
Therefore, we came up with a new system and called it MDK. It stands for Multiversion Deployment Kit , a multi-version deployment tool. We did it based on the following assumptions.
We took the tree storage architecture from Git. We need to have a consistent version of the code in which the script works, that is, we need snapshots. Snapshots are supported by LVM, but there they are implemented inefficiently by experimental file systems like Btrfs and Git. We took the implementation of snapshots from Git.
Renamed all files from file.php to file.php. <version>. Since all the files we have are simply stored on disk, then if we want to store several versions of the same file, we must add a suffix with the version.
I love Go, so for speed I wrote a system on Go.
We took the idea of snapshots from Git. I simplified it a little and tell you how it is implemented in MDK.
There are two types of files in MDK. The first is cards. The pictures below are marked in green and correspond to the directories in the repository. The second type is directly the files, which lie in the same place as usual, but with a suffix in the form of a file version. Files and maps are versioned based on their contents, in our case simply MD5.
Suppose we have a hierarchy of files in which the root map refers to certain versions of files from other maps , and they, in turn, refer to other files and maps, and fix certain versions. We want to change some kind of file.
Perhaps you have already seen a similar picture: we change the file at the second level of nesting, and in the corresponding map - map *, the version of the three * file is updated, its contents are modified, the version changes - and the version also changes in the root map. If we change something, we always get a new root map, but all files that we did not change are reused.
Links remain to the same files as they were. This is the main idea of creating snapshots in any way, for example, in ZFS it is implemented in approximately the same way.
We have on the disk: symlink to the latest root map - the code that will be served from the web, several versions of root maps, several files, possibly with different versions, and in the subdirectories there are maps for the corresponding directories.
I foresee the question: “ And how to process the web request? What files will the user code come to? ”
Yes, I deceived you - there are files without versions, because if you receive a request for index.php, but you don’t have it in the directory, the site will not work.
All PHP files have files that we call stubs., because they contain two lines: require from the file in which the function that knows how to work with these cards is declared, and require from the desired version of the file.
This is done so, and not symlinked to the latest version, because if you exclude b.php from the a.php file without the version, then since require_once is written, the system will remember which root card it started from, it will use it, and Get a consistent version of files. For the rest of the files, we just have symlink to the latest version.
The model is very similar to git push.
Suppose there is a file named "one" on the server. Send a root map to it.
In the root map, dashed arrows indicate links to files that we don’t have. We know their names and versions because they are on the map. We request them from the server. The server sends, and it turns out that one of the files is also a card.
We look - we don’t have a single file at all. Again we request files that are missing. The server sends them. There are no more cards left - the deployment process is completed.
You can easily guess what will happen if the files are 150,000, but one has changed. We will see in the root map that one map is missing, let's go by the level of nesting and get a file. In terms of computational complexity, the process is almost no different from copying files directly, but at the same time, the consistency and snapshots of the code are preserved.
MDK has no drawbacks :) It allows you to quickly and atomically deploy small changes , and the scripts work for days , because we can leave all the files that were deployed within a week. They will occupy a quite adequate amount of space. You can also reuse OPCache, and the CPU eats almost nothing.
Monitoring is quite difficult, but possible . All files are versioned by content, and you can write cron, which will go through all the files and verify the name and content. You can also check that the root map refers to all files, that there are no broken links in it. Moreover, during the deployment integrity is checked.
You can easily roll back changesbecause all the old cards are in place. We can just throw the card, everything will be there right away.
For me, plus the fact that MDK is written in Go means it works quickly.
I deceived you again, there are still cons. For the project to work with the system, a significant modification of the code is required, but it is simpler than it might seem at first glance. The system is very complex , I would not recommend implementing it if you do not have such requirements as Badoo. Also, anyway, sooner or later the place ends, so the Garbage Collector is required .
We wrote special utilities to edit files - real ones, not stubs, for example, mdk-vim. You specify the file, it finds the desired version and edits it.
We have 50 servers on staging, on which we deploy for 3-5 s . Compared to everything except rsync, it is very fast. On production we deploy about 2 minutes , small patches - 5-10 s .
If for some reason you have lost the entire folder with the code on all servers (which should never happen :)), then the process of full uploading takes about 40 minutes . It happened to us once, though at night with a minimum of traffic. Therefore, no one was hurt. The second file was on a pair of servers for 5 minutes, so this is not worth mentioning.
The system is not in Open Source, but if you are interested, then write in the comments - it can be laid out (Yuri from the future: the system is still not in Open Source at the time of this writing ).
Listen to Rasmus, he is not lying . In my opinion, its rsync method together with realpath_root is the best, although loops work quite well too.
Think with your head : look at what your project needs, and don’t try to create a spaceship where there is enough “corn”. But if you still have similar requirements, then a system similar to MDK will suit you.
Yuri did not stop at a simple solution, but additionally made a report in which he disclosed the concept of the “code deploy” concept, talked about classic and alternative solutions to large-scale PHP deployments, analyzed their performance and presented the MDK deployment system.
The concept of “deploy code”
In English, the term “deploy” means putting troops on alert, and in Russian we sometimes say “fill the code into battle,” which means the same thing. You take the code in the already compiled or in the original, if it is PHP, form, upload it to the servers that serve the user traffic, and then, by magic, somehow switch the load from one version of the code to another. All this is included in the concept of “deploy code.”
The deployment process usually consists of several stages.
- Getting the code from the repository , in any way you like: clone, fetch, checkout.
- Assembly - build . For PHP code, the build phase may be missing. In our case, this is, as a rule, automatic generation of translation files, uploading static files to CDN and some other operations.
- Delivery to end servers - deployment.
After everything is assembled, the phase of the immediate deployment begins - the code is poured onto production servers . It is about this phase that Badoo will be discussed.
Old deployment system in Badoo
If you have a file with a file system image, then how to mount it? In Linux, you need to create an intermediate Loop device , attach a file to it, and after that this block device can already be mounted.
A loop device is a crutch that Linux needs to mount a file system image. There are OS in which this crutch is not required.
How is the deployment process using files, which we also call “loops” for simplicity? There is a directory in which the source code and automatically generated content are located. We take an empty image of the file system - now it is EXT2, and earlier we used ReiserFS. We mount an empty image of the file system in a temporary directory, copy all the contents there. If we don’t need something to get into production, then we’re not copying everything. After that, unmount the device, and get the image of the file system in which the necessary files are located. Next, we archive the image and upload it to all servers , there we unzip and mount it.
Other existing solutions
First, let's thank Richard Stallman - without his license, most of the utilities that we use would not have existed.
I conventionally divided the methods for deploying PHP code into 4 categories.
- Based on version control system : svn up, git pull, hg up.
- Based on rsync utility - to a new directory or "on top".
- Deploy one file - no matter what: phar, hhbc, loop.
- The special way that Rasmus Lerdorf suggested is rsync, 2 directories and realpath_root .
Each method has both pros and cons, because of which we abandoned them. Consider these 4 methods in more detail.
Deployment based on svn up version control system
I chose SVN not by chance - according to my observations, in this form the deployment exists precisely in the case of SVN. The system is quite lightweight , it allows you to easily and quickly carry out a deployment - just run svn up and you're done.
But this method has one big minus: if you do svn up, and in the process of updating the source code, when new requests come from the repository, they will see the state of a file system that did not exist in the repository. You will have part of the files new, and part of the old ones - this is a non-atomic deployment method that is not suitable for high load, but only for small projects. Despite this, I know projects that are still deployed this way, and so far everything works for them.
Deployment based on rsync utility
There are two options for how to do this: upload files using the utility directly to the server and upload “on top” - update.
rsync to a new directory
Since you first completely pour all the code into a directory that does not yet exist on the server, and only then switch traffic, this method is atomic - no one sees an intermediate state. In our case, creating 150,000 files and deleting the old directory, which also has 150,000 files, creates a large load on the disk subsystem . We use hard disks very actively, and the server somewhere for a minute does not feel very well after such an operation. Since we have 2000 servers, it is required to upload 900 MB 2000 times.
This scheme can be improved if you first upload to a certain number of intermediate servers, for example, 50, and then add them to the rest. This solves possible problems with the network, but the problem of creating and deleting a huge number of files does not disappear anywhere.
rsync on top
If you used rsync, then you know that this utility can not only fill entire directories, but also update existing ones. Sending only changes is a plus, but since we upload the changes to the same directory where we serve the battle code, there will also be some kind of intermediate state - this is a minus.
Submitting changes works like this. Rsync makes lists of files on the server side from which the deploy is carried out, and on the receiving side. After that, it counts stat from all files and sends the entire list to the receiving side. On the server from which the deployment is proceeding, the difference between these values is considered, and it is determined which files should be sent.
In our conditions, this process takes about 3 MB of traffic and 1 second of processor time. It seems that this is not much, but we have 2,000 servers, and everything turns out at least one minute of processor time. This is not such a quick method, but definitely better than sending the whole thing through rsync. It remains to somehow solve the problem of atomicity and will be almost perfect.
Deploy single file
Whatever single file you upload, it is relatively easy to do using BitTorrent or the UFTP utility. One file is easier to unzip, can be atomically replaced on Unix, and it is easy to check the integrity of the file generated on the build server and delivered to the destination machines by calculating the MD5 or SHA-1 amounts from the file (in the case of rsync, you don’t know what is on the destination servers )
For hard drives, sequential recording is a big plus - an 900 MB file will be written to an unoccupied hard drive in about 10 seconds. But still you need to record these same 900 MB and transfer them over the network.
Lyrical digression about UFTP
This Open Source utility was originally created to transfer files over a network with long delays, for example, via a satellite-based network. But UFTP turned out to be suitable for uploading files to a large number of machines, because it works using UDP protocol based on Multicast. A single Multicast address is created, all the machines that want to receive the file subscribe to it, and switches ensure that copies of packets are delivered to each machine. So we shift the burden of transmitting data to the network. If your network can handle this, then this method works much better than BitTorrent.
You can try this Open Source utility on your cluster. Despite the fact that it works over UDP, it has a NACK mechanism - negative acknowledgment, which forces re-forwarding packets lost upon delivery.This is a reliable way to deploy .
Single file deployment options
tar.gz
An option that combines the disadvantages of both approaches. Not only do you have to write 900 MB to disk sequentially, after that you need to write the same 900 MB once again with random read-write and create 150,000 files. This method is even worse in performance than rsync.
phar
PHP supports archives in phar format (PHP Archive), knows how to give their contents and include files. But not all projects are easy to put in one phar - you need code adaptation. Just because the code from this archive does not work. In addition, one file cannot be changed in the archive ( Yuri from the future: in theory, you can still), you need to reload the entire archive. Also, despite the fact that phar archives work with OPCache, when deploying, the cache must be discarded, because otherwise there will be garbage in OPCache from the old phar file.
hhbc
This method is native to HHVM - HipHop Virtual Machine and is used by Facebook. This is something like a phar archive, but it doesn’t contain the source codes, but the compiled byte code of the HHVM virtual machine - the PHP interpreter from Facebook. It is forbidden to change anything in this file: you cannot create new classes, functions, and some other dynamic features in this mode are disabled. Due to these limitations, the virtual machine can use additional optimizations. According to Facebook, this can bring up to 30% to the speed of code execution. This is probably a good option for them. It is also impossible to change one file here ( Yuri from the future: actually it is possible, because it is a sqlite-base ). If you want to change one line, you need to redo the entire archive again.
For this methodIt is forbidden to use eval and dynamic include. This is so, but not quite. Eval can be used, but if it does not create new classes or functions, and include cannot be made from directories that are outside this archive.
loop
This is our old version, and it has two big advantages. First, it looks like a regular directory . You mount the loop, and for the code anyway - it works with files, both on the develop environment and on the production environment. The second - loop can be mounted in read and write mode, and change one file, if you still need to change something urgently for production.
But loop has cons. First, it works weirdly with docker. I’ll talk about this a bit later.
Second - if you use symlink on the last loop as document_root, then you will have problems with OPCache. It is not very good at having symlink in the path, and begins to confuse which versions of the files to use. Therefore, OPCache has to be reset when deploying.
Another problem is that superuser privileges are required to mount file systems. And you must not forget to mount them at the start / restart of the machine, because otherwise there will be an empty directory instead of code.
Problems with docker
If you create a docker container and throw inside it a folder in which “loops” or other block devices are mounted, then there are two problems at once: new mount points do not fall into the docker container, and those “loops” that were at the time of creation A docker container cannot be unmounted because they are occupied by a docker container.
Naturally, this is generally incompatible with the deployment, because the number of loop devices is limited, and it is unclear how the new code should fall into the container.
We tried to do weird things, like raising a local NFS serveror mount the directory using SSHFS, but for various reasons this did not take root with us. As a result, in cron, we registered rsync from the last “loop” into the current directory, and it ran the command once a minute:
rsync /var/loop/<N>/ /var/www/
Here
/var/www/
is the directory that is promoted to the container. But on machines that have docker containers, we don’t need to run PHP scripts often, so rsync was not atomic, which suited us. But still, this method is very bad, of course. I would like to make a deployment system that works well with docker.rsync, 2 directories and realpath_root
This method was proposed by Rasmus Lerdorf, the author of PHP, and he knows how to deploy.
How to make an atomic deploy, and in any of the ways that I talked about? Take symlink and register it as document_root. At each point in time, symlink points to one of the two directories, and you make rsync into a neighboring directory, that is, to the one to which the code does not point.
But the problem arises: the PHP code does not know in which of the directories it was launched. Therefore, you need to use, for example, a variable that you will write somewhere in the beginning in the config - it will fix which directory the code was run from and from which new files should be included. On the slide it’s called
ROOT_DIR
.Use this constant when accessing all files inside the code that you use on production. So you get the atomicity property: requests that arrive before you switched symlink continue to include files from the old directory in which you did not change anything, and new requests that came after symlink switching start working from the new directory and are served new code.
But this needs to be written in the code. Not all projects are ready for this.
Rasmus-style
Rasmus suggests instead of manually modifying the code and creating constants to slightly modify Apache or use nginx.
For document_root, specify the symlink to the latest version. If you have nginx, then you can register it
root $realpath_root
, for Apache you will need a separate module with the settings that can be seen on the slide. It works like this - when a request arrives, nginx or Apache once in a while consider realpath () from the path, saving it from symlinks, and pass this path as document_root. In this case, document_root will always point to a regular directory without symlinks, and your PHP code may not have to think about which directory it is called from.This method has interesting advantages - real paths come to OPCache PHP, they do not contain symlink. Even the very first file that the request came to will already be full, and there will be no problems with OPCache. Since document_root is used, this works with any PHP project. You do not need to adapt anything.
It does not require fpm reload, you do not need to reset OPCache during the deployment, which is why the processor server is heavily loaded, because it must parse all the files again. In my experiment, resetting OPCache by about half a minute increased processor consumption by 2–3 times. It would be nice to reuse it and this method allows you to do it.
Now the cons. Since you do not reuse OPCache, and you have 2 directories, you need to store a copy of the file in memory for each directory - under OPCache, 2 times more memory is required.
There is another limitation that may seem strange - you can’t deploy more than once every max_execution_time . Otherwise, the same problem will occur, because while rsync is going to one of the directories, requests from it can still be processed.
If you use Apache for some reason, then you need a third-party module , which Rasmus also wrote.
Rasmus says the system is good and I recommend it to you too. For 99% of projects, it is suitable, both for new projects and for existing ones. But, of course, we are not like that and decided to write our own decision.
New system - MDK
Basically, our requirements are no different from the requirements for most web projects. We just want a quick deployment on staging and production, low resource consumption , reuse of OPCache and fast rollback.
But there are two more requirements that may differ from the rest. First of all, it is the ability to apply patches atomically . We refer to patches as changes in one or several files that rule something on production. We want to do it fast. In principle, the system that Rasmus offers is coping with the patch task.
We also have a CLI-script , which can work for a few hours, and they should still work with a consistent version of the code. In this case, the above solutions, unfortunately, either do not suit us, or we must have a lot of directories.
Possible solutions:
- loop xN (-staging, -docker, -opcache);
- rsync xN (-production, -opcache xN);
- SVN xN (-production, -opcache xN).
Here N is the number of calculations that occur in a few hours. We can have dozens of them, which means the need to spend a very large amount of space for additional copies of the code.
Therefore, we came up with a new system and called it MDK. It stands for Multiversion Deployment Kit , a multi-version deployment tool. We did it based on the following assumptions.
We took the tree storage architecture from Git. We need to have a consistent version of the code in which the script works, that is, we need snapshots. Snapshots are supported by LVM, but there they are implemented inefficiently by experimental file systems like Btrfs and Git. We took the implementation of snapshots from Git.
Renamed all files from file.php to file.php. <version>. Since all the files we have are simply stored on disk, then if we want to store several versions of the same file, we must add a suffix with the version.
I love Go, so for speed I wrote a system on Go.
How the Multiversion Deployment Kit Works
We took the idea of snapshots from Git. I simplified it a little and tell you how it is implemented in MDK.
There are two types of files in MDK. The first is cards. The pictures below are marked in green and correspond to the directories in the repository. The second type is directly the files, which lie in the same place as usual, but with a suffix in the form of a file version. Files and maps are versioned based on their contents, in our case simply MD5.
Suppose we have a hierarchy of files in which the root map refers to certain versions of files from other maps , and they, in turn, refer to other files and maps, and fix certain versions. We want to change some kind of file.
Perhaps you have already seen a similar picture: we change the file at the second level of nesting, and in the corresponding map - map *, the version of the three * file is updated, its contents are modified, the version changes - and the version also changes in the root map. If we change something, we always get a new root map, but all files that we did not change are reused.
Links remain to the same files as they were. This is the main idea of creating snapshots in any way, for example, in ZFS it is implemented in approximately the same way.
How MDK lies on a disk
We have on the disk: symlink to the latest root map - the code that will be served from the web, several versions of root maps, several files, possibly with different versions, and in the subdirectories there are maps for the corresponding directories.
I foresee the question: “ And how to process the web request? What files will the user code come to? ”
Yes, I deceived you - there are files without versions, because if you receive a request for index.php, but you don’t have it in the directory, the site will not work.
All PHP files have files that we call stubs., because they contain two lines: require from the file in which the function that knows how to work with these cards is declared, and require from the desired version of the file.
<?phprequire_once"mdk.inc";
require mdk_resolve_path("a.php");
This is done so, and not symlinked to the latest version, because if you exclude b.php from the a.php file without the version, then since require_once is written, the system will remember which root card it started from, it will use it, and Get a consistent version of files. For the rest of the files, we just have symlink to the latest version.
How to deploy using MDK
The model is very similar to git push.
- Send the contents of the root map.
- On the receiving side, we look at what files are missing. Since the version of the file is determined by the content, we do not need to download it a second time ( Yuri from the future: except for the case when there will be a collision of a shortened MD5, which still happened once in production ).
- Request the missing file.
- We pass to the second point and further in a circle.
Example
Suppose there is a file named "one" on the server. Send a root map to it.
In the root map, dashed arrows indicate links to files that we don’t have. We know their names and versions because they are on the map. We request them from the server. The server sends, and it turns out that one of the files is also a card.
We look - we don’t have a single file at all. Again we request files that are missing. The server sends them. There are no more cards left - the deployment process is completed.
You can easily guess what will happen if the files are 150,000, but one has changed. We will see in the root map that one map is missing, let's go by the level of nesting and get a file. In terms of computational complexity, the process is almost no different from copying files directly, but at the same time, the consistency and snapshots of the code are preserved.
MDK has no drawbacks :) It allows you to quickly and atomically deploy small changes , and the scripts work for days , because we can leave all the files that were deployed within a week. They will occupy a quite adequate amount of space. You can also reuse OPCache, and the CPU eats almost nothing.
Monitoring is quite difficult, but possible . All files are versioned by content, and you can write cron, which will go through all the files and verify the name and content. You can also check that the root map refers to all files, that there are no broken links in it. Moreover, during the deployment integrity is checked.
You can easily roll back changesbecause all the old cards are in place. We can just throw the card, everything will be there right away.
For me, plus the fact that MDK is written in Go means it works quickly.
I deceived you again, there are still cons. For the project to work with the system, a significant modification of the code is required, but it is simpler than it might seem at first glance. The system is very complex , I would not recommend implementing it if you do not have such requirements as Badoo. Also, anyway, sooner or later the place ends, so the Garbage Collector is required .
We wrote special utilities to edit files - real ones, not stubs, for example, mdk-vim. You specify the file, it finds the desired version and edits it.
MDK in numbers
We have 50 servers on staging, on which we deploy for 3-5 s . Compared to everything except rsync, it is very fast. On production we deploy about 2 minutes , small patches - 5-10 s .
If for some reason you have lost the entire folder with the code on all servers (which should never happen :)), then the process of full uploading takes about 40 minutes . It happened to us once, though at night with a minimum of traffic. Therefore, no one was hurt. The second file was on a pair of servers for 5 minutes, so this is not worth mentioning.
The system is not in Open Source, but if you are interested, then write in the comments - it can be laid out (Yuri from the future: the system is still not in Open Source at the time of this writing ).
Conclusion
Listen to Rasmus, he is not lying . In my opinion, its rsync method together with realpath_root is the best, although loops work quite well too.
Think with your head : look at what your project needs, and don’t try to create a spaceship where there is enough “corn”. But if you still have similar requirements, then a system similar to MDK will suit you.
We decided to return to this topic, which was discussed on HighLoad ++ and, perhaps, did not receive due attention, because it was only one of many bricks to achieve high performance. But now we have a separate professional PHP Russia conference dedicated entirely to PHP. And here we really come off to the full. We will talk thoroughly about performance , and about standards , and about tools - a lot about that, including refactoring .
Subscribe to the Telegram channel with conference program updates and see you on May 17th.