PHP, YII2 and the formation of large excel-files
One accounting and reporting system supported by our company began to grow very quickly in the amount of stored data. The system is written in PHP using the framework Yii2. Initially, reports were built through the library PhpSpreadsheet, which came to replace, has long become deprecated, PhpExcel.
Among the different types of reporting was one very large - in fact, the complete set of all data stored in the database should be unloaded into one excel-table. At the initial stage, there were no problems, but when the volume began to exceed many hundreds of thousands of records, the unloading script began to fall off at the timeout limit.To begin with, we raised this very limit and began to look for ways to solve the problem. But the temporary solution was not long enough - the problem with the time limit turned into a problem with the memory limit. The server was thrown with “RAM” and generally removed the memory_limit for this particular operation. Very soon, users started complaining again about runtime errors. I had to remove the time limit for the full report. But to sit and watch a dozen minutes on the screen with a load indicator is not much fun. In addition, sometimes a report was needed “here and now,” and every minute spent on its formation turned out to be critical. Experiments with the environment settings have been stopped, the back of the head has been scratched and the code has been optimized.
Search for a solution
The first thing that was done was the reporting script put into the background process, and the user watches the progress through the progress bar. The background execution of tasks was implemented through a queue mechanism using Redis for storage. Work in the system does not stop, you can do other tasks and periodically return to the page with the report - see if the file is ready. Once the file is formed, the user is offered a link to download. But, as mentioned above, sometimes the file was required "immediately", and increasing usability did not solve this problem. In the meantime, the amount of data continued to grow and the time to build the file reached 79 minutes! This is completely unacceptable, especially considering that reporting is one of the fundamentals of the functionality of this system. No, all the other parts worked like a clock,
Again sat down at the code analysis. The first thing that was tested was the process of selecting data from the database. But requests have already been optimized as much as possible. Although the longest request was a terrible sample with five or six calls to the monstrous FIAS, but it worked out in 2-5 seconds. The weak point was not he, but the formation of the exelnik file. Attempts to optimize this process have begun. Starting from caching in redis, to perversions such as the formation of separate small “exelics” in parallel threads with subsequent gluing into one file. But the result was always the same: a problem with time turned into a problem with memory and vice versa. There was no golden mean, only flowing from one extreme to another. After a certain amount of data, the library's resource consumption began to grow exponentially and it was not possible to win. PhpSpreadsheet - not suitable for large files. As a result, it was decided to change the library. As an option - writing your analogue for the formation of Excel-files.
Analysis and tool selection
We didn’t hurry with writing bicycles, but first we analyzed the existing solutions. Of the possible options are only interested in box / spout. Quickly rewrote the module using this library. As a result, the full report turned out in 145 seconds. Let me remind you that the latest tests with PhpSpreadsheet - 79 minutes, and then 2.5 minutes! Conducted testing: increased data by 2 times. The report was generated in 172 seconds. The difference is awesome. Of course, the library does not have all the same functions as PhpSpreadsheet, but in this case, the minimum set of tools is also sufficient, since the speed of work is critical.
Extension for Yii2
The final decision was designed as an extension for Yii2. Maybe someone will come in handy. The extension allows you to upload any data set from GridView to excel while preserving filtering and sorting. It uses yii / queue and box / spout as dependencies. It makes sense to use the extension to form really large files, well, at least 50,000 lines =) At the moment, the module that has become the basis for the extension is able to handle the load of almost 600,000 lines.
Link to github: Yii2 ExcelReport Extension
Thank you for your attention!