Easy export to Excel XLSX

    In continuation of the topic begun in the previous article , I want to share my experience in exporting data, in particular, in XLSX format.



    So, who cares how to fill in XLSX without large and complex libraries, please, under cat.

    Recently, I had the task of exporting an unpredictable size of tabular data in XLSX format. Like any sensible programmer, the first thing he did was to look for ready-made solutions.
    Almost immediately stumbled upon the PHPExcel library. A powerful solution with a bunch of different features and capabilities. Having rummaged a little more I found reviews of programmers about her. In particular, on the forums there are complaints about the speed of work and the refusal to work with a large amount of data. He noted the library as one of the solutions and began to search further.
    I found several more libraries for working with XLSX, but they were all forgotten either. they didn’t update for 2-3 years, or they necessarily dragged third-party libraries along, or used the DOM to work with files, which I didn’t really like. Each time, bumping into another library and studying the mechanisms of its work, I caught myself thinking that all this was “from a cannon by sparrow”. I don’t need such a difficult decision!
    I admit honestly, having studied superficially each of the solutions found, I did not begin to install and test a single one. I needed a simpler and more reliable solution, like a tank.

    Task


    In general, since I did not find anything suitable, then it is necessary to formulate technical requirements for what is needed. The requirements, as one would expect, were trivial:
    • Design the export mechanism as a standalone class
    • Implement a set of functions in the class for writing cell and row values
    • Ability to work with unlimited data
    • Unpacking and packaging XLSX.

    I will dwell only on the last point. As you know, XLSX is a regular zip archive that you can unzip and see that it consists of several files and directories. In the opposite way, it can be packaged and renamed to XLSX. If all the changes are correct, then Microsoft Excel will open the file without problems.

    Implementation


    Initially, I really wanted to create all the files that make up XLSX with code, but, fortunately, I quickly realized the meaninglessness of my idea. And a different, more correct and simple solution was born. It is necessary using Microsoft Excel to create the XLSX file in the form in which it is needed in the end, but without data, in other words, a template, and then, using the code, just add the data!
    In this case, the class will have to unpack the template into a separate directory, make changes to /xl/worksheets/sheet1.xml and pack the contents of the directory back into XLSX.

    The class declaration contains public variables:
    $ templateFile - the name of the template file
    $ exportDir - the folder into which the template will be unpacked, of course with the necessary access rights.

    The constructor of the class accepts the name of the future file, the number of columns and rows. Then it checks that the file name is correct, the folder for unpacking the template exists and forms the full name of the destination folder for unpacking the template.
    After creating the class, you can unzip the template and open sheet1.xml for writing. In fact, I’m not just adding to the file, but completely overwriting it. Once having taken the initial line from it, I make a change to it in the dimension tag, which reflects the size of the exported range, and write it to a file.

    public function openWriter()
    {
    	if (is_dir($this->baseDir))
    		CFileHelper::removeDirectory($this->baseDir);
    	mkdir($this->baseDir);
    	exec("unzip $this->templateFullFilename -d \"$this->baseDir\"");
    	$this->workSheetHandler = fopen($this->baseDir.'/xl/worksheets/sheet1.xml', 'w+');
    	fwrite($this->workSheetHandler, '');
    }
    


    To ensure speed and the ability to work with large amounts of data, the resetRow and flushRow functions allow. They are responsible for clearing the current row in memory and writing the current row to disk.
    But saving the values ​​of cells with different types was not such a simple task.

    Line record

    It would seem that it is difficult to write a string value to a file. However, XLSX is not so simple. All lines inside XLSX are stored in a separate file /xl/sharedStrings.xml. It is not string values ​​that are written to cells, but their serial numbers - indices. A smart solution in terms of file size reduction.

    But such a solution is inconvenient from the point of view of programmatic filling of the template. If this requirement is met, then I would have to perform a separate pass through all string values ​​in the data array, exclude duplicate ones, save them in sharedStrings.xml, index and enter their indices instead of the values ​​in the original array. Slow and uncomfortable.

    It turns out that you can circumvent the requirement and save the string values ​​of the cells directly in the cells. But in this case, the recording format will be different:

    public function appendCellString($value)
    {
    	$this->curCel++;
    	if (!empty($value)) {
    		$value = htmlspecialchars($value, ENT_QUOTES, 'UTF-8');
    		$value = preg_replace( '/[\x00-\x13]/', '', $value );
    		$this->currentRow[] = 'isBold ? ' s="7"' : '').'>'.$value.'';
    		$this->numStrings++;
    	}
    }
    


    Number record

    There were no difficulties with writing integer or fractional numbers. Everything is simple:

    public function appendCellNum($value)
    {
    	$this->curCel++;
    	$this->currentRow[] = ''.$value.'';
    }
    


    Date and Time Record

    The date and time are stored as the number of seconds elapsed since 01/01/1970 divided by the number of seconds in days. Moreover, an error was made in the calculation with the definition of a leap year. In general, without going into details that are easy to find on the network, in order to correctly calculate the date, I had to declare two constants in the class:
    ZERO_TIMESTAMP - date offset in Excel format from UNIX_TIMESTAMP
    SEC_IN_DAY - seconds in days.
    After calculating the date and time, the integer part of the fraction is the date, the fractional part is the time:

    const ZERO_TIMESTAMP = 2209161600;
    const SEC_IN_DAY = 86400;
    public function appendCellDateTime($value)
    {
    	$this->curCel++;
    	if (empty($value))
    		$this->appendCellString('');
    	else
    	{
    		$dt = new DateTime($value);
    		$ts = $dt->getTimestamp() + self::ZERO_TIMESTAMP;
    		$this->currentRow[] = ''.$ts/self::SEC_IN_DAY.'';
    	}
    }
    

    After recording all the data, it remains to close the worksheet and workbook.

    Application


    As before, the use of the described class is based on the export of data using the CArrayDataProvider provider . Assuming that the amount of exported data can turn out to be very large, a special iterator CDataProviderIterator is used , which iterates over the returned data by 100 records (you can specify a different number of records).

    public function exportXLSX($organization, $user, &$filename)
    {
    	$this->_provider = new CArrayDataProvider(/*query*/);
    	Yii::import('ext.AlxdExportXLSX.AlxdExportXLSX');
    	$export = new AlxdExportXLSX($filename, count($this->_attributes), $this->_provider->getTotalItemCount() + 1);
    	$export->openWriter();
    	$export->resetRow();
    	$export->openRow(true);
    	foreach ($this->_attributes as $code => $format)
    		$export->appendCellString($this->_objectref->getAttributeLabel($code));
    	$export->closeRow();
    	$export->flushRow();
    	$rows = new CDataProviderIterator($this->_provider, 100);
    	foreach ($rows as $row)
    	{
    		$export->resetRow();
    		$export->openRow();
    		foreach ($this->_attributes as $code => $format)
    		{
    			switch ($format->type)
                {
                    case 'Num':
                        $export->appendCellNum($row[$code]);
                    /*other types*/
                    default:
                        $export->appendCellString('');					
                }
    		}
    		$export->closeRow();
    		$export->flushRow();
    	}
    	$export->closeWriter();
    	$export->zip();
    	$filename = $export->getZipFullFileName();
    }
    

    Anyone interested can get the source code of my AlxdExportXLSX class for free.

    Also popular now: