gegokk April 5, 2013 at 17:07

The problems of "long" PHP scripts

From the sandbox

Sometimes it becomes necessary to write scripts that take a long time to work. For example, scripts to create / deploy backups, install a demo version of some application, aggregate large amounts of data, import / export data, etc. In order for such scripts not to stop working at an unexpected moment, you need to know and remember some things.

External timeout

First of all, you need to set the appropriate value for the max_execution_time parameter in the PHP config.

If the script is launched by the web server (i.e., in response to an HTTP request from the user), you should also correctly configure the timeout parameters in the web server config. For apache, these are the parameters TimeOut and FastCgiServer ... -idle-timeout ... (if PHP works through FastCGI), for nginx send_timeout and fastcgi_read_timeout (if PHP works through FastCGI).

The web server can also proxy requests to another web server, which will run the PHP script (not a rare example, nginx is the frontend, apache is the backend). In this case, the proxy web server also needs to configure the proxy timeout. For apache ProxyTimeout , for nginx proxy_read_timeout .

User Interruption

If the script is launched in response to an HTTP request, then the user can stop the request from executing in his browser, in which case the PHP script will also stop working. If you want the script to continue its work even after the request is stopped, set the ignore_user_abort parameter in the PHP config to TRUE .

Loss of open connections

If a script opens a connection with a service / service (with a database, with an email server, with an FTP server, ...), and while the script is running, the connection is not used for some time, then it can be closed by this service. For example, if you do not execute MySQL queries for some time while the script is running, MySQL will close the connection after the time specified in the wait_timeout parameter . As a result, an error occurs when trying to execute the next request.

In such cases, you should first try to increase the connection timeout. For example, for MySQL, you can execute a query (thanks to Snowly )

SET SESSION wait_timeout = 9999

If this is not possible or this option is not suitable for some reason, then you can check the connection activity in those places of the code where downtime is possible, and reconnect if necessary. For example, in the MySQLi module, there is a useful function mysqli :: ping for checking connection activity, as well as a mysqli.reconnect configuration parameter for automatically reconnecting when the connection is disconnected. In the absence of similar functions for other types of connections, you can try to write it yourself. In it, you need to trivially access the service and, in case of an error (catch using try ... catch ...), reconnect. for instance

class FtpConnection
{
	private $ftp;
	public function connect()
	{
		$this->ftp = ftp_connect('ftp.server');
		...
	}
	public function reconnect()
	{
		try
		{
			if (!ftp_pwd($this->ftp))
				$this->connect();
		}
		catch($e)
		{
			$this->connect();
		}
	}
	...
}

class MssqlConnection
{
	private $db;
	public function connect()
	{
		$this->db = mssql_connect('mssql.server');
		...
	}
	public function reconnect()
	{
		try
		{
			if (!mssql_query('SELECT 1 FROM dual', $this->db))
				$this->connect();
		}
		catch($e)
		{
			$this->connect();
		}
	}
	...
}

Parallel launch

Often, long scripts are run on schedule (by cron), and it is expected that only one copy of the script will work at a time. But it may happen that the next script launch occurs before the previous one finishes its work, and as a rule this is undesirable (the same data is imported twice, the data used by the first script will be erased ...).

In such cases, you can use the blocking of resources used, but this task is always solved individually. Or you can simply check to see if another copy of this script is running, and either wait for it to finish working, or complete the current run. To do this, you can view the list of running processes, or use the script launch lock, something like:

if (lockStart('script.php'))
{
    // основной код скрипта
    ...
    lockStop('script.php');
}

Web server load

In cases where long scripts are run through a web server, the client’s connection to this same web server remains open until the script runs. This is not good, because the task of the web server is to process the request as quickly as possible and return the result. If the connection remains hanging, then one of the web server workers (processes) will be busy for a long time. And if at the same time a lot of such scripts are launched, then they can take all (well, or almost all) free workers (for apache, see MaxClients ), and the web server simply will not be able to process other requests.

Therefore, when processing a user’s request, run the script in the background via php-cli, so as not to load the web server, and the user should answer that his request is being processed. If necessary, you can periodically check the status of processing using AJAX requests.

That, perhaps, is all that I can tell on this topic. I hope it will be useful for someone.

Tags: