Parallel computing, wrapper class for pcntl_fork ()

Published on September 13, 2011

Parallel computing, wrapper class for pcntl_fork ()

    I want to show my base class, which I use for PHP scripts.
    Its advantage is that it is easy to "parallelize" the work.
    Uses pcntl_fork () with all the "consequences".

    (tested only on Linux)

    The essence of the idea:



    class some_script extends CliScript
    {
      protected function processWorker($item)
      {
          $this->log("I'm doing heavy job");
          sleep(rand(1,5));
          $this->log("I'm done doing heavy job");
      }

    }

    $script = new some_script();
    $script->setWorkers(5);
    $script->run();


    as a result, we get one parent process and 5 children doing “heavy job”.

    There are some nuances and limitations: open connections to the database and files can add up.

    A different approach is needed: the parent process must deal with the base and the child processes do the “dirty work” and return the result. For example:

    class master_and_workers extends CliScript
    {
        protected $contracts = array(2,4,5,1,3,7,3,1,4,9,2,4,1);
        protected $results;

        protected function processMaster()
        {
        foreach($this->contracts as $contract)
        {
          while( !$this->canStartWorker() ) { sleep(1); };

          $this->startWorker($contract);
        }

        $this->waitForChildren();

        var_export($this->results);
        }

        protected function processWorker($item)
        {
        $this->log("I'm busy for {$item} seconds...");
        sleep($item);
        $this->log("Job is done.");

        return "Job is done. Sleep time was {$item}";
        }

        protected function processResult($result)
        {
        $this->results[] = $result;
        }



    $script = new master_and_workers();
    $script->setWorkers(3);
    $script->run();


    A quick explanation if anyone is confused:

    after starting in the parent process, the processMaster () method is executed, which starts the child processes.
    The child process executes the processWorker () method.
    That the child process returns is stored in a temporary file, after completion, the processResult () method in the parent is called and the result is passed there.

    there are several useful methods in the CliScript class:

    getRunningTime () returns the execution time from start in seconds
    countWorkers () returns the number of child processes (it makes sense only in the parent process)
    log () if the CliScript :: file file_log file is specified then logs, if not - to the screen. Displays information about who reports.

    To kill the whole company “gently” send a SIGTERM signal to the parent process, there is a primitive handler.

    In conclusion.



    I can’t say that the code is run-in and licked, rather the opposite, but it helps me a lot to quickly load the CPU with work when necessary.

    CliScript class source code

    If someone has similar ideas and is ready to share it, I will be very happy.