PHP Network Server Performance

    Have you tried ordering a piglet fried on a gun ramp in McDonalds with homemade wine and, for dessert, the girl next to you at the table, for a pleasant conversation during a meal? Didn't even think about it ?? That's it - the article is just about this, about the stereotypes of a programmer and laziness driving progress. But seriously - in the article we will write a very useful for many high-performance network server in PHP in a couple of hours. I'm completely serious :-)


    In the good old days ...


    In the good old days, when people were closer to nature, fresh beer pleased with pleasant bitterness and women smelled exquisite - programmers were closer to hardware and wrote in C and, in moments of inspiration, in assembler. But probably the most important thing is that the programmers understood what tcp is, how it differs from udp, and how to effectively interact with the kernel of the operating system through the system call interface.

    Laziness comes ...


    But laziness took its toll and an approach to development was gradually formed - close to the ideology of the fictional abstract world from The Lord of the Rings .

    People began to create objects of a fictional world, philosophical concepts exchanging messages in programs, and more and more began to tear themselves away from reality and nature. And while in C ++ they still tried to stay alone with nature through pointers and controlled work with memory, in Java and C # laziness took its toll and programmers found themselves in an ideal but far from effective universe of rubber women and non-alcoholic beer. The philosophy has lost in creating a universal API for working with all kinds of file systems or compulsory exception handling (Java).

    And now it’s even scary to look around: the developers are generally “spaced out” to such an extent that they don’t use compilers :-) Many systems are created in poorly typed scripting languages ​​like Python / PHP - which not only support OOP well, but are so powerful that allow one function to efficiently load a file into a variable :-)

    Processors and threads


    Many sacredly believed in hardware support for OOP at the processor level in the 90s, this did not happen. But laziness continues to influence and makes us now firmly believe in the effective implementation of programming language threads - taking into account the mode of reproduction of processor cores. Those. I don’t want to strain and just write “new Thread” and everything will work efficiently and quickly.

    Meanwhile ...


    In the meantime, the world is being taken over by effective nginx -style C solutions , productive NoSQL solutions are being created that are close to hardware, and when it comes to speed, performance - the brain obsessed with laziness and advertising starts to move and feel - something is not right! "They lie" about threads - they do not work quite efficiently, even on multinuclear glands. Although theoretically - they must!

    Forgot the origins ...


    Ask the developer now about the difference between close and shutdown or the differences between the process and the flow ... and more and more often you see what it means to expressively bite your nails :-) But in order to write a useful server, you need to understand how the operating system works, what there are network protocols, how the nature of things is structured and what real beer is! :-)

    It's not a programming language


    And no matter, believe me, in what programming language are you going to make a useful network server. It is important - how deeply do you understand what you are going to do and what immunity you have against advertising and your own technological laziness.

    Connection Handling


    People, when they were closer to nature, knew how to efficiently process network sockets . Of course, the kernel of the operating system should do this and notify you of the event:
    1) A new socket has arrived in the listen connection ( listen ) - and it can be taken into processing ( accept )
    2) It can be read from the socket without blocking the process ( read ).
    3) You can write to the socket without blocking the process ( write ).

    In the world of physical laws, other methods of processing compounds, such as creating a bunch of threads or, excuse me, sometimes dodue to laziness and perfectionism, processes - they work more slowly and “consume” much more memory. And although often “excuses” like: “it’s cheaper to buy one more piece of iron than teaching a programmer how to process demultiplexed sockets asynchronously” work, sometimes you find yourself in a situation where you need to solve a problem efficiently on current equipment - reducing costs by 1-2 orders of magnitude.
    It is on this principle that the well-known nginx works, processing tens of thousands of connections by several processes of the operating system.

    It still remains a mystery to me why, despite the appearance in the same java about 10 years ago, libraries for solving server problems"In the nginx style" - it has not yet received the proper distribution and applications continue to "fig" on streams, despite all the dead end and wastefulness of this approach! :-)

    Why don't everyone do that?


    Just laziness :-) Although it is also believed that asynchronous processing of demultiprexed sockets is much more difficult from a programming point of view than 50 lines in a separate process. But below I will show how to write a similar server even on a little sharpened PHP for other tasks is quite simple.

    PHP and sockets


    PHP has support for native expensive BSD sockets. But this extension, unfortunately, does not support ssl / tls.
    Therefore, you need to climb into the stream interface - streams, which is slightly alienated from nature and has a healthy lifestyle, filled with abstractions, "goblins and necromorphs" . If you take a shovel and discard a bunch of husks, you can see network sockets behind this interface and work quite effectively with them.

    Pieces of code


    I will not give the entire source code of the network server, but walk through the key parts of the code. In general, the server stably holds up to 1024 open sockets in one process without recompiling PHP, occupying about 18-20MB (this is dofiga by C standards, but believe me, there are PHP scripts that eat gigabytes) and straining only one processor core (yes, syscpu is noticeably larger, but as without it). If you rebuild PHP, then select can work with a much larger number of sockets.

    Server core

    Tasks of the server core:
    1) Check the array of tasks
    2) For each task, create a connection
    3) Check that the socket in the task can be read or written without blocking the process
    4) Release the resources (socket, etc.) of the task
    5) Accept the connection from the control socket to add a task - without blocking the process.

    In simple words, we fill the core of the server with jobs to work with sockets (for example, go to sites and collect data, etc.) and the kernel starts hundreds of jobs in one process at once.

    The task

    The task is an object in the terminology of OOP type FSM . There is a strategy inside the object - let’s say: “go to this address, create a request, upload a response, parses, etc. return to the beginning and end, write the result in NoSQL. " Those. You can create a task from a simple content loading, to a complex chain of load testing with numerous branches - and all this, I recall, lives in the task object.
    Tasks in this implementation are set through a separate control socket on port 8000 - json objects are written to the tcp socket and then begin their movement in the server core.

    The principle of processing jobs and sockets

    The main thing is not to allow the server process to lock while waiting for a response in the function when reading or writing information to the network socket, while waiting for a new connection to the control socket or somewhere in complex calculations / loops. Therefore, all job sockets are checked in the select system call and the OS kernel notifies us only when the event happens (or by timeout).
            while (true) {
                $ar_read = null;
                $ar_write = null;
                $ar_ex = null;
                //Собираемся читать также управляющий сокет, вместе с сокетами заданий
                $ar_read[] = $this->controlSocket;
                foreach ($this->jobs as $job) {
                    //job cleanup
                    if ( $job->isFinished() ) {
                        $key = array_search($job, $this->jobs);
                        if (is_resource($job->getSocket())) {
                            //"надежно" закрываем сокет
                            stream_socket_shutdown($job->getSocket(),STREAM_SHUT_RDWR);
                            fclose($job->getSocket());
                        }
                        unset($this->jobs[$key]);
                        $this->jobsFinished++;
                        continue;
                    }
                    //Задания могут "засыпать" на определенное время, например при ошибке удаленного сервера  
                    if ($job->isSleeping()) continue;
                    //Заданию нужно инициировать соединение
                    if ($job->getStatus()=='DO_REQUEST') {
                        $socket = $this->createJobSocket($job);
                        if ($socket) {
                            $ar_write[] = $socket;
                        }
                    //Задание хочет прочитать ответ из сокета
                    } else if ($job->getStatus()=='READ_ANSWER') {
                        $socket = $job->getSocket();
                        if ($socket) {
                            $ar_read[] = $socket;
                        }
                    //Заданию нужно записать запрос в сокет
                    } else if ( $job->getStatus()=='WRITE_REQUEST' ) {
                        $socket = $job->getSocket();
                        if ($socket) {
                            $ar_write[] = $socket;
                        }
                    }
                }
                //Ждем когда ядро ОС нас уведомит о событии или делаем дежурную итерацию раз в 30 сек
                $num = stream_select($ar_read, $ar_write, $ar_ex, 30);
    

    Further, when the event occurred and the OS notified us, we begin processing sockets in non-blocking mode. Yes, you can further optimize the bypass of the array of tasks, index the tasks by socket number and win 10ms - but for now ..., you guessed it, too lazy :-)
                    if (is_array($ar_write)) {
                        foreach ($ar_write as $write_ready_socket) {
                            foreach ($this->getJobs() as $job) {
                                if  ($write_ready_socket == $job->getSocket()) {
                                    $dataToWrite = $job->readyDataWriteEvent();
                                    $count = fwrite($write_ready_socket , $dataToWrite, 1024);
                                    //Сообщаем объекту сколько байт удалось записать в сокет
                                    $job->dataWrittenEvent($count);
                                }
                            }
                        }
                    }
                    if (is_array($ar_read)) {
                        foreach ($ar_read as $read_ready_socket) {
                            ///// command processing
                            ///
                            //Пришло соединение на управляющий сокет, обрабатываем команду
                            if ($read_ready_socket == $this->controlSocket) {
                                $csocket = stream_socket_accept($this->controlSocket);
                                 //Тут упрощение - верим локальному клиенту, что он закроет соединение. Иначе ставьте таймаут.
                                 if ($csocket) {
                                    $req = '';
                                    while ( ($data = fread($csocket,10000)) !== '' ) {
                                        $req .= $data;
                                    }
                                    //Обрабатываем команду также в неблокирующем режиме                                
                                    $this->processCommand(trim($req), $csocket);
                                    stream_socket_shutdown($csocket, STREAM_SHUT_RDWR);
                                    fclose($csocket);
                                }
                                continue;
                                ///
                                /////
                            } else {
                                //Читаем из готового к чтению сокета без блокировки 
                                $data = fread($read_ready_socket , 10000);
                                foreach ($this->getJobs() as $job) {
                                    if  ($read_ready_socket == $job->getSocket()) {
                                        //Передаем заданию считанные данные. Если сокет закроется, считаем пустую строку.
                                        $job->readyDataReadEvent($data);
                                    }
                                }
                            }
                        }
                    }
                }
    

    The socket itself is also initiated in non-blocking mode, it is important to set the flags, both of them! STREAM_CLIENT_ASYNC_CONNECT | STREAM_CLIENT_CONNECT:
        private function createJobSocket(BxRequestJob $job) {
            //Check job protocol
            if ($job->getSsl()) {
            //https
                $ctx = stream_context_create(
                            array('ssl' =>
                                array(
                                    'verify_peer' => false,
                                    'allow_self_signed' => true
                                )
                            )
                );
                $errno = 0;
                $errorString = '';
                //Вот тут происходит временами блокировочка в 30-60мс, видимо из-за установки TCP-соединения с удаленным хостом, надо глянуть в исходники, но снова ... лень
                $socket = stream_socket_client('ssl://'.$job->getConnectServer().':443',$errno,$errorString,30,STREAM_CLIENT_ASYNC_CONNECT|STREAM_CLIENT_CONNECT,$ctx);
                if ($socket === false) {
                    $this->log(__METHOD__." connect error: ". $job->getConnectServer()." ". $job->getSsl() ."$errno $errorString");
                    $job->connectedSocketEvent(false);
                    $this->connectsFailed++;
                    return false;
                } else {
                    $job->connectedSocketEvent($socket);
                    $this->connectsCreated++;
                    return $socket;
                }
            } else {
            //http
    ...
    

    Well, let's look into the code of the task itself - it should be able to work with partial answers / requests. First, let us tell the server core what we want to write to the socket.
    //Формируем тело следующего запроса
        function readyDataWriteEvent() {
            if (!$this->dataToWrite) {
                if ($this->getParams()) {
                    $str = http_build_query($this->getParams());
                    $headers = $this->getRequestMethod()." ".$this->getUri()." HTTP/1.0\r\nHost: ".$this->getConnectServer()."\r\n".
                        "Content-type: application/x-www-form-urlencoded\r\n".
                        "Content-Length:".strlen($str)."\r\n\r\n";
                    $this->dataToWrite = $headers;
                    $this->dataToWrite .= $str;
                } else {
                    $headers = $this->getRequestMethod()." ".$this->getUri()." HTTP/1.0\r\nHost: ".$this->getConnectServer()."\r\n\r\n";
                    $this->dataToWrite = $headers;
                }
                return $this->dataToWrite;
            } else {
                return $this->dataToWrite;
            }
        }
    

    Now we write a request, determining how much is left.
    //Пишем запрос в сокет до того момента, когда полностью запишем его тело
        function dataWrittenEvent($count) {
            if ($count === false ) {
                //socket was reset
                $this->jobFinished = true;
            } else {
                $dataTotalSize = strlen($this->dataToWrite);
                if ($count<$dataTotalSize) {
                    $this->dataToWrite = substr($this->dataToWrite,$count);
                    $this->setStatus('WRITE_REQUEST');
                } else {
                    //Когда успешно записали запрос в сокет, переходим в режим чтения ответа
                    $this->setStatus('READ_ANSWER');
                }
            }
        }
    

    After receiving the request, we read the answer. It is important to understand when the answer is fully read. You may need to set a timeout for reading - I did not need this.
    //Читаем из сокета до момента, когда полностью прочитаем ответ и совершаем переход в другой статус
        function readyDataReadEvent($data)
        {
            ////////// Successfull data read
            /////
            if ($data)  {
                $this->body .= $data;
                $this->setStatus('READ_ANSWER');
                $this->bytesRead += strlen($data);
            /////
            //////////
            } else {
            ////////// А тут мы уже считали ответ и начинаем его парсить
            /////
                    ////////// redirect
                    if ( preg_match("|\r\nlocation:(.*)\r\n|i",$this->body, $ar_matches) ) {
                        $url = parse_url(trim($ar_matches[1]));
                        $this->setStatus('DO_REQUEST');
                    } else if (...) {
                        //Так мы сигнализируем ядру сервера, что задание нужно завершить
                        $this->jobFinished = true;
                        ...
                    } else if (...) {
                        $this->setSleepTo(time()+$this->sleepInterval);
                        $this->sleepInterval *=2;
                        $this->retryCount--;
                        $this->setStatus('DO_REQUEST');
                    }
                $this->body = '';
                ...
    

    In the last fragment, we can hierarchically direct the FSM according to our strategy, implementing various options for the job.
    At the time of writing the task class, the feeling that you were writing a plugin for nginx did not leave ;-)

    Result


    You see, how simply and succinctly it was possible to solve the problem of working simultaneously with hundreds and thousands of jobs and sockets in just one PHP process. And imagine, if we raise for this server how many PHP processes, how many cores on the server - yes, these are thousands of served clients. And there is no garden with streams and inefficient switching of the processor context and increased memory requirements. The PHP server process consumes only about 20MB of memory, and works like a horse :-)

    Summary


    Understanding how the kernel of the operating system can bring us effective processing of network sockets, we adapted to it and implemented a high-performance PHP server - serving hundreds of open network sockets in one process. If necessary, you can recompile PHP and process thousands of sockets in one process.
    Expand your circle of knowledge, do not be lazy under the yoke of stereotypes - using even low-typed scripting languages ​​you can create productive servers - the main thing is to know how and not to be afraid to experiment :-) Good luck to everyone!

    Also popular now: