PHP IPC - Interprocess Communication in PHP


    The purpose of this note is to familiarize PHP developers with the capabilities of interprocess communication in this language. Note does not intend to tell in every detail about each of the features, implementation details or show working code examples.

    Since any programmer sooner or later has the task of parallelization, this note was conceived as a starting point from which you can begin your journey into the world of an exciting hemorrhoids process of building such systems.



    Also, the threading theme in PHP will be touched, moreover, precisely threads ( Thread ), since until recently, PHP allowed (conveniently) to implement any parallelization only thanks to Fork (we will not touch on perversions like curl_multi_ *, popen \ fsockopen, Apache MPM, etc.). The topic will be considered only in the context of IPC, I leave the search for details of the implementation of a particular approach to readers.

    The narration will be conducted in the context of software running on a single computer, and IPC within a single computer. This is due to the fact that interprocess communication in distributed systems is a very, very extensive topic. Therefore, all kinds of message brokers, databases, Pub \ Sub, and other “inter-computer” approaches will not be considered, moreover, they are covered on other resources on the Web.

    In view of all of the above, some preparation is required from the reader, since the terminology will be clarified only at key points, however, the text is plentifully provided with links to the necessary articles of PHP documentation, Wikipedia, etc., as well as the understanding that many things are on purpose simplified due to the format of this material.

    What does IPC look like?


    So what exactly is interprocess communication?

    Inter-process communication - (Eng. Inter-Process Communication, IPC ) - a set of methods for exchanging data between multiple threads in one or more processes. Processes can be run on one or more computers interconnected by a network. IPC methods are divided into messaging, synchronization, shared memory, and remote call (RPC) methods. IPC methods depend on the bandwidth and latency between the streams and the type of data being transmitted.

    Describe like IPC looks like!


    The outline of the summary is as follows:

    0. PCNTL
    1. Sockets
    2. Shared Memory
    3. Semaphore, Shared Memory and IPC
    4. pthreads

    0. PCNTL


    The extension implements the most basic functionality for working with processes, but we are interested in working with UNIX signals , and more specifically, the pcntl_signal function , which allows you to install a signal handler. This approach is the least functional of all considered, since it does not allow data transfer. Using this extension, you can, for example, organize the start / stop of workers, or read tasks from the buffer (file, database, memory, etc.), or signal one part of the system to another about an event.

    The most easily implemented, there are many examples, and the possibilities in the application, it can often be more than enough for some kind of not very difficult tasks.

    1. Sockets

    Sockets - (eng. Socket - connector) - the name of the software interface for the exchange of data between processes. The processes during such an exchange can be performed both on one computer, and on various computers connected by a network. A socket is an abstract object that represents the endpoint of a connection.

    It is necessary to distinguish between client and server sockets. Client sockets can be roughly compared with the terminal devices of the telephone network, and server sockets with switches. The client application (for example, the browser) uses only client sockets, and the server application (for example, the web server to which the browser sends requests) uses both client and server sockets.


    Perhaps this is the most obvious and best known way to implement IPC, however, and the most time-consuming. The first option is to create a broker (socket server), and client threads to connect to it. Here you will find the fascinating world of debugging non-blocking input / output (and how would you like to write blocking code?), As well as the implementation of many trivial things like wrappers over extension functions. The second option is simpler, it can be used for simpler implementations: create_socket_pair , which creates a pair of connected sockets, an example is available by reference.

    Using sockets to implement IPC requires a rather serious approach and lighting up manuals, but the pluses include the ability to distribute system elements to different servers in the future, without resorting to significant code changes. Also, the versatility of this technology is its versatility: for example, writing a client in PHP connecting to a C-sh server is not difficult.

    Also, the minuses should be canceled: the non-blocking IO mentioned above. Since the data will be received in batches, you should think carefully about the mechanism for ensuring their integrity, buffering and processing, which would not reduce all the advantages of non-blocking input / output.

    2. Shared Memory

    This extension allows you to fully work with virtual memory . The advantages of the approach are that it is the fastest (if speed is put at the forefront of the application) and the least resource-intensive. Moreover, its implementation is not associated with as many pitfalls as in the previous solution, and the technology itself is not difficult to digest.

    There are many options for use: both the total space and the allocation of blocks individually for each thread / process, data processing is also simplified due to the clear definition of the block size. The disadvantages include some difficulty in the convenient implementation of such an interaction: you have to forward the addresses of blocks to child processes (as parameters, when pcntl_fork starts , using marker files etc.)

    This approach is perhaps the most common and preferred, since it does not require large labor costs in implementation, and is more universal.

    3. Semaphore, Shared Memory and IPC

    This extension includes the capabilities of the previous one, however, it adds such basic resources for synchronizing resources as semaphores, another way of interacting with streams, known as messaging.

    Semaphores can come in handy when the threads are forced to work with some kind of shared resource, say, you wrote a firewall that, with each request, gets into a file with the IP addresses of Roskomnadzor and does street magic with the incoming request. The file, of course, is updated by some other service flow, therefore, it is unacceptable to read (or change) it while the update process is in progress, by someone else. The theory of semaphore work is simple, and there are many examples of their implementation, therefore, for those who have not yet worked with this type of locks, I recommend that you familiarize yourself with this, it will help to better understand the processes of building interaction between threads.

    Messaging is a more “high-level” and more convenient solution than shared memory, but this topic is poorly covered in the context of PHP. In addition, I know of cases where this technology showed some oddities, let's say, in its work, therefore, you should carefully check and double-check the results of the code.

    4. Pthreads

    And so we reached segfault's pinnacle of evolution for both IPC and multithreading in PHP.

    A cool guy named Joe Watkins wrote the pthread extension , which adds support for true multithreading in PHP. Just the other day (September 8, 2013) the first stable version (0.0.45) was released, however, the author in his post on Reddit very thoroughly disclosed the topic of beta \ stable releases, therefore, do not focus on this. I strongly recommend that you study all of his comments in the subject, there is a lot of useful information about pthread.

    What are the advantages? In everything. Pthreads provides an extremely simple and convenient API for implementing any of your multi-threaded fantasies. Here you and synchronize both in Java, and events, and IPC with probros objects! True, things are not so smooth with them (see examples on the github), and the author writes that this problem is not his business, however, he managed to create a miracle with socket resources, and now, socket_accept results from the main thread you can stick it in the daughter - amazing! Enough to parse the examples to understand how simple and elegantly done.

    I will not describe all the features and delights of this extension, everything is on the author’s github and in the documentation on php.net
    Apparently, the author is quite intensively working on his project, therefore, in the future he may have many more interesting features, stay tuned.

    To run the extension, you need to build PHP in Thread-safe mode, here is a small script that will do everything for you:

    Finish with a file if necessary.
    mkdir /opt/php-ts && \
    cd /opt/php-ts && \
    wget http://www.php.net/get/php-5.5.3.tar.bz2/from/ua1.php.net/mirror -O php-5.5.3.tar.bz2 && \
    tar -xf php-5.5.3.tar.bz2 && \
    cd php-5.5.3/ext && \
    git clone https://github.com/krakjoe/pthreads.git && \
    cd ../ && \
    ./buildconf --force && \
    ./configure --disable-all --enable-pthreads --enable-maintainer-zts && \
    make && \
    TEST_PHP_EXECUTABLE=sapi/cli/php sapi/cli/php run-tests.php ext/pthreads  && \
    alias php-ts="/opt/php-ts/php-5.5.3/sapi/cli/php"

    Does he look like a Pipe?


    That’s probably all. Although the language is limited in its IPC capabilities, nevertheless, it allows you to write efficient applications using various approaches to implement interprocess communication. For those of you who are now faced with the task of realizing such interaction, I recommend that you carefully study all the methods listed in the note, since they are not interchangeable, but they complement each other effectively.

    PS This does not apply directly to the topic of the article, but it is very applicable to some of the points described here, namely, blocking IO and the imperfection of the event model: I recommend that you familiarize yourself with the Eio and Ev extensions (author of both osmanov ).

    Also popular now: