feedbee February 15, 2015 at 15:18

Poll: how do you solve the problem of synchronizing parallel requests in PHP?

For a long time I’ve been trying to understand how much space is devoted to issues of parallelism and competitiveness of code execution in the everyday practice of an average PHP programmer. On the one hand, when developing a server application, a programmer automatically writes code that will be executed in parallel. On the other hand, in practice in PHP, all problems in this area were solved by the tools that everyone used - a web server, session, and DBMS.

Do your projects pay attention to the problems of synchronizing simultaneously processed HTTP requests? They are solved through transactions, locks? What blocking methods do you use? Anyway, you need to bathe about it, or is the topic useless? We learn the opinion of the audience. This post does not provide answers to questions. Intelligence is underway here.

***

In the PHP world, it is historically customary not to pay much attention to parallel execution of code. PHP itself is not imprisoned for multithreading (there is no thread safety inside the engine). I have never met projects or programmers who used pthreads (and did you meet? Then tell me about it in comments). And not so much multithreading is needed in practice in web applications, where in parallel you need to execute primarily requests, and not separate parts of the code within a single request. And since parallel execution of incoming requests in separate processes is organized by the application server (php-fpm or apache), the programmer does not need to think about it - everything works out of the box.

PHP has a session mechanism that is used in the vast majority of cases. A session with default settings blocks the parallel execution of requests within a single session. This "covers" some holes and leads to the fact that in practice you can never encounter obvious problems. Those. until the user starts to hack, working, for example, in two browsers at the same time, nothing will break due to the lack of locks and transactions.

In addition, the probability of collisions due to parallelism is very small for sites with a peak number of requests less than 2 per second (assuming that the response generation time does not exceed a second).

And finally, if some parallelism problems still come up, then the easiest way to solve them is a transaction in the database with a sufficient level of isolation. Since almost all sites that are developed in PHP use transactional DBMSs as storage, it’s enough to just start using transactions to solve competition problems of query execution, which lead to data inconsistency. Without even delving deeply into the topic of process synchronization, simply using transactions, the problem can be solved.

All this leads to the fact that in practice the average PHP programmer almost never encounters problems of parallel code execution. Most generally know very little about concurrent programming, synchronization, and locking. This is clearly seen in the interviews. And how much this is in demand, I want to know in this survey.

In part, all this is good - a low entry threshold, which is one of the main advantages of PHP. Rapid development by saving time on working through all bottlenecks. But sooner or later, many begin to face the problem of synchronizing processes face to face. And the development of reliable, and not designed for circumstances, applications requires a certain level of study of this issue.

The simplest example when a transaction does not help is cache warming up. To prevent data that is being cached from being generated in parallel, competitive requests must be blocked, allowing the request to be filled first to fill the cache. You can’t do without blocking. Moreover, if there are several servers, the lock should be centralized. Another example is file hosting. The user has a limited number of files that he can upload. When adding a file, you need to compare the number of downloaded files with the limit and accept the file if the limit is not exhausted. Although here you can make a feint with your ears and do without locks, it will be easiest to block before checking by the user, check the counter, occupy the slot for the file, remove the lock, and then take the body of the file itself.

And using transactions also has its problems. At a minimum, they must be restarted several times if there is a race-condition and the transaction is rolled back due to a collision. There are questions when working with external resources for the database - files, cache, requests to the remote API.

***

In fact, all PHP programmers write code that works in a competitive environment. Often, even in a very highly competitive. And you have to synchronize access to shared resources from parallel processes. I think that many, like me, will be interested to know how our colleagues look at this problem. How do your projects solve the problem of synchronizing access to shared resources?

How is this problem solved with us?

We use transats and locks. Transactions help maintain data consistency if the task is reduced to a series of queries in the database. When it is necessary to synchronize code that works not only with the database, or doesn’t work with it at all, we use locks through my abstraction library over the method of locking . If the backend runs on one server, it is enough to use the driver for flock (), if you need to block distributed, then you can use the drivers for Redis or Memcache.

If you have good material on this topic, share the links in the comments.

PS To lovers of other programming languages: if you want to tell how the problem is successfully solved in your language / framework, you are welcome. Otherwise, pay attention to the hub in which the publication is posted.

Only registered users can participate in the survey. Please come in.

You program the server side of the application:

60% PHP 497 only
26.9% in PHP and other languages (JS, Python, Ruby, .NET, ...) at the same time 223
13% in only one or more other languages 108

Do you think the issues of interprocess communication and synchronization are relevant for a PHP programmer?

14.9% Yes, but not for me (no time or desire to figure it out) 102
46.7% Yes, I want to figure it out 319
15.8% Yes, I already figured it out 108
8% No, nobody needs this, and I don’t understand this 55
14.4% No, no one needs this, but I already understand this (at least in general terms) 99

Is the issue of synchronizing parallel requests (processes) relevant in your project (at work)?

49.8% We do not have this question at all. 344
23.9% We have this issue relevant, but not resolved 165
26.2% In our country, this issue was relevant, but 181 have already been resolved .

What synchronization methods do you use?

9.4% Lock on files without flock () 49
19.3% Locks via flock () 100
15.6% Locks via special OS synchronization objects (mutex, semaphore, ...) 81
33.3% Locks via Memcached, Redis, ... 173
67.7% Transactions in DBMS 351
25.2% Explicit locks in the DBMS 131
17.5% Other Options 91

Tags:

Poll: how do you solve the problem of synchronizing parallel requests in PHP?

You program the server side of the application:

Do you think the issues of interprocess communication and synchronization are relevant for a PHP programmer?

Is the issue of synchronizing parallel requests (processes) relevant in your project (at work)?

What synchronization methods do you use?

Also popular now: