How to avoid collisions when writing to Memcache from PHP
Typically, programmers use the technology for its intended purpose, but I decided to conduct an experiment and try to use the memcached server as a scalable temporary key = value storage.
Memcached is designed for simple caching of static data, because it does not provide a system for avoiding collisions.
Suppose our PHP application runs on a single server, and memcached runs on a remote machine. We can easily read and write to the same cell, because the application is not WEB, so the process is one. Due to the linearity of the process, it will not be able to simultaneously write different data to the same cell.
Today we needed to split the application into two servers and problems started. Collisions occurred while writing to memcache. It turned out that in 80% of cases, applications try to simultaneously write their data into one cell. An ideal solution would be to use shared memory, but it does not scale unlike Memcached. Due to the large amount of code and the estimated time to rewrite applications, it was decided to add a crutch.
Imagine that two daemons are accessing a single data cell at the same moment for recording, which is inevitable. Under normal conditions, we will get a collision. We will act like this:
Test results of two daemons overwriting the same cell in memcache with different PIDs. Demon Algorithm:
The tests show that almost half of the data was lost.
For normal developers: Redis, MemcacheDB and the like, we still have to rewrite this miracle.
With this algorithm, Memcache can even be used as a Gearman . The disadvantages of the caching server itself remain, but in most cases they do not appear.
Memcached is designed for simple caching of static data, because it does not provide a system for avoiding collisions.
Data recording
Standard situation
Suppose our PHP application runs on a single server, and memcached runs on a remote machine. We can easily read and write to the same cell, because the application is not WEB, so the process is one. Due to the linearity of the process, it will not be able to simultaneously write different data to the same cell.
Two or more processes
Today we needed to split the application into two servers and problems started. Collisions occurred while writing to memcache. It turned out that in 80% of cases, applications try to simultaneously write their data into one cell. An ideal solution would be to use shared memory, but it does not scale unlike Memcached. Due to the large amount of code and the estimated time to rewrite applications, it was decided to add a crutch.
Read Aglorythm - Writing
Imagine that two daemons are accessing a single data cell at the same moment for recording, which is inevitable. Under normal conditions, we will get a collision. We will act like this:
- Process1 reads uniqid from the memory location, it is empty.
- Process1 writes uniqid with its pid and server number
- Process1 checks the uniqid value. If it matches his key, writes the data.
Depending on the situation, deletes the key. You can add the recording time, which will unlock the data cell when the process abnormally ends 1 - Process2 reads uniqid from the memory location, it is not empty.
- Process2 is waiting, for example usleep (rand (1,5)); (cyclically) in case you need to write data despite access to the cell of a neighboring object. We may not need to record, but just block the cell to prevent recording.
- Process2 reads uniqid from memory, it is empty. Further, the same as with the first process.
- Process3 reads uniqid from memory, it is empty.
- Process3 writes uniqid with its pid and server number
- Process3 checks the uniqid value. It does not match his key.
- Process3 returns an error, or acts otherwise according to the algorithm.
- After recording, processes should remove pid from uniqid (it depends on the situation)
Test results of two daemons overwriting the same cell in memcache with different PIDs. Demon Algorithm:
Первый демон:
Ожиданий очереди: 17699
Успешных запросов: 100000
Время выполнения: 89.012994 сек.
Второй демон:
Ожиданий очереди: 92999
Успешных запросов: 100000
Время выполнения: 139.522396 сек.- while the mutex is closed, check again.
- if the mutex is open, write your mutex.
- if the mutex is correct - we write, if it is incorrect - we go to the beginning.
- Writing data
- delete mutex
The tests show that almost half of the data was lost.
For normal developers: Redis, MemcacheDB and the like, we still have to rewrite this miracle.
With this algorithm, Memcache can even be used as a Gearman . The disadvantages of the caching server itself remain, but in most cases they do not appear.