
Comparing MemCache and MongoDb for Network Cache
A rather extraordinary idea arose: to take as a means of the network cache not MemCache , but MongoDb and compare their performance. But to represent and compare the performance of these two “caching mechanisms”, we also took other means to speed up the work of our App ( APC , RamFS , TmpFS , XCache ).
The article presents data and graphs comparing these mechanisms with a description and reasoning of the data and graphs.
Any large Internet resource sooner or later faces the problem of server loads. Everything would be fine if you have only one server, well, or say a bunch of Web server + DB server. In general, everything is relatively good if you have only one front-end (Front-end). But here problems arise when increasing your "zoo" of servers! Not only do you have a more complicated hierarchy of connections between your servers, but you also have problems such as centralized data storage - this applies to the database, static files, and, of course, the cache files themselves.
At work, we are developing a fairly large Internet resource that has long gone beyond the boundaries of a single server. The benefit so far is limited to a web server and a DB server. However, the authorities have now started a new large project, in which its crazy structure is already visible at the level of design and thinking. A small group of programmers now faces a new and interesting task: to think over the structure and hierarchy of the project, the size of which not one of us has personally encountered before. But this is not about this project, but about problems emerging from its development, namely, the main one is centralized storage of the cache.
This article does not imply an exhaustive source of a solution to this problem, but I can share the experience that I gained in comparing and choosing tools, as well as caching schemes.
We proceed to the main part of the article, namely to our debriefing.
To compare caching performance, the following caching tools were selected: APC , MemCache , RamFS , TmpFS , XCache , as well as (extraordinary choice), the rather fast MongoDB DBMS .
I think that we need to start with how we installed all these programs and extensions, and what difficulties we encountered.
All experiments were performed on a CentOS system with 2Gb of memory and installed PHP [5.3.6], and for the sake of fairness, we did not configure a single package, everything was taken “out of the box”.
Installing APC [3.1.9] :
Everything is simple and easy. We install the package through pecl and that's it, APC works.
Installing memcache [2.2.6] :
Also quite simple. We also install the package through pecl, and then separately install the memcache program itself, as this is not an extension, but a third-party program that must be installed separately, but it is not difficult. After installation, everything worked wonderfully.
Installing RamFS :
This is even more creation than installation. Everything is done very simply too:
Installing TmpFS :
Just like in the previous case, this is a creation, not an installation. Created by the team
The note:
Installation of XCache [1.3.1] :
And here the interesting begins, with which we encountered. Installing this package turned out to be quite prudent due to the fact that PHP 5.2 was also installed on the system (and it is currently the main one in the system), XCache has a large list of dependencies, and installing this package was problematic due to the need to slip he needs the libraries. But even with this, the problems with this extension did not end, because after installation (and it seemed to be normal) during the launch of a script with a test to verify the correct operation of this extension, new problems arose in that when trying to write a variable to memory an error occurred:
and after it was the following:
After 2 days of fighting this error with straight hands, a keyboard and a tambourine, the result was zero. There were speculations that * .so was incorrectly assembled, however, replacing our collected * .so with the same version downloaded from the Internet did not lead to anything. Google Uncle went into action, the exact same StackOverflow problem was found . However, they did not find a solution to this problem, so they were forced to conduct further tests without XCache. There wasn’t much disappointment, because previous tests showed that the difference between APC and XCache is almost insignificant, but there are much more problems with XCache.
Installing MongoDB [1.8.1] :
Installing Mongo is quite simple and I will not describe its process, however, I will give you the launch with the configuration we need:
The main parameter that interests us is the last "
It seems that everything is installed and "configured", now we proceed directly to the tests themselves.
For greater honesty, we will conduct tests with 4 stored data volumes: 4Kb, 72Kb, 144Kb, 2.12Mb .
Consider the data obtained on the charts:
4Kb:

As can be seen on the 1st chart (Timeline for recording 4Kb data) - MemCache showed the worst result, followed by RamFs and TmpFs, then Mongo and APC.

On the 2nd graph (Timeline for reading 4Kb data), Mongo grazes the rear ones clearly, while MemCache is almost comparable to RamFs and TmpFs, which almost coincided, just on this graph the discrepancies between them are almost invisible.
72Kb:

From the 3rd graph (Timeline for recording 72Kb data), MemCache behaved inappropriately, and Mongo is quite predictable. It is very surprising that APC, RamFs and TmpFs are almost the same.

And here is the reading displayed on the 4th chart (Timeline for reading 72Kb data): Mongo lost ground and let his competitor MemCache go forward, as before APC - the leader of the whole race.
144Kb:

Starting from the 5th graph (144Kb data recording timeline), we see already strange behavior of almost all systems - MemCache behaves very unstably and sometimes at a smaller number of iterations the time is even higher than at a larger one, and even more surprising is that closer to 500 iterations Mongo begins to overtake APC! RamFs and TmpFs keep a surprisingly high bar in speed.

On the 6th graph (Timeline for reading 144Kb data) Mongo loses to everyone, and APC is the fastest.
2.12Mb:

On the 7th graph (Timeline for recording 2.12Mb data), Mongo turned out to be very stable, however, the last iteration 500 times unsettled him, but he stayed in the saddle. Of course, the obscene leader is APC, but something interesting happens with MemCache, TmpFs and RamFs, which will be described below.

The last graph (Reading timeline of 2.12Mb data) is also a very wonderful picture: TmpFs came out the slowest and RamFs the fastest. Another amazing thing is that Mongo outperformed APC!
If you look at the last iterations of the last 2 graphs in which files were cached at 2.12Mb 500 times, you can notice a rather interesting thing that 500 (number of iterations) * 2.12 (size of one file) = 1`060Mb, which tells us that in RamFS and TmpFS we went beyond the maximum volume! Therefore, the numbers are quite interesting and unpredictable.
In fact, everything turned out just predictably: when the upper limit is reached (and we have 1024Mb), RamFS simply stops writing data and ignores the write commands. When reading this data, reading does not physically occur, it simply returns an empty string (when processing in PHP, the string is not empty, but interpreted as null), and TmpFS behaves just the opposite at that time - it writes all our data, pre-allocating them place by moving older data to swap. This explains the fact that with such volumes and similar number of iterations, RamFS takes quite a short time to process the record and even less to read non-existent data, while TmpFS, on the contrary, greatly increases this time due to the need to perform disk operations.
As I indicated above, all funds were taken “out of the box”, these funds were not configured, therefore memcache has a time of 0, because the recorded volume of one record of 2.12Mb in size simply exceeds the maximum amount of one stored record in memcache. In order for memcache to be able to store data larger than 2.12Mb in size, it is necessary to rebuild memcache itself, but then it would be necessary to configure everything else accordingly, but it would be almost impossible to configure different products in the same way, so I allowed myself to leave this nuance in as it is now. Perhaps this is not true, but we were more interested in the behavior of all these means in equal conditions - “out of the box”.
Looking at all this data, each developer can draw his own conclusion and his conclusions about what and how to use. We made our conclusions, which we share:
Due to the fact that we are interested in the data of those tools that allow you to cache over the network, we will discuss MemCache and Mongo. Of course, in many cases, as the graphs show, MemCache bypasses Mongo when reading, which is more relevant than the record in which Mongo bypasses MemCache. However, if we take into account all the features that MongoDb gives us, which of course is understandable, because it is a full-fledged DBMS with all its capabilities and ease of working with it, then the speed disadvantages that Mongo shows can easily be covered by its ability to make complex samples and getting multiple entries (cache entries) in one request. Another plus of MongoDb over MemCache: with the correct and well-thought-out App architecture, you can get cached page elements in one request,
You can also think of the system in such a way that there is a multi-level cache: first-level cache (network cache) MemCache or Mongo, and second-level cache APC or XCache (but comparisons are still required).
In the near future, a more in-depth analysis and comparison of MemCache and Mongo is planned - a comparison not only of the time spent, but also the amount of memory spent with measuring the load on the processor and the server as a whole.
Finally, I would like to point out that the tests were run repeatedly and the data shown in the tables is the average of 10-20 repetitions of the tests.
I will not give tabular test data due to the large volume.
The article presents data and graphs comparing these mechanisms with a description and reasoning of the data and graphs.
Any large Internet resource sooner or later faces the problem of server loads. Everything would be fine if you have only one server, well, or say a bunch of Web server + DB server. In general, everything is relatively good if you have only one front-end (Front-end). But here problems arise when increasing your "zoo" of servers! Not only do you have a more complicated hierarchy of connections between your servers, but you also have problems such as centralized data storage - this applies to the database, static files, and, of course, the cache files themselves.
At work, we are developing a fairly large Internet resource that has long gone beyond the boundaries of a single server. The benefit so far is limited to a web server and a DB server. However, the authorities have now started a new large project, in which its crazy structure is already visible at the level of design and thinking. A small group of programmers now faces a new and interesting task: to think over the structure and hierarchy of the project, the size of which not one of us has personally encountered before. But this is not about this project, but about problems emerging from its development, namely, the main one is centralized storage of the cache.
This article does not imply an exhaustive source of a solution to this problem, but I can share the experience that I gained in comparing and choosing tools, as well as caching schemes.
We proceed to the main part of the article, namely to our debriefing.
To compare caching performance, the following caching tools were selected: APC , MemCache , RamFS , TmpFS , XCache , as well as (extraordinary choice), the rather fast MongoDB DBMS .
I think that we need to start with how we installed all these programs and extensions, and what difficulties we encountered.
All experiments were performed on a CentOS system with 2Gb of memory and installed PHP [5.3.6], and for the sake of fairness, we did not configure a single package, everything was taken “out of the box”.
Installing APC [3.1.9] :
Everything is simple and easy. We install the package through pecl and that's it, APC works.
Installing memcache [2.2.6] :
Also quite simple. We also install the package through pecl, and then separately install the memcache program itself, as this is not an extension, but a third-party program that must be installed separately, but it is not difficult. After installation, everything worked wonderfully.
Installing RamFS :
This is even more creation than installation. Everything is done very simply too:
mount -t ramfs -o size=1024m ramfs /tmp/ramfs
here we allocate 1Gb of memory and mount it as a file system in a directory /tmp/ramfs
. Installing TmpFS :
Just like in the previous case, this is a creation, not an installation. Created by the team
mount -t tmpfs -o size=1024m tmpfs /tmp/tmpfs
, which shows that we also allocate 1Gb of memory and mount it as a file system in a directory /tmp/tmpfs
. The note:
The main differences between RamFS and TmpFS that interest us are the following: both methods work almost the same, except that when you reach the allocated limit for RamFS, the system will not inform us about the limit of the used volume and the new data you record will simply disappear to nowhere, and TmpFS (also when the allocated limit is reached) will display a message about insufficient space, while older data will be moved to swap, that is, disk operations will be performed, which actually does not suit us. Both RamFS and TmpFS will be reset to zero and will disappear when the system is restarted, so if you want to use them even after the system is restarted, you need to put the partition creation script back into startup.
Installation of XCache [1.3.1] :
And here the interesting begins, with which we encountered. Installing this package turned out to be quite prudent due to the fact that PHP 5.2 was also installed on the system (and it is currently the main one in the system), XCache has a large list of dependencies, and installing this package was problematic due to the need to slip he needs the libraries. But even with this, the problems with this extension did not end, because after installation (and it seemed to be normal) during the launch of a script with a test to verify the correct operation of this extension, new problems arose in that when trying to write a variable to memory an error occurred:
Warning: xcache_set(): xcache.var_size is either 0 or too small to enable var data caching in /usr/local/php5.3/xcache.php on line 6
and after it was the following:
Warning: xcache_get(): xcache.var_size is either 0 or too small to enable var data caching in /usr/local/php5.3/xcache.php on line 10
After 2 days of fighting this error with straight hands, a keyboard and a tambourine, the result was zero. There were speculations that * .so was incorrectly assembled, however, replacing our collected * .so with the same version downloaded from the Internet did not lead to anything. Google Uncle went into action, the exact same StackOverflow problem was found . However, they did not find a solution to this problem, so they were forced to conduct further tests without XCache. There wasn’t much disappointment, because previous tests showed that the difference between APC and XCache is almost insignificant, but there are much more problems with XCache.
Installing MongoDB [1.8.1] :
Installing Mongo is quite simple and I will not describe its process, however, I will give you the launch with the configuration we need:
/usr/local/mongodb/bin/mongod --dbpath /usr/local/mongodb/data --profile=0 --maxConns=1500 --fork --logpath /dev/null --noauth --diaglog=1 --syncdelay=0
The main parameter that interests us is the last "
--syncdelay=0
", which indicates the time of synchronization of data in memory with data on the HDD. The specified value of 0 means that we explicitly prohibit Mongo from synchronizing with the HDD, this feature is described by the creators of the DBMS, but it is not recommended because of any power outage or other system failure that could affect the Mongo daemon, all your data will be lost . We are quite satisfied with this risk, because We want to try using this DBMS as a caching mechanism. It seems that everything is installed and "configured", now we proceed directly to the tests themselves.
For greater honesty, we will conduct tests with 4 stored data volumes: 4Kb, 72Kb, 144Kb, 2.12Mb .
Consider the data obtained on the charts:
4Kb:

As can be seen on the 1st chart (Timeline for recording 4Kb data) - MemCache showed the worst result, followed by RamFs and TmpFs, then Mongo and APC.

On the 2nd graph (Timeline for reading 4Kb data), Mongo grazes the rear ones clearly, while MemCache is almost comparable to RamFs and TmpFs, which almost coincided, just on this graph the discrepancies between them are almost invisible.
72Kb:

From the 3rd graph (Timeline for recording 72Kb data), MemCache behaved inappropriately, and Mongo is quite predictable. It is very surprising that APC, RamFs and TmpFs are almost the same.

And here is the reading displayed on the 4th chart (Timeline for reading 72Kb data): Mongo lost ground and let his competitor MemCache go forward, as before APC - the leader of the whole race.
144Kb:

Starting from the 5th graph (144Kb data recording timeline), we see already strange behavior of almost all systems - MemCache behaves very unstably and sometimes at a smaller number of iterations the time is even higher than at a larger one, and even more surprising is that closer to 500 iterations Mongo begins to overtake APC! RamFs and TmpFs keep a surprisingly high bar in speed.

On the 6th graph (Timeline for reading 144Kb data) Mongo loses to everyone, and APC is the fastest.
2.12Mb:

On the 7th graph (Timeline for recording 2.12Mb data), Mongo turned out to be very stable, however, the last iteration 500 times unsettled him, but he stayed in the saddle. Of course, the obscene leader is APC, but something interesting happens with MemCache, TmpFs and RamFs, which will be described below.

The last graph (Reading timeline of 2.12Mb data) is also a very wonderful picture: TmpFs came out the slowest and RamFs the fastest. Another amazing thing is that Mongo outperformed APC!
If you look at the last iterations of the last 2 graphs in which files were cached at 2.12Mb 500 times, you can notice a rather interesting thing that 500 (number of iterations) * 2.12 (size of one file) = 1`060Mb, which tells us that in RamFS and TmpFS we went beyond the maximum volume! Therefore, the numbers are quite interesting and unpredictable.
In fact, everything turned out just predictably: when the upper limit is reached (and we have 1024Mb), RamFS simply stops writing data and ignores the write commands. When reading this data, reading does not physically occur, it simply returns an empty string (when processing in PHP, the string is not empty, but interpreted as null), and TmpFS behaves just the opposite at that time - it writes all our data, pre-allocating them place by moving older data to swap. This explains the fact that with such volumes and similar number of iterations, RamFS takes quite a short time to process the record and even less to read non-existent data, while TmpFS, on the contrary, greatly increases this time due to the need to perform disk operations.
As I indicated above, all funds were taken “out of the box”, these funds were not configured, therefore memcache has a time of 0, because the recorded volume of one record of 2.12Mb in size simply exceeds the maximum amount of one stored record in memcache. In order for memcache to be able to store data larger than 2.12Mb in size, it is necessary to rebuild memcache itself, but then it would be necessary to configure everything else accordingly, but it would be almost impossible to configure different products in the same way, so I allowed myself to leave this nuance in as it is now. Perhaps this is not true, but we were more interested in the behavior of all these means in equal conditions - “out of the box”.
Looking at all this data, each developer can draw his own conclusion and his conclusions about what and how to use. We made our conclusions, which we share:
Due to the fact that we are interested in the data of those tools that allow you to cache over the network, we will discuss MemCache and Mongo. Of course, in many cases, as the graphs show, MemCache bypasses Mongo when reading, which is more relevant than the record in which Mongo bypasses MemCache. However, if we take into account all the features that MongoDb gives us, which of course is understandable, because it is a full-fledged DBMS with all its capabilities and ease of working with it, then the speed disadvantages that Mongo shows can easily be covered by its ability to make complex samples and getting multiple entries (cache entries) in one request. Another plus of MongoDb over MemCache: with the correct and well-thought-out App architecture, you can get cached page elements in one request,
You can also think of the system in such a way that there is a multi-level cache: first-level cache (network cache) MemCache or Mongo, and second-level cache APC or XCache (but comparisons are still required).
In the near future, a more in-depth analysis and comparison of MemCache and Mongo is planned - a comparison not only of the time spent, but also the amount of memory spent with measuring the load on the processor and the server as a whole.
Finally, I would like to point out that the tests were run repeatedly and the data shown in the tables is the average of 10-20 repetitions of the tests.
I will not give tabular test data due to the large volume.