Fi1osof March 2, 2013 at 14:36

Development of an online store of 13000+ products on MODX Revolution. Part 1

I already wrote about my shopModx component . And although few people appreciated it, since many are waiting for ready-made solutions with one big button “Install and Work,” nevertheless this component is developed taking into account those disadvantages that are in MODX, and which MODX developers often rely on, and taking into account those advantages that MODX has, but which developers do not know about, or simply do not use.

I also want to say that this module is not just being developed. It is developed for two not small stores (for starters), and the output will be a run-in platform for the implementation of large online stores.

Today I would like to start a series of articles on the development of large online stores on MODX Revolution, with stories about what difficulties you have to deal with and what options for solving these problems are used. And also about what shopModx will already carry on board to solve such problems, and what tricks will allow you to get 100% control over the development of your unique store without getting into shopModx code.

So, a little about the store that we are working on: this is an online furniture store. Yesterday I imported the base. It turned out 13,000+ documents, 43,000+ TVs and almost 13,000 entries in modx_shopmodx_products.

I must say right away that I expect to get the page code not even from the cache and search by parameters in less than 1 second, and the average load should not exceed 0.3-0.4 seconds.

So, briefly about the first problems and their solutions.

Problem 1. Large cache file and a lot of used memory

To get started, the input for pure Reva. Specially downloaded clean 2.3.0 and looked at memory usage. The code stuck in the plugin for the OnWebPageComplite event - this is the most extreme point of MODX execution after exit (), saving the document cache, etc. First call (manually deleted all cache files):

Memory: 13.5409 Mb
TotalTime: 0.1880 s

Further:

Memory: 10.1396 Mb
TotalTime: 0.0640 s

By the way, just in case, the plugin code: gist.github.com/Fi1osof/5062419
You can modify it with access control and always see the current load on the server.

In general, we check the results at the store (by the way, I want to clarify right away that the document is not empty, but has 8 connected TV parameters, one of which is a picture with a custom media resource). First run

Memory: 24.1438 Mb
TotalTime: 0.4360 s

Further:

Memory: 18.4103 Mb
TotalTime: 0.0960 s

That is, we have an increase in the used memory of almost 10 meters at once. This is because we have cached the entire map of the URLs of the context, and we have 13,000+ documents there. The context cache file is almost 2 meters.

The obvious solution is to shorten the context cache file. I already wrote
in detail about the subtle points in caching MODX and about my patch cacheOptimizer . We put it and disable caching of the resource map for the web context. New Results:

Memory: 16.1369 Mb
TotalTime: 0.2640 s

Memory: 10.4021 Mb
TotalTime: 0.0720 s

That is, in normal mode, we consume almost as much memory as on a bare system.

Problem 2. Page not found (404)

This problem stems directly from the previous solution :-) Since we chopped off the cache of the URL map, now MODX will not be able to "understand" which URL we are accessing when using the CNC using the URL. I’ll clarify right away that if you don’t use CNC, then this should not be a problem for you (although who doesn’t use CNC today?), Or if you do not have a large store (up to 1000 products), then you can’t cut off the page map , an extra megabyte of RAM is not a problem.

So, to solve this problem, I decided to use my own router. I just wrote a new class extending modRequest and tweaked a couple of methods a bit. The logic is as follows: when accessing the page, MODX tries to find the resource id by the requested URL in the cache. (URL is already cleared, that is, without any parameters, etc.). If it finds, then returns the ID and then everything happens as usual. If not, then tries to find the document in the database by uri. Finds - writes id to the cache and then returns id. If not, then the standard procedure is OnPageNotFound (so you can still plug in your plugin to modify the search).

This additional class will be supplied with shopModx, and if someone needs it (if there is a big store), then just turn it on in the settings (modRequest.class key).

There is also an option to immediately cache all pages, for example, when updating the cache (use the plugin for the OnSiteRefresh event).

Problem 3. A lot of cache files

I can imagine how many people read the previous solution and thought “well, what a moron!” :-)

Yes, producing hundreds of thousands of cache files is complete insanity. But here the key word is files. Yes, it is their condition (files) that haunt us. Therefore, in this case, we just use a different cache provider, not a file provider. I decided to use memcached, as I had already encountered it somehow and installed it on the server, and you can use the other one you want. Memcache and APC are also included in the standard Revo build.

I argued my choice in favor of the cache mechanism on the RAM by the fact that cache reset is simplified. Try deleting 1,000,000 files from the hard drive. This will happen sooooo long. In the case of memcached, flushing the cache is simple and quick.

$modx->cacheManager->getCacheProvider()->flush();

Another huge plus of memcached is that you can store any type of data, including objects. The only exceptions are resources (for example, connecting to a database) and objects whose properties include resources. Such objects should be created with the __sleep () and __wakeup () methods, so that before saving they would delete all resource properties, and when restored from the cache, they could recreate these properties.

So see the results. First run

Memory: 15.0709 Mb
TotalTime: 0.1040 s

Further

Memory: 10.403 Mb
TotalTime: 0.0640 s

In my opinion it is very good for an uncached context of 13,000+ documents.

Problem 4. Mass update of documents when changing system settings

I will not explain why, but I needed to change the suffix of the containers. I changed it, and I didn’t wait for the Ajax response ... It was useful to watch the processor / system / settings / updatefromgrid. There is such a checkForRefreshURIs () method in it. In general, if "friendly_urls", "use_alias_path" or "container_suffix" has changed, then it signals that the URLs need to be updated. All is correct. But the problem is that he is trying to update all documents indiscriminately, not even containers. In addition, the sorting condition by menuindex also adds for some reason (although we are interested in the nesting order, not the menu index).
In general, this process made the server cry. Added the condition isfolder = 1, and then in 6 seconds updated all the containers. I will not change suffixes anymore :-)

Summary

In practice, we received full processing of a document on a site with 13000+ documents (in two tables) and 43000+ TV-shek, in less than 0.3 seconds with an updated cache. From cache - in less than 0.1 sec.
Conditionally, we can assume that at this stage the difference between a large and a small site ends, since further brakes are possible only at the page rendering level, and this already depends on how we write the templates, etc.
I will write this point in the next article (most likely tomorrow). But I’ll say right away what I’ll do on Smarty, since IMHO doing all this on clean chunks and snippets is a lot of problems.

And finally, the results of a local test of 100 clients with 1000 requests each: gist.github.com/Fi1osof/462e1af10ab7b95311df
Time per request: 44.224 [ms] (mean, across all concurrent requests)

PS Package on modx.com: modx.com/extras/package/shopmodx GitHub
project: github.com/Fi1osof/shopModx
Filled the latest version with the request class .

PPS It is better to specify the settings of the memcached provider directly in config.core.php (just take a word).

$config_options = array (
  'cache_handler' => 'cache.xPDOMemCached',
  'cache_prefix' => 'shopmodx_', // Надо указывать разный префикс для разных сайтов на одном memcached-сервере
);

The $ config_options variable is already there.

Tags: