Magento Performance Increase

... or the right work with collections.

I want to tell you about the errors that I saw on almost every Magento project that had performance problems. When working with Magento, I sometimes have to audit someone else's code. Therefore, I would like to share with you experience that will help improve the performance of your sites and avoid mistakes in the future.

This article talks about Magento 1. *, but what is described also applies to Magento 2. *.

In almost every project where there are performance problems, you can find something like this:

$temp = array();
$collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect('*');
foreach ($collection as $product) {
    $product = $product->load($product->getId());
    $temp[] = $product->getSku();
}
Wrong

instead

$temp = array();
$collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect('sku');
foreach ($collection as $product) {
    $temp[] = $product->getSku();
}
Right.

The reasons for this are very simple:

  1. After loading there are no necessary attributes
  2. So do "programmers" on the Internet
  3. Loading superfluous attributes according to the principle “it will not be worse”

To understand what’s wrong and what we can do with performance, I suggest concentrating on working with collections:

  1. Eav / Flat Tables
  2. Cache
  3. Proper work with collections

And of course the conclusions.



EAV / Flat tables


EAV is such a data storage approach when the entity to which the attribute belongs, the attribute itself and its value are spaced into different tables.

In Magento, EAV entities include: products, categories, custom meters, and address custom addresses. Attributes themselves are stored in the eav_attribute table.

The total attribute value types in Magento 5 are: text, varchar, int, decimal, and datetime. There is one more type - static, it differs from the other 5 by the fact that it is in the table with the entity.

The attribute table indicates which table or what type this or that attribute is and Magento already knows where to write it and where to read it from.

Such storage of values ​​allows you to have quite easily implemented attribute of the set (when each entity can have its own attribute or not have it at all), adding a new attribute is just one more line in the database. Added a new value for 1 attribute for another store - a new line in the table of values ​​for this attribute.

How is it stored in the database
Entity:
Product - catalog_product_entity,
Category - catalog_category_entity,
Customer - customer_entity,
Customer address - customer_address_entity

Attribute:
eav_attribute
catalog_eav_attribute
customer_eav_attribute

Value:
* _text
* _varchar
* _int
* _decimal
* _datetime

Flat is an approach familiar to all of us, where everything is in one place and we don’t need any additional tables to get the product and all its attributes without unnecessary work - SELECT * FROM plate WHERE id = some kind of id and that's it.

From EAV entities, Flat representation can be used only for categories and for goods.

How is it stored in the database
Product:
catalog_product_flat_1 // * _N store_view
Category:
catalog_category_flat_1 // * _N store_view

In order to include an attribute in a Flat table and generally enable the use of Flat tables, do the following
In the admin panel, Catalog> Attributes> Manage attributes

Magento will add the attribute to the Flat table if the attribute has 1 of the following values.



In the admin panel, System> Configuration> Catalog

Magento will use Flat tables for the entities listed below.



Pay attention to the following facts:

  1. Flat tables are used ONLY on category pages, the list of products in the Group product, and indeed wherever the collection is used. They are not used on the product page, in the admin panel, when using the load method on the model.
  2. After you enable Flat tables, you need to reindex, otherwise Magento will continue to use only EAV tables
  3. After enabling Flat tables, Magento still continues to use EAV, but also starts copying changes to the Flat table when saving changes

Why is all this necessary and why not use the Flat approach everywhere? Take a look at the pivot table of the pros and cons
EAV:
+ A more flexible system than Flat
+ When adding a new attribute there is no need to reindex data
+ Almost unlimited number of attributes
+ All attributes are always available
+ Static attributes (sku, created_at, updated_at) are always present in the selection, even if they are not specifically indicated
- Fatal error: Call to a member function getBackend () when fetching / filtering by a non-existing attribute
- Performance

Flat:
+ Performance
+ Only existing attributes that are added to the Flat table can be applied to fetching / filtering
- Restriction on row size (up to 65.535 bytes, i.e. 85 varchar 255) and the number of columns (InnoDB up to 1000, some up to 4096)
- It is used only when working with collections (EAV
is always used when loading) - The result is different from issuing a request with EAV (no static attributes)
- After inclusion, reindexing is required, otherwise EAV tables will be used
- When adding a new attribute, you need to reindex Flat tables



Cache


Of course, each of you can tell me that why do we need to figure out how to speed up queries in the database and generally how collections work if the cache saves us and everything is cached. I will answer briefly - the cache will not save you. None of the caches presented in Magento either automatically cache collections or does not work in your custom controllers and models that you use, say, when importing data or counting something. And besides, before it gets into the cache, because you need to somehow put it there and quickly show it to the user.

Types of caches in Magento 1. *:



  • Configuration - caches configuration files
  • Layout - caches layout files
  • Block HTML output - caches phtml templates. By default, it is used on the frontend only in the top menu and footer.
  • Translations - cache csv translate files
  • Collections data - caches collections that use the -> initCache (...) method. By default, only the core_store, core_store_group, core_website collections are cached during initialization.
  • EAV types and attributes - should cache eav attributes, BUT does not cache . Used in 1 method that is never called since Magneto CE 1.4
  • Web services cache - caches api.xml files
  • Page Cache (FPC) - caches all HTML, caches only CMS, Category, Product pages. Ignored if https protocol, get parameter? No_cache = 1, cookies NO_CACHE
  • DDL Cache (Hidden) - caches DESCRIBE calls to the database, used in write operations

... and no 1 caches collections automatically.


Proper work with collections


In order to show more clearly why something needs to be done differently than many are used to, I decided to give some performance tests of different approaches. Let's start with a test bench. For testing, I used:

Testbed:
OS X 10.10
3.1 GHz Intel Core i5 (4 cores)
8GB

Magento configuration:
Magento EE 1.14.0
MySQL 5.5.38
PHP 5.6.2

Content:
3 Categories
2000 Products
2000 CMS pages

Process:
For tests an extension was created with 1 controller and 1 action, each test was carried out 5 times, then the average time was calculated. All results are in seconds.

class Test_Test_IndexController extends Mage_Core_Controller_Front_Action
{
    public function indexAction()
    {
        $temp = array();
        $start = microtime(true);
        Init values
        Loop start
            $temp[] = $product->getSku();
        Loop end
        Or
        Some code snippet
        $stop = microtime(true);
        echo $stop - $start;
    }
}

Pseudo code

Tests


  1. EAV / Flat with and without reloading models
  2. Collection Caching
  3. Proper use of count () and getSize ()
  4. Proper use of getFirstItem and setPage (1,1)

EAV / Flat with and without reloading models


Collection cycle. With load (reboot) models inside the loop:

$temp = array();
$collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect(...);
foreach ($collection as $product) {
    $product = $product->load($product->getId());
    $temp[] = $product->getSku();
}

Collection cycle. Without load models inside:

$temp = array();
$collection = Mage::getModel('catalog/product')->getCollection()->addAttributeToSelect(...);
foreach ($collection as $product) {
    $temp[] = $product->getSku();
}

3 types of data sampling:

  1. addAttributeToSelect ('*'); // all attributes
  2. addAttributeToSelect ('sku'); // 1 static attribute
  3. addAttributeToSelect ('name'); // 1 standard attribute

results


As you probably noticed, the time without reloading the models is TIME less than when you reloading the models. Also, the time is even shorter when Flat tables are enabled (i.e. there are no extra joins and unions) and we select only the necessary attributes.

In the first case, we boot with a bunch of joins ... and then do it again, but for the modelka and so 2000 times.

The second time we do this for attribute statics (it is in the same label as the product itself) and Magento does not need to do joins. Therefore, time is less.

For the third time, Magento needs to attach another plate where this attribute is stored.

With Flat tables, everything is similar, but in 2 cases everything is identical - this is because both attributes are in 1 table, hence the time is identical.

I think the numbers speak for themselves.


Collection Caching


Without cache:

$collection = Mage::getModel('catalog/product')->getCollection()
                                               ->addAttributeToSelect('*');

Using the initCache method:

$collection = Mage::getModel('catalog/product')->getCollection()
                                               ->addAttributeToSelect('*')
                                               ->initCache(Mage::app()->getCache(),'our_data',array('SOME_TAGS'));

Custom Caching Implementation:

$cache = Mage::app()->getCache();
$collection = $cache->load('our_data');
if(!collection) {
	$collection = Mage::getModel('collection/product')->getCollection()->addAttributeToSelect('*')->getItems();
	$cache->save(serialize($collection),'our_data',array(Mage_Core_Model_Resource_Db_Collection_Abstract::CACHE_TAG));
} else {
	$collection = unserialize($collection);
}

Consider a sample without using a cache, using the method that Magento offers us and with a crutch that I haven’t seen anywhere ... I made a pile based on cache model methods. Please note that for all tests, after compiling the request, I loaded the data and converted the collection to an array of objects.

results


Without a cache, actually nothing surprising ... everything is as usual.

But using the Magenta cache, I personally was surprised when I saw that the time was longer. And about EAV caching is generally a stupid undertaking, because the EAV collection first loads the entities from the product table (this is exactly what is cached), and then it selects attribute values ​​with a separate request and fills the objects. In Flat there, everything from 1 table is chased. But nevertheless, it takes more time to work with the cache than with the database (I tested both with the file system and with redis - the fourth digit after the comma is different ... that is, it doesn’t exist on 2k entities). The essence of the InitCache method is that it first collects all the data in the collection itself (pagination, filters, events, and so on), creates a hash from the sql request and looks for it in the cache, and if there is something there, then it is anserlsizes, and then all events and subsequent methods are launched. This is the slowest procedure in the whole process, it is here that the cache is slower than a simple query in the database. But then it does not send a request to the database ... which is not so scary already.

There is a separate example with a cache written by me on my knee, there we cache the final result of the collection, and bypassing all events and loading attributes. This works for EAV and for Flat collections.

Proper use of count () and getSize ()


getSize ()

$size = Mage::getModel('catalog/product')->getCollection()
                                         ->addAttributeToSelect('*')
                                         ->getSize();

count ()

$size = Mage::getModel('catalog/product')->getCollection()
                                         ->addAttributeToSelect('*')
                                         ->count();

results


The difference between the methods is that count () loads all the objects in the collection, and then, with the usual php count, it counts the number of objects and returns us a number. getSize does not load the collection, but generates 1 more request to the database, where there are no limits, orders and a list of selectable attributes, there is only COUNT (*).

An example of using both methods is this:

If you need to know if there are any values ​​in the database at all or how many there are, use getSize, if in any case you need a collection loaded, or already loaded then use count () - it will return you the number of elements loaded to the collection.

Proper use of getFirstItem and setPage (1,1)


getFirstItem ()

$product = Mage::getModel('catalog/product')->getCollection()
                                            ->getFirstItem();

setPage (1,1)

$product = Mage::getModel('catalog/product')->getCollection()
                                            ->setPage(1,1)
                                            ->getFirstItem();

load ()

$product = Mage::getModel('catalog/product')->load(22);

results


The problem with getFirstItem is that it loads the entire collection, and then simply returns the first element in foreach, and if it is not, it returns an empty object.

setPage (aka $ this-> setCurPage ($ pageNum) -> setPageSize ($ pageSize)) limits the selection to exactly 1 record, which, as you can see, significantly speeds up the loading of the result.

Even load is faster than getFirstItem, but note that load turned out to be slower than fetching from a collection of 1 element. This is because load always works with EAV tables.



conclusions


To summarize everything written above, I want to advise all people working with Magento:

  • Never call the load method again on objects retrieved from the collection.
  • Download only required attributes
  • If applicable to the project, use Flat tables.
  • Use count to count the results of the loaded collection and getSize to get the number of all records.
  • Do not use the getFirstItem method without setPage (1,1) or similar methods

Also popular now: