Optimizing the Nested Set Model in PHPixie

    image

    Just 2 hours ago, I added the last test to a new type of communication for PHPixie ORM - Nested Set. I thought for a long time whether to use this approach or the Closure Table to store trees in SQL databases. But as a result, the Closure Table lost due to the quasi-quadratic dimensions to which the link table grows (with 20 nodes in the worst case, 190 records can already be obtained). So the next task was to optimize the classic Nested Set approach, and I really liked the result.



    One of the main problems with using the Nested Set is the cost of inserting a node. The left the insertion position, the more records you need to shift to the right. For example, we have such a tree:

    image


    Let's say we need to insert the Fay subcategory into Pixies, the result is this:

    image


    As you can see, all the nodes in the tree will have to change. Now imagine that in this way you save the comment tree for articles. Each time someone adds a new comment, an update of the heap of entries occurs, in addition, you must definitely do this all in the transaction, and it is advisable to prohibit changes to these lines by parallel requests. If all this is not done, then even several active users commenting at the same time can easily break all indexing. And the system performance is obviously also very affected.

    I found an elegant way to solve this problem by just slightly modifying the standard Nested Set. Its idea is simple, although implementation is noticeably more difficult in some places. To all nodes, add the identifier of the subtree in which they are located, for example, the root id. In each subtree, start numbering left and right again. The above example with this approach would look like this:

    image


    When inserting, we need to change only the nodes located in the same subtree:

    image


    As you can see, the Plants subtree has not changed at all. From a practical point of view, this reduces the number of changed rows in the database by orders of magnitude. If changes take place in different subtrees, they do not interfere with each other at all and even if indexing breaks in one of them, this will not affect the others in any way.

    For these amenities you have to pay with more complex code. It is easy to make a mistake and forget to change the root field when transferring a node to another subtree or taking it to the root, the procedure for turning records into an object tree is also a bit more complicated, etc. That’s why it took so long to write tests for all cases.

    Use in PHPixie

    A full dock will appear on the site soon, but here is a small guide for those who want to use the new connection now.

    First, add 4 INTEGER fields to the table in which the elements are stored: left , right , rootId , depth (of course, the field names are fully customizable, these are only default). Then add the link to /bundes/app/assets/config/orm.php

    return array(
         'relationships' => array(
              //....
             array(
                 'model' => 'category' //имя модели,
                 'type' => 'nestedSet'
             )
         )
    );
    


    Then everything is simple:

    $fairies->children->add($sprites);
    $fairies->children->add($pixies);
    //ну или так
    $pixies->parent->set($fairies);
    //поместить категорию в корень
    $pixies->parent->remove();
    //подгрузить все дерево с БД
    //заметьте условие where('depth', 0),
    //без него в результате мы бы получили все категории
    //а не только те что сверху (то есть кучу дубликатов)
    $categories = $orm->query('category')->where('depth', 0)->find(array('children'));
    


    By the way, PHPixie itself will monitor the removal of entities and correct indexes.

    I hope everyone will like the new functionality, and those who do not use PHPixie may find the optimization approach useful.
    If you are interested in any details or just want to chat, go to our chat room .

    Also popular now: