Hierarchy Implementation - Combining Adjacency List and Materialized Path through one-to-many

Keeping the hierarchy in MySQL is a pretty worn-out topic, having smoked the Habrt repeatedly, however I did not find for myself the optimal structure that combines ease of support and ease of use. The bicycle was invented by itself ...

Adjacency List (AL) is convenient:
  • self-sustaining data integrity (ON DELETE CASCADE)
  • ease of insertion / transfer of branches (updating affects one parent_id field of one element)
  • Ease of getting children to 1 level of nesting

But the main inconvenience arises when sampling:
  • getting all children of the branch (for a menu with a restriction on the level of nesting)
  • getting the total number of children (catalog information)
  • find the full path to the item (breadcrumbs, URL creation).

They do not represent difficulties in the solution, but they force either hardcode the number of levels or resort to recursion. By and large, you can put up with it, write a couple of functions with ugly code and forget about it all. But!
Having looked at the idea of ​​storing the full path in Materialized Path, I did not quite understand, but what prevents it from being stored in an external table with a one-to-many relationship? Someone will say that they say " it was already ", but there is one significant difference: parent_id !
So. Pages table:
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`parent_id` int(10) unsigned DEFAULT NULL,
`title` varchar(250) DEFAULT NULL,


Pages_paths table:
`item_id` int(10) unsigned DEFAULT NULL,
`parent_id` int(10) unsigned DEFAULT NULL,
`level` tinyint(3) unsigned DEFAULT '0',
`order` tinyint(3) unsigned DEFAULT '0',

We write the dependencies:
ALTER TABLE `pages`  ADD CONSTRAINT `pages_ibfk_1` FOREIGN KEY (`parent_id`) REFERENCES `pages` (`id`) ON DELETE CASCADE;
ALTER TABLE `pages_paths`
  ADD CONSTRAINT `pages_paths_ibfk_2` FOREIGN KEY (`parent_id`) REFERENCES `pages` (`id`) ON DELETE CASCADE,
  ADD CONSTRAINT `pages_paths_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `pages` (`id`) ON DELETE CASCADE;


Thus, you can "hang" this method on an existing table with AL and not interfere with the existing application logic.
Insert and modify tree operations are performed as with regular AL. When deleting the main element of a branch, the foreign key drags the whole branch and dependencies in the table with paths.
The only intervention is required at the stage of adding new elements and moving elements between branches. You can hang a trigger, but for greater compatibility with popular hosting, I decided to limit myself to simple PHP and pull updates from the script.
I tested everything on a table with 1000 records and 5 levels of nesting. In order not to be tormented by his hands, he wrote a simple table hammer:
Tree::Fixture( 'pages', 1000, 5 );

To get started, you need to initialize the paths for the existing table:
Tree::GeneratePaths('pages'); 

Processing this table on a development machine took ~ 10 seconds. After that, the path to the element can be pulled out of the path table with simple queries:
SELECT *  FROM `pages_paths` pp
JOIN `pages` p ON p.`id`=pp.`parent_id` 
WHERE item_id=:id ORDER BY `order`

Collect all the children (or count their number without JOIN`a):
SELECT * FROM `pages_paths` pp JOIN `pages` p ON p.`id`=pp.`item_id` 
WHERE pp.`parent_id`=:id ORDER BY pp.`level`, pp.`order`

If we add an element, we just need to update the paths, if we move, then first we bang the old branch of the paths (item_id =: id OR parent_id =: id) and in the new update the paths:
Tree::GeneratePaths( 'pages', $id );

Updates within a branch of 100-200 elements fit within 1 second, which is quite acceptable for my tasks - only the admin will see the delay.
Take a full look at the PHP Class and Core SQL .
In conclusion, examples of use ( or the whole ):
//Путь к элементу
$arr = Tree::GetPath( 'pages', $id );
//Получение всех детей
$arr = Tree::GetChildren( 'pages', $id );
//Количество детей
$num = Tree::GetChildrenCount( 'pages', $id );


I would be glad if someone can suggest how to optimize the execution time of the general indexing, the implementation in the form of a MySQL procedure, or other clever thoughts.

Also popular now: