PostgreSQL Indexes for Zabbix

    We recently moved Zabbix to work with a PostgreSQL database. Together with moving to a server with SSD, this gave a significant increase in speed. Also solved the problem with duplicate hosts in the database, bug request . This article could end here, but we noticed that Zabbix uses a lot of disk space, so below I will tell you how it turned out. And how we figured it out.

    We have Zabbix with a relatively large database. It monitors almost 1,500 hosts and collects about 180 thousand metrics. We use partitioning in the database, which facilitates the cleaning of historical data. In this case, for each partition has its own index. Well, you understand what I'm hinting at.

    Yes, now we will talk about indexes. We found that a number of indices grows almost 2 times, with our volumes they occupy 5-7 Gb for each partition. And provided that we store historical data for 10 days and trends for 3 months, a total of about 70 Gb extra is obtained. With a total database volume of about 220 Gb, the use of SSDs is very tangible.

    Approach number one. They solved the problem head on and launched full reindex. It turned out well, released almost 70 Gb, as expected. Disadvantage: time to complete the operation. Or rather, even the fact that at this time the lock is put on the table, and reindex takes about 3 hours.

    Approach number two. Looked towards pg_repack. This utility allows you to produce vacumm full and reindex without locking on tables. They installed it, set it up, started it and got ready to rejoice. Imagine our disappointment when we saw that less than a gigabyte was freed. We open the documentation and read it very carefully, there is an item - verbatim:

    “Target table must have a PRIMARY KEY, or at least a UNIQUE total index on a NOT NULL column”. Next, open the database and see that the tables that we need do not have either.

    Of course, it would be possible to add them, but this will increase the size of the database and the developers clearly had a reason not to add PRIMARY KEY. We played around and found that, for example, in a table that contains information on monitoring logs, rows can be completely duplicated. In the end, we had to abandon pg_repack.

    The third approach is victorious. If you take the partition for the previous day, then no one writes to it, and you can make reindex with it without downtime of the Zabbix server. Invented - verified. Nope, it turns out, lock is superimposed on the entire table, and not on the partition. Well, well, if reindex cannot be made, why not make a new index, and here we finally found a solution. The algorithm is simple:

    1. We create a new index with the construction: "CREATE INDEX CONCURRENTLY", which allows us not to impose a long lock on the database, since no one writes to this partition, the index is created successfully.
    2. Delete the old index.
    3. Rename the new index to the old.
    4. Happily beat yourself in the chest.

    Now every night we run a script that goes through the tables and does all these operations.

    Script
    for i in `seq 1 10`; do
            dd=`date +%Y_%m_%d -d "$i day ago"` 
            index_name="history_p$dd""_1" 
            index_name_new="history_p$dd""_1_new" 
            echo "$index_name" 
            psql -U postgres -d zabbix  -c "CREATE INDEX CONCURRENTLY $index_name_new  ON partitions.history_p$dd USING btree (itemid, clock); " 
            echo "Done CREATE" 
            psql -U postgres -d zabbix  -c "DROP INDEX partitions.$index_name" 
            echo "Done DROP" 
            psql -U postgres -d zabbix  -c "ALTER INDEX  IF EXISTS  partitions.$index_name_new RENAME TO $index_name"  
            echo "Done ALTER" 
            done
    echo "Reindex history_uint_p Start \n" 
    for i in `seq 1 10`; do 
            dd=`date +%Y_%m_%d -d "$i day ago"`
            index_name="history_uint_p$dd""_1" 
            index_name_new="history_uint_p$dd""_1_new" 
            echo "$index_name" 
            psql -U postgres -d zabbix  -c "CREATE INDEX CONCURRENTLY $index_name_new  ON partitions.history_uint_p$dd USING btree (itemid, clock); " 
            echo "Done CREATE" 
            psql -U postgres -d zabbix  -c "DROP INDEX partitions.$index_name" 
            echo "Done DROP" 
            psql -U postgres -d zabbix  -c "ALTER INDEX  IF EXISTS  partitions.$index_name_new RENAME TO $index_name" 
            echo "Done ALTER" 
            done
    echo "Reindex trends_p Start \n" 
    for i in `seq 1 2`; do
            dd=`date +%Y_%m -d "1 month ago"`
            index_name="trends_p$dd""_1" 
            index_name_new="trends_p$dd""_1_new" 
            echo "$index_name" 
            psql -U postgres -d zabbix  -c "CREATE INDEX CONCURRENTLY $index_name_new  ON partitions.trends_p$dd USING btree (itemid, clock); " 
            echo "Done CREATE" 
            psql -U postgres -d zabbix  -c "DROP INDEX partitions.$index_name" 
            echo "Done DROP" 
            psql -U postgres -d zabbix  -c "ALTER INDEX  IF EXISTS  partitions.$index_name_new RENAME TO $index_name" 
            echo "Done ALTER" 
            done
    echo "Reindex trends_uint_p Start \n" 
    for i in `seq 1 2`; do
            dd=`date +%Y_%m -d "1 month ago"`
            index_name="trends_uint_p$dd""_1" 
            index_name_new="trends_uint_p$dd""_1_new" 
            echo "$index_name" 
            psql -U postgres -d zabbix  -c "CREATE INDEX CONCURRENTLY $index_name_new  ON partitions.trends_uint_p$dd USING btree (itemid, clock); " 
            echo "Done CREATE" 
            psql -U postgres -d zabbix  -c "DROP INDEX partitions.$index_name" 
            echo "Done DROP" 
            psql -U postgres -d zabbix  -c "ALTER INDEX  IF EXISTS  partitions.$index_name_new RENAME TO $index_name" 
            echo "Done ALTER" 
            done
    


    Please do not strictly criticize the script - this is a draft for clarity of the article.

    Thanks for attention!

    Also popular now: