PostgreSQL Indexes - 1

    Foreword


    This series of articles will focus on indexes in PostgreSQL.

    Any question can be considered from different points of view. We will talk about what should be of interest to an application developer using a DBMS: what indexes exist, why there are so many different indexes in PostgreSQL, and how to use them to speed up queries. Perhaps the topic could be revealed with a smaller number of words, but we secretly hope for an inquisitive developer who is also interested in the details of the internal structure, especially since understanding such details allows not only to listen to someone else's opinion, but also to draw our own conclusions.

    Outside the discussion will remain the development of new types of indexes. This requires knowledge of the C language and relates more to the competence of the system programmer, rather than the application developer. For the same reason, we will practically not consider software interfaces, but focus only on what matters for using ready-to-use indexes.

    In this part, we will talk about the division of responsibilities between the general indexing mechanism related to the database engine and the individual index access methods that can be added as extensions in PostgreSQL. In the next part, we will look at the access method interfaceand important concepts such as classes and families of operators. After such a long but necessary introduction, we will examine in detail the device and the application of various types of indexes: Hash , B-tree , GiST , SP-GiST , GIN and RUM , BRIN and Bloom .

    Indices


    Indexes in PostgreSQL are special database objects designed primarily to speed up data access. These are auxiliary structures: any index can be deleted and restored again from the information in the table. Sometimes you hear that a DBMS can work without indexes, just slowly. However, this is not so, because indexes also serve to support some integrity constraints.

    There are currently six different types of indexes built into PostgreSQL 9.6, and another one is available as an extension - this has been made possible thanks to important changes in version 9.6. So in the near future we should expect the appearance of other types of indexes.

    Despite all the differences between the types of indexes (also called access methods), ultimately, any of them establishes a correspondence between the key (for example, the value of the indexed column) and the rows of the table in which this key occurs. Lines are identified using the TID (tuple id), which consists of the file block number and the position of the line inside the block. Then, knowing the key or some information about it, you can quickly read the lines in which the information we are interested in can be found without looking at the entire table.

    It is important to understand that the index, accelerating access to data, in return requires certain costs for its maintenance. For any operation on indexed data — whether it's inserting, deleting, or updating table rows — the indexes created for that table must be rebuilt, moreover, within the same transaction. Note that updating the table fields for which indexes were not created does not lead to the rebuilding of indexes; this mechanism is called HOT (Heap-Only Tuples).

    Extensibility has some implications. So that the new access method can be easily integrated into the system, a general indexing mechanism is allocated in PostgreSQL. Its main task is to obtain TID from the access method and work with them:

    • reading data from the corresponding versions of table rows;
    • selection by a separate TID, or immediately by a set of TID (with the construction of a bitmap);
    • checking the visibility of row versions for the current transaction, taking into account the level of isolation.

    The indexing mechanism is involved in query execution; it is called in accordance with the plan built at the optimization stage. The optimizer, sorting and evaluating various ways of query execution, must understand the capabilities of all access methods that can be potentially applied. Will the access method be able to give data right away in the right order, or should sorting be separately provided? is it possible to apply an access method to search for null? - such questions are constantly solved by the optimizer.

    Information about the access method is needed not only by the optimizer. When creating an index, the system must decide: is it possible to build an index over several columns? can this index provide uniqueness?

    So, each access method must provide all the necessary information about itself. Prior to version 9.6, the pg_am table was used for this, and starting from 9.6, the data migrated deeper into the special functions. We will get acquainted with this interface a bit later.

    The tasks of the access method itself include everything else:

    • implementation of an algorithm for constructing an index and pagination of data (so that any index is processed the same way by the buffer cache manager);
    • search for information in the index by the expression " indexed-field operator expression ";
    • assessment of the cost of using the index;
    • work with locks necessary for the correct parallel execution of processes;
    • generation of a write-ahead log (WAL).

    First, we look at the possibilities of a general indexing mechanism, and then move on to consider various access methods.

    Indexing mechanism


    The indexing mechanism allows PostgreSQL to work equally with a variety of access methods, given their capabilities.

    Basic Scanning Methods


    Index scan


    You can work differently with the TIDs supplied by the index. Consider an example: We created a table with three fields. The first field contains numbers from 1 to 100000, and an index has been created on it (for now it does not matter to us which one). The second field contains various ASCII characters except non-printable ones. Finally, the third field contains a logical value, true for about 1% of the lines, and false for the rest. Rows are inserted into the table in random order. Let's try to choose a value by the condition "a = 1". Note that the condition has the form “ indexed-field operator expression ”, where “equal” is used as the operator , and the expression (search key) is “1”. In most cases, the condition should be just such that the index can be used.

    postgres=# create table t(a integer, b text, c boolean);
    CREATE TABLE
    postgres=# insert into t(a,b,c)
      select s.id, chr((32+random()*94)::integer), random() < 0.01
      from generate_series(1,100000) as s(id)
      order by random();
    INSERT 0 100000
    postgres=# create index on t(a);
    CREATE INDEX
    postgres=# analyze t;
    ANALYZE





    postgres=# explain (costs off) select * from t where a = 1;
              QUERY PLAN          
    -------------------------------
     Index Scan using t_a_idx on t
       Index Cond: (a = 1)
    (2 rows)

    In this case, the optimizer has decided to use an index scan (Index Scan). When indexed, the access method returns TID values ​​one at a time, until matching rows end. The indexing mechanism turns to the pages of the table pointed to by TID, receives the version of the row, checks its visibility in accordance with the multiversion rules, and returns the data received.

    Bitmap Scan


    Index scanning works well when it comes to just a few values. However, as the selection increases, the chances are that you will have to return to the same tab page several times. Therefore, in this case, the optimizer switches to scanning by bitmap scan: First, the access method returns all TIDs matching the condition (Bitmap Index Scan node), and a bitmap of row versions is constructed from them. Then the row versions are read from the table (Bitmap Heap Scan) - in this case, each page will be read only once.

    postgres=# explain (costs off) select * from t where a <= 100;
                 QUERY PLAN            
    ------------------------------------
     Bitmap Heap Scan on t
       Recheck Cond: (a <= 100)
       ->  Bitmap Index Scan on t_a_idx
             Index Cond: (a <= 100)
    (4 rows)



    Please note that in the second step the condition can be double-checked (Recheck Cond). The selection may be too large for the bitmap of row versions to fit entirely in RAM (limited by the work_mem parameter). In this case, only a bitmap of pages containing at least one suitable version of the string is built. Such a “rough” card takes up less space, but when reading a page you have to double-check the conditions for each line stored there. Note that even in the case of a small sample (as in our example), the “Recheck Cond” step is still displayed in the plan, although it is not actually implemented.

    If conditions are imposed on several fields of the table, and these fields are indexed, scanning the bitmap allows (if the optimizer considers this advantageous) to use several indices at the same time. For each index, bitmaps of row versions are constructed, which are then bitwise logically multiplied (if the expressions are connected by the AND clause), or are logically added (if the expressions are connected by the OR clause). For example: Here, the BitmapAnd node combines two bitmaps using the bit operation “and”.

    postgres=# create index on t(b);
    CREATE INDEX
    postgres=# analyze t;
    ANALYZE
    postgres=# explain (costs off) select * from t where a <= 100 and b = 'a';
                        QUERY PLAN                    
    --------------------------------------------------
     Bitmap Heap Scan on t
       Recheck Cond: ((a <= 100) AND (b = 'a'::text))
       ->  BitmapAnd
             ->  Bitmap Index Scan on t_a_idx
                   Index Cond: (a <= 100)
             ->  Bitmap Index Scan on t_b_idx
                   Index Cond: (b = 'a'::text)
    (7 rows)



    Scanning using a bitmap avoids repeated accesses to the same data page. But what if the data in the pages of the table is physically ordered exactly like the index entries? Of course, you cannot completely rely on the physical order of the data in the pages - if you need sorted data, you must explicitly specify the ORDER BY clause in the request. But situations are possible in which “almost all” the data is actually ordered: for example, if the rows are added in the right order and do not change after that, or after the CLUSTER command is executed. Then building a bitmap is an extra step, a regular index scan will be no worse (if you do not take into account the possibility of combining several indices). Therefore, when choosing an access method, the scheduler looks into special statistics,

    postgres=# select attname, correlation from pg_stats where tablename = 't';
     attname | correlation
    ---------+-------------
     b       |    0.533512
     c       |    0.942365
     a       | -0.00768816
    (3 rows)

    Values ​​close in absolute value to unity indicate high ordering (as for column c), and values ​​close to zero, on the contrary, indicate a random distribution (column a).

    Sequential scan


    To complete the picture, it should be said that under a non-selective condition, the optimizer will prefer to use the index sequential scan of the entire table: And it will be right. The fact is that indexes work the better, the higher the selectivity of the condition, that is, the fewer rows satisfy it. As the selection increases, the overhead of reading index pages also increases.

    postgres=# explain (costs off) select * from t where a <= 40000;
           QUERY PLAN      
    ------------------------
     Seq Scan on t
       Filter: (a <= 40000)
    (2 rows)



    The situation is aggravated by the fact that sequential reading is faster than reading pages "randomly". This is especially true for hard drives, where the mechanical operation of bringing the head to the track takes significantly longer than reading the data itself; with SSDs, this effect is less pronounced. To account for the difference in access costs, there are two parameters seq_page_cost and random_page_cost, which can be set not only globally, but also at the level of table spaces, thus taking into account the characteristics of different disk subsystems.

    Covering Indices


    As a rule, the main task of the access method is to return the identifiers of the appropriate table rows so that the indexing mechanism can read the necessary data from them. But what if the index already contains all the data needed for the request? Such an index is called covering , in which case the optimizer can only use Index Scan:

    postgres=# vacuum t;
    VACUUM
    postgres=# explain (costs off) select a from t where a < 100;
                 QUERY PLAN            
    ------------------------------------
     Index Only Scan using t_a_idx on t
       Index Cond: (a < 100)
    (2 rows)

    The name may suggest that the indexing mechanism does not refer to the table at all, receiving all the necessary information exclusively from the access method. But this is not entirely true, because indexes in PostgreSQL do not contain information that makes it possible to judge the visibility of rows. Therefore, the access method returns all versions of rows that fall under the search condition, regardless of whether they are visible to the current transaction or not.

    However, if the indexing mechanism had to look into the table each time to determine visibility, this scanning method would not be any different from ordinary index scanning.

    The problem is solved by the fact that PostgreSQL supports the so-called visibility map for tables ,in which the vacuum process marks pages where data has not changed long enough for all transactions to see it, regardless of start time and isolation level. If the identifier of the row returned by the index refers to such a page, then visibility can not be checked.

    Therefore, regular cleaning increases the effectiveness of coating indexes. Moreover, the optimizer takes into account the number of uncleaned rows and can refuse to use only index scanning if it predicts a large overhead for checking visibility.

    The number of forced table calls can be found using the explain analyze command:

    postgres=# explain (analyze, costs off) select a from t where a < 100;
                                      QUERY PLAN                                  
    -------------------------------------------------------------------------------
     Index Only Scan using t_a_idx on t (actual time=0.025..0.036 rows=99 loops=1)
       Index Cond: (a < 100)
       Heap Fetches: 0
     Planning time: 0.092 ms
     Execution time: 0.059 ms
    (5 rows)

    In this case, it was not necessary to access the table (Heap Fetches: 0), since the cleanup has just been performed. In general, the closer this number is to zero, the better.

    Not all indexes store indexed values ​​themselves along with row identifiers. If the access method cannot return data, it cannot be used exclusively for index scanning.

    Null


    Uncertain values ​​play an important role in relational databases as a convenient way of representing the fact that a value does not exist or is not known.

    But special significance requires a special attitude to oneself. Ordinary Boolean logic turns into three-digit; it is unclear whether the indefinite value should be less than usual values ​​or greater (hence the special constructions for sorting NULLS FIRST and NULLS LAST); it is not obvious whether it is necessary to consider uncertain values ​​in aggregate functions or not; special statistics are required for the scheduler ...

    From the point of view of index support with indefinite values, there is also ambiguity: should such values ​​be indexed or not? If you do not index null, then the index can be more compact. But if you index, then it becomes possible to use the index for conditions of the form “ index-field IS [NOT] NULL”, as well as a covering index in the complete absence of conditions on the table (since in this case the index should return data from all rows of the table, including numbers and with undefined values).

    For each access method, its developers make their decision whether to index indefinite values ​​or not. But, as a rule, they are still indexed.

    Multiple Field Indexes


    Multiple field conditions can be supported using multi-column indexes . For example, we could create an index on two fields of our table: The optimizer most likely would prefer such an index to a combination of bitmaps, since here we immediately get the desired TIDs without any auxiliary actions: A multi-column index can be used to speed up the selection by condition on some of the fields - starting from the first: As a rule, if the condition is not imposed on the first field, the index will not be used. But in some cases, the optimizer may consider that it is still more profitable than sequential scanning. We will cover this topic in more detail when we look at btree indexes.

    postgres=# create index on t(a,b);
    CREATE INDEX
    postgres=# analyze t;
    ANALYZE



    postgres=# explain (costs off) select * from t where a <= 100 and b = 'a';
                       QUERY PLAN                  
    ------------------------------------------------
     Index Scan using t_a_b_idx on t
       Index Cond: ((a <= 100) AND (b = 'a'::text))
    (2 rows)



    postgres=# explain (costs off) select * from t where a <= 100;
                  QUERY PLAN              
    --------------------------------------
     Bitmap Heap Scan on t
       Recheck Cond: (a <= 100)
       ->  Bitmap Index Scan on t_a_b_idx
             Index Cond: (a <= 100)
    (4 rows)



    Not all access methods support the creation of indexes across multiple columns.

    Expression Indices


    We talked about the fact that the search condition should be in the form " indexed-field operator expression ". In the example below, the index will not be used, because instead of the field name itself, an expression with it is used: This particular query is not difficult to rewrite so that only the field name is to the left of the operator. But if this is not possible, indexes for expressions (functional indexes) come to the rescue : A functional index is created not by a table field, but by an arbitrary expression; the optimizer will take into account such an index for conditions of the form “ indexed-expression operator expression

    postgres=# explain (costs off) select * from t where lower(b) = 'a';
                    QUERY PLAN                
    ------------------------------------------
     Seq Scan on t
       Filter: (lower((b)::text) = 'a'::text)
    (2 rows)



    postgres=# create index on t(lower(b));
    CREATE INDEX
    postgres=# analyze t;
    ANALYZE
    postgres=# explain (costs off) select * from t where lower(b) = 'a';
                         QUERY PLAN                    
    ----------------------------------------------------
     Bitmap Heap Scan on t
       Recheck Cond: (lower((b)::text) = 'a'::text)
       ->  Bitmap Index Scan on t_lower_idx
             Index Cond: (lower((b)::text) = 'a'::text)
    (4 rows)

    ". If the calculation of the indexed expression is a costly operation, then updating the index will require significant computational resources.

    It should also be borne in mind that separate statistics are collected for the indexed expression. It can be seen in the pg_stats view by the name of the index: If necessary, you can control the number of histogram baskets in the same way as for regular table fields (taking into account that the column name can be different depending on the indexed expression):

    postgres=# \d t
           Table "public.t"
     Column |  Type   | Modifiers
    --------+---------+-----------
     a      | integer |
     b      | text    |
     c      | boolean |
    Indexes:
        "t_a_b_idx" btree (a, b)
        "t_a_idx" btree (a)
        "t_b_idx" btree (b)
        "t_lower_idx" btree (lower(b))

    postgres=# select * from pg_stats where tablename = 't_lower_idx';
    ...



    postgres=# \d t_lower_idx
     Index "public.t_lower_idx"
     Column | Type | Definition
    --------+------+------------
     lower  | text | lower(b)
    btree, for table "public.t"

    postgres=# alter index t_lower_idx alter column "lower" set statistics 69;
    ALTER INDEX

    Partial indexes


    Sometimes it becomes necessary to index only part of the table rows. Usually this is due to the strong uneven distribution: it makes sense to search for a rare value by index, but it is easier to find the frequent one by a full table scan.

    Of course, you can build a regular index on the “c” column, and it will work as we expect: In this case, the index takes 276 pages: But, since the “c” column is true for only one percent of the rows, 99% of the index just never not used. In this case, you can build a partial index: The size of such an index is reduced to 5 pages: In some cases, the difference in volume and performance can be very significant.

    postgres=# create index on t(c);
    CREATE INDEX
    postgres=# analyze t;
    ANALYZE
    postgres=# explain (costs off) select * from t where c;
              QUERY PLAN          
    -------------------------------
     Index Scan using t_c_idx on t
       Index Cond: (c = true)
       Filter: c
    (3 rows)

    postgres=# explain (costs off) select * from t where not c;
        QUERY PLAN    
    -------------------
     Seq Scan on t
       Filter: (NOT c)
    (2 rows)



    postgres=# select relpages from pg_class where relname='t_c_idx';
     relpages
    ----------
          276
    (1 row)



    postgres=# create index on t(c) where c;
    CREATE INDEX
    postgres=# analyze t;
    ANALYZE



    postgres=# select relpages from pg_class where relname='t_c_idx1';
     relpages
    ----------
            5
    (1 row)



    Sorting


    If the access method returns row identifiers in sort order, this gives the optimizer additional options for executing the query.

    You can scan the table and then sort the data: Or you can read the data using the index immediately in the sort order: Of all the access methods, only btree can return the data in sorted form, so we will postpone a more detailed conversation until we consider this type of index.

    postgres=# set enable_indexscan=off;
    SET
    postgres=# explain (costs off) select * from t order by a;
         QUERY PLAN      
    ---------------------
     Sort
       Sort Key: a
       ->  Seq Scan on t
    (3 rows)



    postgres=# set enable_indexscan=on;
    SET
    postgres=# explain (costs off) select * from t order by a;
              QUERY PLAN          
    -------------------------------
     Index Scan using t_a_idx on t
    (1 row)



    Parallel construction


    Typically, building an index requires setting a SHARE lock on the table. This lock allows you to read data from the table, but prohibits any changes while the index is being built.

    You can verify this if, at the time of creating the index, say, on table t, in another session, execute the query: If the table is large enough and is actively used in insert, update or delete mode, this may turn out to be unacceptable - changing sessions will wait for a long time to release the lock time. In this case, you can use the parallel creation of the index:

    postgres=# select mode, granted from pg_locks where relation = 't'::regclass;
       mode    | granted
    -----------+---------
     ShareLock | t
    (1 row)





    postgres=# create index concurrently on t(a);
    CREATE INDEX

    Such a command sets up a lock of the SHARE UPDATE EXCLUSIVE type, which allows reading and changing data (only changing the structure of the table, as well as cleaning, analyzing, or building another index on the same table, is prohibited).

    However, there is a downside. Firstly, the index will be built slower than usual, because instead of one pass through the table, two are performed, and you also need to wait for the completion of parallel transactions that modify the data.

    Secondly, during the parallel construction of the index, a deadlock or violation of the uniqueness constraint may occur. An index is nonetheless created, but in a “non-working” state; in this case, it must be deleted and recreated again. Non-working indexes are marked with the word INVALID in the output of the psql \ d command, and a complete list can be obtained by request: Continued .

    postgres=# select indexrelid::regclass index_name, indrelid::regclass table_name from pg_index where not indisvalid;
     index_name | table_name
    ------------+------------
     t_a_idx    | t
    (1 row)


    Also popular now: