PostgreSQL Indexes - 2


    Interface


    In the first part, we talked about the fact that the access method should provide information about yourself. Let's see how this interface works.

    The properties


    All properties of access methods are presented in the pg_am table (am - access method). From this table you can also get the list of available methods: Although sequential scanning can rightfully be attributed to access methods, it has historically turned out to be absent from this list. In PostgreSQL 9.5 and later, each property was represented by a separate field in the pg_am table. Starting with version 9.6, properties are polled by special functions and divided into several levels:

    postgres=# select amname from pg_am;
     amname
    --------
     btree
     hash
     gist
     gin
     spgist
     brin
    (6 rows)





    • access method properties - pg_indexam_has_property,
    • properties of a specific index - pg_index_has_property,
    • The properties of individual index columns are pg_index_column_has_property.

    The separation of access method and index levels is done with an eye to the future: at present, all indexes created based on the same access method will always have the same properties.


    The properties of the access method include the following four (using btree as an example):

    postgres=# select a.amname, p.name, pg_indexam_has_property(a.oid,p.name)
    from pg_am a,
    unnest(array['can_order','can_unique','can_multi_col','can_exclude']) p(name)
    where a.amname = 'btree' order by a.amname;
     amname |     name      | pg_indexam_has_property
    --------+---------------+-------------------------
     btree  | can_order     | t
     btree  | can_unique    | t
     btree  | can_multi_col | t
     btree  | can_exclude   | t
    (4 rows)


    • can_order
      The access method allows you to specify the sort order of values ​​when creating the index (currently applicable only for btree);
    • can_unique
      Uniqueness and primary key constraint support (applicable for btree only);
    • can_multi_col An
      index can be built on multiple columns;
    • can_exclude
      Support for EXCLUDE exception restriction.

    Properties related to the index (take for example an existing one):

    postgres=# select p.name, pg_index_has_property('t_a_idx'::regclass,p.name)
    from unnest(array['clusterable','index_scan','bitmap_scan','backward_scan']) p(name);
         name      | pg_index_has_property
    ---------------+-----------------------
     clusterable   | t
     index_scan    | t
     bitmap_scan   | t
     backward_scan | t
    (4 rows)


    • clusterable The
      ability to reorder table rows according to a given index (clustering by the CLUSTER command of the same name);
    • index_scan
      Support for index scanning. This property may seem strange, but not all indexes can produce TIDs one at a time - some give all the results at once and only support bitmap scanning;
    • bitmap_scan
      Support for bitmap scanning;
    • backward_scan
      Returns the result in the reverse order of the index creation.

    Finally, the column properties:

    postgres=# select p.name, pg_index_column_has_property('t_a_idx'::regclass,1,p.name)
    from unnest(array['asc','desc','nulls_first','nulls_last','orderable','distance_orderable','returnable','search_array','search_nulls']) p(name);
            name        | pg_index_column_has_property
    --------------------+------------------------------
     asc                | t
     desc               | f
     nulls_first        | f
     nulls_last         | t
     orderable          | t
     distance_orderable | f
     returnable         | t
     search_array       | t
     search_nulls       | t
    (9 rows)


    • asc, desc, nulls_first, nulls_last, orderable
      These properties are related to the ordering of values ​​(we will talk about them when we get to btree indexes);
    • distance_orderable
      Displays the result in sort order by operation (currently applicable only for gist and rum indices);
    • returnable The
      ability to use the index without accessing the table, that is, support exclusively for index access;
    • search_array
      Support for searching multiple values ​​for the construction " index-field IN ( constant_list )" or, what is the same, " index-field = ANY ( constant_array )";
    • search_nulls The
      ability to search by the conditions is null and is not null.

    Some of the properties we have already discussed in detail earlier. Some properties are currently implemented only by one method. We will consider such possibilities when we talk about this particular method.

    Classes and Families of Operators


    In addition to the set of “skills”, one must also know what types of data and with which operators the access method works. For this, PostgreSQL has the concepts of an operator class and an operator family .

    The operator class contains a minimal set of operators (and possibly auxiliary functions) for the index to work with some data type.

    A class is always part of a family of operators. Moreover, several classes may be included in one common family if they have the same semantics. For example, the integer_ops family includes the classes int8_ops, int4_ops, and int2_ops for different types of bigint, integer, and smallint that are identical in meaning:

    postgres=# select opfname, opcname, opcintype::regtype
    from pg_opclass opc, pg_opfamily opf
    where opf.opfname = 'integer_ops'
    and opc.opcfamily = opf.oid
    and opf.opfmethod = (select oid from pg_am where amname = 'btree');
       opfname   | opcname  | opcintype
    -------------+----------+-----------
     integer_ops | int2_ops | smallint
     integer_ops | int4_ops | integer
     integer_ops | int8_ops | bigint
    (3 rows)

    Another example: the datetime_ops family includes operator classes for working with dates (both without time and with time): The family can also include additional operators for comparing values ​​of different types. By grouping in a family, the scheduler can use the index for predicates with values ​​of different types. The family may also contain other auxiliary functions. In most cases, families and classes of operators do not need to know anything. Usually we just create an index, and it uses some class of operators by default. However, you can specify the operator class explicitly. A simple example when this is necessary: ​​in a database with a sort rule other than C, a regular text field index does not support the LIKE operation:

    postgres=# select opfname, opcname, opcintype::regtype
    from pg_opclass opc, pg_opfamily opf
    where opf.opfname = 'datetime_ops'
    and opc.opcfamily = opf.oid
    and opf.opfmethod = (select oid from pg_am where amname = 'btree');
       opfname    |     opcname     |          opcintype          
    --------------+-----------------+-----------------------------
     datetime_ops | date_ops        | date
     datetime_ops | timestamptz_ops | timestamp with time zone
     datetime_ops | timestamp_ops   | timestamp without time zone
    (3 rows)







    postgres=# show lc_collate;
     lc_collate
    -------------
     en_US.UTF-8
    (1 row)
    postgres=# explain (costs off) select * from t where b like 'A%';
             QUERY PLAN          
    -----------------------------
     Seq Scan on t
       Filter: (b ~~ 'A%'::text)
    (2 rows)

    This limitation can be overcome by creating an index with the operator class text_pattern_ops (note how the condition in the plan has changed):

    postgres=# create index on t(b text_pattern_ops);
    CREATE INDEX
    postgres=# explain (costs off) select * from t where b like 'A%';
                               QUERY PLAN                          
    ----------------------------------------------------------------
     Bitmap Heap Scan on t
       Filter: (b ~~ 'A%'::text)
       ->  Bitmap Index Scan on t_b_idx1
             Index Cond: ((b ~>=~ 'A'::text) AND (b ~<~ 'B'::text))
    (4 rows)

    System catalog


    To conclude this part, we give a small diagram of the system catalog tables related directly to classes and families of operators.



    All these tables, of course, are described in detail .

    Using the system catalog, you can find the answer to a number of questions without even looking at the documentation. For example, what types of data can this or that access method work with? What operators are included in the class (and, therefore, the index can be used for access by a condition that includes such an operator)? To be continued .

    postgres=# select opcname, opcintype::regtype
    from pg_opclass
    where opcmethod = (select oid from pg_am where amname = 'btree')
    order by opcintype::regtype::text;
           opcname       |          opcintype          
    ---------------------+-----------------------------
     abstime_ops         | abstime
     array_ops           | anyarray
     enum_ops            | anyenum
    ...



    postgres=# select amop.amopopr::regoperator
    from pg_opclass opc, pg_opfamily opf, pg_am am, pg_amop amop
    where opc.opcname = 'array_ops'
    and opf.oid = opc.opcfamily
    and am.oid = opf.opfmethod
    and amop.amopfamily = opc.opcfamily
    and am.amname = 'btree'
    and amop.amoplefttype = opc.opcintype;
            amopopr        
    -----------------------
     <(anyarray,anyarray)
     <=(anyarray,anyarray)
     =(anyarray,anyarray)
     >=(anyarray,anyarray)
     >(anyarray,anyarray)
    (5 rows)


    Also popular now: