afiskon April 9, 2018 at 17:50

What's New in PostgreSQL 11: INCLUDE Indexes

The release of PostgreSQL 11 will not take place soon, only in October. But featurefreeze has already arrived, which means we know what features were included in this release, and we can test them by collecting PostgreSQL from the master branch. Of particular note is a feature called INCLUDE indexes . The patch was originally written by Anastasia Lubennikova , and then finished by Alexander Korotkov and Fedor Sigaev . Pushing it into PostgreSQL took “only” something about three years.

Let's try to figure out what kind of indices these are. To begin, create a plate for the experiments:

create table test (k serial primary key, v text, ts timestamp);
insert into test (v, ts) select 'key_' || s , now() from generate_series(1, 10000) as s;

... and build a regular btree index on it:

create index on test (v);

Take a look at the execution plan for the following query:

=# explain select v, ts from test where v > 'key_1337' and v < 'key_2337';
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=31.57..112.09 rows=1101 width=16)
   Recheck Cond: ((v > 'key_1337'::text) AND (v < 'key_2337'::text))
   ->  Bitmap Index Scan on test_v_idx  (cost=0.00..31.29 rows=1101 width=0)
         Index Cond: ((v > 'key_1337'::text) AND (v < 'key_2337'::text))
(4 rows)

Watch what happens. Since the index is built on column v, and in the query we select v and ts, PostgreSQL is forced to execute the query in two steps. First, it goes through the index and finds rows that satisfy the condition. Then he has to go to the table to get ts.

The idea of INCLUDE indexes is to include all the data needed to complete the query directly in the index (but not index it). Thus, the query becomes possible to execute in one index scan.

Let's check:

drop index test_v_idx;
create index on test (v) include (ts);
explain select v, ts from test where v > 'key_1337' and v < 'key_2337';

Result:

 Index Only Scan using test_v_ts_idx on test  (cost=0.29..46.30 rows=1101 width=16)
   Index Cond: ((v > 'key_1337'::text) AND (v < 'key_2337'::text))
(2 rows)

Due to the fact that now we do not go to the table, the query should work faster. However, it is worth noting that in practice it all depends on your data. Each case is unique, so I deliberately do not bring here any synthetic benchmarks . It may turn out that on your data volumes, index only scan with include indexes works as fast as with regular indexes. And even the accumulated statistics tells PostgreSQL that the query is faster to make a heap scan. This can happen, for example, if the selectivity of your request is low.

Anyway, knowing about this feature is useful, and I am sincerely glad that it will appear in PostgreSQL 11.

Tags:

What's New in PostgreSQL 11: INCLUDE Indexes

Also popular now: