Igor_Le April 17, 2018 at 15:47

What froze on feature freeze

On April 8, the committeefest 2018-03 ended . Those patches that are not on it (and on the 3 previous commit fests) will no longer fall into the PostgreSQL 11 release: the feature freeze has occurred. Time to take stock.

The main news of the latest committeefest (and version 11, respectively):

weighty set of patches for sectioning.
Only one patch is dedicated to JIT compilation, but this is a step in a direction that is likely to develop intensively in the future.
“Covering” indexes (INCLUDE indexes). This topic is already actively discussed and continues in development.
A series of patches in a group of procedural languages. They are important, including for compatibility with SQL standards and migration with Oracle.
Interesting, but not so resonant patches.

Let's start in random order.

Jit compilation

JIT compiling expressions & tuple deforming
Without JIT, query execution is an interpretation of the execution plan. With JIT, that is, with just-in-time compilation, or on the fly, the individual parts of the request are compiled, and due to this, the request is faster. A typical example is JIT compilation of expressions in SQL queries. This patch , written by Andres Freund from EnterpriseDB, also has a psychological component: the LLVM compiler is now used for JIT- it is even called a "compiler infrastructure." It is often used specifically for JIT, it is convenient, allows you to embed functions in code (inline), optimizes the code and it is quite universal from the point of view of target platforms. When deforming records (expanding lines in memory), LLVM is also used, and this also improves performance.

INCLUDE indices

Covering B-tree indexes (aka INCLUDE)
INCLUDE indices are a big patch of Russian traversal, it began to be torn apart 2 years ago by Anastasia Lubennikova and continued by Alexander Korotkov and Fedor Sigaev(all of them are from Postgres Professional - well, yes, we are not indifferent to the domestic contribution to the community). INCLUDE indexes are sometimes called covering indexes, but, strictly speaking, a covering index for a particular query is an index that contains all the table columns needed in the query. And the meaning of INCLUDE indexes is that the index can become covering not by including additional columns in it, but by storing additional information (non-indexed values). For example, this way you can expand a unique index, which will remain unique.

CREATE UNIQUE INDEX newidx ON newt USING btree (c1, c2) INCLUDING (c3, c4);

They allow you to increase productivity, since index only scan is usually much faster, and compared to other methods, switching to index only scan is less cumbersome: you can get by with one index where 2 or more were needed, which means less time and resources are spent on updating indexes on insert and update. The Covering B-tree indexes (aka INCLUDE) patch is also the first step because support for covering indexes for their other types will follow: work has already begun, for example, on supporting GiST.

Examples are on Alexander Alekseev’s blog (Postgres Pro) and Alexey Lesovsky’s (Data Egret) blog .

New in Procedural Languages (Eisentraoute)

SQL procedures,
Transaction control in procedures,
PL / pgSQL nested CALL with transactions,
SET TRANSACTION in PL / pgSQL ,
INOUT parameters in procedures
These patches by Peter Eizentraut from 2ndQuadrant make PostgreSQL procedural languages literally procedural: now there will be full-fledged functions stored procedures. The SQL procedures patch was adopted late last year. Since then, the interpreter has understood the syntax with the CREATE / ALTER / DROP PROCEDURE commands, calling the CALL procedure, as well as ROUTINE. In January, the most valuable was added - transaction management in procedures: Transaction control in procedures. And here is this patchINOUT allows you to create procedures in this way:

CREATE PROCEDURE foo(INOUT a int)
LANGUAGE plpgsql
AS $$
begin
SELECT 1 into a;
end;$$;

Previously, SET TRANSACTION was only possible in SQL, but not in plpgsql. With the patch, this is already possible, and this is also a step towards migrating with Oracle. Nested function calls (and DOs) with transactions are also now possible .

Partitioning

The NTT team, Amit Langote, developers from 2ndQuadrant and others, worked on this large series of patches. Back in November last year, a patch was added that added hash partitioning. Now all the main types of sectioning in PostgreSQL are.

But the most important news is different: a series of patches makes it possible to create unique indexes, PRIMARY KEY globally on the entire partitioned table (you can find some information about it here, for example: unique indexes on partitioned tables . Therefore, you can also create tables that reference the partitioned table: foreign keys and partitioned tables . It will be possible to update a partitioned table without thinking about whether the record would remain in the same section (version 10 would give an error):UPDATE of partition key . Appeared ON CONFLICT DO UPDATE for partitioned tables.

That is, you can handle a partitioned table almost like with a regular one. Almost - because the features will only work if the fields that make up the partition key are included in the unique index. But this is a huge step anyway.

As for Add support for tuple routing to foreign partitions , this patch, which automatically directs inserted records to external sections, is important also for those who will create systems with sharding based on new partitioning capabilities.

The most important skill of the optimizer is to effectively exclude from the plan sections in which there is obviously no data (patch faster partition pruning in planner) In the case when there are a lot of sections (and in real projects there are thousands, or even tens of thousands), the exclusion of sections (pruning) can seriously reduce the query execution time. It will be possible to exclude unwanted ones at the execution stage ( Runtime Partition Pruning patch ), when the condition of getting into one or another section is not known in advance. This happens, for example, in queries with subqueries.

Partition-wise join for declarative partitioned tables is an implementation of the algorithms for joining two partitioned tables. The connection takes place separately in sections, and then comes together. In many cases, this is faster than joining parent tables. Similarly with Partition-wise aggregation / groupingaggregation covers first individual sections, then the results are collected.

Among the amenities, a Default Partition for Range section will appear , where all records that go beyond the boundaries of the specified sections would fall in order not to stop each time due to an error. Automatic creation of sections for data whose range is not known in advance, is not even planned yet (the pg_pathman extension can do this ).

JSON (B)

In this direction, efforts have been made for more than 2 years by the Postgres Pro development team. Since patches are heavy and affect many postgres mechanisms, they are accepted by the community leisurely. PostgreSQL 11 includes 3 patches: Anton Bykov's
Jsonb transform for pl / perl and Anton Bykov's
Jsonb transform for pl / python (effective transformations of binary JSON when passing them to Perl and Python functions) and Anastasia Lubennikova's Cast jsonb to numeric, int, float, bool ( type conversion). But key patches like SQL / JSON support in PostgreSQL, or SQL / JSON: jsonpath , or SQL / JSON: functions are still waiting. But this is support for the SQL / JSON standard.

In the context of JSON, we can also mention the Konstantin Knizhnik patch , useful for surjective functions working with JSON, for example, of the form (info - >> 'name'). but may be useful for other purposes.

Parallel Gather and sorting when creating B-tree indexes

Gather speed-up works more efficiently with queues in memory, speeds up requests, especially simple ones.

Parallel tuplesort (for parallel B-Tree index creation) . This is another January patch - parallel sorting of records for B-tree indexes.

index-only count (*) for indexes supporting bitmap scans ( A. Kuzmenkov , Postgres Professional) was adopted at the end of last year. Queries of the form SELECT (*) ... WHERE ..., where the information needed for the query in the expression is contained in the indexes, can now be significantly accelerated.

Vacuum

Not a very fundamental change, but still: now you can run VACUUM of several tables with one command: Allow users to specify multiple tables in VACUUM commands . The patch was adopted at the end of last year. At the same time, the most important patches regarding the priorities of the evacuation of various tables, while evacuation schedules are waiting.

Logical replication

She didn’t make much progress in vanilla PostgreSQL. Added support for TRUNCATE: Logical decoding of TRUNCATE

Checksums

Verify Checksums during Basebackups . Now you can check the checksums in the backup process (if checksums are included).

Contribution to version 11 of domestic developers is significant. But this is the topic of another story. In the meantime, thanks to all the developers (and reviewers) of the upcoming release!

[photo of the author. in the photo for freeze, the hero of the film "Leviathan" - Kirovsk, Kola Peninsula.]

Tags: