Hide and seek with the optimizer. Game over, this is CTE PostgreSQL 12



    This article is a continuation of the story about new in PostgreSQL 12. We already analyzed SQL / JSON (JSONPath patch) in the article “What they froze on feature freeze 2019. Part I. JSONPath” , now it's CTE's turn.

    Cte


    CTE is Common Table Expression - common table expressions, they are also called WITH constructs. In fact, this is the creation of temporary tables, but existing only for one query, and not for the session. They can be accessed inside this request. Such a request is well read, it is understandable, it is easy to modify it if necessary. This is a very popular thing, and it has been in PostgreSQL for a long time.

    But amenities can be expensive. The problems are related to the materialization of the expression after AS inside the WITH ... AS () construct. It is also called an internal expression and is calculated before you start calculating the rest; it cannot be embedded in a top-level query (no inlining). Planning for this expression does not take into account the rest of the request. This behavior is called a barrier to optimization, or fencing. In addition, materialization itself requires work_mem. And if the sample is large, then the problems begin (for example, there is a report by Ivan Frolkov at PGConf 2019).

    Hide and seek with the optimizer, which we will analyze below, is not a bug in general, but a feature. Of course, there are situations when preliminary calculation of a part of an expression eliminates, say, unnecessary repeated operations in recursive queries. On the other hand, many developers used CTE as a view without thinking about the barrier, and as a result, requests from CTE were executed not only more slowly than equivalent (but more sophisticated) requests with subqueries, but slower by orders of magnitude. After weighing the pros and cons, the community took a decisive step: it changed the default behavior.

    We will observe the work of CTE on such a plate:

    CREATE TABLE xytable AS SELECT x, x AS y FROM generate_series(1,10000000) AS x;
    CREATE INDEX ON xytable(x,y);

    Table "public.xytable"
        Column    |  Type   |     Collation    |    Nullable    | Default
    --------------+---------+------------------+----------------+---------
    x             | integer |                  |                |
    y             | integer |                  |                |
    Indexes:
    "xytable_x_y_idx" btree (x, y)

    Let's start with a simple request:

    SELECT * FROM xytable WHERE x=2 AND y>1;
                                                              QUERY PLAN                                                          
    -----------------------------------------------------------------------------
     Index Only Scan using xytable_x_y_idx on xytable  (cost=0.43..8.46 rows=1 width=8) (actual time=0.016..0.017 rows=1 loops=1)
          Index Cond: ((x = 2) AND (y > 1))
          Heap Fetches: 1
     Planning Time: 0.075 ms
     Execution Time: 0.035 ms
    (5 rows)

    Everything is considered instantly, only the index is used.

    A query with a subquery that computes the same, but with a syntax a little more complicated:

    SELECT * FROM
        (SELECT * FROM xytable WHERE y>1) AS t WHERE x=2;
                                                              QUERY PLAN                                                          
    ---------------------------------------------------------------------------------
     Index Only Scan using xytable_x_y_idx on xytable  (cost=0.43..8.46 rows=1 width=8) (actual time=0.016..0.016 rows=1 loops=1)
       Index Cond: ((x = 2) AND (y > 1))
       Heap Fetches: 1
     Planning Time: 0.062 ms
     Execution Time: 0.029 ms
    (5 rows)

    Everything is in order, very fast index calculation.

    And now one more logically equivalent request, but with CTE:

    WITH yy AS (
         SELECT * FROM xytable WHERE y>1)
    SELECT * FROM yy WHERE x=2;
    QUERY PLAN
    ------------------------------------------
    CTE Scan on yy (actual time=0.099..3672.842 rows=1 loops=1)
       Filter: (x = 2)
       Rows Removed by Filter: 9999998
       CTE yy
          -> Seq Scan on cte (actual time=0.097..1355.367 rows=9999999 loops=1)
              Filter: (y > 1)
              Rows Removed by Filter: 1
     Planning Time: 0.088 ms
     Execution Time: 3735.986 ms
    (9 rows)

    Such a delay is already visible to the naked eye. You won’t drink coffee, but there’s enough time to look into the mail (when we have the 11th version or earlier).

    And here’s what happened: in the case of subqueries, the optimizer immediately realized that the conditions x = 2 and y> 1 can be combined into one filter and searched by index. In the case of CTE, the optimizer has no choice: it must first deal with the condition inside the WITH ... AS construct, materialize the result, and only then work on.

    And here the point is not that materialization will require resources: if the condition is y <3, then not millions of records will have to be materialized, but only 2. Monstrous time for a simple query is spent on sequential search, the optimizer cannot use index search because so that the composite index is built on x, and only then on y, and he won’t know anything about the query with condition x = 2 until he fulfills the internal CTE condition. It is beyond the barrier.

    So, prior to PostgreSQL 12, the default was materialization, now its absence. We launch the same request based on the new version. Barrier, as it were, the optimizer immediately sees the entire request:

    WITH yy AS (
         SELECT * FROM xytable WHERE y>1)
    SELECT * FROM yy
    WHERE x=2;

    QUERY PLAN
    ------------------------------------------
    Index Only Scan using xytable_x_y_idx1 on xytable  (cost=0.43..8.46 rows=1 width=8) (actual time=0.015..0.016 rows=1 loops=1)
           Index Cond: ((x = 2) AND (y > 1))
           Heap Fetches: 1
     Planning Time: 0.067 ms
     Execution Time: 0.029 ms
    (5 rows)

    The optimizer instantly learned to combine the conditions in the optimal order - as was the case with the subqueries.

    But the defaults are defaults, and for complete ownership of the situation now, in version 12 there is a controlled, controlled materialization of CTE:

    WITH cte_name
    AS [NOT] MATERIALIZED

    Let's materialize:

    EXPLAIN ANALYZE  WITH yy AS MATERIALIZED (
         SELECT * FROM xytable WHERE y>1)
    SELECT * FROM yy WHERE x=2;

    QUERY PLAN
    ---------------------------
     CTE Scan on yy  (cost=356423.68..581401.19 rows=49995 width=8) (actual time=661.038..3603.292 rows=1 loops=1)
       Filter: (x = 2)
       Rows Removed by Filter: 9999998
       CTE yy
         ->  Bitmap Heap Scan on cte  (cost=187188.18..356423.68 rows=9999000 width=8) (actual time=661.032..2102.040 rows=9999999 loops=1)
               Recheck Cond: (y > 1)
               Heap Blocks: exact=44248
               ->  Bitmap Index Scan on xytable_x_y_idx1  (cost=0.00..184688.43 rows=9999000 width=0) (actual time=655.519..655.519 rows=9999999 loops=1)
                     Index Cond: (y > 1)
     Planning Time: 0.086 ms
     Execution Time: 3612.840 ms
    (11 rows)

    Everything is as in 11 and before it, you can look at mail in the standby mode of query results. We prohibit materialization, remove the barrier:

    EXPLAIN ANALYZE  WITH yy AS NOT MATERIALIZED (
         SELECT * FROM xytable WHERE y>1)
    SELECT * FROM yy WHERE x=2;
    QUERY PLAN
    ---------------------------
     Index Only Scan using xytable_x_y_idx1 on xytable  (cost=0.43..8.46 rows=1 width=8) (actual time=0.070..0.072 rows=1 loops=1)
       Index Cond: ((x = 2) AND (y > 1))
       Heap Fetches: 1
     Planning Time: 0.182 ms
     Execution Time: 0.108 ms
    (5 rows)

    Again, no respite: it counts instantly.
    Remained nuances. But important nuances.

    CTE materializes by default if it is accessed more than once.


    At first glance, materialization in such cases is a reasonable decision: why calculate the same thing twice. In practice, this often leads to what we observed above. To force refusal from materialization, it is necessary to explicitly order the optimizer: NOT MATERIALIZED.

    We execute without NOT MATERIALIZED request with double WHERE:

    WITH yy AS ( 
         SELECT * FROM xytable WHERE y > 1)
    SELECT (
         SELECT count(*) FROM yy WHERE x=2), (
         SELECT count(*) FROM yy WHERE x=2);

    QUERY PLAN
    ---------------------------------------------------------------------------
    Result (actual time=3922.274..3922.275 rows=1 loops=1)
       CTE yy
          -> Seq Scan on xytable (actual time=0.023..1295.262 rows=9999999 loops=1)
              Filter: (y > 1)
              Rows Removed by Filter: 1
       InitPlan 2 (returns $1)
          -> Aggregate (actual time=3109.687..3109.687 rows=1 loops=1)
              -> CTE Scan on yy (actual time=0.027..3109.682 rows=1 loops=1)
                  Filter: (x = 2)
                  Rows Removed by Filter: 9999998
       InitPlan 3 (returns $2)
          -> Aggregate (actual time=812.580..812.580 rows=1 loops=1)
              -> CTE Scan on yy yy_1 (actual time=0.016..812.575 rows=1 loops=1)
                   Filter: (x = 2)
                   Rows Removed by Filter: 9999998
       Planning Time: 0.136 ms
       Execution Time: 3939.848 ms
    (17 rows)

    And now we’ll explicitly write a ban on materialization:

    WITH yy AS NOT MATERIALIZED (
         SELECT * FROM xytable WHERE y > 1)
    SELECT (
         SELECT count(*) FROM yy WHERE x=2), (
         SELECT count(*) FROM yy WHERE x=2);

    QUERY PLAN
    ---------------------------------------------------------------------------
    Result (actual time=0.035..0.035 rows=1 loops=1)
       InitPlan 1 (returns $0)
         -> Aggregate (actual time=0.024..0.024 rows=1 loops=1)
             -> Index Only Scan using xytable_x_y_idx on xytable (actual time=0.019..0.020 rows=1 loops=1)
                 Index Cond: ((x = 2) AND (y > 1))
                 Heap Fetches: 1
       InitPlan 2 (returns $1)
         -> Aggregate (actual time=0.006..0.006 rows=1 loops=1)
            -> Index Only Scan using xytable_x_y_idx on xytable cte_1 (actual time=0.004..0.005 rows=1 loops=1)
                Index Cond: ((x = 2) AND (y > 1))
                Heap Fetches: 1
       Planning Time: 0.253 ms
       Execution Time: 0.075 ms
    (13 rows)

    Writing CTEs are always executed, and CTEs that are not referenced never.


    This is evident from the plan: not_executed is not in it. This is true for previous versions, but it is worth remembering that, and the (NOT) MATERIALIZED construct applies to the executable expression in version 12.

    EXPLAIN (COSTS OFF) WITH yy AS (
         SELECT * FROM xytable WHERE y > 1),
    not_executed AS (
         SELECT * FROM xytable),
    always_executed AS (
         INSERT INTO xytable VALUES(2,2) RETURNING *)
    SELECT FROM yy WHERE x=2;

    QUERY PLAN
    -----------------------------
    CTE Scan on yy
       Filter: (x = 2)
       CTE yy
          -> Seq Scan on cte
                Filter: (y > 1)
       CTE always_executed
          -> Insert on cte cte_1
                -> Result
    (5 rows)

    And one more rule:

    recursive queries with WITH always materialize.


    It is always, and not by default. If we order the optimizer: NOT MATERIALIZED, there will be no error, and materialization will still be. This is a community-conscious decision.

    We will consider the illustrative example homework. That's all for today.

    This part of the review devoted to the new in CTE uses examples and fragments from the report “Postgres 12 in Etudes”, which Oleg Bartunov read at Saint Highload ++ in St. Petersburg on April 9 this year.

    In the next series - KNN .

    Also popular now: