The problem with bound variables: how to turn the optimizer from enemy to friend

Original author: Victor Varlamov, OCP
  • Transfer
The author of the article is Victor Varlamov ( varlamovVp18 ), OCP.
The original article was published on 07/07/2017.
Special thanks to the author of the translation - brutaltag .

Our reporting system typically runs hundreds of lengthy queries that are triggered by various events. The query parameters are the list of clients and the time interval (daily, weekly, monthly). Because of the irregular data in the tables, a single query can produce both one row and one million rows, depending on the report parameters (for different clients there is a different number of rows in the fact tables). Each report is designed as a package with a main function that accepts input parameters, performs additional transformations, then opens a static cursor with associated variables, and returns this open cursor at the end. The DB parameter CURSOR_SHARING is set in FORCE.
In such a situation, one has to deal with poor performance, both in the case of re-use of the query plan by the optimizer, and with full parsing of the query with literal parameters . Associated variables may cause a non-optimal query plan.

In his book “Oracle Expert Practices,” Alex Gorbachev brings an interesting story told to him by Tom Kite. Every rainy Monday, users had to deal with a modified query plan. It is hard to believe, but it was so:
“According to the observations of end users, when it rained heavily on Monday, the database performance was terrible. On any other day of the week or on Monday there were no problems without rain. From a conversation with the database administrator Tom Kite learned that the difficulties continued until the forced restart of the database, after which the performance became normal. That was such a workaround: rainy Monday - restart. ”

This is a real case, and the problem was solved completely without any magic, only thanks to the excellent knowledge of how Oracle works. I will show the solution at the end of the article.
Here is a small example of how related variables work.
Create a table with irregular data.

SQL> CREATETABLE VVP_HARD_PARSE_TEST(C1 NUMBER, C2 NUMBER, C3 VARCHAR2(300));
TABLE created.
SQL> INSERTINTO VVP_HARD_PARSE_TEST
SELECTROWNUM C1,
       CASEWHENLEVEL < 9THEN1WHENMOD(ROWNUM, 100)=99THEN99ELSE1000000END C2,
       RPAD('A', 300, 'A') C3
  FROM DUAL
CONNECTBYLEVELCREATEINDEX IND_VVP_HARD_PARSE_TEST_C2 ON VVP_HARD_PARSE_TEST(C2);
INDEX created.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => USER, 
                                        TABNAME => 'VVP_HARD_PARSE_TEST', 
                                        CASCADE => TRUE, 
                                        METHOD_OPT=>'FOR ALL INDEXED COLUMNS SIZE 254');
PL/SQL PROCEDURE successfully completed.
SQL> SELECT histogram FROM user_tab_columns WHERE table_name = 'VVP_HARD_PARSE_TEST'AND column_name = 'C2';
  HISTOGRAM
  ---------
  FREQUENCY
SQL> SELECT c2, COUNT(*) FROM VVP_HARD_PARSE_TEST GROUPBY c2 ORDERBY1;
C2            COUNT(*)
-----------------------
1                     8
99                10000
1000000          989992


In other words, we have the table VVP_HARD_PARSE_TEST with a million rows, where in 10,000 cases the field C2 = 99, 8 records with C2 = 1, and the rest with C2 = 1,000,000. A histogram across the field C2 tells the Oracle optimizer about this distribution of data. This situation is known as an uneven distribution of data , and a histogram can help choose the right query plan, depending on the requested data.

Observe simple queries to this table. Obviously, for a query

SELECT * FROM VVP_HARD_PARSE_TEST WHERE c2 = :p

if p = 1, then the best choice would be INDEX RANGE SCAN, for the case of p = 1000000 it is better to use FULL TABLE SCAN. Query1 and Query1000000 queries are identical, except for the text in the comments, this is done to get different identifiers of query plans.

DECLARE p NUMBER; v NUMBER;
BEGIN
   V := 0; p := 1000000;
   FOR rec IN (SELECT/*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
   v : =0; p := 1;
   FOR rec IN (SELECT/*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
   -----------------
   V := 0; p := 1;
   FOR rec IN (SELECT/*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
   v := 0; p := 1000000;
   FOR rec IN (SELECT/*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
END;

Now look at the query plans:

SQL> SELECT sql_id, child_number, executions, plan_hash_value, sql_text FROM v$sqlWHERE sql_text LIKE'SELECT % * FROM VVP_HARD_PARSE_TEST WHERE C2%';
SQL_ID        CHILD_NUMBER EXECUTIONS PLAN_HASH_VALUE 
SQL_TEXT
-------------------------------------------------
7rqnhhp6pahw2 0            2          2782757451 
SELECT/*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1 
7xwt28hvw3u9s 022463783749SELECT/*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
SQL> SELECT * FROMTABLE(dbms_xplan.display_cursor(sql_id => '7rqnhhp6pahw2', format => 'basic +peeked_binds'));
SELECT/*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
 PLAN hashVALUE: 2782757451-------------------------------------------------
| Id  | Operation         | Name                |
-------------------------------------------------
|   0 | SELECTSTATEMENT  |                     |
|   1 |  TABLEACCESSFULL| VVP_HARD_PARSE_TEST |
-------------------------------------------------
 Peeked Binds (IDENTIFIEDBYposition):
--------------------------------------1 - :B1 (NUMBER): 1000000SQl> SELECT * FROMTABLE(dbms_xplan.display_cursor(sql_id => '7xwt28hvw3u9s', format => 'basic +peeked_binds'));
SELECT/*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
 PLAN hashVALUE: 2463783749------------------------------------------------------------------
| Id  | Operation                   | Name                       |
------------------------------------------------------------------
|   0 | SELECTSTATEMENT            |                            |
|   1 |  TABLEACCESSBYINDEXROWID| VVP_HARD_PARSE_TEST        |
|   2 |   INDEXRANGESCAN          | IND_VVP_HARD_PARSE_TEST_C2 |
------------------------------------------------------------------
 Peeked Binds (IDENTIFIEDBYposition):
--------------------------------------1 - :B1 (NUMBER): 1

As you can see, a plan for different queries is created only once, at the time of the first execution (only one child cursor with CHILD_NUMBER = 0 exists for each query). Each request is executed twice (EXECUTION = 2). During a hard parse, Oracle retrieves the values ​​of the associated variables and selects a plan according to these values. But he uses the same plan for the next run, despite the fact that the associated variables changed in the second run. Non-optimal plans are used - Query1000000 with variable C2 = 1 uses FULL TABLE SCAN instead of INDEX RANGE SCAN, and vice versa.

It is clear that fixing an application and using parameters as literals in a query is the most appropriate way to solve a problem, but it leads to dynamic SQL with its known flaws. Another way is to disable the query of related variables ( ALTER SESSION SET "_OPTIM_PEEK_USER_BINDS" = FALSE) or delete histograms ( link ).

One possible solution is the alternative use of data access policies, also known as Virtual Private Database (detailed access control, Fine Grained Access Control , row level control). This allows you to change requests on the fly and therefore can cause a complete analysis of the query plan each time the query uses detailed access control. This technique is described in detail in the article Randal Geist. The disadvantage of this method is the increasing number of complete parses and the inability to manipulate query plans.

See what we do now. After analyzing our data, we decide to divide customers into three categories - Large, Medium and Small (LMS or 9-5-1) - according to the number of transactions or transactions during the year. Also, the number of lines in the report strictly depends on the period: Monthly - Large, Weekly - Middle, Daily - Small or 9-5-1. Further, the solution is simple - let us make the predicate of the security policy depending on each category and on each period. So, for each request, we get 9 possible child cursors. Moreover, queries with different policies will lead us to the same query identifiers, this makes it possible to implement SQL PLAN MANAGEMENT (sql plan baseline).

SQL> CREATETABLE HARD_PARSE_TABLE ASSELECT * FROM dual;
TABLE created.
SQL> CREATETABLE CLIENTS_HP_STATISTICS (client_seqno NUMBER, client_id VARCHAR2(255), cnt_year NUMBER);
TABLE created.
SQL> INSERTINTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year) 
    VALUES (1,      'SMALL CLIENT',       8);
1 ROW inserted.
SQL> INSERTINTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year) 
    VALUES (99,     'MIDDLE CLIENT',  50001);
1 ROW inserted.
SQL> INSERTINTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year) 
    VALUES (1000000,'LARGE CLIENT',  989992);
1 ROW inserted.
SQL> CREATEORREPLACEPACKAGE FORCE_HARD_PARSE_PKG IS
   gc_small  CONSTANTNUMBER := 1; 
   gc_middle CONSTANT NUMBER := 5; 
   gc_large  CONSTANT NUMBER := 9;
   gc_client_middle CONSTANT NUMBER :=  50000;
   gc_client_large  CONSTANT NUMBER := 500000;  
   gc_daterange_middle CONSTANT NUMBER := 10;
   gc_daterange_large  CONSTANT NUMBER := 50;
   FUNCTION FORCE_HARD_PARSE(in_schema VARCHAR2, in_object VARCHAR2)
   RETURN VARCHAR2;
   PROCEDURE SET_PREDICATE  (n NUMBER);
   PROCEDURE SET_PREDICATES (p_daterange NUMBER DEFAULT NULL, 
                             p_clientrange NUMBER DEFAULT NULL);
   PROCEDURE CALC_PREDICATE;
   PROCEDURE CALC_PREDICATES(p_date_interval NUMBER DEFAULT 1, 
                             p_client_seqno NUMBER DEFAULT NULL,
                             p_client_id VARCHAR2 DEFAULT NULL,
                             p_client_seqno_list VARCHAR2 DEFAULT NULL
   );
END FORCE_HARD_PARSE_PKG;
PACKAGE created.
SQL> CREATEORREPLACEPACKAGEBODY FORCE_HARD_PARSE_PKG IS
   g_predicate NUMBER;    -- g_daterange || 0 || g_clientrange 
   g_daterange NUMBER;    -- 1 - small, 5 - middle, 9 - large
   g_clientrange NUMBER;  -- 1 - small, 5 - middle, 9 - large--
   FUNCTION FORCE_HARD_PARSE(in_schema VARCHAR2, in_object VARCHAR2) 
   RETURN VARCHAR2
   IS
   BEGINIF NVL(g_predicate, 0) = 0THENRETURNNULL;   
      ELSE 
         RETURN TO_CHAR(g_predicate, 'TM') || ' = ' || TO_CHAR(g_predicate, 'TM');
      ENDIF;
   END FORCE_HARD_PARSE;
   --
   PROCEDURE SET_PREDICATE (n NUMBER) 
   IS
   BEGIN
      g_predicate := n;
   END;
   PROCEDURE SET_PREDICATES (p_daterange NUMBER DEFAULT NULL, 
                             p_clientrange NUMBER DEFAULT NULL) 
   IS
   BEGINIF p_daterange ISNOTNULLTHEN 
         g_daterange := p_daterange; 
         CALC_PREDICATE; 
      ENDIF;
      IF p_clientrange IS NOT NULL THEN 
         g_clientrange := p_clientrange; 
         CALC_PREDICATE; 
      ENDIF;
   END SET_PREDICATES;
   PROCEDURE CALC_PREDICATE 
   IS
   BEGIN
      g_predicate := NVL(g_daterange, 0) * 100 + NVL(g_clientrange, 0);
   END CALC_PREDICATE;
   PROCEDURE CALC_PREDICATES (p_date_interval NUMBER DEFAULT 1, 
                              p_client_seqno NUMBER DEFAULT NULL,
                              p_client_id VARCHAR2 DEFAULT NULL,
                              p_client_seqno_list VARCHAR2 DEFAULT NULL)
   IS
      v_cnt NUMBER;
   BEGINIF p_date_interval ISNOTNULLTHEN 
         g_daterange := CASEWHEN p_date_interval < gc_daterange_middle 
                           THEN gc_small
                           WHEN p_date_interval < gc_daterange_large 
                           THEN gc_middle
                           ELSE gc_large
                        END;
         CALC_PREDICATE;
      ENDIF;
      IF COALESCE(p_client_seqno, p_client_id, p_client_seqno_list) IS NOT NULL 
      THEN
         SELECT NVL(SUM(cnt_year), 0) AS cnt
           INTO v_cnt  
           FROM CLIENTS_HP_STATISTICS t
          WHERE1=1AND (p_client_seqno ISNULLOR p_client_seqno = t.client_seqno)
            AND (p_client_id    ISNULLOR p_client_id    = t.client_id)
            AND (p_client_seqno_list ISNULLOR t.client_seqno IN
                  (SELECTSUBSTR(s,
                                 CASEWHENLEVEL > 1THENINSTR(s, ',', 1, LEVEL - 1 ) + 1ELSE1END,
                                 INSTR(s, ',', 1, LEVEL) – 
                                 CASEWHENLEVEL > 1THENINSTR(s, ',', 1, LEVEL1) + 1ELSE1END)
                     FROM (SELECT p_client_seqno_list||','AS s FROM DUAL)
                  CONNECTBYINSTR(s, ',', 1, LEVEL) > 0));
         g_clientrange := CASE 
                             WHEN v_cnt > gc_client_large THEN gc_large
                             WHEN v_cnt > gc_client_middle THEN gc_middle
                             ELSE gc_small
                          END;
         CALC_PREDICATE; 
      ENDIF;
   END CALC_PREDICATES;
END FORCE_HARD_PARSE_PKG;
PACKAGE BODY created.
SQL> EXEC DBMS_RLS.ADD_POLICY (USER, 'HARD_PARSE_TABLE', 'HARD_PARSE_POLICY', USER, 'FORCE_HARD_PARSE_PKG.FORCE_HARD_PARSE', 'select');
PL/SQL PROCEDURE successfully completed.

Now, if we want to embed such a technology into a report, we need to add HARD_PARSE_TABLE to the query (this will not spoil it a bit) and call CALC_PREDICATES before the main query is executed.

Let's see how this technique can transform the previous example:

DECLARE p NUMBER; v NUMBER;
BEGIN
   V := 0; p := 1000000;
   FORCE_HARD_PARSE_PKG.SET_PREDICATE(1000000);
   FOR rec IN (SELECT/*+query_hp1000000*/ * 
                 FROM VVP_HARD_PARSE_TEST, 
                      HARD_PARSE_TABLE 
                WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP; 
   dbms_output.put_line(v);
   v := 0; p := 1;
   FORCE_HARD_PARSE_PKG.SET_PREDICATE(1);
   FOR rec IN (SELECT/*+query_hp1000000*/ * 
                 FROM VVP_HARD_PARSE_TEST, 
                      HARD_PARSE_TABLE 
                WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
   -----------------
   V := 0; p := 1;
   FORCE_HARD_PARSE_PKG.SET_PREDICATE(1);
   FOR rec IN (SELECT/*+query_hp1*/ * 
                 FROM VVP_HARD_PARSE_TEST, 
                      HARD_PARSE_TABLE 
                WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
   v := 0; p := 1000000;
   FORCE_HARD_PARSE_PKG.SET_PREDICATE(1000000);
   FOR rec IN (SELECT/*+query_hp1*/ * 
                 FROM VVP_HARD_PARSE_TEST,
                      HARD_PARSE_TABLE 
                WHERE c2 = p) LOOP
      V := v + 1;
   ENDLOOP;
   dbms_output.put_line(v);
END;

Let's look at the execution plans:

SQL> SELECT sql_id, child_number, executions, plan_hash_value, sql_text, s.* FROM v$sql s WHERE sql_text LIKE'SELECT % * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2%'ORDERBY1,2;
SQL_ID                CHILD_NUMBER     EXECUTIONS                  PLAN_HASH_VALUE
SQL_TEXT
--------------------------------------------------------------------------------
7wva3uqbgh4qf         0                1	                   1136240498	
SELECT/*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 
7wva3uqbgh4qf         113246475190SELECT/*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 
8cju3tfjvwm1p         013246475190SELECT/*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1 
8cju3tfjvwm1p         111136240498SELECT/*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1
--SQL> SELECT * FROMTABLE(dbms_xplan.display_cursor(sql_id => '7wva3uqbgh4qf', cursor_child_no => 0, format => 'basic +peeked_binds'));
SELECT/*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, 
HARD_PARSE_TABLE WHERE C2 = :B1
 PLAN hashVALUE: 1136240498----------------------------------------------------
| Id  | Operation            | Name                |
----------------------------------------------------
|   0 | SELECTSTATEMENT     |                     |
|   1 |  MERGEJOIN CARTESIAN|                     |
|   2 |   TABLEACCESSFULL  | HARD_PARSE_TABLE    |
|   3 |   BUFFER SORT        |                     |
|   4 |    TABLEACCESSFULL | VVP_HARD_PARSE_TEST |
----------------------------------------------------
 Peeked Binds (IDENTIFIEDBYposition):
--------------------------------------1 - :B1 (NUMBER): 1000000--SQL> SELECT * FROMTABLE(dbms_xplan.display_cursor(sql_id => '7wva3uqbgh4qf', cursor_child_no => 1, format => 'basic +peeked_binds'));
SELECT/*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, 
HARD_PARSE_TABLE WHERE C2 = :B1
 PLAN hashVALUE: 3246475190--------------------------------------------------------------------
| Id  | Operation                     | Name                       |
--------------------------------------------------------------------
|   0 | SELECTSTATEMENT              |                            |
|   1 |  MERGEJOIN CARTESIAN         |                            |
|   2 |   TABLEACCESSFULL           | HARD_PARSE_TABLE           |
|   3 |   BUFFER SORT                 |                            |
|   4 |    TABLEACCESSBYINDEXROWID| VVP_HARD_PARSE_TEST        |
|   5 |     INDEXRANGESCAN          | IND_VVP_HARD_PARSE_TEST_C2 |
--------------------------------------------------------------------
 Peeked Binds (IDENTIFIEDBYposition):
--------------------------------------1 - :B1 (NUMBER): 1--SQl> SELECT * FROMTABLE(dbms_xplan.display_cursor(sql_id => '8cju3tfjvwm1p', cursor_child_no => 0, format => 'basic +peeked_binds'));
SELECT/*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, 
HARD_PARSE_TABLE WHERE C2 = :B1
 PLAN hashVALUE: 3246475190--------------------------------------------------------------------
| Id  | Operation                     | Name                       |
--------------------------------------------------------------------
|   0 | SELECTSTATEMENT              |                            |
|   1 |  MERGEJOIN CARTESIAN         |                            |
|   2 |   TABLEACCESSFULL           | HARD_PARSE_TABLE           |
|   3 |   BUFFER SORT                 |                            |
|   4 |    TABLEACCESSBYINDEXROWID| VVP_HARD_PARSE_TEST        |
|   5 |     INDEXRANGESCAN          | IND_VVP_HARD_PARSE_TEST_C2 |
--------------------------------------------------------------------
 Peeked Binds (IDENTIFIEDBYposition):
--------------------------------------1 - :B1 (NUMBER): 1--SQL> SELECT * FROMTABLE(dbms_xplan.display_cursor(sql_id => '8cju3tfjvwm1p', cursor_child_no => 1, format => 'basic +peeked_binds'));
SELECT/*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, 
HARD_PARSE_TABLE WHERE C2 = :B1
 PLAN hashVALUE: 1136240498----------------------------------------------------
| Id  | Operation            | Name                |
----------------------------------------------------
|   0 | SELECTSTATEMENT     |                     |
|   1 |  MERGEJOIN CARTESIAN|                     |
|   2 |   TABLEACCESSFULL  | HARD_PARSE_TABLE    |
|   3 |   BUFFER SORT        |                     |
|   4 |    TABLEACCESSFULL | VVP_HARD_PARSE_TEST |
----------------------------------------------------
 Peeked Binds (IDENTIFIEDBYposition):
--------------------------------------1 - :B1 (NUMBER): 1000000

Looks good! Each query is executed twice, with different child cursors and different plans. For the parameter C2 = 1000000 we see the FULL TABLE SCAN in both queries, and for the parameter C1 = 1 we see always the INDEX RANGE SCAN.

At the end I cite a case with rainy Fridays:

“It turns out that every weekend on Sunday there was a cold backup , so that all the query plans were regenerated by the first execution on Monday morning. One of the employees usually started his work before the others, and his query plan was well implemented for other users during the week. However, if it was raining, this user was late for the start of the working day due to problems with his morning route. Then the batch calculation of reports was first launched, but the query plan was completely bad for other cases because of the inappropriate values ​​of the associated variables. ”

And a few useful system views:
dba_tab_histograms, all_tab_histograms, user_tab_histograms
v$vpd_policy
v$sql_bind_capture
dba_hist_sqlbind

Also popular now: