Performance tuning and database troubleshooting these days

    Unfortunately, now the role of specialists in Performance tuning and database troubleshooting is being cut down only to the last - troubleshooting: they almost always turn to specialists only when problems have already reached a critical point, and they need to be solved “yesterday”. And even then it’s good if they turn, and do not delay the problem by buying an even more expensive and powerful hardware without a detailed performance audit and stress tests. Indeed, frustrations are often enough: they bought equipment worth 2-5 times more expensive, and won only 30-40% in performance, the whole increase from which is eaten up in a few months either by an increase in the number of users, or by an exponential increase in data, coupled with a complication of logic.

    And now, at a time when the number of architects, testers and DevOps engineers is growing rapidly, and Java Core developers are optimizing even work with strings , slowly but surely the time is coming for database optimizers. DBMSs with each release become so smarter and more complex that studying both documented and undocumented nuances and optimizations requires a huge amount of time. A huge number of articles are published monthly and major conferences devoted to Oracle are held. Sorry for the banal analogy, but in this situation, when database administrators become similar to airplane pilots with countless toggle switches, buttons, lights and screens, it is already indecent to load them with the subtleties of performance optimization.

    Of course, DBA, like pilots, in most cases can easily solve obvious and simple problems when they are either easily diagnosed or noticeable in various “tops” (Top events, top SQL, top segments ...). And which are easy to find on MOS or Google, even if they don’t know the solution. It is much more complicated when even the symptoms are hidden behind the complexity of the system and they need to be fished out among the huge amount of diagnostic information collected by the Oracle DBMS itself.

    One of the simplest and most vivid such examples is the analysis of filter and access predicts: in large and loaded systems, it often happens that such a problem is easily overlooked, because the load is fairly evenly spread over different requests (with joins for various tables, with slight differences in conditions, etc.), and the top segments do not show anything special, they say, “well, yes, from these tables the data is most often needed and there is more” . In such cases, you can start the analysis with statistics from SYS.COL_USAGE $: col_usage.sql

    col owner format a30
    col oname format a30 heading "Object name"
    col cname format a30 heading "Column name"
    accept owner_mask prompt "Enter owner mask: ";
    accept tab_name prompt "Enter tab_name mask: ";
    accept col_name prompt "Enter col_name mask: ";
    SELECT a.username              as owner
          ,o.name                  as oname
          ,c.name                  as cname
          ,u.equality_preds        as equality_preds
          ,u.equijoin_preds        as equijoin_preds
          ,u.nonequijoin_preds     as nonequijoin_preds
          ,u.range_preds           as range_preds
          ,u.like_preds            as like_preds
          ,u.null_preds            as null_preds
          ,to_char(u.timestamp, 'yyyy-mm-dd hh24:mi:ss') when
    FROM   
           sys.col_usage$ u
         , sys.obj$       o
         , sys.col$       c
         , all_users      a
    WHERE  a.user_id = o.owner#
    AND    u.obj#    = o.obj#
    AND    u.obj#    = c.obj#
    AND    u.intcol# = c.col#
    AND    a.username like upper('&owner_mask')
    AND    o.name     like upper('&tab_name')
    AND    c.name     like upper('&col_name')
    ORDER  BY a.username, o.name, c.name
    ;
    col owner clear;
    col oname clear;
    col cname clear;
    undef tab_name col_name owner_mask;

    However, for a full analysis of this information is not enough, because it does not show combinations of predicates. In this case, analysis of v $ active_session_history and v $ sql_plan can help us:

    with 
     ash as (
       select 
          sql_id
         ,plan_hash_value
         ,table_name
         ,alias
         ,ACCESS_PREDICATES
         ,FILTER_PREDICATES
         ,count(*) cnt
       from (
          select 
             h.sql_id
            ,h.SQL_PLAN_HASH_VALUE plan_hash_value
            ,decode(p.OPERATION
                     ,'TABLE ACCESS',p.OBJECT_OWNER||'.'||p.OBJECT_NAME
                     ,(select i.TABLE_OWNER||'.'||i.TABLE_NAME from dba_indexes i where i.OWNER=p.OBJECT_OWNER and i.index_name=p.OBJECT_NAME)
                   ) table_name
            ,OBJECT_ALIAS ALIAS
            ,p.ACCESS_PREDICATES
            ,p.FILTER_PREDICATES
          -- поля, которые могут быть полезны для анализа в других разрезах:
          --  ,h.sql_plan_operation
          --  ,h.sql_plan_options
          --  ,decode(h.session_state,'ON CPU','ON CPU',h.event) event
          --  ,h.current_obj#
          from v$active_session_history h
              ,v$sql_plan p
          where h.sql_opname='SELECT'
            and h.IN_SQL_EXECUTION='Y'
            and h.sql_plan_operation in ('INDEX','TABLE ACCESS')
            and p.SQL_ID = h.sql_id
            and p.CHILD_NUMBER = h.SQL_CHILD_NUMBER
            and p.ID = h.SQL_PLAN_LINE_ID
            -- если захотим за последние 3 часа:
            -- and h.sample_time >= systimestamp - interval '3' hour
       )
       -- если захотим анализируем предикаты только одной таблицы:
       -- where table_name='&OWNER.&TABNAME'
       group by 
          sql_id
         ,plan_hash_value
         ,table_name
         ,alias
         ,ACCESS_PREDICATES
         ,FILTER_PREDICATES
    )
    ,agg_by_alias as (
       select
          table_name
         ,regexp_substr(ALIAS,'^[^@]+') ALIAS
         ,listagg(ACCESS_PREDICATES,' ') within group(order by ACCESS_PREDICATES) ACCESS_PREDICATES
         ,listagg(FILTER_PREDICATES,' ') within group(order by FILTER_PREDICATES) FILTER_PREDICATES
         ,sum(cnt) cnt
       from ash
       group by 
          sql_id
         ,plan_hash_value
         ,table_name
         ,alias
    )
    ,agg as (
       select 
           table_name
          ,'ALIAS' alias
          ,replace(access_predicates,'"'||alias||'".','"ALIAS".') access_predicates
          ,replace(filter_predicates,'"'||alias||'".','"ALIAS".') filter_predicates
          ,sum(cnt) cnt
       from agg_by_alias 
       group by 
           table_name
          ,replace(access_predicates,'"'||alias||'".','"ALIAS".') 
          ,replace(filter_predicates,'"'||alias||'".','"ALIAS".') 
    )
    ,cols as (
       select 
           table_name
          ,cols
          ,access_predicates
          ,filter_predicates
          ,sum(cnt)over(partition by table_name,cols) total_by_cols
          ,cnt
       from agg
           ,xmltable(
              'string-join(for $c in /ROWSET/ROW/COL order by $c return $c,",")'
              passing 
                 xmltype(
                    cursor(
                       (select distinct
                           nvl(
                           regexp_substr(
                              access_predicates||' '||filter_predicates
                             ,'("'||alias||'"\.|[^.]|^)"([A-Z0-9#_$]+)"([^.]|$)'
                             ,1
                             ,level
                             ,'i',2
                           ),' ')
                           col
                        from dual
                        connect by 
                           level<=regexp_count(
                                     access_predicates||' '||filter_predicates
                                    ,'("'||alias||'"\.|[^.]|^)"([A-Z0-9#_$]+)"([^.]|$)'
                                  )
                       )
                   ))
              columns cols varchar2(400) path '.'
           )(+)
       order by total_by_cols desc, table_name, cnt desc
    )
    select 
       table_name
      ,cols
      ,sum(cnt)over(partition by table_name,cols) total_by_cols
      ,access_predicates
      ,filter_predicates
      ,cnt
    from cols
    where rownum<=50
    order by total_by_cols desc, table_name, cnt desc;
    

    As you can see from the query itself, it displays the top 50 search columns and the predicates themselves by the number of times ASH hits in the last 3 hours. Despite the fact that ASH only stores snapshots every second, sampling at loaded bases is very representative. It can clarify several points:

    • The cols field - displays the search columns themselves, and total_by_cols - the sum of the entries in the context of these columns.
    • I think it is quite obvious that this information in itself is not a sufficient marker of the problem, because for example, several one-time fullscans can easily ruin the statistics, so you will definitely have to consider the queries themselves and their frequency (v $ sqlstats, dba_hist_sqlstat)
    • Grouping by OBJECT_ALIAS inside SQL_ID, plan_hash_value is important for combining index and table predicates by object, because when accessing the table through the index, the predicates will be divided into different lines of the plan:
      image

    Depending on the need, this script can be easily modified to collect additional information in other sections, for example, taking into account partitioning or expectations. And having already analyzed this information, coupled with the analysis of table statistics and its indexes, the general data scheme and business logic, you can pass recommendations to developers or architects to choose a solution, for example: options for denormalization or changing the partitioning scheme or indexes.

    It is also quite often forgotten to analyze SQL * net traffic, and there are also many subtleties, for example: fetch-size, SQLNET.COMPRESSION, extended datatypes, which allow reducing the number of roundtripes, etc., but this is a topic for a separate article.

    In conclusion, I would like to say that now that users are becoming less tolerant of delays, optimizing performance is becoming a competitive advantage.

    Also popular now: