jobgemws March 1, 2018 at 08:10

Three aspects of optimization (DB and software)

Tutorial

Foreword

Quite often, users, developers, administrators, etc. MS SQL Server DBMSs encounter database performance problems or the DBMS as a whole.

This article will give general recommendations on tuning optimization of both the database and the entire database management system as a whole. Basic recommendations will also be given on how the .NET application and MS SQL Server interact. Examples of solutions to most of the recommendations below will be given.

This article will not consider the optimization of hardware resources, the OS itself and the use of different built-in features for the DBMS and the OS as a whole, which would take an entire book.

Decision

In total, there are only 3 blocks of recommendations for optimization with the database:

1) directly the optimization of the database and the DBMS in general
2) optimization of the interaction of the application and MS SQL Server (we will further consider the interaction of the .NET application and MS SQL Server)
3) optimization of the queries themselves

First, let's analyze the first block.

There are only 3 main recommendations for optimizing the database and DBMS as a whole:

1) obsolescence of the procedural cache
2) non-optimal indexes
3) non-optimal statistics

This unit needs to be worked out as part of routine maintenance from 1 time per day to 1 time per week, depending on the needs of the entire information system. It is also worth considering that during the operation of this block, databases and DBMSs as a whole will be heavily loaded by resources. Therefore, this work must be carried out either during off-peak hours, or on the backup server, or by distributing work throughout the day (in the latter case then point 1 is not applicable).

It is also important to note that this block must be performed after all mass data processing as part of other routine maintenance.

Usually, step 2 is first performed (indexes are optimized), and then step 1 (procedure cache is cleared), and then step 3 is done (statistics are updated).

We analyze each item in the first block separately.

Clause 1 on obsolescence of the procedural cache is solved by simply clearing this cache by calling a simple command:

DBCC FLUSHPROCINDB ('DB_NAME');

However, it is important to note that such a solution does not always fit all databases and all DBMSs. Such a solution is suitable for databases with the following characteristics:

1) database data size up to 500 GB
2) the total database data volume changes every day, that is, not only new data appears, but also significant volumes are updated and deleted.

Examples of such databases are TFS, CRM , NAV and 1C.

If the database has a data size of more than 500 GB or the data is only added, but changed and deleted in extremely small volumes (extremely small means that the volume of changes is so many times less than the total volume of unchanged data that this volume of changes can be neglected), then the solution must first be tested on the test environment as close as possible to the production environment. In any case, when clearing the procedural cache for such databases and DBMS as a whole, subsequently updating statistics will be a very long and resource-intensive operation. Therefore, for such databases and DBMS as a whole, paragraph 1 on cleaning the procedural cache must first be canceled, at least until the test results on the test medium are obtained.

For a database in which you can apply item 1, and a DBMS containing only such databases, you can implement the following stored procedure [srv]. [AutoUpdateStatisticsCache] for later use:

An example implementation of the stored procedure [srv]. [AutoUpdateStatisticsCache]

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [srv].[AutoUpdateStatisticsCache]
	@DB_Name nvarchar(255)=null,
	@IsUpdateStatistics bit=0
AS
BEGIN
	/*
		очистка кеша с последующим обновлением статистики по всем несистемным БД
	*/
	SET NOCOUNT ON;
    declare @tbl table (Name nvarchar(255), [DB_ID] int);
	declare @db_id int;
	declare @name nvarchar(255);
	declare @str nvarchar(255);
	--получаем все БД, которые не помечены как только для чтения
	insert into @tbl(Name, [DB_ID])
	select name, database_id
	from sys.databases
	where name not in ('master', 'tempdb', 'model', 'msdb', 'distribution')
	and is_read_only=0	--write
	and state=0			--online
	and user_access=0	--MULTI_USER
	and is_auto_close_on=0
	and (name=@DB_Name or @DB_Name is null);
	while(exists(select top(1) 1 from @tbl))
	begin
		--получаем идентификатор нужной БД
		select top(1)
			@db_id=[DB_ID]
		  , @name=Name
		from @tbl;
		--очищаем кэш по id БД
		DBCC FLUSHPROCINDB(@db_id);
		if(@IsUpdateStatistics=1)
		begin
			--обновляем статистику
			set @str='USE'+' ['+@name+']; exec sp_updatestats;'
			exec(@str);
		end
		delete from @tbl
		where [DB_ID]=@db_id;
	end
END
GO

Here at the end, if the @IsUpdateStatistics parameter is set to 1, then statistics for the given database in the @DB_Name parameter are also updated. If @ IsUpdateStatistics = 1, then step 1 on clearing the procedural cache must be carried out after solving problem 2, i.e., in this case, the problem of step 3 on non-optimal statistics is closed.

The current size of the entire plan cache and query plan cache can be viewed by implementing, for example, the following view in the administration database:

An example implementation of the [inf] view. [VSizeCache]

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE view [inf].[vSizeCache] as
--Текущий размер всего кэша планов и кэша планов запросов (https://club.directum.ru/post/1125)
with tbl as (
	select
	  TotalCacheSize = SUM(CAST(size_in_bytes as bigint)) / 1048576,
	  QueriesCacheSize = SUM(CAST((case 
									  when objtype in ('Adhoc', 'Prepared') 
									  then size_in_bytes else 0 
									end) as bigint)) / 1048576,
	  QueriesUseMultiCountCacheSize = SUM(CAST((case 
									  when ((objtype in ('Adhoc', 'Prepared')) and (usecounts>1))
									  then size_in_bytes else 0 
									end) as bigint)) / 1048576,
	  QueriesUseOneCountCacheSize = SUM(CAST((case 
									  when ((objtype in ('Adhoc', 'Prepared')) and (usecounts=1))
									  then size_in_bytes else 0 
									end) as bigint)) / 1048576
	from sys.dm_exec_cached_plans
)
select 
  'Queries' as 'Cache', 
  (select top(1) QueriesCacheSize from tbl) as 'Cache Size (MB)', 
  CAST((select top(1) QueriesCacheSize from tbl) * 100 / (select top(1) TotalCacheSize from tbl) as int) as 'Percent of Total/Queries'
union all
select 
  'Total' as 'Cache', 
  (select top(1) TotalCacheSize from tbl) as 'Cache Size (MB)', 
  100 as 'Percent of Total/Queries'
union all
select 
  'Queries UseMultiCount' as 'Cache', 
  (select top(1) QueriesUseMultiCountCacheSize from tbl) as 'Cache Size (MB)', 
  CAST((select top(1) QueriesUseMultiCountCacheSize from tbl) * 100 / (select top(1) QueriesCacheSize from tbl) as int) as 'Percent of Queries/Queries'
union all
select 
  'Queries UseOneCount' as 'Cache', 
  (select top(1) QueriesUseOneCountCacheSize from tbl) as 'Cache Size (MB)', 
  CAST((select top(1) QueriesUseOneCountCacheSize from tbl) * 100 / (select top(1) QueriesCacheSize from tbl) as int) as 'Percent of Queries/Queries'
--option(recompile)
GO

Now we will analyze step 2. about non-optimal indexes.

By non-optimal indices we mean the following 4 factors:

1) highly fragmented indices
2) unused indices
3) missing indices
4) indices that use more costs for their services than bring performance gains

By strongly fragmented index we mean the following fragmentation indices:

1 ) more than 30% for indices with a size of at least 20 pages
2) more than 20% for indices with a size of at least 100 pages
3) more than 10% for indices with a size of at least 500 pages

Indexes from clause 2 and clause 4 my it is possible to determine, for example, using the following [inf]. [vDelIndexOptimize] view for a specific database:

An example implementation of the [inf] view. [VDelIndexOptimize]

USE [ИМЯ_БД]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE view [inf].[vDelIndexOptimize] as
/*
	     возвращаются те индексы, которые не использовались в запросах более одного года
		 как пользователями, так и системой.
		 БД master, model, msdb и tempdb не рассматриваются
*/
select DB_NAME(t.database_id)		as [DBName]
	 , SCHEMA_NAME(obj.schema_id)	as [SchemaName]
	 , OBJECT_NAME(t.object_id)		as [ObjectName]
	 , obj.Type						as [ObjectType]
	 , obj.Type_Desc				as [ObjectTypeDesc]
	 , ind.name						as [IndexName]
	 , ind.Type						as IndexType
	 , ind.Type_Desc				as IndexTypeDesc
	 , ind.Is_Unique				as IndexIsUnique
	 , ind.is_primary_key			as IndexIsPK
	 , ind.is_unique_constraint		as IndexIsUniqueConstraint
	 , (t.[USER_SEEKS]+t.[USER_SCANS]+t.[USER_LOOKUPS]+t.[SYSTEM_SEEKS]+t.[SYSTEM_SCANS]+t.[SYSTEM_LOOKUPS])-(t.[USER_UPDATES]+t.[System_Updates]) as [index_advantage]
	 , t.[Database_ID]
	 , t.[Object_ID]
	 , t.[Index_ID]
	 , t.USER_SEEKS
     , t.USER_SCANS 
     , t.USER_LOOKUPS 
     , t.USER_UPDATES
	 , t.SYSTEM_SEEKS
     , t.SYSTEM_SCANS 
     , t.SYSTEM_LOOKUPS 
     , t.SYSTEM_UPDATES
	 , t.Last_User_Seek
	 , t.Last_User_Scan
	 , t.Last_User_Lookup
	 , t.Last_System_Seek
	 , t.Last_System_Scan
	 , t.Last_System_Lookup
	 , ind.Filter_Definition,
		STUFF(
				(
					SELECT N', [' + [name] +N'] '+case ic.[is_descending_key] when 0 then N'ASC' when 1 then N'DESC' end FROM sys.index_columns ic
								   INNER JOIN sys.columns c on c.[object_id] = obj.[object_id] and ic.[column_id] = c.[column_id]
					WHERE ic.[object_id] = obj.[object_id]
					  and ic.[index_id]=ind.[index_id]
					  and ic.[is_included_column]=0
					order by ic.[key_ordinal] asc
					FOR XML PATH(''),TYPE
				).value('.','NVARCHAR(MAX)'),1,2,''
			  ) as [Columns],
		STUFF(
				(
					SELECT N', [' + [name] +N']' FROM sys.index_columns ic
								   INNER JOIN sys.columns c on c.[object_id] = obj.[object_id] and ic.[column_id] = c.[column_id]
					WHERE ic.[object_id] = obj.[object_id]
					  and ic.[index_id]=ind.[index_id]
					  and ic.[is_included_column]=1
					order by ic.[key_ordinal] asc
					FOR XML PATH(''),TYPE
				).value('.','NVARCHAR(MAX)'),1,2,''
			  ) as [IncludeColumns]
from sys.dm_db_index_usage_stats as t
inner join sys.objects as obj on t.[object_id]=obj.[object_id]
inner join sys.indexes as ind on t.[object_id]=ind.[object_id] and t.index_id=ind.index_id
where ((last_user_seek	is null or last_user_seek		0) and (t.[SYSTEM_SEEKS]<=(t.[USER_UPDATES]+t.[System_Updates]-(t.[USER_SEEKS]+t.[USER_SCANS]+t.[USER_LOOKUPS]+t.[SYSTEM_SCANS]+t.[SYSTEM_LOOKUPS])))))
and t.database_id>4 and t.[object_id]>0
and ind.is_primary_key=0 --не является ограничением первичного ключа
and ind.is_unique_constraint=0 --не является ограничением уникальности
and t.database_id=DB_ID()
GO

It is also important to analyze the overlap of indices . To do this, create the [srv]. [VDelIndexInclude] view in the database in question:

An example implementation of the [srv] view. [VDelIndexInclude]

USE [ИМЯ_БД]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE view [srv].[vDelIndexInclude] as
/*
  Погорелов А.А.
  Поиск перекрывающихся(лишних) индексов.
  Если поля индекса перекрываются более широким индексом в том же порядке следования полей начиная с первого поля, то 
  этот индекс считается лишним, так как запросы могут использовать более широкий индекс.
  http://www.sql.ru/blogs/andraptor/1218
*/
WITH cte_index_info AS (
SELECT
tSS.[name] AS [SchemaName]
,tSO.[name] AS [ObjectName]
,tSO.[type_desc] AS [ObjectType]
,tSO.[create_date] AS [ObjectCreateDate]
,tSI.[name] AS [IndexName]
,tSI.[is_primary_key] AS [IndexIsPrimaryKey]
,d.[index_type_desc] AS [IndexType]
,d.[avg_fragmentation_in_percent] AS [IndexFragmentation]
,d.[fragment_count] AS [IndexFragmentCount]
,d.[avg_fragment_size_in_pages] AS [IndexAvgFragmentSizeInPages]
,d.[page_count] AS [IndexPages]
,c.key_columns AS [IndexKeyColumns]
,COALESCE(ic.included_columns, '') AS [IndexIncludedColumns]
,tSI.is_unique_constraint
FROM
(
SELECT
tSDDIPS.[object_id] AS [object_id]
,tSDDIPS.[index_id] AS [index_id]
,tSDDIPS.[index_type_desc] AS [index_type_desc]
,MAX(tSDDIPS.[avg_fragmentation_in_percent]) AS [avg_fragmentation_in_percent]
,MAX(tSDDIPS.[fragment_count]) AS [fragment_count]
,MAX(tSDDIPS.[avg_fragment_size_in_pages]) AS [avg_fragment_size_in_pages]
,MAX(tSDDIPS.[page_count]) AS [page_count]
FROM
[sys].[dm_db_index_physical_stats] (DB_ID(), NULL, NULL , NULL, N'LIMITED') tSDDIPS
GROUP BY
tSDDIPS.[object_id]
,tSDDIPS.[index_id]
,tSDDIPS.[index_type_desc]
) d
INNER JOIN [sys].[indexes] tSI ON
tSI.[object_id] = d.[object_id]
AND tSI.[index_id] = d.[index_id]
INNER JOIN [sys].[objects] tSO ON
tSO.[object_id] = d.[object_id]
INNER JOIN [sys].[schemas] tSS ON
tSS.[schema_id] = tSO.[schema_id]
CROSS APPLY (
SELECT
STUFF((
SELECT
', ' + c.[name] +
CASE ic.[is_descending_key]
WHEN 1 THEN
'(-)'
ELSE
''
END
FROM
[sys].[index_columns] ic
INNER JOIN [sys].[columns] c ON
c.[object_id] = ic.[object_id]
and c.[column_id] = ic.[column_id]
WHERE
ic.[index_id] = tSI.[index_id]
AND ic.[object_id] = tSI.[object_id]
AND ic.[is_included_column] = 0
ORDER BY
ic.[key_ordinal]
FOR XML
PATH('')
)
,1, 2, ''
) AS [key_columns]
) c
CROSS APPLY (
SELECT
STUFF((
SELECT
', ' + c.[name]
FROM
[sys].[index_columns] ic
INNER JOIN [sys].[columns] c ON
c.[object_id] = ic.[object_id]
AND c.[column_id] = ic.[column_id]
WHERE
ic.[index_id] = tSI.[index_id]
AND ic.[object_id] = tSI.[object_id]
AND ic.[is_included_column] = 1
FOR XML
PATH('')
)
,1, 2, ''
) AS [included_columns]
) ic
WHERE
tSO.[type_desc] IN (
N'USER_TABLE'
)
AND OBJECTPROPERTY(tSO.[object_id], N'IsMSShipped') = 0
AND d.[index_type_desc] NOT IN (
'HEAP'
)
)
SELECT
t1.[SchemaName]
,t1.[ObjectName]
,t1.[ObjectType]
,t1.[ObjectCreateDate]
,t1.[IndexName] as [DelIndexName]
,t1.[IndexIsPrimaryKey]
,t1.[IndexType]
,t1.[IndexFragmentation]
,t1.[IndexFragmentCount]
,t1.[IndexAvgFragmentSizeInPages]
,t1.[IndexPages]
,t1.[IndexKeyColumns]
,t1.[IndexIncludedColumns]
,t2.[IndexName] as [ActualIndexName]
FROM
cte_index_info t1
INNER JOIN cte_index_info t2 ON
t2.[SchemaName] = t1.[SchemaName]
AND t2.[ObjectName] = t1.[ObjectName]
AND t2.[IndexName] <> t1.[IndexName]
AND PATINDEX(REPLACE(t1.[IndexKeyColumns], '_', '[_]') + ',%', t2.[IndexKeyColumns] + ',') > 0
WHERE
t1.[IndexIncludedColumns] = '' -- don't check indexes with INCLUDE columns
AND t1.[IndexIsPrimaryKey] = 0 -- don't check primary keys
AND t1.is_unique_constraint=0  -- don't check unique constraint
AND t1.[IndexType] NOT IN (
N'CLUSTERED INDEX'
,N'UNIQUE CLUSTERED INDEX'
) -- don't check clustered indexes
GO

It is important to note here that even if the index falls under clause 2 or clause 4, then there is no need to rush to delete it. You need to make sure that the system really does not need it. To do this, it is necessary to conduct the necessary tests on the test medium, which is as close as possible to the production one, first with the index and then with the remote index (take measurements and compare).

Missing indexes (item 3) can be determined, for example, using the following representation [inf]. [VRecomendateIndex]:

An example implementation of the [inf] view. [VRecomendateIndex]

USE [ИМЯ_БД]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE view [inf].[vRecomendateIndex] as
-- Отсутствующие индексы из DMV
SELECT  @@ServerName AS ServerName ,
        DB_Name(ddmid.[database_id]) as [DBName] ,
        t.name AS 'Affected_table' ,
		ddmigs.user_seeks * ddmigs.avg_total_user_cost * (ddmigs.avg_user_impact * 0.01) AS index_advantage,
		ddmigs.group_handle,
		ddmigs.unique_compiles,
		ddmigs.last_user_seek,
		ddmigs.last_user_scan,
		ddmigs.avg_total_user_cost,
		ddmigs.avg_user_impact,
		ddmigs.system_seeks,
		ddmigs.last_system_scan,
		ddmigs.last_system_seek,
		ddmigs.avg_total_system_cost,
		ddmigs.avg_system_impact,
		ddmig.index_group_handle,
		ddmig.index_handle,
		ddmid.database_id,
		ddmid.[object_id],
		ddmid.equality_columns,	  -- =
		ddmid.inequality_columns,
		ddmid.[statement],
        ( LEN(ISNULL(ddmid.equality_columns, N'')
              + CASE WHEN ddmid.equality_columns IS NOT NULL
                          AND ddmid.inequality_columns IS NOT NULL THEN ','
                     ELSE ''
                END) - LEN(REPLACE(ISNULL(ddmid.equality_columns, N'')
                                   + CASE WHEN ddmid.equality_columns
                                                             IS NOT NULL
                                               AND ddmid.inequality_columns
                                                             IS NOT NULL
                                          THEN ','
                                          ELSE ''
                                     END, ',', '')) ) + 1 AS K ,
        COALESCE(ddmid.equality_columns, '')
        + CASE WHEN ddmid.equality_columns IS NOT NULL
                    AND ddmid.inequality_columns IS NOT NULL THEN ','
               ELSE ''
          END + COALESCE(ddmid.inequality_columns, '') AS Keys ,
        ddmid.included_columns AS [include] ,
        'Create NonClustered Index IX_' + t.name + '_missing_'
        + CAST(ddmid.index_handle AS VARCHAR(20)) 
        + ' On ' + ddmid.[statement] COLLATE database_default
        + ' (' + ISNULL(ddmid.equality_columns, '')
        + CASE WHEN ddmid.equality_columns IS NOT NULL
                    AND ddmid.inequality_columns IS NOT NULL THEN ','
               ELSE ''
          END + ISNULL(ddmid.inequality_columns, '') + ')'
        + ISNULL(' Include (' + ddmid.included_columns + ');', ';')
                                                  AS sql_statement ,
        ddmigs.user_seeks ,
        ddmigs.user_scans ,
        CAST(( ddmigs.user_seeks + ddmigs.user_scans )
        * ddmigs.avg_user_impact AS BIGINT) AS 'est_impact' ,
        ( SELECT    DATEDIFF(Second, create_date, GETDATE()) Seconds
          FROM      sys.databases
          WHERE     name = 'tempdb'
        ) SecondsUptime 
FROM    sys.dm_db_missing_index_groups ddmig
        INNER JOIN sys.dm_db_missing_index_group_stats ddmigs
               ON ddmigs.group_handle = ddmig.index_group_handle
        INNER JOIN sys.dm_db_missing_index_details ddmid
               ON ddmig.index_handle = ddmid.index_handle
        INNER JOIN sys.tables t ON ddmid.OBJECT_ID = t.OBJECT_ID
WHERE   ddmid.database_id = DB_ID()
--ORDER BY est_impact DESC;
GO

This will return a list of missing indexes for a particular database.

If you need a list of missing indexes for all DBMS databases, you can display it using the definition of the following view [inf]. [VNewIndexOptimize]:

An example implementation of the [inf] view. [VNewIndexOptimize]

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE view [inf].[vNewIndexOptimize] as
/*
	степень полезности новых индексов
	index_advantage: >50 000 - очень выгодно создать индекс
					 >10 000 - можно создать индекс, однако нужно анализировать и его поддержку
					 <=10000 - индекс можно не создавать
*/
SELECT @@ServerName AS ServerName,
       DB_Name(ddmid.[database_id]) as [DBName],
	   OBJECT_SCHEMA_NAME(ddmid.[object_id], ddmid.[database_id]) as [Schema],
	   OBJECT_NAME(ddmid.[object_id], ddmid.[database_id]) as [Name],
	   ddmigs.user_seeks * ddmigs.avg_total_user_cost * (ddmigs.avg_user_impact * 0.01) AS index_advantage,
	   ddmigs.group_handle,
	   ddmigs.unique_compiles,
	   ddmigs.last_user_seek,
	   ddmigs.last_user_scan,
	   ddmigs.avg_total_user_cost,
	   ddmigs.avg_user_impact,
	   ddmigs.system_seeks,
	   ddmigs.last_system_scan,
	   ddmigs.last_system_seek,
	   ddmigs.avg_total_system_cost,
	   ddmigs.avg_system_impact,
	   ddmig.index_group_handle,
	   ddmig.index_handle,
	   ddmid.database_id,
	   ddmid.[object_id],
	   ddmid.equality_columns,	  -- =
	   ddmid.inequality_columns,
	   ddmid.[statement],
       ( LEN(ISNULL(ddmid.equality_columns, N'')
             + CASE WHEN ddmid.equality_columns IS NOT NULL
                         AND ddmid.inequality_columns IS NOT NULL THEN ','
                    ELSE ''
               END) - LEN(REPLACE(ISNULL(ddmid.equality_columns, N'')
                                  + CASE WHEN ddmid.equality_columns
                                                            IS NOT NULL
                                              AND ddmid.inequality_columns
                                                            IS NOT NULL
                                         THEN ','
                                         ELSE ''
                                    END, ',', '')) ) + 1 AS K ,
       COALESCE(ddmid.equality_columns, '')
       + CASE WHEN ddmid.equality_columns IS NOT NULL
                   AND ddmid.inequality_columns IS NOT NULL THEN ','
              ELSE ''
         END + COALESCE(ddmid.inequality_columns, '') AS Keys ,
       ddmid.included_columns AS [include] ,
       'Create NonClustered Index [IX_' + OBJECT_NAME(ddmid.[object_id], ddmid.[database_id]) + '_missing_'
       + CAST(ddmid.index_handle AS VARCHAR(20)) 
       + '] On ' + ddmid.[statement] COLLATE database_default
       + ' (' + ISNULL(ddmid.equality_columns, '')
       + CASE WHEN ddmid.equality_columns IS NOT NULL
                   AND ddmid.inequality_columns IS NOT NULL THEN ','
              ELSE ''
         END + ISNULL(ddmid.inequality_columns, '') + ')'
       + ISNULL(' Include (' + ddmid.included_columns + ');', ';')
                                                 AS sql_statement ,
       ddmigs.user_seeks ,
       ddmigs.user_scans ,
       CAST(( ddmigs.user_seeks + ddmigs.user_scans )
       * ddmigs.avg_user_impact AS BIGINT) AS 'est_impact' ,
       ( SELECT    DATEDIFF(Second, create_date, GETDATE()) Seconds
         FROM      sys.databases
         WHERE     name = 'tempdb'
       ) SecondsUptime 
FROM
sys.dm_db_missing_index_group_stats ddmigs
INNER JOIN sys.dm_db_missing_index_groups AS ddmig
ON ddmigs.group_handle = ddmig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS ddmid
ON ddmig.index_handle = ddmid.index_handle
--WHERE   mid.database_id = DB_ID()
--ORDER BY migs_adv.index_advantage
GO

Here (as in clauses 2 and 4), it is also important to note that even if the index falls under clause 3, then you do not have to rush to create it. You need to make sure that the system really needs it. To do this, it is necessary to conduct the necessary tests on a test medium that is as close as possible to the production one, first without a new index, and then with a new index (take measurements and compare). It is not uncommon for cases when a new index from clause 3 becomes subsequently an index from clause 2 or clause 4.

So how to solve the problem of item 1 - to get rid of a strong degree of fragmentation of indices? The Internet is full of ready-made solutions to this issue. Here is another example that will be based on the recommendations from msdn .

To do this, create a view [inf]. [VIndexDefrag] in those databases where it is necessary to consider the level of fragmentation of indices:

An example implementation of the [inf] view. [VIndexDefrag]

USE [ИМЯ_БД]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE view [inf].[vIndexDefrag]
as
with info as 
(SELECT
	ps.[object_id],
	ps.database_id,
	ps.index_id,
	ps.index_type_desc,
	ps.index_level,
	ps.fragment_count,
	ps.avg_fragmentation_in_percent,
	ps.avg_fragment_size_in_pages,
	ps.page_count,
	ps.record_count,
	ps.ghost_record_count
	FROM sys.dm_db_index_physical_stats
    (DB_ID()
	, NULL, NULL, NULL ,
	N'LIMITED') as ps
	inner join sys.indexes as i on i.[object_id]=ps.[object_id] and i.[index_id]=ps.[index_id]
	where ps.index_level = 0
	and ps.avg_fragmentation_in_percent >= 10
	and ps.index_type_desc <> 'HEAP'
	and ps.page_count>=8 --1 экстент
	and i.is_disabled=0
	)
SELECT
	DB_NAME(i.database_id) as db,
	SCHEMA_NAME(t.[schema_id]) as shema,
	t.name as tb,
	i.index_id as idx,
	i.database_id,
	(select top(1) idx.[name] from [sys].[indexes] as idx where t.[object_id] = idx.[object_id] and idx.[index_id] = i.[index_id]) as index_name,
	i.index_type_desc,i.index_level as [level],
	i.[object_id],
	i.fragment_count as frag_num,
	round(i.avg_fragmentation_in_percent,2) as frag,
	round(i.avg_fragment_size_in_pages,2) as frag_page,
	i.page_count as [page],
	i.record_count as rec,
	i.ghost_record_count as ghost,
	round(i.avg_fragmentation_in_percent*i.page_count,0) as func
FROM info as i
inner join [sys].[all_objects]	as t	on i.[object_id] = t.[object_id];
GO

Here you will find a list of those included indexes that are not heaps, occupy at least 1 extent (8 pages) and have a fragmentation level of at least 10%.

We also create 2 tables in the database for administration - to save the list of processed indexes and to save the results of processed indexes. The first table is needed in order not to take into account the same indices twice, until all indices are processed:

Table for saving a list of indexes that have undergone reorganization in one iteration

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [srv].[ListDefragIndex](
	[db] [nvarchar](100) NOT NULL,
	[shema] [nvarchar](100) NOT NULL,
	[table] [nvarchar](100) NOT NULL,
	[IndexName] [nvarchar](100) NOT NULL,
	[object_id] [int] NOT NULL,
	[idx] [int] NOT NULL,
	[db_id] [int] NOT NULL,
	[frag] [decimal](6, 2) NOT NULL,
	[InsertUTCDate] [datetime] NOT NULL,
 CONSTRAINT [PK_ListDefragIndex] PRIMARY KEY CLUSTERED 
(
	[object_id] ASC,
	[idx] ASC,
	[db_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [srv].[ListDefragIndex] ADD  CONSTRAINT [DF_ListDefragIndex_InsertUTCDate]  DEFAULT (getutcdate()) FOR [InsertUTCDate]
GO

Table for saving history about index reorganization of all databases of an instance of MS SQL Server

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [srv].[Defrag](
	[ID] [bigint] IDENTITY(1,1) NOT NULL,
	[db] [nvarchar](100) NOT NULL,
	[shema] [nvarchar](100) NOT NULL,
	[table] [nvarchar](100) NOT NULL,
	[IndexName] [nvarchar](100) NOT NULL,
	[frag_num] [int] NOT NULL,
	[frag] [decimal](6, 2) NOT NULL,
	[page] [int] NOT NULL,
	[rec] [int] NULL,
	[ts] [datetime] NOT NULL,
	[tf] [datetime] NOT NULL,
	[frag_after] [decimal](6, 2) NOT NULL,
	[object_id] [int] NOT NULL,
	[idx] [int] NOT NULL,
	[InsertUTCDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Defrag] PRIMARY KEY CLUSTERED 
(
	[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [srv].[Defrag] ADD  CONSTRAINT [DF_Defrag_InsertUTCDate]  DEFAULT (getutcdate()) FOR [InsertUTCDate]
GO

Next, we create the stored procedure [srv]. [AutoDefragIndex] for optimizing indexes on each database you need (also possible on system databases) as follows:

An example implementation of the stored procedure [srv]. [AutoDefragIndex]

USE [ИМЯ_БД]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [srv].[AutoDefragIndex]
	@count int=null --кол-во одновременно обрабатываемых индексов
	,@isrebuild bit=0 --позволять ли перестраиваться индексам (фрагментация которых свыше 30%)
AS
BEGIN
	SET NOCOUNT ON;
	SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
	--определяем возможность перестраивать индекс в режиме ONLINE
	declare @isRebuildOnline bit=CASE WHEN (CAST (SERVERPROPERTY ('Edition') AS nvarchar (max)) LIKE '%Enterprise%' OR CAST (SERVERPROPERTY ('Edition') AS nvarchar (max)) LIKE '%Developer%' OR CAST (SERVERPROPERTY ('Edition') AS nvarchar (max)) LIKE '%Evaluation%') THEN 1 ELSE 0 END;
	declare @IndexName		nvarchar(100)
			,@db			nvarchar(100)
			,@db_id			int
			,@Shema			nvarchar(100)
			,@Table			nvarchar(100)
			,@SQL_Str		nvarchar (max)=N''
			,@frag			decimal(6,2)
			,@frag_after	decimal(6,2)
			,@frag_num		int
			,@page			int
			,@ts			datetime
			,@tsg			datetime
			,@tf			datetime
			,@object_id		int
			,@idx			int
			,@rec			int;
	--для обработки
	declare @tbl table (
						IndexName		nvarchar(100)
						,db				nvarchar(100)
						,[db_id]		int
						,Shema			nvarchar(100)
						,[Table]		nvarchar(100)
						,frag			decimal(6,2)
						,frag_num		int
						,[page]			int
						,[object_id]	int
						,idx			int
						,rec			int
					   );
	--для истории
	declare @tbl_copy table (
						IndexName		nvarchar(100)
						,db				nvarchar(100)
						,[db_id]		int
						,Shema			nvarchar(100)
						,[Table]		nvarchar(100)
						,frag			decimal(6,2)
						,frag_num		int
						,[page]			int
						,[object_id]	int
						,idx			int
						,rec			int
					   );
	set @ts = getdate()
	set @tsg = @ts;
	--выбираем индексы, которые фрагментированы не менее, чем на 10%
	--и которые еще не выбирались
	if(@count is null)
	begin
		insert into @tbl (
						IndexName	
						,db			
						,[db_id]	
						,Shema		
						,[Table]		
						,frag		
						,frag_num	
						,[page]				
						,[object_id]
						,idx		
						,rec		
					 )
		select				ind.index_name,
							ind.db,
							ind.database_id,
							ind.shema,
							ind.tb,
							ind.frag,
							ind.frag_num,
							ind.[page],
							ind.[object_id],
							ind.idx ,
							ind.rec
		from  [inf].[vIndexDefrag] as ind
		where not exists(
							select top(1) 1 from [БД для администрирования].[srv].[ListDefragIndex] as lind
							where lind.[db_id]=ind.database_id
							  and lind.[idx]=ind.idx
							  and lind.[object_id]=ind.[object_id]
						)
		--order by ind.[page] desc, ind.[frag] desc
	end
	else
	begin
		insert into @tbl (
						IndexName	
						,db			
						,[db_id]	
						,Shema		
						,[Table]		
						,frag		
						,frag_num	
						,[page]				
						,[object_id]
						,idx		
						,rec		
					 )
		select top (@count)
							ind.index_name,
							ind.db,
							ind.database_id,
							ind.shema,
							ind.tb,
							ind.frag,
							ind.frag_num,
							ind.[page],
							ind.[object_id],
							ind.idx ,
							ind.rec
		from  [inf].[vIndexDefrag] as ind
		where not exists(
							select top(1) 1 from [БД для администрирования].[srv].[ListDefragIndex] as lind
							where lind.[db_id]=ind.database_id
							  and lind.[idx]=ind.idx
							  and lind.[object_id]=ind.[object_id]
						)
		--order by ind.[page] desc, ind.[frag] desc
	end
	--если все индексы выбирались (т е выборка пуста)
	--то очищаем таблицу обработанных индексов
	--и начинаем заново
	if(not exists(select top(1) 1 from @tbl))
	begin
		delete from [БД для администрирования].[srv].[ListDefragIndex]
		where [db_id]=DB_ID();
		if(@count is null)
		begin
			insert into @tbl (
							IndexName	
							,db			
							,[db_id]	
							,Shema		
							,[Table]		
							,frag		
							,frag_num	
							,[page]				
							,[object_id]
							,idx		
							,rec		
						 )
			select				ind.index_name,
								ind.db,
								ind.database_id,
								ind.shema,
								ind.tb,
								ind.frag,
								ind.frag_num,
								ind.[page],
								ind.[object_id],
								ind.idx ,
								ind.rec
			from  [inf].[vIndexDefrag] as ind
			where not exists(
								select top(1) 1 from [БД для администрирования].[srv].[ListDefragIndex] as lind
								where lind.[db_id]=ind.database_id
								  and lind.[idx]=ind.idx
								  and lind.[object_id]=ind.[object_id]
							)
			--order by ind.[page] desc, ind.[frag] desc
		end
		else
		begin
			insert into @tbl (
							IndexName	
							,db			
							,[db_id]	
							,Shema		
							,[Table]		
							,frag		
							,frag_num	
							,[page]				
							,[object_id]
							,idx		
							,rec		
						 )
			select top (@count)
								ind.index_name,
								ind.db,
								ind.database_id,
								ind.shema,
								ind.tb,
								ind.frag,
								ind.frag_num,
								ind.[page],
								ind.[object_id],
								ind.idx ,
								ind.rec
			from  [inf].[vIndexDefrag] as ind
			where not exists(
								select top(1) 1 from [БД для администрирования].[srv].[ListDefragIndex] as lind
								where lind.[db_id]=ind.database_id
								  and lind.[idx]=ind.idx
								  and lind.[object_id]=ind.[object_id]
							)
			--order by ind.[page] desc, ind.[frag] desc
		end
	end
	--если выборка не пустая
	if(exists(select top(1) 1 from @tbl))
	begin
		--запоминаем выбранные индексы
		INSERT INTO [БД для администрирования].[srv].[ListDefragIndex]
		       (
				 [db]
				,[shema]
				,[table]
				,[IndexName]
				,[object_id]
				,[idx]
				,[db_id]
				,[frag]
			   )
		select	 [db]
				,[shema]
				,[table]
				,[IndexName]
				,[object_id]
				,[idx]
				,[db_id]
				,[frag]
		from @tbl;
		insert into @tbl_copy (
						IndexName	
						,db			
						,[db_id]	
						,Shema		
						,[Table]	
						,frag		
						,frag_num	
						,[page]			
						,[object_id]
						,idx		
						,rec		
					 )
		select			IndexName	
						,db			
						,[db_id]	
						,Shema		
						,[Table]	
						,frag			
						,frag_num	
						,[page]				
						,[object_id]
						,idx		
						,rec	
		from @tbl;
		--формируем запрос на оптимизацию выбранных индексов (в случае реорганизации-с последующим обновлением статистики по ним)
		while(exists(select top(1) 1 from @tbl))
		begin
			select top(1)
			@IndexName=[IndexName],
			@Shema=[Shema],
			@Table=[Table],
			@frag=[frag]
			from @tbl;
			if(@frag>=30 and @isrebuild=1 and @isRebuildOnline=1)
			begin
				set @SQL_Str = @SQL_Str+'ALTER INDEX ['+@IndexName+'] on ['+@Shema+'].['+@Table+'] REBUILD WITH(ONLINE=ON);'
			end
			else
			begin
				set @SQL_Str = @SQL_Str+'ALTER INDEX ['+@IndexName+'] on ['+@Shema+'].['+@Table+'] REORGANIZE;'
									   +'UPDATE STATISTICS ['+@Shema+'].['+@Table+'] ['+@IndexName+'];';
			end
			delete from @tbl
			where [IndexName]=@IndexName
			and [Shema]=@Shema
			and [Table]=@Table;
		end
		--оптимизируем выбранные индексы
		execute sp_executesql  @SQL_Str;
		--записываем результат оптимизации индексов
		insert into [БД для администрирования].srv.Defrag(
									[db],
									[shema],
									[table],
									[IndexName],
									[frag_num],
									[frag],
									[page],
									ts,
									tf,
									frag_after,
									[object_id],
									idx,
									rec
								  )
						select
									[db],
									[shema],
									[table],
									[IndexName],
									[frag_num],
									[frag],
									[page],
									@ts,
									getdate(),
									(SELECT top(1) avg_fragmentation_in_percent
									FROM sys.dm_db_index_physical_stats
										(DB_ID([db]), [object_id], [idx], NULL ,
										N'LIMITED')
									where index_level = 0) as frag_after,
									[object_id],
									[idx],
									[rec]
						from	@tbl_copy;
	end
END
GO

It is important to note that when rebuilding the index, updating statistics on the index is not necessary. There is also rebuilding the index only if it is fragmented by at least 30% and at the same time the release of MS SQL Server allows you to do this in ONLINE mode, and the input parameter @isrebuild of the stored procedure was set to 1.

Here, the count parameter is needed more to distribute the load throughout the day. If optimization by indexes occurs only at a certain time during the day or less, then NULL can be passed to count (as by default).

Now we will create the stored procedure [srv]. [AutoDefragIndexDB] in the database for administration, for its subsequent call:

An example implementation of the stored procedure [srv]. [AutoDefragIndexDB]

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [srv].[AutoDefragIndexDB]
	@DB nvarchar(255)=NULL, --по конкретной БД или по всем
	@count int=NULL, --кол-во индексов для рассмотрения в каждой БД
	@IsTempdb bit=0 --включать ли БД tempdb
AS
BEGIN
	/*
		вызов оптимизации индексов для заданной БД
	*/
	SET NOCOUNT ON;
	declare @db_name nvarchar(255);
	declare @sql nvarchar(max);
	declare @ParmDefinition nvarchar(255)= N'@count int';
	if(@DB is null)
	begin
		select [name]
		into #tbls
		from sys.databases
		where [is_read_only]=0
		and [state]=0 --ONLINE
		and [user_access]=0--MULTI_USER
		and (((@IsTempdb=0 or @IsTempdb is null) and [name]<>N'tempdb') or (@IsTempdb=1));
		while(exists(select top(1) 1 from #tbls))
		begin
			select top(1)
			@db_name=[name]
			from #tbls;
			set @sql=N'USE ['+@db_name+']; '+
			N'IF(object_id('+N''''+N'[srv].[AutoDefragIndex]'+N''''+N') is not null) EXEC [srv].[AutoDefragIndex] @count=@count;';
			exec sp_executesql @sql, @ParmDefinition, @count=@count;
			delete from #tbls
			where [name]=@db_name;
		end
		drop table #tbls;
	end
	else
	begin
		set @sql=N'USE ['+@DB+']; '+
			N'IF(object_id('+N''''+N'[srv].[AutoDefragIndex]'+N''''+N') is not null) EXEC [srv].[AutoDefragIndex] @count=@count;';
		exec sp_executesql @sql, @ParmDefinition, @count=@count;
	end
END
GO

Thus, to automate the process of optimizing indexes, you can take the following steps:

1) for each database you need, define the view [inf]. [VIndexDefrag] and the stored procedure [srv]. [AutoDefragIndex]
2) define two tables [srv] in the administration database. [ListDefragIndex] and [srv]. [Defrag], and the stored procedure [srv]. [AutoDefragIndexDB]
3) create a task in the Agent to periodically call the stored procedure [srv]. [AutoDefragIndexDB] from the administration database

Now we will analyze section 3 of the suboptimal statistics.

In most cases, when performing steps 1–2 or just step 2 for directly optimizing the database and DBMS as a whole and setting the database properties to update statistics (automatic statistics update, automatic creation statistics, asynchronous automatic statistics update, automatic statistics creation with the addition of ), MS SQL Server itself does a good job of optimizing statistics.

The main thing is not to forget to update statistics after the reorganization of the index, since in this case it is not updated, as well as when performing step 1 on clearing the procedural cache.

But sometimes there are times when MS SQL Server does not cope with its task due to the specifics of the entire information system or when it is impossible to use item 1 (clearing the procedural cache). Then from point 1 you can just take a command to update statistics across the entire database:

Updating statistics across the entire database

USE [ИМЯ_БД]
GO
exec sp_updatestats;
GO

However, if this is not enough, or updating statistics across the entire database takes a very long time, it is necessary to consider a more flexible algorithm for updating statistics.

It should be noted right away that when building a more flexible algorithm for updating statistics, Clause 1 on clearing the procedural cache and updating all statistics on the database is not applicable in the block for directly optimizing the database and the DBMS as a whole.

Here is an example of the implementation of claim 3 in the case when it is impossible to update statistics across the entire database and the built-in tools are also not enough.

To do this, create the following stored procedure [srv] in the necessary databases. [AutoUpdateStatistics]:

An example implementation of the stored procedure [srv]. [AutoUpdateStatistics]

USE [ИМЯ_БД]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [srv].[AutoUpdateStatistics]
	--Максимальный размер в МБ для рассматриваемого объекта
	@ObjectSizeMB numeric (16,3) = NULL,
	--Максимальное кол-во строк в секции
	@row_count numeric (16,3) = NULL
AS
BEGIN
	/*
		тонкое обновление статистики
	*/
	SET NOCOUNT ON;
	SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
    declare @ObjectID int;
	declare @SchemaName nvarchar(255);
	declare @ObjectName nvarchar(255);
	declare @StatsID int;
	declare @StatName nvarchar(255);
	declare @SQL_Str nvarchar(max);
	;with st AS(
	select DISTINCT 
	obj.[object_id]
	, obj.[create_date]
	, OBJECT_SCHEMA_NAME(obj.[object_id]) as [SchemaName]
	, obj.[name] as [ObjectName]
	, CAST(
			(
			   --общее число страниц, зарезервированных в секции (по 8 КБ на 1024 поделить=поделить на 128)
				SELECT SUM(ps2.[reserved_page_count])/128.
				from sys.dm_db_partition_stats as ps2
				where ps2.[object_id] = obj.[object_id]
			) as numeric (38,2)
		  ) as [ObjectSizeMB] --размер объекта в МБ
	, s.[stats_id]
	, s.[name] as [StatName]
	, sp.[last_updated]
	, i.[index_id]
	, i.[type_desc]
	, i.[name] as [IndexName]
	, ps.[row_count]
	, s.[has_filter]
	, s.[no_recompute]
	, sp.[rows]
	, sp.[rows_sampled]
	--кол-во изменений вычисляется как:
	--сумма общего кол-ва изменений в начальном столбце статистики с момента последнего обновления статистики
	--и разности приблизительного кол-ва строк в секции и общего числа строк в таблице или индексированном представлении при последнем обновлении статистики
	, sp.[modification_counter]+ABS(ps.[row_count]-sp.[rows]) as [ModificationCounter]
	--% количества строк, выбранных для статистических вычислений,
	--к общему числу строк в таблице или индексированном представлении при последнем обновлении статистики
	, NULLIF(CAST( sp.[rows_sampled]*100./sp.[rows] as numeric(18,3)), 100.00) as [ProcSampled]
	--% общего кол-ва изменений в начальном столбце статистики с момента последнего обновления статистики
	--к приблизительному количество строк в секции
	, CAST(sp.[modification_counter]*100./(case when (ps.[row_count]=0) then 1 else ps.[row_count] end) as numeric (18,3)) as [ProcModified]
	--Вес объекта:
	--[ProcModified]*десятичный логарифм от приблизительного кол-ва строк в секции
	, CAST(sp.[modification_counter]*100./(case when (ps.[row_count]=0) then 1 else ps.[row_count] end) as numeric (18,3))
								* case when (ps.[row_count]<=10) THEN 1 ELSE LOG10 (ps.[row_count]) END as [Func]
	--было ли сканирование:
	--общее количество строк, выбранных для статистических вычислений, не равно
	--общему числу строк в таблице или индексированном представлении при последнем обновлении статистики
	, CASE WHEN sp.[rows_sampled]<>sp.[rows] THEN 0 ELSE 1 END as [IsScanned]
	, tbl.[name] as [ColumnType]
	, s.[auto_created]	
	from sys.objects as obj
	inner join sys.stats as s on s.[object_id] = obj.[object_id]
	left outer join sys.indexes as i on i.[object_id] = obj.[object_id] and (i.[name] = s.[name] or i.[index_id] in (0,1) 
					and not exists(select top(1) 1 from sys.indexes i2 where i2.[object_id] = obj.[object_id] and i2.[name] = s.[name]))
	left outer join sys.dm_db_partition_stats as ps on ps.[object_id] = obj.[object_id] and ps.[index_id] = i.[index_id]
	outer apply sys.dm_db_stats_properties (s.[object_id], s.[stats_id]) as sp
	left outer join sys.stats_columns as sc on s.[object_id] = sc.[object_id] and s.[stats_id] = sc.[stats_id]
	left outer join sys.columns as col on col.[object_id] = s.[object_id] and col.[column_id] = sc.[column_id]
	left outer join sys.types as tbl on col.[system_type_id] = tbl.[system_type_id] and col.[user_type_id] = tbl.[user_type_id]
	where obj.[type_desc] <> 'SYSTEM_TABLE'
	)
	SELECT
		st.[object_id]
		, st.[SchemaName]
		, st.[ObjectName]
		, st.[stats_id]
		, st.[StatName]
		INTO #tbl
	FROM st
	WHERE NOT (st.[row_count] = 0 AND st.[last_updated] IS NULL)--если нет данных и статистика не обновлялась
		--если нечего обновлять
		AND NOT (st.[row_count] = st.[rows] AND st.[row_count] = st.[rows_sampled] AND st.[ModificationCounter]=0)
		--если есть что обновлять (и данные существенно менялись)
		AND ((st.[ProcModified]>=10.0) OR (st.[Func]>=10.0) OR (st.[ProcSampled]<=50))
		--ограничения, выставленные во входных параметрах
		AND (
			 ([ObjectSizeMB]<=@ObjectSizeMB OR @ObjectSizeMB IS NULL)
			 AND
			 (st.[row_count]<=@row_count OR @row_count IS NULL)
			);
	WHILE (exists(select top(1) 1 from #tbl))
	BEGIN
		select top(1)
		@ObjectID	=[object_id]
		,@SchemaName=[SchemaName]
		,@ObjectName=[ObjectName]
		,@StatsId	=[stats_id]
		,@StatName	=[StatName]
		from #tbl;
		SET @SQL_Str = 'IF (EXISTS(SELECT TOP(1) 1 FROM sys.stats as s WHERE s.[object_id] = '+CAST(@ObjectID as nvarchar(32)) + 
						' AND s.[stats_id] = ' + CAST(@StatsId as nvarchar(32)) +')) UPDATE STATISTICS ' + QUOTENAME(@SchemaName) +'.' +
						QUOTENAME(@ObjectName) + ' ('+QUOTENAME(@StatName) + ') WITH FULLSCAN;';
		execute sp_executesql @SQL_Str;
		delete from #tbl
		where [object_id]=@ObjectID
		  and [stats_id]=@StatsId;
	END
	drop table #tbl;
END
GO

Here you can see the fact that statistics are updated only for those objects for which significant changes have been made. However, you should pay attention to the value of the [IsScanned] column. If it is different from 1, then this means the fact that the total number of rows in the table or indexed view at the last statistics update and the total number of rows selected for statistical calculations do not match. And this means that the statistics are already outdated. And although the algorithm considers only significant changes in the data, it is not necessary to exclude the need that someday it may be necessary to update all the statistics of an object in which there were few changes and which weighs a lot from the data. Therefore, the above algorithm cannot be universal for all databases,

Next, in the database for administration, we create the stored procedure [srv]. [AutoUpdateStatisticsDB], which in the future will need to be periodically run according to the regulations:

An example implementation of the stored procedure [srv]. [AutoUpdateStatisticsDB]

USE [БД для администрирования]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [srv].[AutoUpdateStatisticsDB]
	@DB nvarchar(255)=NULL, --по конкретной БД или по всем
	@ObjectSizeMB numeric (16,3) = NULL,
	--Максимальное кол-во строк в секции
	@row_count numeric (16,3) = NULL,
	@IsTempdb bit=0 --включать ли БД tempdb
AS
BEGIN
	/*
		вызов тонкой оптимизации статистики для заданной БД
	*/
	SET NOCOUNT ON;
	declare @db_name nvarchar(255);
	declare @sql nvarchar(max);
	declare @ParmDefinition nvarchar(255)= N'@ObjectSizeMB numeric (16,3), @row_count numeric (16,3)';
	if(@DB is null)
	begin
		select [name]
		into #tbls
		from sys.databases
		where [is_read_only]=0
		and [state]=0 --ONLINE
		and [user_access]=0--MULTI_USER
		and (((@IsTempdb=0 or @IsTempdb is null) and [name]<>N'tempdb') or (@IsTempdb=1));
		while(exists(select top(1) 1 from #tbls))
		begin
			select top(1)
			@db_name=[name]
			from #tbls;
			set @sql=N'USE ['+@db_name+']; '+
			N'IF(object_id('+N''''+N'[srv].[AutoUpdateStatistics]'+N''''+N') is not null) EXEC [srv].[AutoUpdateStatistics] @ObjectSizeMB=@ObjectSizeMB, @row_count=@row_count;';
			exec sp_executesql @sql, @ParmDefinition, @ObjectSizeMB=@ObjectSizeMB, @row_count=@row_count;
			delete from #tbls
			where [name]=@db_name;
		end
		drop table #tbls;
	end
	else
	begin
		set @sql=N'USE ['+@DB+']; '+
		N'IF(object_id('+N''''+N'[srv].[AutoUpdateStatistics]'+N''''+N') is not null) EXEC [srv].[AutoUpdateStatistics] @ObjectSizeMB=@ObjectSizeMB, @row_count=@row_count;';
		exec sp_executesql @sql, @ParmDefinition, @ObjectSizeMB=@ObjectSizeMB, @row_count=@row_count;
	end
END
GO

Usually, a more flexible algorithm for updating statistics is needed in such databases, where hardware capacities are not able to update statistics for the entire database in a reasonable time, i.e. this is a database with very large data sizes (significantly more than 1 TB).

Thus, all 3 points of the first block on direct optimization of the database itself and the DBMS as a whole were considered.

In addition to this block, the following general recommendations should be added:

1) it is necessary to ensure that database data files are fragmented from 5 to 20% (if less than 5%, increase (by setting the original size in the database file properties larger), more than 20% -compress using the SHRINKFILE command )

2) it is necessary to maintain system databases by indexes and statistics (especially msdb )

3) it is necessary to clean the logs of the msdb database , for example as follows:

Example of cleaning the logs of the msdb database

declare @dt datetime=DateAdd(day,-28,GetDate());
exec msdb.dbo.sp_delete_backuphistory @dt;
exec msdb.dbo.sp_purge_jobhistory @oldest_date=@dt;
exec msdb.dbo.sp_maintplan_delete_log null, null, @dt;
exec msdb.dbo.sp_cycle_errorlog;
exec msdb.dbo.sp_Cycle_Agent_ErrorLog;
...

Now we will analyze the second block on optimizing the interaction between the application and MS SQL Server (.NET applications and MS SQL Server).

Here we give only the main recommendations without implementation examples, so that the article does not turn out to be too lengthy.

So, here are the general recommendations for optimizing the interaction between the application and MS SQL Server:

1) Try not to work with the string, but with the set when sending commands to the DBMS
2) Try to send requests to the DBMS asynchronously and not force the user to wait for the application
to respond 3) Try to send requests to the DBMS in batches, rather than single ones (especially relevant when changing data) —then implement a delayed start mechanism and a query accumulation system
4) Use hashing for all software components to reduce access to the DBMS
5) Sign in the connection string to the database the application component in the Application Name:

Application Name Example

using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApp2
{
    class Program
    {
        static void Main(string[] args)
        {
            OpenSqlConnection();
            Console.ReadKey();
        }
        private static void OpenSqlConnection()
        {
            string connectionString = GetConnectionString();
            using (SqlConnection connection = new SqlConnection())
            {
                connection.ConnectionString = connectionString;
                connection.Open();
                Console.WriteLine("State: {0}", connection.State);
                Console.WriteLine("ConnectionString: {0}",
                    connection.ConnectionString);
            }
        }
        static private string GetConnectionString()
        {
            SqlConnectionStringBuilder conn = new SqlConnectionStringBuilder();
            conn.ApplicationName = "MyProgram";
            conn.DataSource = "SQLServerName";
            conn.InitialCatalog = "DBName";
            conn.IntegratedSecurity = true;
            return conn.ToString();
        }
    }
}

6) Set the isolation level correctly when working with the database

7) On the server side of the software, if necessary, implement a queue of calls to the DBMS, which will take into account not only the specifics of the information system, but also the hardware capabilities of the DBMS server or group of DBMS servers, if the latter work as AlwaysOn

8) Try to pre-filter as much data as possible, and not immediately request everything from the DBMS and then apply a filter (for example, use page-by-page return if necessary with dynamic asynchronous data loading)

9) Try not to filter by a large amount of data not on the DBMS side (see the previous paragraph)

10) Separate the component logs and database logs, and do not write everything in one table

Now we will analyze the third block for optimizing the queries themselves.

Here again we will not go into details much, since this would take a whole book, and we will describe only some key points.

So, general recommendations for optimizing the queries themselves:

1) try to filter the data as strongly as possible before joining them with other tables
2) try to sort the result set as little as possible by the amount of data
3) avoid the DISTINCT, LIKE '% construct if possible. .. ', OUTER JOIN especially on big data
4) if the sample needs only one field from the joined table, then do not join such a table, but make a subquery in the sample itself
5) when filtering, aggregating and selecting, try to take into account the available indexes so that the optimizer can use them
6) try to return only those fields that are really needed, and not all fields from all joined tables (do not approach unified code in T-SQL, t for it is a very bad approach especially with large data)
7) when updating and deleting data (if the filter or aggregation is not built on the clustered index or clustered index in addition there are other conditions or aggregation), then do not do the operation from the table, and the first Select the deleted / modified data to a temporary table (which will consist of columns that are included in the clustered index, as well as all other required fields to update) and apply directly after removal / update
8) do not overload the conditions for joining tables, but rather make part of the condition in the filter
9) use reasonable hints to queries

For a better understanding, we give an example of clause 7.

For example, we have a Personal table (there is a clustered index on the ID field) and we need to remove all people with a name that contains the substring 'on' from it and update the comment by the name of those people whose last name ends with 'va'.

Here's how to implement this task:

Examples according to claim 7

select [ID]
into #tbl_del
from [Personal]
where [FirstName] like '%на%';
delete from p
from #tbl_del as t
inner join [Personal] as p on t.[ID]=p.[ID];
drop table #tbl_del;
select [ID]
	  ,[FirstName]
into #tbl_upd
from [Personal]
where [LastName] like '%ва';
update p
set p.[Comment]=t.[FirstName]
from #tbl_upd as t
inner join [Personal] as p on t.[ID]=p.[ID];
drop table #tbl_upd;

Now, when the example of item 7 was considered on simple queries, we give an example with the work of the so-called duplicates. A fairly common task is to remove duplicate records. It can be implemented as follows:

Example of deleting duplicate records according to clause 7

--Пусть будет признаком дубликата в таблице Personal совпадение по именам и фамилиям.
--Тогда нужно удалить последние по дате создания записи, которые дублируются, и оставить только одну:
;with DelDups as (
  select [ID],
  --используем оконную функцию с секционированием по имени и фамилии
  row_number() over(partition by [FirstName], [LastName] order by [InsertDate] desc) as [rn]
  from [Personal]
)
select [ID]
into #tbl_del
delete from DelDups
where rn>1;
delete from p
from #tbl_del as t
inner join [Personal] as p on t.[ID]=p.[ID];
drop table #tbl_del;

Similarly, we consider an example with updating duplicate rows.

Thus, all 3 blocks were considered to optimize both the database itself and the DBMS as a whole, and when accessing the software and the queries themselves.

In addition, it is also worth noting that with the growth of the information system, in particular the number of simultaneous users and the volume of the database itself, it is worth considering the separation of systems into OLTP and OLAP, where background tasks (such as integration, data movement (ETL, etc.) and t e) will be performed with the OLAP system (including reports), and real-time tasks coming from users will be performed with the OLTP system. This separation will allow even finer tuning of each of the systems and will significantly reduce the load on the OLTP system. Here you can adhere to the golden rule: the number of requests from users in the OLTP system should be several times greater than the number of all other requests. Similarly, in terms of the total volume of processed data for a sufficiently long period (week, month, etc.).

Comment. In fact, in order to distinguish an OLAP system from an OLTP, it is enough to check one fact: in the first system, the number of data samples will be many times greater than the number of data changes in the frequency of operations (i.e., either the changes are rare or occur in such small volumes compared to with selectable volumes, that these changes can be neglected). In an OLTP system, exactly the opposite is true. OLAP and OLTP systems are configured differently, and the hybrid solution will increasingly show its non-optimality with the growth of data volumes and the number of simultaneous accesses to these data (these characteristics will be especially acute in databases whose volumes are significantly more than 1 TB). If the database is significantly less than 1 TB, then separation into OLTP and OLAP may not be necessary,

Also try to use all available features wisely, both from the DBMS itself and from the OS as a whole, as well as correctly configure the hardware resources themselves for general optimization of work.

Also, do not forget about resource limitation (CPU, RAM and I / O system bandwidth, as well as its speed).

After all the optimizations, it is also important to clarify over time about the future growth of requirements for the DBMS (in particular, the increase in the number of simultaneous calls, the increase in the volume of processed data, etc.) and in advance to predict problems and solutions (that is, you can still improve, but where hardware improvement is needed or the purchase of missing capacities for the future).

Result

This article considered the optimization of the database and DBMS as a whole, as well as general recommendations on optimizing applications for working with the DBMS and the database queries themselves.

The above recommendations allow you to significantly optimize the entire DBMS for a long period, taking into account the further development of the information system and an increase in the number of simultaneous calls to the DBMS.

Sources:

» Dynamic administrative views of the system
» Overlapping indexes
» Reorganizing and rebuilding indexes
» SHRINKFILE
» msdb
» Connection string
» Isolation level
» Automatic update statistics
» Automatic defragmentation of indexes
» Current size of the entire plan cache and query plan cache

Tags:

MS SQL Server

Three aspects of optimization (DB and software)

Foreword

Decision

Result

Sources:

Also popular now: