Database Development Solutions

Good afternoon! In the process of project development, one has to deal with the problem of database performance, because The amount of data is growing, and problem areas float involuntarily.

This post describes not very successful solutions that are invisible in the early stages of the project. And solutions that can further increase productivity significantly.

The post is more aimed at more experienced developers who are already possibly looking for solutions, so I will be brief.

Select in result fields

Many developers use the following construct when writing queries.
  (SELECT MAX(Column) FROM Table2) as Column3,
FROM Table1 T

The fact is that for each row from the table Table1, a SELECT MAX (Column) FROM Table2 is selected, which requires additional resources for a new selection. In this case, performance will decrease as the amount of data in Table1 and Table2 increases.

Acceptable option.
FROM Table1 T
  JOIN (SELECT Max(Column) as Column3 FROM Table2) T2 ON 1 = 1

In this case, the selection from the table Table2 will be performed once, and the result will be substituted for each row from the table Table1.

Replacing Left Join with Union

There are situations when it is necessary to collect data from several tables of the same type in meaning and often resort to the following solution

  M.Id, COALESCE(P1.Sum, 0) + COALESCE(P2.Sum, 0) + COALESCE(P3.Sum, 0) + ... + COALESCE(PN.Sum, 0) as Sum
FROM MainTable M
  LEFT JOIN PayTable1 P1 ON P1.Id = M.Id
  LEFT JOIN PayTable2 P2 ON P2.Id = M.Id
  LEFT JOIN PayTable3 P3 ON P3.Id = M.Id
  LEFT JOIN PayTableN PN ON PN.Id = M.Id

This approach has several problems:
1. Produced, multiplying PayTableN tables and with increasing tables, performance will drop, even if there is little data.
2. The need to use LEFT JOIN, which in many DBMS indexes do not work.

You can improve performance by abandoning the operation of multiplication in favor of combining and using JOIN instead of LEFT JOIN.
SELECT R.Id, SUM(R.Sum) as Sum
  SELECT M.Id, P1.Sum
  FROM MainTable M
    JOIN PayTable1 P1 ON P1.Id = M.Id
  SELECT M.Id, P2.Sum
  FROM MainTable M
    JOIN PayTable2 P2 ON P2.Id = M.Id
  SELECT M.Id, P3.Sum
  FROM MainTable M
    JOIN PayTable3 P3 ON P3.Id = M.Id
  FROM MainTable M
    JOIN PayTableN PN ON PN.Id = M.Id
  ) R

Moreover, such a construction is easier to maintain, i.e. when adding new PayTableN tables, you need to add the corresponding block in only one revenge.

Using scalar operations in a join condition

There are situations when in the join condition it is necessary to use additional operations on the field, in which case the index is not used

  T1.*, T2.*
FROM Table1 T1
  JOIN Table2 T2 ON T2.Id = T1.Id AND COALESCE(T2.Column, 0) = COALESCE(T1.Column, 0)

Possible solutions:
1. To place in the WHERE clause, but this option will not always give a positive result, because the optimizer could do this for you.
  T1.*, T2.*
FROM Table1 T1
  JOIN Table2 T2 ON T2.Id = T1.Id 
WHERE COALESCE(T2.Column, 0) = COALESCE(T1.Column, 0)

2. Divide the request into two.
  T1.*, T2.*
FROM Table1 T1
  JOIN Table2 T2 ON T2.Id = T1.Id AND T2.Column = T1.Column
  T1.*, T2.*
FROM Table1 T1
  JOIN Table2 T2 ON T2.Id = T1.Id AND T2.Column IS NULL AND T1.Column IS NULL

Queries Using With

The WITH statement certainly helps to make the request more understandable and structured, but many do not quite understand how it works and make the following mistakes.

  MainSubQuery AS (
    SELECT * FROM Table1
  SubQuery AS (
    --Здесь на самом деле, какой либо сложный запрос, требующий больших ресурсов
    SELECT * FROM Table2
  FROM MainSubQuery M
    LEFT JOIN SubQuery Q1 ON = AND Q1.Param = 1
    LEFT JOIN SubQuery Q2 ON = AND Q2.Param = 2
    LEFT JOIN SubQuery Q3 ON = AND Q3.Param = 3
    LEFT JOIN SubQuery Q4 ON = AND Q4.Param = 4

The problem is that many people think that the SubQuery subquery will be executed once, and then only the result will be taken and substituted where necessary.
But in fact, the following is obtained:

  LEFT JOIN (SELECT * FROM Table2) Q1 ON = AND Q1.Param = 1
  LEFT JOIN (SELECT * FROM Table2) Q2 ON = AND Q2.Param = 2
  LEFT JOIN (SELECT * FROM Table2) Q3 ON = AND Q3.Param = 3
  LEFT JOIN (SELECT * FROM Table2) Q4 ON = AND Q4.Param = 4

The problem can be solved by resorting to temporary tables, i.e. The result of a heavy query should be preliminarily put in a temporary table and then used. This method is also effective if you use NICKNAME from the IBM DB2 world as the source dataset.

INSERT Session.MainSubQuery
  SELECT * FROM Table1;
INSERT Session.SubQuery
  SELECT * FROM Table2;
FROM Session.MainSubQuery M
  LEFT JOIN Session.SubQuery Q1 ON = AND Q1.Param = 1
  LEFT JOIN Session.SubQuery Q2 ON = AND Q2.Param = 2
  LEFT JOIN Session.SubQuery Q3 ON = AND Q3.Param = 3
  LEFT JOIN Session.SubQuery Q4 ON = AND Q4.Param = 4;


Next, we move from machine productivity to human labor productivity. We will talk about how to facilitate the routine work of maintaining and maintaining databases.

In the process of developing a project, it is necessary to recreate procedures, functions, tables, and other database objects, but
constantly updates DROPs in updates , ALTERs, etc. in each DBMS differently.

The solution to this issue may be the introduction of their own service procedures, for example:
CALL[EXECUTE] DropProcedure(SchemaName, ProcedureName)
CALL[EXECUTE] DropTable(TableName, TableName)
CALL[EXECUTE] DropView(SchemaView, ProcedureView)


And then we already create the corresponding objects, regardless of whether they were created earlier or not.

But with this approach, there also remains a problem related to access rights to objects, i.e. when deleting and creating an object, group rights to objects fly off, because This is a new object for DBMS.

Decisions can be different, for example:
Maintain the rights issuance script up to date and download after each update (inconvenient).
Do not use Group Policy at all (not good).

A rather convenient and effective solution is to enter a pair of service procedures, where one will be launched at the beginning of the update, and the second at the end.

--Сохранение всех прав
--Удаление и создание объектов
--Восстановление прав
CALL[EXECUTE] RestoreRights

The next question relates to the maintenance of the structure of the database DBMS MS SQL, namely in connection with the extremely inconvenient method of commenting on tables and columns. Which in turn is rarely done or not done at all.

  EXEC sys.sp_addextendedproperty @name=N'MS_Description', @value=N'Идентификатор' , @level0type=N'SCHEMA',@level0name=N'main', @level1type=N'TABLE',@level1name=N'TextTable', @level2type=N'COLUMN',@level2name=N'id';

You can also create a set of service wrapper routines.
  EXEC Service.CommentOnTable N'SchemaName', N'TableName', N'TableComment';
  EXEC Service.CommentOnColumn N'SchemaName', N'TableName', N'ColumnName', N'ColumnComment';
  EXEC Service.CommentOnProcedure N'SchemaName', N'TableName', N'ProcedureComment';

It’s already shorter, more informative and more pleasant to work with.

This, of course, is not limited to a set of service procedures; you can think of a lot of things.

All the best, pleasant work!

Also popular now: