From versioned database migration to database change management

    Thanks to people who do not hesitate to share their thoughts and experiences, even negative ones, on many important issues of organizing work with database systems. I came across an article “ Versioned migration of the database structure: why it’s better not to do this ”, I thought to comment on it, but, having considered the publication date, I decided to write my own. It is quite obvious that the author had his own idea of ​​the meaning and meaning of the words that he put in the heading. And the inaccurate presentation led to the fact that the wrong task was being solved. Quite a long time ago articles on the organization of versioned database migration appeared on Habré. They are easily detected by keyword searches. Here in this article: VERSION MIGRATION OF THE DATABASE STRUCTURE: BASIC APPROACHESAn excellent introduction to terminology, tasks and basic methods for solving them is given.
    I would like to tell by my own example about those unexpected problems that without an invitation suddenly arose in front of our group at one of my old works, and about what we lacked then, for quick and effective resolution of the situation - in general, also a negative experience - suddenly someone will come in handy now or in the future. Despite the fact that in our company we are more broadly approaching the solution of such problems, combining them under the term “Management of changes in the database”, I will try to stay in the terminology field from the article above.

    EXPERIENCE OF PAST YEARS - ABOUT UNSOLVED PROBLEMS

    In 1997, the development team that I just entered was tasked with creating a software package within 3 months that implements automated technology, which was to form the basis of the business activities of the entire company. It’s a long-standing affair and, with your permission, I won’t go into details and details of technology and business processes. It is important that it was necessary to process and interconnect the daily received data, supplied in significant volumes from two independent external sources, to accumulate them in the repository, with minimal delays to give answers to arbitrary requests from our customers - managers within the company, to forecast many indicators based on retrospective analysis of the accumulated volumes of information. This task has been completed,

    The first signs of a problem appeared when the management of the IT department ordered the transfer of a copy of this system to the data center of one of the customers. At the same time, the deadline, as usual, was “yesterday”. Since the source of information was the same as ours, the data flows from the same provider did not have to be changed much, the ETL remained almost the same, the list of requests was narrowed, the set of reports was slightly modified and limited. Nevertheless, the technical base on which all this was supposed to work was already different: instead of Oracle, there was an MS SQL Server DBMS. But, the database structure itself has not changed, even the data types did not require a complex conversion.

    Now in our support there are two close in design and functionality, but different in implementation versions of the system. Soon the fairy tale affects, after a while we received in support of about 30 more identical systems for working in branches throughout the country. Deploy of new versions was carried out by copying the new version from the center. From the point of view of versioned migration, this was one standard version, even the configuration parameters of all servers had to be the same, local administrators were not simply not encouraged, but prohibited, and violations of this ban should be monitored daily. It was also necessary to have control over the state and contents of directories - the basis for further “information” of information from branch databases to the central one - and the content should be guaranteed the same for all local databases.

    Well, the last stage, as many have already guessed, was the emergence of another 4-5 separate options to ensure the work of the company’s branches abroad. Moreover, each country has its own source data provider, some data elements are not enough, some of the necessary information is “scattered” on the supplied sets, some of the indicators are calculated according to other rules. This means that each option is fundamentally different in the ETL part, and this applies not only to applications and procedures, but also to the so-called data structures. Stage Database (a working database for performing ETL transformations without interfering with the main one).

    Thus, having developed a system for purely internal use, after a short time, without any well-thought-out plans for the future, we found ourselves acting as keepers of a large zoo of exotic versions. And I guess not only us.

    No matter how general and brief this story is, some characteristic features related to database migration can be extracted from it.

    1. Simultaneous support of a large number of different versions of the application is an objective reality that does not depend on the wishes of developers or users
    2. Migration is required not only when transferring the database to another server or DBMS platform, but also when synchronizing the database with the available version (version, generation) of application software
    3. The difference between the versions is connected not only with differences in the processing logic and database structures, but also in various DBMS platforms, as well as in special settings for the existing hardware.
    4. It requires not only management of the migration process, but also control of the immutability of schemes, data and settings, audit (reporting) on ​​the composition of elements that have been changed, by whom and by what values
    5. Changes to the database structure during migration often entail additional processing / conversion of stored shared data

    Probably, it was easier for us, because the tasks "piled on" gradually. And the team was stable, with clearly distributed functions of each employee, both in vertical areas of activity and in particular tasks. Nevertheless, every time changes were made to the application software, all the same planned or spontaneous, we had to transfer everyone to a new version of the application. And the version of the working database must correspond to this updated version, it is necessary to perform versioned migration of schemes, data, settings.

    To one degree or another, suitable solutions exist and are quite common. It is important to arrive at the most convenient and less time-consuming method. I talk about the negative experience, because now I would have chosen another technology, and I really regret its absence at that time.

    It would seem that it is easier to have uniform scripts for the modification of all servers and automatically execute them on each server!

    The first problem was that in different places the versions of the applications that had to be switched to were different! A single modification script could not exist, it was necessary to spend a lot of time studying the current state, writing and testing scripts for each server separately. Testing is a separate issue, since a test stand should have been created first.

    The support of the log of changes that were made on a separate server during the existence of the old version of the application did not bring relief, because if there were several such changes, synchronization scripts should be executed strictly in the “correct” sequence, and, in some cases, re-executing one script is absolutely unacceptable - the result will already be irreversible.

    Now multiply your time by the number of different servers!

    The inverse problem: the local administrator decided to "improve" the database on his own or, for example, due to lack of disk space, "demolished" a large index and "forgot" to restore it after transferring the database files to another medium. We could detect such changes in settings and structure quite easily, but we need to have this particular version of the database in the initial state in order to compare with what. And if it was possible to detect and fix the changes in the circuit quickly enough, changes in the server configuration parameters or in the data itself were detected with great effort and for a much longer period. And then carry out the entire technology of creating correction scripts and their execution. And at the same time to conduct an investigation of who and at what moment made these changes.

    If there were available tools that automatically supported the approach to the database structure as to source texts, similar to all the usual source control systems in various programming languages, the following tasks would be much easier:

    • Updating the database from a specific version to any other in one step, both to a later one and return to the previous one;
    • Easy to receive migration scripts in automatic mode, with the ability to "manually" make corrections as a last resort;
    • Creation from scratch of a new database instance corresponding to the existing version of the application;
    • Simple creation of test / development database instances on the basis of actual working databases for development on them that are most appropriate for these workers.
    • Monitoring and auditing of unwanted changes in database copies, if necessary, automatic return to the reference state in a short time.


    image
    This is what was so lacking in the story being described. This approach is not very effective to use without the use of any instrumental solutions, but then we did not have enough time or resources to develop our own full-fledged utility. The policy of the management of the IT department also interfered. However, today there are also finished products that implement this approach, each with its own set of features and functions. I plan to talk about one of the possible solutions in the next publication

    There are many different ways and solutions to store and manage database changes. It is important to find the most acceptable approach and apply a tool that will help you increase the degree of automation of versioned database migration, increase the quality and reliability of your work, save the resources and time of your employees. In this article, I tried to tell by a life example where the problems of managing database changes come from, what difficulties this entails, and what conclusion I came to on the basis of this, by and large, negative experience.

    Also popular now: