Git and Microsoft SQL Server

  • Tutorial
Hello everybody!

In a previous post, we talked about the difficulties that developers experience when writing SQL code (and these problems are relevant not only for MS SQL Server). Here's a story on how to use Git to version control SQL Server code using SQLFuse . The principle is the same as with regular files, but there are some features.




Major changes to SQLFuse


Since the first publication on the hub, the SQLFuse has been rewritten for use as an deploy-system, which predetermined the following new qualities:
  • Changes made to files and directories are accumulated in the cache, then the SQL commands are reset by the timer to the database, wrapping them before this in a transaction. If one operation fails, all the changes made are rolled back and the cache is cleared, the hierarchy of files and directories is brought into line with their actual state relative to the database. Instead of resetting the timer for changes, it was possible to do, for example, interaction with the D-Bus, but this is already a further stage of the project.
  • The text of the contents of the files of the table modules (columns, restrictions, etc.) does not display the name of the module, so as not to mislead the version control system: the name of the file or directory will always be the name of the module.
  • Optimization of SQL-commands before dumping them into the database. As it turned out, when applying patches to files, they are deleted, then created together with new data — this is unacceptable for the database because of the possibility of data loss. Therefore, such operations are converted, if possible, to a single ALTER command. Not only that, Git completely removes the directories for which the patch is applied, and then re-creates them along with the patch changes - this problem was also solved.

Still not working :
  • Support for creating / editing views. Please note that the presentation format of the views may change.
  • XML indexes
  • Extended Attributes
  • The order of the columns when creating tables. That is, when creating the table, the columns will be created in the database in the order in which they were created on the file system; in future versions of SQLFuse, this problem, I hope, will be solved.

I wanted to express my gratitude for the recall and testing of SQLFuse along with SQL Server 2000. After testing it turned out that SQLFuse could not be made to work with SQL Server 2000, as the latter lacks some meta tables and there is no proper support for XML at the language level that are used to generate code for table modules.

Why git?


Git does not create additional directories with its utility files in each directory of observed structures. So, for example, acts Subversion. Creating directories SQLFuse perceives it as creating a table or schema, and the files inside them as modules with a text definition, so creating Subversion service directories is not possible.

Most likely, SQLFuse will be able to make friends with Mercurial too - I will be glad to bug reports or read a story about the integration experience.

Development scheme and necessary infrastructure


Let there be two SQL Server servers - the first working (work), the second test (test); one Deploy server. The public repository is located on the github.com resource .

Sandra - committer of the main branch, can make changes to the working and test servers. Bob is a developer who can only make changes to the test server. A test database is mounted on the Deploy server in the home directories of Sandra and Bob users. Only for Sandra the working database is additionally mounted.

Deploy Server Setup


A previous publication also talked about how to build SQLFuse from source and what dependencies are required. For more information about this and how to install it on the system, see the SQLFuse project page . As before, the base for the demonstration will be - AdventureWorks2008R2 . It is assumed that a machine is already configured running one of the Linux distributions, using systemd as the initialization daemon .

Creating users in the system:
useradd -m -N sandra
useradd -m -N bob

Do not forget to change the user password, for example, using the passwd command , and set global variables that are necessary for Git to work.

The systemd user service should be launched when the user logs in and mount the database in the appropriate directories. To do this, place the service description files in the ~ / .config / systemd / user / directory, and the SQLFuse configuration files in the ~ / .config / sqlfuse / directory .

Creating the necessary directories:
mkdir -p ~/workspace/work/sqlserv_1/advworks
# директория для тестовой БД будет создана при клонировании репозитория
mkdir -p ~/workspace/test/sqlserv_1
mkdir -p ~/.config/sqlfuse
mkdir -p ~/.config/systemd/user

Service file test-sqlserv_1-advworks@test_advworks.service
[Unit]
Description=SQLFuse mount profile %i for %u to %h/workspace/%P
[Service]
Type=forking
ExecStart=/usr/bin/sqlfuse -o profilename=%i %h/workspace/%P
ExecStop=/usr/sbin/fusermount -u %h/workspace/%P
TimeoutSec=5min
[Install]
WantedBy=default.target



With this setting, after the user logs out of the system, the services will terminate automatically - this feature can be used as an emergency cancellation of the cache reset in the database.

By using the systemd instance mechanism, adding a new service comes down to copying the file:
cd ~/.config/systemd/user
cp test-sqlserv_1-advworks@test_advworks.service work-sqlserv_1-advworks@work_advworks.service

The SQLFuse configuration file that describes the connection profiles test_advworks and work_advworks is located in the ~ / .config / sqlfuse directory, and in our case it will look like this:
Sqlfuse.conf configuration file
[global]
# Максимальное количество подключений
maxconn=2
# Наименование приложения, под которым производиться подключение к серверу
appname=SQLFuse
# Использовать ANSI_NPW
ansi_npw=true
# Горячий старт. Пользователь должен иметь права на создание временных таблиц
hot_start=true
# Фильтр по имени - данные объекты не будут искаться в БД
filter=(?i)(\.dav$|\.html$|\.exe$|\.cmd$|\.ini$|\.bat$|\.vbs$|\.vbe$|\.gitignore$|\.git$|\.gitattributes$)
# Не выводить схемы
exclude_schemas=db_accessadmin;db_backupoperator;db_datareader;db_datawriter;db_ddladmin;db_denydatareader;db_denydatawriter;db_owner;db_securityadmin;guest;INFORMATION_SCHEMA;sys
# Время, с момента последней операции записи, по истечению которого сбрасывается кэш
deploy_time=10
##################
# Профиль подключения
[test_advworks]
# Экземпляр или IP-адрес сервера
servername=192.168.6.50
# Наименование БД
dbname=AdvTest
# Профиль авторизации в sqlfuse.auth.conf
auth=advauth
###################
[work_advworks]
# Экземпляр или IP-адрес сервера
servername=192.168.6.50
# Наименование БД
dbname=AdvWork
# Профиль авторизации в sqlfuse.auth.conf
auth=advauth



To speed up the work, specify the maximum number of connections greater than 1 to execute queries to the database in parallel, when forming a list and code of table objects. However, if you set a value greater than 2, then this will not bring much effect, since Git does not know how or does not want to parallelize its requests to the file system.

You do not need to follow the same sequence of steps for all users to prepare SQLFuse by placing the necessary files in the / etc / skel directory .

Initialization of the work environment


In order to start tracking changes, Sandra has to make the first commit for the working database to the master branch and send the changes to the bare repository: I am too lazy to configure GitLab, so we will use github.com for clarity :
# Старт и монтирование рабочей БД
systemctl --user start work-sqlserv_1-advworks@work_advworks.service
# Инициализация и первый коммит для рабочей БД
cd ~/workspace/work/sqlserv_1
git init
git add -v advworks/
git commit -m 'Initial work commit'
# Отправка изменений в публичный репозиторий
git remote add origin https://github.com/alexandrmov/adventureworks.git
git push origin master

Next, prepare the test database and the testing branch ( initially, the test database should not be mounted! ):
# клонирование общедоступного репозитория из github:
cd ~/workspace/test/sqlserv_1
git clone https://github.com/alexandrmov/adventureworks.git
# сделаем иерархию для тестовой БД похожей на рабочую
mv adventureworks/* ./
rmdir adventureworks
# предпологается, что тестовая и рабочая БД отличаются
git checkout -b testing
rm -rf advworks/*
# монтирование тестовой БД
systemctl --user start test-sqlserv_1-advworks@test_advworks.service
# первый коммит ветки testing и отправка изменений в центральный репозиторий
git add -v advworks/
git commit -m 'Initial testing commit'
git push origin testing

All mergers, changes in the history of commits, and other operations must be carried out on local workstations, but not as on mounted databases! Therefore, in order not to accidentally break the working or test server, you can write several hooks that prohibit changing the current branches in the repositories where the databases are mounted, and execute any commands except git pull to get changes from the central repository.

For Bob, setting up the work environment is almost the same. You need to clone the public repository, go to the testing branch , delete the contents of the advworks directory recursively, and finally mount the test database by running the systemd service .

Of course, it would be very cool to recreate test structures from the working database by copying the hierarchy of directories and files, but, unfortunately, so far the views are in read-only mode and extended attributes are not supported.

Pseudoreal example


Suppose you need to set a dealer discount for each product. To do this, Bob will add the DealerDiscount field to the Production.ProductListPriceHistory table to store the discount percentage. Bob also needs to change the dbo.ufnGetProductDealerPrice function .

Bob's actions to complete the task will be as follows:
  • Synchronize the master branch with the public repository, since you need to use the version of the module code from the production server.
  • Create a column and restriction by creating the appropriate files, and change the function:
  • Commit the changes to the new dealerdiscount branch and send them to the public repository:
    # Сохранение сделанных изменений в отдельной ветке
    git checkout -b dealerdiscount
    git add advworks/dbo/ufnGetProductDealerPrice
    git add advworks/Production/ProductListPriceHistory/DealerDiscount
    git add advworks/Production/ProductListPriceHistory/DF_ProductListDealerDiscount
    git commit -m "Dealer discount"
    # Отправка локальных изменений в центральный репозиторий
    git push origin dealerdiscount
    # Слияние новых изменений в тестовую ветку
    git checkout testing
    git rebase testing dealerdiscount
    # Отправка изменений ветки тестового сервера в центральный репозиторий
    git push origin testing
    

  • Log in to your Deploy server using your username and make changes to the test database to run in and test new changes:
    cd ~/workspace/test/sqlserv_1/
    git fetch
    git pull origin testing
    

  • Create what is called a Pull Request (here you can see an example) for discussion and final verification by the committer before merging into a working database.

As attentive readers may have noticed, at the end of text modules (procedures, functions, triggers, etc.) a lot of spaces are generated - this is due to the fact that SQL Server does not correctly return the length of data from metatables. The actual size can only be determined by executing the sp_helptext procedure : it is always smaller than in the metatable. If you do not use the sys.sql_modules metatable , but the sp_helptext procedure immediately , then the speed of receiving data is significantly reduced when the git status command is executedbecause you have to call the procedure in a loop. That is why, in order to prevent incorrect utilities from working with files, such a crutch was made. Please keep in mind that when flushing a cache in the database, trailing spaces in the text modules are truncated.

After verification, Sandra should merge the new functionality with the master branch , making changes to the working database - the actions are the same as for Bob, only with the master branch . If the process of transferring changes is complex, then the merger can be performed "manually". After applying SQL commands, the changes should be committed and sent to the central repository, and the test and development branches of the new functionality will be synchronized.

Conclusion


The considered development scheme is very similar to utopia, where all changes can be made atomically and without errors, however, SQLFuse can only be used to track the code of modules, when changes are made directly to the database by SQL scripts. Or, for test servers, make changes using the mixed method, and for workers, use a script that will generate the difference between commits. Follow my account on the github , perhaps such a script will appear soon, but, I hope someone will get ahead of me and write it first, putting it in free access.

So, the considered approach allows you to implement:
  • the ability to control the design of the code during commits;
  • data safety during competitive adjustments of the same SQL Server modules;
  • maintaining the history of edits of modules introduced by users;
  • using resources similar to GitHub, and web applications such as GitLab;
  • more flexible search by code and construction of dependency graphs between modules without connecting to the database.

The next publication will describe the application of the automatic documentation generation system to SQL Server modules, as well as how to create / edit views.

Only registered users can participate in the survey. Please come in.

How do you exercise version control of SQL code?

  • 10.5% Proprietary Software 11
  • 32.6% Manually save the code in files, or we monitor the scripted database in whole / in part 34
  • 40.3% Version control of SQL code is not carried out 42
  • 16.3% Another method, the answer in the comments 17

Also popular now: