MySQL + Git structure synchronization

    To synchronize project files and keep history, we use version control systems, for example, Git. However, when I had a question about version control of the structure of the MySQL database, I could not find a satisfactory solution.
    I note that in many frameworks and ORMs there are necessary mechanisms out of the box - migration, versioning, etc. But for native work with MySQL - you have to do everything with pens. And the idea came up to try to create an automatic system for tracking changes.

    Task


    I wanted to change the database structure on the development server, automatically update it on the production server, and also see the history of all changes in Git, since it was already used to control the code. And so that everything is free and easy!
    To do this, you need to receive information about all change requests (CREATE, ALTER, DROP).

    Decision start


    MySQL supports 3 ways of maintaining logs - these are error logs (error log), logs of all queries (general log) and slow query logs (slow log).
    I have not used the first option yet, but there are ideas (details below). Now about the other two options.
    Logs can be written either to mysql tables or to files. The log file format is rather inconvenient and I decided to use tables.

    Attention, since we are talking about ALL mysql logs, this solution should be used only on a dev server without MySQL load!

    An important point is the definition of the database to which the query is being sent, since this information may not be in the SQL text of the query itself.
    CREATE TABLE  /*DB_NAME.*/TABLE_NAME
    

    It turned out that general log writes only the server thread number, and in order to determine the database, one would have to look for a record for this thread with an indication of the database used. In addition, the logs contain information about connecting and disconnecting to the server.

    The structure of mysql.general_log


    But slow_log just found everything you need: firstly, the logs contain only information about the queries, and secondly, the name of the database in the context of which the queries go is remembered.

    Mysql.slow_log structure Setting up


    slow log to record all requests is very simple in my.cnf
    log-output=TABLE
    slow_query_log = 1
    long_query_time = 0
    log_slow_admin_statements = 1

    log_slow_admin_statements is needed to record ALTER requests.

    Log Processing


    So, we need to constantly pick up all the queries, select from them the requests to change the database structure and clear all the rest.

    The mysql.slow_log table does not contain a key field, and it cannot be locked (and therefore partially delete entries). Therefore, we will create a table that will suit us.

    Structure change_structure_log


    For log rotation a small procedure:
    USE mysql;
    DELIMITER $$
    CREATE PROCEDURE `change_structure_log_rotate`()
    BEGIN
    -- Definition start
    drop table if exists slow_log_copy;
    CREATE TABLE slow_log_copy LIKE slow_log;
    RENAME TABLE slow_log TO slow_log_old, slow_log_copy TO slow_log;
    insert into change_structure_log (start_time,query_time,sql_text, db) select start_time, query_time, sql_text,db from slow_log_old where sql_text like "ALTER%" OR sql_text like "CREATE%" OR sql_text like "DROP%";
    drop table slow_log_old;
    -- Definition end
    END
    $$
    


    And it can be added to the MySQL scheduler:
    CREATE EVENT `event_archive_mailqueue`
      ON SCHEDULE EVERY 5 MINUTE STARTS CURRENT_TIMESTAMP
      ON COMPLETION NOT PRESERVE
      ENABLE
      COMMENT ''  DO
    call change_structure_log_rotate();
    


    So, we have a table with all requests for restructuring. Now we will write a small script to process it. I will not use any specific language (I personally write in PHP, but because of the large number of dependencies in the code, it makes no sense to upload the code).

    So:
    1. We loop through all the entries in the change_structure_log table.
    2. For sql_text, we regularly pull out the database name, if it exists, for example
    ^ALTER\s+TABLE\s+(?:(?:ONLINE|OFFLINE)\s+)?(?:(?:IGNORE)\s+)?(?:([^\s\.]+)\.\s*)?([^\s\.]+)
    3. If the name db is not specified in the query, use it from the db field.
    4. We write to the project folder with Git all the records of the corresponding database. For example, 20140508150500.sql.log. For queries without a database, at the beginning we write use $ DB;
    5. Delete all processed records.

    So, in our project folder there are new files with requests to change the database, now we can commit them in normal mode in our Git client.

    Next, on the production server, we write a script that tracks the appearance of new files and execute them in mysql. So, when updating the git repository on the production server along with the code, we change the database to the state on the dev server.

    Upd. Also (at the prompt of DsideSPb ), you can use the hook for Git post-checkout, which will make the update iteration continuous and without external listeners.

    I must say right away that this solution is quite primitive and does not support many Git functions. However, based on it, we can do even cooler things: by changing specific tables - for example, automatically changing our ORM files.
    Or automatically create Yaml schemas - using any MySQL client without additional plugins for it.
    It is also possible, for example, to track changes in data in specific tables without changing the database structure itself (triggers, etc.), which can be useful for various CMS.

    PS If we also want to learn about slow queries - we can integrate this into our system, for this we need to remove the filter from the procedure and make a request for slow queries in our script and save them.

    Also popular now: