Reinitializing a PostgreSQL Database Cluster

    The subject of the described problem



    When working with the database to PostgreSQL you must not forget, in a locale ( locale ) has been initialized the database cluster - so called postgre directory (usually /var/lib/pgsql/data), which stores the data of all the bases of this setup PostgreSQL.



    Problem



    Today I encountered such a problem. In the query of the sample when using the function, the lower()reduction of the Cyrillic text to lower case did not occur, while the English values ​​willingly "decreased".

    The first attempt to solve the problem



    A google search on half the Internet gave information that it would be nice if the database you searched for was encoded UTF-8(in my case, by mistake, it was in default SQL_ASCII).

    OK. No sooner said than done! Relatively quickly found instructions on how to recreate the database in a new encoding without losing data.

    [bash]
    # su - postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydatabase
    ~ pg_dump mydatabase -Ft -v -U postgres -f /tmp/mydatabase.tar
    ~ dropdb mydatabase --username postgres
    ~ createdb --encoding UNICODE mydatabase --username postgres
    ~ pg_restore /tmp/mydatabase.tar | psql --dbname mydatabase --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydatabase
    


    The check showed that the data of this database did not deteriorate, but the required actions (reduction to lower case by the function lower()still did not occur.

    The solution to the problem becomes interesting.



    Along with the continuation of the Google-reading has come to understand that this database cluster PostgreSQL server was initialized locale «C», and for the function lower(), and upper()it means a lot!

    I had to figure out how to reinitialize the database cluster without destroying the existing databases and the data in them. At the same time, this server is production - crontabsome data dumps are merged onto it ( ) once per hour. The benefit is that it is not so much production that there would be no free “window” for reinitialization.

    "Window" in 60 minutes and half an hour to prepare



    Until the end of the working day there was 1.5 hours and one free “window” lasting 60 minutes.

    I decided to start by warming up on a local laptop. Here it is worth mentioning the difference in operating systems: laptop - Ubuntu 8.10, server - CentOS 5. Having prepared three terminal windows and another text editor window, I started the preparatory work.

    Firstly, the first method needed to be divided in two - a dump of existing data and their restoration after cluster reinitialization.

    The database dump took place without strong complaints, only pg_dropcursed a couple of times at connected users to the same databases ( decided by closing pgAdmin ).

    Then it was found ( in the case of Ubuntu ) hidden in a rather unusual place (/usr/lib/postgresql/8.3/bin) command ( initdb) and executed in the necessary parameters.

    [bash]
    # su - postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydatabase
    ~ pg_dump mydatabase -Ft -v -U postgres -f /tmp/mydatabase.tar
    ~ dropdb mydatabase --username postgres
    ~ initdb --locale = ru_RU.utf8 data /
    


    Damn! Error ...

    Hmm, it turned out that you need to manually delete the contents of the database cluster directory.

    Warning! Do not do it right away rm -rf data/*. I understood this after I did it on the laptop, and after recovery, I lost access rights for users to the server (which are stored in pg_hba.conf).


    You need to make a copy of the file pg_hba.confsomewhere for the duration of the change.

    [bash]
    ~ cp data / pg_hba.conf /home/cr0t/pg_hba.2009.03.24_1654.conf
    


    After deleting the contents of the directory and stopping the PostgreSQL daemon, the cluster was reinitialized without error.

    [bash]
    ~ exit
    # /etc/init.d/postgresql stop
    # su - postgres
    ~ rm -rf data / *
    ~ initdb --locale = ru_RU.utf8 data /
    


    It only remained to restart the server and restore the old databases from the dumps to the already new cluster, initialized in the "correct" locale.

    [bash]
    ~ exit
    # /etc/init.d/postgresql start
    # su - postgres
    ~ createdb --encoding UNICODE mydatabase --username postgres
    ~ pg_restore /tmp/mydatabase.tar | psql --dbname mydatabase --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydatabase
    


    Success. Summary



    After these successful actions, the function lower()began to correctly “press” the Cyrillic characters. Everyone is happy. But I didn’t even think about custom PostgreSQL roles (as users started to be called in version 8.x). They are gone. Good thing I needed to create them all a couple. But who has a lot of them, be careful, do not repeat my mistake!

    PS Steps to re-initialize if there are multiple databases



    In my case, it was necessary to downgrade and subsequently restore 3 bases.
    To solve this problem, you need to add only additional repetitions of some actions to remove the dump and then restore it ( if you wish, you can even write an automation script;) ).

    [bash]
    # su - postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydb1
    ~ pg_dump mydb1 -Ft -v -U postgres -f /tmp/mydb1.tar
    ~ dropdb mydb1 --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydb2
    ~ pg_dump mydb2 -Ft -v -U postgres -f /tmp/mydb2.tar
    ~ dropdb mydb2 --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydb3
    ~ pg_dump mydb3 -Ft -v -U postgres -f /tmp/mydb3.tar
    ~ dropdb mydb3 --username postgres
    ~ cp data / pg_hba.conf ./
    ~ exit
    # /etc/init.d/postgresql stop
    # su - postgres
    ~ rm -rf data / *
    ~ initdb --locale = ru_RU.utf8 data /
    ~ cp pg_hba.conf data /
    ~ exit
    # /etc/init.d/postgresql start
    # su - postgres
    ~ createdb --encoding UNICODE mydb1 --username postgres
    ~ pg_restore /tmp/mydb1.tar | psql --dbname mydb1 --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydb1
    ~ createdb --encoding UNICODE mydb2 --username postgres
    ~ pg_restore /tmp/mydb2.tar | psql --dbname mydb2 --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydb2
    ~ createdb --encoding UNICODE mydb3 --username postgres
    ~ pg_restore /tmp/mydb3.tar | psql --dbname mydb3 --username postgres
    ~ vacuumdb --full --analyze --username postgres --dbname mydb3
    


    Cross-post from my blog Summer code The

    post is similar to the recently published Patchim UTF-8 Collation under FreeBSD , but it seems to me that the solution to the problem specific to FreeBSD is described there, but I give it for Ubuntu. When I solved my problem, I didn’t even use this information - only Google-reading.

    Thanks for the karma! , moved to the PostgreSQL blog.

    Also popular now: