Wizard crash in PostgreSQL cluster: what to do?

    Greetings. Today I would like to talk about such an unpleasant situation as the failure of the wizard in the case of using native replication in PostgreSQL 9.x. So, suppose you have a cluster of two or more PostgreSQL servers and a meteorite suddenly falls on the master. It is logical to assume that you have to make one of the replicas a master. There are two ways to do this.

    1. Using a trigger file.


    The manual for configuring replication says that in recovery.conf, among other things, you can (and should) specify the trigger_file parameter. Everything is simple here - as soon as you create the file specified in this parameter on the replica, PostgreSQL will interrupt the recovery process (in our case, replication) and open a new timeline.
    This means that after the trigger file is created, the position of the binary log counter will change, and not sequentially (say - from 000000010000000000000043 to 000000010000000000000044), but with a new era - (to 0000000 2 0000000000000043).

    The good news is that this method does not require a restart - everything will happen on the fly. There will be no downtime (we don’t take into account the time to change configs on clients), all connections will be saved - PostgreSQL will just beat the walreceiver process and give the go-ahead for recording.

    The bad news is that this method is good if you have only two servers - a master and one replica - because if the cluster consists of 3 or more machines - you cannot make this node a new master without reloading other replicas - when you try to bind another replica to the new master, PostgreSQL invariably says the following:

    FATAL: timeline 2 of the primary does not match recovery target timeline 1

    All sorts of attempts to slip the history file (which stores the transition point to the new timeline - this file is created each time the recovery process is completed) failed as well. In general, the participants of the official MailList adhere to the same point of view - with this approach, other replicas will have to be redone (in the case of 9.0 - using pg_start_backup / pg_stop_backup and rsync, and in the case of 9.1 - using the pg_basebackup utility).

    2. Removing recovery.conf


    For some reason, the description of the second method in the manuals could not be found (maybe it is there and I was not careful enough - I will not argue). I think you will agree that it is simple, logical and, apparently, reliable (at least - I have not managed to completely break anything in the process of repeated experiments):

    1. From the entire cluster, you need to find the latest replica. You can do this from the console by doing something on each host: If the meteorite has not yet fallen on your master, but you expect it from any minute, it is better to put out the master node in advance so that the positions of the binary log do not change during your manipulations.

    # ps aux|grep postg|grep rec
    postgres 143 0.0 13.2 8692004 6533448 ? Ss Feb06 3:58 postgres: startup process recovering 00000001000001E500000054
    postgres 2683 0.0 0.0 8699452 4044 ? Ss Feb09 0:33 postgres: wal receiver process streaming 1E5/542F9970




    2. On the selected replica, change postgresql.conf so that the node can be a master (the following parameters are taken from the manual for configuring replication, in your case, the values, of course, may differ): 3. Edit pg_hba.conf: 4. Delete to the new recovery.conf wizard 5. We make restart of PostgreSQL on the new wizard. 6. Edit recovery.conf on the other replicas (specify a new wizard) and perform restarts. In such a simple way, you can turn a replica into a master without losing the position of the binary log. Of the obvious minuses - you have to restart the entire cluster (although, if you have the opportunity to transfer the IP address from the old master to the new one, you won’t need to restart the replicas).
    wal_level = hot_standby
    max_wal_senders = 5
    wal_keep_segments = 32
    archive_mode = on
    archive_command = 'cp %p /path_to/archive/%f'



    host replication postgres 192.168.100.3/32 trust







    If for some reason you want to make a replica a new master, the bin-log position, which is not the newest, you will first have to appoint the newest replica as a master (it is advisable to close it from random connections to the record), synchronize it with the replica from which you want to make a master in the end and only after that do all the operations described above on it (that is, in fact - you assign the master twice - first for only one replica, and then for everyone else).

    In general, during the experiments, the only problematic place that we managed to find was the situation when one of the replicas lagged so far that the new master no longer needs the XLOG files it needs. How to deal with this, I told in my previous post - I can add only that if you send binary logs to the backup server during archiving, this problem can hardly be called significant.

    Also popular now: