zhigaev December 15, 2014 at 12:59

Replication slots in PostgreSQL

From the sandbox

Prior to the ninth version, PostgreSQL used WAL archiving to create a “warm” backup server. Version 9.0 introduced streaming replication with the ability to create a “hot” read-only server. The next version of PostgreSQL 9.4 will introduce new functionality for creating streaming replication called replication slots. Consider what it is and how it differs from previous methods.
The first release candidate is currently available. As a test bench, 2 virtual machines were selected for Ubuntu 14.04. The build and installation process is the same for the primary and backup servers. We put from the sources, after having pre-installed the necessary packages:

sudo apt-get update && sudo apt-get -y install make g++ checkinstall libreadline6-dev zlib1g-dev

Download and unpack the archive from the repository:

wget https://ftp.postgresql.org/pub/source/v9.4rc1/postgresql-9.4rc1.tar.gz 
tar xzf postgresql-9.4rc1.tar.gz

We collect and install the package:

cd postgresql-9.4rc1/ 
./configure 
make 
sudo checkinstall

By default, binaries for working with DBMS are placed in / usr / local / pgsql /.
Add the postgres user to the system:

sudo useradd -M postgres

Create a directory for the cluster:

sudo mkdir -p /data/db 
sudo chown postgres:postgres /data/db 
sudo chmod 0700 /data/db

Next, we perform actions on the main server. We initialize the cluster:

sudo -u postgres /usr/local/pgsql/bin/initdb -D /data/db

In addition to the cluster structure, initdb will create default configs. Create the pg_log directory in the cluster, in which the logs will be stored:

sudo -u postgres mkdir /data/db/pg_log 
sudo -u postgres chmod 0700 /data/db/pg_log

Add entries to pg_hba.conf to connect users and so that the backup server can pick up WAL logs from the main one:

host    all	                    all	                  192.168.1.0/24              md5 
host    replication         replica           192.168.1.108/32          md5

In postgresql.conf config rule parameters:
listen_addresses = '*' - listen for incoming connections on all interfaces
wal_level = hot_standby - Required Format WAL-logging for replication
max_wal_senders = 2 - number of simultaneous connections for replication
logging_collector = on - logs piled in pg_log

run our cluster:

sudo -u postgres /usr/local/pgsql/bin/pg_ctl -D /data/db start

We look at the status of the processes:

ps aux | grep postgres 
postgres 21295  0.0  0.0  23700   604 ?        Ss   13:39   0:00 postgres: logger process                 
postgres 21297  0.0 13.6 170880 138408 ?  Ss   13:39   0:01 postgres: checkpointer process           
postgres 21298  0.0  5.0 170784 51076 ?     Ss   13:39   0:00 postgres: writer process                 
postgres 21299  0.0  0.5 170648  5148?       Ss   13:39   0:00 postgres: wal writer process             
postgres 21300  0.0  0.1 171052  1836 ?      Ss   13:39   0:00 postgres: autovacuum launcher process    
postgres 21301  0.2  0.1  25924  1060 ?        Ss   13:39   0:17 postgres: stats collector process

Create a replica user with replication rights:

/usr/local/pgsql/bin/psql -U postgres -c "create user replica with replication encrypted password '123'"

Let's create a test database with data:

/usr/local/pgsql/bin/createdb -U postgres testdb 
/usr/local/pgsql/bin/psql -U postgres -d testdb -c "create table testtable (id serial, data text)" 
/usr/local/pgsql/bin/psql -U postgres -d testdb -c "insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,1000000)::text)"

Set up a backup server.
Create a directory for the cluster:

sudo mkdir -p /data/db 
sudo chmod 0700 /data/db 
sudo chown postgres:postgres /data/db

Using the pg_basebackup utility, make a backup copy of the main server:

sudo -u postgres /usr/local/pgsql/bin/pg_basebackup -h 192.168.1.103 -U replica -D /data/db -X s

pg_basebackup copies the entire contents of the cluster, including configs, therefore we change the hot_standby parameter to the on state in postgresql.conf
Create a recovery.conf file in the cluster directory, in which we specify the parameters for connecting to the main server:

standby_mode='on' 
primary_conninfo='host=192.168.1.103 port=5432 user=replica password=123'

We start the cluster on the backup server:

sudo -u postgres /usr/local/pgsql/bin/pg_ctl -D /data/db start

After that, the wal_sender process should start on the main server, and on the standby wal_receiver:

Hidden text

postgres 21295  0.0  0.0  23700   604 ?        Ss   13:39   0:00 postgres: logger process                 
postgres 21297  0.0  0.2 170756  2312 ?      Ss   13:39   0:00 postgres: checkpointer process           
postgres 21298  0.0  0.2 170784  2252 ?      Ss   13:39   0:00 postgres: writer process                 
postgres 21299  0.0  0.5 170648  5148 ?      Ss   13:39   0:00 postgres: wal writer process             
postgres 21300  0.0  0.1 171052  1804 ?      Ss   13:39   0:00 postgres: autovacuum launcher process    
postgres 21301  0.0  0.1  25924  1060 ?       Ss   13:39   0:00 postgres: stats collector process          
postgres 21323  0.0  0.2 171048  2108 ?      Ss   13:46   0:00 postgres: wal sender process replica 192.168.1.108(56673) streaming 0/4E000210 
postgres 15150  0.0  0.0  23700   612 ?       Ss   13:46   0:00 postgres: logger process                 
postgres 15151  0.0  0.1 170788  1496 ?     Ss   13:46   0:00 postgres: startup process   recovering 00000001000000000000004E 
postgres 15152  0.0  0.0 170680   944 ?       Ss   13:46   0:00 postgres: checkpointer process           
postgres 15153  0.0  0.1 170680  1204 ?      Ss   13:46   0:00 postgres: writer process                 
postgres 15154  0.0  0.0  25792   648 ?        Ss   13:46   0:00 postgres: stats collector process        
postgres 15155  0.6  0.1 174956  1660 ?      Ss   13:46   0:00 postgres: wal receiver process   streaming 0/4E000138

You can view replication status through the pg_stat_replication view on the primary server

testdb=# select * from pg_stat_replication; 
-[ RECORD 1 ]----+------------------------------ 
pid              | 21987 
usesysid         | 16384 
usename          | replica 
application_name | walreceiver 
client_addr      | 192.168.1.108 
client_hostname  | 
client_port      | 56674 
backend_start    | 2014-11-25 18:30:09.206434+03 
backend_xmin     | 
state            | streaming 
sent_location    | 0/5A2D8E60 
write_location   | 0/5A2D8E60 
flush_location   | 0/5A2D8E60 
replay_location  | 0/5A2D8E60 
sync_priority    | 0 
sync_state       | async

It can be seen that the primary and backup server are synchronized. Now we will generate some more test data and right after that we will look at replication status.

testdb=# insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,1000000)::text); 
INSERT 0 1000000 
testdb=# select * from pg_stat_replication; 
-[ RECORD 1 ]----+------------------------------ 
pid              | 21987 
usesysid         | 16384 
usename          | replica 
application_name | walreceiver 
client_addr      | 192.168.1.108 
client_hostname  | 
client_port      | 56674 
backend_start    | 2014-11-25 18:30:09.206434+03 
backend_xmin     | 
state            | streaming 
sent_location    | 0/63800000 
write_location   | 0/63680000 
flush_location   | 0/63680000 
replay_location  | 0/6136E160 
sync_priority    | 0 
sync_state       | async

Here we observe that the backup server took all the WAL logs from the main one, but has not yet had time to use them all, so it lags behind the main one. By default, in postgres, replication occurs asynchronously using WAL logs; these are binary files of a fixed size of 16 MB, which are located in the pg_xlog directory. Their number can be changed using the checkpoint_segments and wal_keep_segments parameters. When the amount of changed data in the cluster exceeds the total size of the WAL logs, the checkpointer process is started, which dumps the WAL logs to the data files. After that, WAL-logs are recreated anew. In the current stable version of postgres, the primary server does not consider the status of the backup server. Therefore, if the backup is too “behind” the main, then on the main WAL-logs will be recreated before the backup takes them.
Temporarily disable the backup server from connecting to port 5432:

sudo iptables -A OUTPUT -m tcp -p tcp —dport 5432 -j DROP

We will generate more data on the main server:

testdb=# insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,1000000)::text); 
INSERT 0 1000000

Let's reset the iptables rule and see the logs of the backup server, in which we observe such an unpleasant picture.

LOG:  started streaming WAL from primary at 0/78000000 on timeline 1 
FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000010000000000000078 has already been removed

The main server deleted the WAL logs before the backup server managed to pick them up. Now you have to back up the main server again. The problem is that the primary server does not take into account the state of the backup. Therefore, if there are problems with the network or just a slow channel between the servers, then with intensive loading and / or changing data on the main server, there is a risk of replication failure. A partial solution is to increase the value of the wal_keep_segments parameter , and enable WAL archiving . But in version 9.4, replication slots will appear. Let's see how it works:
On the main server, create a replication slot:

testdb=# SELECT pg_create_physical_replication_slot('standby_slot'); 
-[ RECORD 1 ]-----------------------+---------------- 
pg_create_physical_replication_slot | (standby_slot,)

testdb=# select * from pg_replication_slots; 
-[ RECORD 1 ]+------------- 
slot_name    | standby_slot 
plugin       | 
slot_type    | physical 
datoid       | 
database     | 
active       | f 
xmin         | 
catalog_xmin | 
restart_lsn  |

On the backup, add the line
primary_slot_name = 'standby_slot' to the existing contents of the recovery.conf file .

After restarting the backup server, disconnect it from the main one again and generate test data on the main one that exceeds the volume of WAL logs:

testdb=# insert into testtable select nextval('testtable_id_seq'::regclass), md5(generate_series(1,10000000)::text);

Let's see the WAL log parameters in the system, and then the real number of log files in the pg_xlog directory:

testdb=# show checkpoint_segments; 
-[ RECORD 1 ]-------+-- 
checkpoint_segments | 3

testdb=# show wal_keep_segments; 
-[ RECORD 1 ]-----+-- 
wal_keep_segments | 0

testdb=#\! ls /data/db/pg_xlog | wc -l 
50

To calculate the maximum number of WAL files in the system, use the formula : (2 + checkpoint_completion_target) * checkpoint_segments + 1.
However, the current number of WAL logs in the system is much higher. Replication slots store information about the number of downloaded WAL logs for each backup server. Now WAL-logs will be accumulated until the last standby server picks them up or until the replication slot is removed. As WAL logs download, the pg_xlog directory on the main server will decrease. Having dropped the iptables rule on the backup server, we see in the logs that replication has resumed.

testdb=#\! tail -f /data/db/pg_log/postgresql-2014-11-27_191036.log 
Is the server running on host "192.168.1.103" and accepting TCP/IP connections on port 5432? 	 
LOG:  started streaming WAL from primary at 0/A0000000 on timeline 1

Replication slots is a great tool for enhancing the reliability and convenience of replication in PostgreSQL.

Description of replication slots on the official PostgreSQL website: www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION-SLOTS

Tags:

postgresql replication

Replication slots in PostgreSQL

Also popular now: