PostgreSQL data recovery after pg_control loss
To ensure fault tolerance, PostgreSQL, like many databases, uses a special logbook that keeps a history of data changes. Before writing data to database files, the PostgreSQL server accumulates the changes in the RAM and writes it to the serial log file so as not to lose them due to an unexpected power outage.
Data is written to the log before the database user receives a message about the successful application of the changes. This log is called the Write-Ahead Log or simply WAL, and the log files are stored in the pg_xlog directory. PostgreSQL also periodically flushes modified accumulated data from RAM to disk. This data matching process is called a checkpoint. A checkpoint is also performed every time PostgreSQL is shut down.
Information about which internal values the control point ended with is stored in the global / pg_control file, and therefore this file must be accessible to the DBMS before the data recovery. If PostgreSQL is disconnected abnormally, then the changes from the log files (pg_xlog) are applied to the database files, starting from the position of the last checkpoint. This process is called data recovery.
The pg_control file contains information:
- version of the control file format,
- the checksum of the data written to this file,
- version of the database file format,
- unique identifier of the database instance,
- current state: running / stopped,
- log position corresponding to the running and previous control points,
- current timeline
- maximum visible transaction number (xid),
- maximum number of internal counter of objects (oid),
- time of creation,
- and much more.
You can view the contents of pg_control using the pg_controldata utility:
$ pg_controldata /var/lib/pgsql/9.5/data
pg_control version number: 942
Catalog version number: 201510051
Database system identifier: 6242923005164171508
Database cluster state: in production
pg_control last modified: Fri Apr 29 01:00:00 2016
Latest checkpoint location: EEAF/BAA5520
Prior checkpoint location: EEAF/BAA5440
...
Latest checkpoint's NextXID: 7/876524573
Latest checkpoint's NextOID: 264355612
Latest checkpoint's NextMultiXactId: 134512401
Latest checkpoint's NextMultiOffset: 547842659
...
If the contents of pg_control were lost, then PostgreSQL will not be able to start the recovery procedure. If it happened that the database files unexpectedly disappeared, which can happen during an emergency shutdown with the parameter fsync=off
, then the correct way to restore is to switch to the backup copy. This article can be useful in cases when you need to restore the database to a minimum, but it is impossible to switch to a backup copy and you can donate part of the data.
The pg_control file is not protected from failures, and you can only restore it using the pg_resetxlog utility or the hex editor. Using pg_resetxlog, you may lose some of the data. You refuse all current transaction logs and believe that PostgreSQL has completed its work normally: all data is written to files, as if the checkpoint had just completed. You will also have to select the maximum visible transaction counter number. If you select the transaction number too large, the data files will not contain information that the DBMS has not yet flushed to disk from RAM, as the checkpoint creation process would have done. If you choose the transaction number too small, then the data recorded later will be invisible.
It is logical to choose the moment of the control point, but where can I get this value? The standard utility pg_xlogdump comes to the aid, with which you can see the contents of WAL files. You need to select the latest file where the checkpoint record is located, with the XLOG record type:
$ pg_xlogdump -r XLOG pg_xlog/$FILE
...
rmgr: XLOG len (rec/tot): 80/ 106, tx: 0, lsn: EEAF/0BAA5B40, prev EEAF/0BAA5B08, desc: CHECKPOINT_ONLINE redo EEAF/BAA5B08; tli 2; prev tli 2; fpw true; xid 7/876524573; oid 264355612; multi 134512401; offset 547842659; oldest xid 686019718 in DB 16400; oldest multi 128391103 in DB 16400; oldest/newest commit timestamp xid: 0/0; oldest running xid 876524573; online
In this case, you can select the following options for pg_resetxlog:
$ pg_resetxlog -x 876524573 -o 264355612 -m 134512401,128391103 -n /var/lib/pgsql/9.5/data
For the specified command to apply the values, you need to run it without a key -n
and with an additional key -f
. The command will clear the contents of the pg_xlog directory and write the new values to the pg_control file. After that, you can start PostgreSQL without data recovery.
If you selected a checkpoint for recovery, in order not to get into a situation where the displaced data from the buffer cache turned out to be written to the database files, it is recommended to set the parameter value autovacuum=off
and remove the logical backup using the pg_dump utility before starting the instance . If errors occur during the removal of the backup copy, use the option zero_damaged_pages=on
. After you make a logical backup, you must restore it on a new instance of PostgreSQL.
All the successful operation of PostgreSQL and backups at your fingertips!