Another new feature of pg_filedump: restoring the PostgreSQL directory



    In a previous article, we learned how to use pg_filedump utility to recover data, or at least some of it, from a completely killed PostgreSQL database. It was assumed that from somewhere we know the numbers of segments corresponding to the table. If we know some of the contents of the table, its segments are really not difficult to find, for example, with a simple grep. However, in a more general case, this is not so easy to do. In addition, it was assumed that we know the exact layout of the tables, which is also far from a fact. So, recently, my colleagues and I made a new patch for pg_filedump, which allows us to solve these problems.


    So, let's say we want to restore a table called test. If we don’t remember the name of the table, it’s not scary, so using the method described below you can get the names of all tables in the database. Information about the tables is stored in the pg_class catalog table, the segment of which is always numbered 1259.


    Using the latest version of pg_filedump, we can read pg_class as follows:


    ./pg_filedump -D name,oid,oid,oid,oid,oid,oid,~ /path/to/base/16384/1259 | grep COPY | grep test

    Pay attention to the list of types for decoding, which we pass to pg_filedump:


    name,oid,oid,oid,oid,oid,oid,~

    Here at the beginning we pass the type names of the first seven columns of the table (the pg_class scheme is known and described in the documentation ), and the tilde says to ignore the remaining columns. In this case, they are still not interesting to us; there is no need to list them all.


    Output Example:


    COPY: test  220016387010016385COPY: test  220016387010016385COPY: test_pkey 2200001040316391

    The last column is relfilenode, i.e. the segment number. We need him! Remember, 16385.


    But wait, we don’t know the table layout. The pg_attribute catalog table will help us find it, the relfilenode of which is hardcoded and equal to 1249. By the way, you can look at the relfilenode of all catalog tables in the pg_class.h file .


    Open the dock by pg_attribute , decode:


    ./pg_filedump -D oid,name,oid,int,smallint,~ /path/to/base/16384/1249 | grep COPY | grep 16385

    Output Example:


    COPY: 16385 k   23-14COPY: 16385 v   25-1-1COPY: 16385 ctid    2706COPY: 16385 xmin    2804COPY: 16385 cmin    2904COPY: 16385 xmax    2804COPY: 16385 cmax    2904COPY: 16385 tableoid    2604

    As you can see, the table has two columns with the names k and v (the rest are system columns, they are needed for MVCC to work, and that's all). Here 23 and 25 are atttypid, that is, column types. But how to understand what these types are?


    The answer is contained in the pg_type catalog table (relfilenode = 1247, dock ):


    ./pg_filedump -i -D name,~ /path/to/base/16384/1247 | grep -A5 -E 'OID: (23|25)'

    Output Example:


      XMIN: 1  XMAX: 0  CID|XVAC: 0OID: 23
      Block Id: 0  linp Index: 8   Attributes: 30   Size: 32
      infomask: 0x0909 (HASNULL|HASOID|XMIN_COMMITTED|XMAX_INVALID) 
      t_bits: [0]: 0xff [1]: 0xff [2]: 0xff [3]: 0x07COPY: int4--
      XMIN: 1  XMAX: 0  CID|XVAC: 0OID: 25
      Block Id: 0  linp Index: 10   Attributes: 30   Size: 32
      infomask: 0x0909 (HASNULL|HASOID|XMIN_COMMITTED|XMAX_INVALID) 
      t_bits: [0]: 0xff [1]: 0xff [2]: 0xff [3]: 0x07COPY: text

    So, now we have all the necessary information on hand. The table is called test, has relfilenode 16385 and contains two columns - k with type int4 and v with type text. Now we can dump its contents, as described in the previous article .


    I hope that in practice you will never need this knowledge :) If you have questions or additions, I will be glad to see them in the comments!


    Also popular now: