Restoring data from CockroachDB

    Restoring data from cockroachdb is easy - just roll everything from backup. How did backups not do it? For a base with version 1.0 released just six months ago? Well, do not despair, most likely the data can be restored. I will talk about how I restored the database for my project of a fun social network in bamboo and streamed this process on YouTube.

    How will we restore

    First you need to figure out what happened, why did CockroachDB fall? The reasons are different, but in any case, the server no longer starts or does not respond to requests. In my case, after a short google, the rocksdb base was beaten:

    E171219 15:50:36.54151725 util/log/crash_reporting.go:82  a panic has occurred!
    E171219 15:50:36.73448574 util/log/crash_reporting.go:82  a panic has occurred!
    E171219 15:50:37.24129825 util/log/crash_reporting.go:174  Reported as error 20a3dd770da3404fa573411e2b2ffe09
    panic: Corruption: block checksum mismatch [recovered]
    	panic: Corruption: block checksum mismatch
    goroutine 25 [running]:*Stopper).Recover(0xc4206c8500, 0x7fb299f4b180, 0xc4209de120)
    	/go/src/ +0xb1panic(0x1957a00, 0xc4240398a0)
    	/usr/local/go/src/runtime/panic.go:489 +0x2cf*Store).processReady(0xc420223000, 0x103)
    	/go/src/ +0x427

    Restore RocksDB storage

    If you have a rocksdb database broken, then the necessary command is already built in to cockroach version 1.1 to restore it:

    $ cockroach debug rocksdb repair

    You may also need to specify the path to the data directory, if the path is custom.

    After recovery, you can try starting cockroachdb again. In my case, this did not help, but the error was different:

    E171219 13:12:47.618517 1 cli/error.go:68  cockroach server exited witherror: cannot verifyemptyenginefor bootstrap: unable toreadstore ident: store has not been bootstrapped
    Error: cockroach server exited witherror: cannot verifyemptyenginefor bootstrap: unable toreadstore ident: store has not been bootstrapped

    Obviously, something broke somewhere in the settings, and I do not understand the storage format of cockroachdb well enough to understand that it is still missing. Therefore, we will go the other way: we know that inside this is Key-Value storage and even roughly know what we need to look for, as the developers told ( here and here ) about this in their blog.

    “Tearing” data directly from RocksDB

    Since the format of the keys and records is approximately known to us, you can take the directory with the data from the “broken” instance and try to go through all the keys and get the data directly from there. I will talk about the option with one host, but if there are many hosts and the entire cluster has died, then you will still need to learn how to delete duplicates and determine the latest versions of keys.

    We will write everything on go, of course. At first I decided to try to take the library , and it even started up, but it gave an error that it did not know the comparator cockroach_comparator. I took the correct comparator from the source code of cockroach itself, but nothing has changed.

    Since I was too lazy to figure out what’s the matter, I decided to go the other way and simply took the ready-made package right from the source code of cockroachdb itself: the package has everything you need for in order to work correctly with the KV-base.

    Therefore, we will open the database and begin to iterate and try to look for the names of the keys, in the value of which there are some lines that we know exactly what is in the database:

    package main
    import""funcmain() {
    	db, err := engine.NewRocksDB(engine.RocksDBConfig{
    		Dir:       "/Users/yuriy/tmp/vbambuke",
    		MustExist: true,
    	}, engine.NewRocksDBCache(1000000))
    	if err != nil {
    		log.Fatalf("Could not open cockroach rocksdb: %v", err.Error())
    		engine.MVCCKey{Timestamp: hlc.MinTimestamp},
    		func(kv engine.MVCCKeyValue)(bool, error) {
    			if bytes.Contains([]byte(kv.Value), []byte("")) {
    				log.Printf("Email key: %s", kv.Key)
    			returnfalse, nil

    I got something like this:

    Email key: /Table/54/1/158473728194052097/0/1503250869.243064075,0

    This key has quite a few components, but here's what I managed to find out:

    0. Table means “table” :)
    1. Table number (tables should go in the order of creation)
    2. Type of key. 1 means ordinary record, 2 means index
    3. The value of the primary key (1,2,3, ...)
    4. I do not know, apparently the version?
    5. timestamp

    That is, you can display all the keys and their values ​​for all tables in a separate file and break them by table number. After that, it should become more or less clear how the tables are called (and what is their structure, because you still have it :)?).

    Parsing the record format

    I recovered data from cockroachdb version 1.0.4, so for later versions the details may be different. But here is what I managed to understand:

    1. The first 6 bytes in the value can be ignored. Apparently, this is a checksum of data and some other meta-information, for example, bits about nullable fields
    2. Next comes the data itself, and before each column, except the first, there is a separate byte with its type

    Example from the messages table (I used od in order to get a readable view of binary data):

    The structure of the messages table was like this:

    CREATETABLE messages (
      user_id BIGINT,
      user_id_to BIGINT,
      is_out BOOL,
      message TEXT,
      ts BIGINT

    $ head -n 2 messages | od -c00000001   /   1   /   0   /   15032508680000020    .   727554828   ,   00000040    =                     241   E 270276  \n   # 202 200 230 3160000060316   ˁ  ** 2630040232022002042313742352222640040320000100026031   N   o   w       I       u   s   e       r   e   a   l
    0000120        P   o   s   t   g   r   e   S   Q   L 0232302772562170000140240320375342   (  \n                                        

    Let's analyze this data in order:

    1. first in the file I wrote the key name - in the fragment
    0000000    1   /   1   /   0   /   1   5   0   3   2   5   0   8   6   8
    0000020    .   7   2   7   5   5   4   8   2   8   ,   0                
    0000040    =                     

    this is all a piece of the key from which we need to take the value of the primary key (the format of the keys is described above)

    2. Heading. On the line 0000040 after the key is a 6-byte header:
    241   E 270276  \n   #
    it is always different, but for all my tables the first 6 bytes needed to be simply skipped.

    3. The first field, user_id. The numbers I met in cockroachdb have always been varint encoded from the standard library. The first column can be read using binary.Varint . We will have to read the following piece:
    0000040    =                     241E270276  \n   #        отсюда   --->    202 200 230 3160000060316   ˁ  ** 263004  <----     досюда      023 202 200 204 231 374 235 222 264 004 032

    4. The second field, user_id_to. It turned out that at the beginning of the field is its type and 023 means a number and can be read exactly like varint. You can write the appropriate functions for reading such columns from a byte array:
    funcreadVarIntFirst(v []byte)([]byte, int64) {
    	res, ln := binary.Varint(v)
    	if ln <= 0 {
    		panic("could not read varint")
    	return v[ln:], res
    funcreadVarInt(v []byte)([]byte, int64) {
    	if v[0] != '\023' {
    		panic("invalid varint prefix")
    	return readVarIntFirst(v[1:])

    5. Next is the Boolean field. I had to tinker a bit, but I was able to find out that you can use a ready-made function from a package encoding.DecodeBoolValue This function works approximately the same as the ones announced above, only returns an error instead of panic. We use panic for convenience - we do not need to handle errors very intelligently in a one-time utility.
    6. Next is the message text. Before the text fields comes byte 026, then the length and then the contents. It looks something like this:

    0000100026031   N   o   w       I       u   s   e       r   e   a   l
    0000120        P   o   s   t   g   r   e   S   Q   L 0232302772562170000140240320375342   (  \n                                        

    One would think that the first byte is the length, and then the text itself goes further. If the values ​​are small (conditionally up to 100 bytes), then this even works. But in fact, the length is encoded in another way, and the length can also be read using the functions from the encoding package:

    funcreadStringFirst(v []byte)([]byte, string) {
    	v, _, ln, err := encoding.DecodeNonsortingUvarint(v)
    	if err != nil {
    		panic("could not decode string length")
    	return v[ln:], string(v[0:ln])
    funcreadString(v []byte)([]byte, string) {
    	if v[0] != '\026' {
    		panic("invalid string prefix")
    	return readStringFirst(v[1:])

    7. Well, the final regular number, we read using our readVarInt function.

    Reading a column of type DATE

    I was tormented with a column of type DATE, because in the encoding package the right function was not found right away :). I had to improvise. I won’t bother you for a long time, the DATE format is an ordinary number (column type 023 hints), and it says ... The number of seconds in UNIX TIME format divided by 86400 (the number of seconds in days). That is, to read the date, you need to multiply the read number by 86400 and treat this as unix time:

    v, birthdate := readVarInt(v)
    ts := time.Unix(birthdate*86400, 0)
    formatted := fmt.Sprintf("%04d-%02d-%02d", ts.Year(), ts.Month(), ts.Day())

    Insert back to base

    To insert the data back into the database, I personally wrote a simple function for escaping strings:

    funcescape(q string)string {
    	var b bytes.Buffer
    	for _, c := range q {
    		if c == '\'' {
    	return b.String()

    And used it to make SQL queries manually:

    	"INSERT INTO messages2(id, user_id, user_id_to, is_out, message, ts) VALUES(%s, %d, %d, %v, '%s', %d);\n",
    	pk, userID, userIDTo, isOut, escape(message), ts,

    But you can compose a CSV, use your model for the base, use prepared expressions, etc. - as you please. It is not difficult after you have parsed the binary data storage format in CockroachDB :).

    Links, conclusions

    Thank you for scrolling to the end :). Better do backups, and don’t act like me. But if suddenly you really need to pull data from CockroachDB, then this article will have to help you a little. Do not lose data!

    My funny social network
    Sources of my data recovery utility
    Process on youtube (2 of 3 videos)

    Also popular now: