Get the difference between binary files using vcdiff

    tortoise.jpgtortoise_bad.jpg


    It took me to understand where and how the JPEG file was corrupted during the transfer.


    VCDIFF is a format and algorithm for delta coding. Described in RFC 3284 .

    Delta encoding (English Delta encoding) - a way to present data as a difference (delta) between serial data instead of the data itself.

    For example, I use text files encoded in Windows-1251 for clarity. But with the same success it can be binary files.


    Sources:


    "копия    текст    копия"       ( source.txt )
    "копия    изменения    копия"   ( target.txt )

    Need to get the difference between files:


    "         изменения         "   ( source.txt -> target.txt )
    "         текст         "       ( target.txt -> source.txt )

    I use the xdelta3 program but I think any one that works with the vcdiff format will do.


    How to get


    We will need another file filled with spaces:


    "                           " ( spaces.txt )

    It must be greater than or equal in size to the source file (source.txt)


    Team:


    xdelta3 -e -A -n -s source.txt target.txt | xdelta3 -d -s spaces.txt

    Result:


             изменения         

    Used flags:
    -e- Delta creation
    -A- removes unnecessary headers
    -n- removes crc (it does not allow to use the delta with another source)
    -s [файл]- the source with which the target file is compared and restored
    -d- getting the target file from the delta and the source


    How it works


    If you run the command:


    xdelta3 -e -A -n -s source.txt target.txt | xdelta3 printdelta

    Then after all the headers see the commands VCDIFF


      Offset Code Type1 Size1  @Addr1 + Type2 Size2 @Addr2
      000000025  CPY_09 S@0
      000009 010  ADD        9000018 025  CPY_09 S@14

    VCDIFF is inherently very simple. It consists of 3 teams.


    COPY(copy) - copies data from the source or target
    ADD(add) - writes data stored in the delta (unique data that is not in the source) to the target file
    RUN(repeat) - repeats one byte from the delta a specified number of times


    Delta stores only unique data and the rest is copied from the source. If you run the command:


    xdelta3-e-A-n-ssource.txttarget.txt > target.vcdiff

    We will see in the delta only the word "changes" which is only in the target file.


    D0A6D093D094200102011720131B20090302изменения190D0A19200E

    ( JSON does not like special characters so I translated them to HEX )


    If the delta is applied on the source (source.txt) then we will get the target file (target.txt)


    xdelta3-d-ssource.txttarget.vcdiff
    копия    изменения    копия

    Replacing the source (source.txt) with a file filled with spaces (spaces.txt) we replaced the data that is repeated in the source and in the target file with spaces.


    xdelta3-d-sspaces.txttarget.vcdiff
             изменения         

    You can use any other character in the spaces.txt file. The main condition is that the spaces.txt file is greater than or equal in size to the source file.


    I actually compared the JPEG files like this:


    xdelta3 -e -A -n -s bad_image.jpg good_image.jpg | xdelta3 -d -s spaces.txt

    The result of comparing these files:


    View result
    F488A2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            F2AB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            

    Many spaces and bytes that were "broken". Broken bytes translated to HEX.


    Test jpeg files on which you can test comparison methods:


    magnettortoise.jpg (18,821 b)magnettortoise_bad.jpg (18 829 b)
    tortoise.jpgtortoise_bad.jpg

    xdelta3 -e -A -n -s tortoise_bad.jpg tortoise.jpg | xdelta3 -d -s spaces.txt

    The result of comparing these files:


    View result
    F1BF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     F0B786                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          F39BAF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     F3BD94                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         

    Broken bytes translated to HEX.


    Also popular now: