Rich signature, or what the MS compiler hides

    Introduction
    Probably many programmers, and just curious people, have come across the fact that some exe / dll / sys and the like files contain incomprehensible data between the MZ and PE headers that end with the word Rich.

    Many did not pay attention to them, while some of them could say that this file was created using a compiler from Microsoft. There were also people who believed that there was hidden some data necessary for the program to work. And this is only one side of this fact.

    The other side is what many people know that Microsoft specially marks executable files created using their compilers (C \ C ++ \ MASM) and that this is supposedly done in order to calculate the creators of malicious programs.
    Many believe that when linking, information about the computer or user is entered.

    Combining both of these facts, we can confidently say that Rich data is possible and that is the identifier by which you can determine the person / computer where the malicious program was created. The validity of this fact will be verified.

    Rich in detail
    From the main features of this data that could be found during visual inspection, the following can be distinguished:
    • They always come after the MZ header, but before the PE header.
    • Most often, their position relative to the beginning of the file = 80h = 128.
    • A program compiled on the same computer has the same Rich data.

    Example RICH data (marked in pink):
    image

    those. its structure is approximately the following: where YYYYYYYY - Repeating 32-bit values XXXXXXXX - Constantly different 32-bit values [Rich signature] - 4 bytes forming the word Rich What does Rich hide? A lot of time was spent to find out that at least some kind of information and what kind of data is stored there and as a result, a foreign site was found where the article was located (http://ntcore.com/files/richsign.htm) a man named Daniel Pistelli in which he researched this signature, as well as how it is formed and a lot of useful data.
    XXXXXXXX - YYYYYYYY - YYYYYYYY - YYYYYYYY
    XXXXXXXX - XXXXXXXX - XXXXXXXX - XXXXXXXX
    XXXXXXXX - XXXXXXXX - XXXXXXXX - XXXXXXXX
    [Rich сигнатура] - YYYYYYYY








    As it was found in the above article, the number marked by us as YYYYYYYY is the encryption / decryption key. Speaking in more detail about this, then this is just a mask for the XOR operation (which is just encryption)

    Decryption:
    • We take a double word after the signature Rich - this is our key.
    • Each double word Rich of data is decrypted by XOR using the key, and so on until we reach the Rich signature (no need to decrypt).

    As a result, we get something like this: where, XXXXXXXX - decrypted data 00000000 - XOR number per itself = 0. The DanS signature was most likely written in to check that it is exactly Rich data and that they were not faked with a simple pseudo-random number generator . Further, the author of that article ran the compiler through the IDA to study in more detail what is written in Rich data. As it turned out, their format is this: where, XXXX is the older version YYYY is the younger version ZZZZZZZZ is something like a build version. From this we can safely say that Rich data is just the versions of libraries and the compiler that were used to create the program. Practical implementation:
    DanS-00000000 - 00000000 - 00000000
    XXXXXXXX - XXXXXXXX - XXXXXXXX - XXXXXXXX
    XXXXXXXX - XXXXXXXX - XXXXXXXX - XXXXXXXX
    Rich-YYYYYYYY







    XXXX00YY ZZZZZZZZ XXXX00YY ZZZZZZZZ
    XXXX00YY ZZZZZZZZ XXXX00YY ZZZZZZZZ








    Everything would be fine, but it’s always better to see with your own eyes what versions are there. Yes, and I would like to automate this process.
    Therefore, we will write a small function on Delphi that will display the library versions wired in Rich data. As a result of this function, you can get version information. In my case, the experimental files had the following library versions. What threatens us: According to the data described above, we can say that Rich doesn’t store any important and even more confidential information, so you don’t have to worry about the fact that they will be present in your programs. Although in fact, the following key points can be identified:
    // на входе имя и путь до файла
    procedure PrintRichData(FileName : string);
    var
    Lib : DWORD; // адрес файла в памяти
    Data : DWORD; // данные
    Key : DWORD; // ключ шифрования
    x : integer;
    cnt : integer; // кол-во элементов в данных
    MinVer, MajVer : WORD; // версия
    Times : DWORD;// доп инфо.
    Msg : string;
    begin
    Msg := '';
    // загрузим в память подопытную прогу
    Lib := LoadLibrary(PAnsiChar(FileName));
    if Lib <> 0 then // если загрузили
    begin
    cnt := 0;
    while true do // перебираем Rich данные чтобы найти конец и кол-во их
    begin
    // получаем текущую запись
    Data := DWORD(pointer(Lib + $80 + (cnt shl 2))^);
    if Data = 0 then break; // если пустое значение значит нету данных
    if Data = $68636952 then break; // проверим на конец данных - сигнатура Rich
    inc(cnt); // переходим на следующую запись
    end;

    if cnt <> 0 then // если есть Rich данные
    begin
    // считаем маску шифрования
    Key := DWORD(pointer(Lib + $80 + ((cnt+1) shl 2))^);
    x := 4; // Так как первый элемент это DanS а потом 3 повтора ключа, то начинаем сразу с 4-го элемента
    while x < cnt-1 do // перебираем все элементы
    begin
    Data := DWORD(pointer(Lib + $80 + (x shl 2))^) xor Key; // расшифровываем
    MinVer := Data and $FFFF; // младшая версия
    MajVer := (Data shr 16) and $0F; // старшая
    inc(x);
    Times := DWORD(pointer(Lib + $80 + (x shl 2))^) xor Key; // доп. инфа.
    Msg := Msg + `Ver: ` + inttostr(MajVer) + `.0.` + inttostr(MinVer) + ` Times:` + inttostr(Times) + #13#10;
    inc(x); // следующий элемент
    end;

    MessageBox(0, PAnsiChar(Msg), `INFO`, 0);
    end;

    FreeLibrary(Lib);
    end;
    end;



    Ver: 1.0.0 Times:44
    Ver: 9.0.9210 Times:5
    Ver: 0.0.9210 Times:1
    Ver: 12.0.9178 Times:8
    Ver: 13.0.9210 Times:1






    • You can find out which version of the compiler was used.
    • The same version of the compiler on different computers, for the same program will give the same Rich data. So according to these data it is impossible to directly prove that this program was created on this computer, and even more so by you.
    • The encryption mask is calculated based on the data of the PE header, so antiviruses can use it as a signature when analyzing a malicious program
    • Removing this signature - may cause more interest in your program antivirus.
    • Random values ​​in version data can also attract the attention of antiviruses.
    • When writing file protection systems, if completely random Rich data is used at all, then antiviruses can immediately understand that the file has been modified i.e. the second signature check (DanS) fails. So it is better if you generate this data yourself, it is advisable to have a set of the most common versions of libraries.


    References:
    1) Daniel Pistelli (http://ntcore.com/files/richsign.htm) - A detailed description of Rich data, a key generation method, and a compiler patch to disable Rich data creation.
    2) Vovane (http://www.wasm.ru/forum/viewtopic.php?id=8572) - Program forgery and verification of Rich data.

    Also popular now: