How we (almost) defeated DirCrypt



    Translation of an article from Check Point's Malware Research Team.

    DirCrypt is one of the most malicious types of extorting money malvari. It not only encrypts all found user files, demanding a ransom for their decryption, but also remains in the system, picking up and encrypting files stored by the user on the fly. After its appearance, work on a computer turns into torment.

    Victims of such malware are usually recommended to restore files from an early backup. But if there is no such backup, we are left with a difficult choice - to reconcile with the loss of data or pay an attacker. However, we (Check Point's Malware Research Team) managed to find a way in case of DirCrypt to recover almost all the data, taking advantage of its weaknesses in the implementation of encryption.

    A typical DirCrypt victim learns about an attack only after it is committed. Malvar scours through hard drives for documents, images, and archives. After that, the string ".enc.rtf" is added to the name of the found files. If you try to view their contents in a raw form, you will not see a trace of old data. And after opening this file as an RTF document, you will see instructions for paying money to a fraudster.


    File system before file encryption


    File system after file encryption

    Here we begin our investigation. The disassembled code of the program is confused and obfuscated, and we have to find the code that performs encryption in it. Since the suffix ".enc.rtf" is added to each processed file, we can make a logical conclusion that somewhere in the area of ​​encryption functions we will see a link to this line. But after a quick inspection at Hex-Rays' IDA-Pro, we find that most of the lines and the binary are obfuscated. Thus, we are faced with the first task: to decrypt these lines.

    One way to search for encrypted strings is to simply examine all sections for a block of seemingly random data. After that, you need to find a function with a cross-reference (DATA XREF) to this piece of binary data. If an index that is within certain limited limits is passed to this function, then we found what we were looking for.

    Finding an area with encrypted data turned out to be quite easy - this is a big chunk in the .data section.


    Encrypted strings

    Having studied cross-references to this data block, we learn that several functions are accessing it, one of which is called in the code as many as 170 times. Turning directly to the places of its call, we see that the index is passed in it as the first parameter. Bingo, we caught you!

    The code itself for decrypting the strings is quite long and complicated. Since our goal is to decrypt the lines as soon as possible, we did not reverse this algorithm, but wrote a small script for Windbg, which did the hard work for us:

    .while (@$t0 < 0xe1) { .printf “\n%02x”, @$t0; r eip = 0x40456a; p; p; p; eb esp @$t0; p; du @eax; r $t0 = @$t0 + 0x01 }
    


    This code in a loop calls the decryption function, passing index values ​​from the range into it in turn, and the decrypted lines themselves are displayed in the terminal window.


    Example of outputting decoded Windbg strings

    Now we need to transfer the received strings to IDA-Pro for the convenience of static analysis. We decided to include them in the text as comments in the places where they are used. To do this, I wrote a small script on IdaPython that takes a file (“strings.txt”) with a dump of Windbg output and inserts the lines where the function to decrypt them is called. Now we can quickly cross-reference the lines of interest to us from the DecryptString dialog.



    Now in the table of such links we find ".enc.rtf" and find that it is used only in one place. Since the suffix is ​​added to the files after encryption, we can safely assume that we are close to the code itself that executes it.

    The code for the function that refers to ".enc.rtf" is pretty visual, and IDA-Pro did part of the work for us, correctly defining its argument as the file name. At the beginning of the function, another function is called, after which the suffix ".enc.rtf" is received and decoded, and then the file is renamed. That is, it is clear that the encryption process itself occurs before renaming in the very first function.

    Moving to it, we find a voluminous code with a repeating pattern: data is read from the file with chunks, which are then changed and written back to the file. This is the classic behavior of cryptographic functions, so you can be sure: we found what we were looking for. Now the fun begins.


    External wrapper of the function that performs encryption

    Malvar in a cycle reads chunks of the contents of the file, encrypts them in memory and writes them at the same offsets, overwriting previous data.


    Encryption Cycle

    Climbing a little deeper, we understand that in fact there are two encryption functions. The first is called for each chunk of read data and thus encrypts the entire file. As an argument to this function, a pointer to the object is passed, the data on which is created in the second function. The experienced eye recognizes here the initialization of the S-block of the RC4 algorithm.

    Encryption is performed for each file individually, and at the same time, the S-block is initialized with the same key for each file.


    Initialization and operation of the RC4 algorithm

    If you've ever seen bloopers in cryptography, here you simply won’t believe your eyes. And since the S-block is constantly reinitialized, the same key stream is used for each file. It remains for us to take the final step: if we know the original contents of the encrypted file on the victim’s computer, then in order to receive the key stream, we need to perform the XOR operation byte by byte between the original and encrypted data.

    OK, we need to find the file that is guaranteed to be available on the Windows file system. These are, for example, standard images for the desktop background. Their size is about 100 Kb, so that, without spending much effort, we can get a key stream of the same size.

    And only this thought flashed through our heads when the following code caught our eye:


    Appending the RC4 key to the end of the encrypted file

    Having picked up the jaw from the floor, we begin to seriously regret the author of this unfinished malvari. Probably confused as to where to save the key, he somehow decided to append it to the end of the file, where everyone can find it. For some reason, this idea seemed appropriate to him.

    This is also beneficial for us: now we can use the same RC4 to completely decrypt the file.

    But wait a moment. Here we face another problem: not only RC4 is used to encrypt files.


    Encryption of the first 1024 bytes of RSA

    The first chunk of size 1024 bytes is encrypted using the RSA algorithm. The private key is not stored inside the file, so one way to get it is to pay money to the attacker in exchange for the key. Assuming that we won’t pay money (and we won’t attack the key issuing server), the only way out is to save everything except the first 1024 bytes.

    In some cases (depending on the format of the restored file), this problem can be successfully solved. Let's take a standard .doc file as an example. In it, starting at offset 0x1A00, is the text of the file in Unicode. Let DirCrypt encrypt our experimental file and compare its contents “before” and “after”:


    Experimental document created in Microsoft Word


    Document in a hex editor before encryption


    A document in a hex editor after encryption

    In the case of .doc files, we wrote a simple Python script that extracts the RC4 key, decrypts the file with it (except for the first 1024 bytes) and saves the text in ASCII. By running it for our experimental file, we were able to completely restore the text of the document.


    Extracted text

    Afterword


    In this article, we talked about tricks that will help you find a vulnerability in cryptomalvari. The initial goal for us was to demonstrate that the attacks carried out by this category of malware can be analyzed to a greater or lesser extent, and the defending party can almost always detect and take advantage of the attacker's unsuccessful moves.

    Also popular now: