Analysis of suspicious PDF files

    A few months ago, I faced an interesting task of analyzing a suspicious pdf file. By the way, I usually do analysis of the security of web applications and not only the web, and I am not a great expert in the direction of malware analysis, but the case presented itself was quite interesting.

    Almost all the tools presented in this article are contained in the Remnux distribution, created specifically for reverse engineering malware. You can upload your own virtual machine image for VirtualBox or Vmware.

    First of all, I analyzed the resulting instance using the pdfid script:



    Decoded javascript code using the same pdf-parser: I



    brought it to a convenient form, for this you can use js-beautify:



    Not bad. He also analyzed the file using the excellent jsunpack utility:



    At first glance, he discovered a vulnerability CVE-2009-1492 related to the execution of arbitrary code or denial of service through Adobe Reader and Adobe Acrobat versions 9.1, 8.1.4, 7.1.1 and earlier versions , using a pdf file containing annotaion and using the getAnnots method. But if I check my results obtained above with the corresponding exploit, it turns out that this vulnerability is not related to the current case. In our version, annotaion is used to store most of the script, including for obfuscation purposes.

    Data from annotaion is called by the getAnnots method and is located in object 9 of our file (as pdfparser showed). We will save the received javascript code by adding a stream from object 9 to it. Usually, the first step for safe code execution is to replace the eval function with a harmless alert or console.log and open the file using a browser. You can also use Spidermonkey for these purposes. The main functions and variables we need are already defined in the pre.js file, which you can also find in the Remnux distribution.



    Not bad. After starting Spidermonkey, we got a new script that uses the eval functions and the data stream from object 7:



    The most interesting thing in this script is hidden in the var v12 variable - this is the arguments.callee function. Arguments.callee indicates a call to the current executable function. Therefore, this code uses itself for obfuscation purposes. That is, if you change something in the current code (as I did earlier when refactoring or replacing the eval function with alert), you will break the whole next part of the decryption. But do not despair. Articles describing similar situations: isc.sans.org/diary/Browser+*does*+matter%2C+not+only+for+vulnerabilities+-+a+story+on+JavaScript+deobfuscation/1519 , isc.sans.edu/ diary / Static + analysis + of + malicous + PDFs +% 28Part +% 232% 29/7906 and www.nobunkum.ru/ru/flash (from Alice Shevchenko).

    In this case, we can replace the call to arguments.callee.toString (). Length with the length of the function itself and move on, replacing the call to arguments.callee.toString (). CharCodeAt (0) with the first character in the line of our function.

    There is no need to decode all the code, just execute the resulting script with the data using the same spidermonkey or use jsunpack.



    The final script looked like this:



    After refactoring I got:



    The vulnerability details are described here: cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-5659 . A couple of public exploits can be found here: downloads.securityfocus.com/vulnerabilities/exploits/27641-collectEmailInfo-PoC.txt , www.exploit-db.com/exploits/16674/ .

    As you can see in the second case, our unescape ("% u0c0c% u0c0c") and this.collabStore = Collab.collectEmailInfo ({subj: "", msg: # {rand12}} are used;

    2. In the func_04 function, the variable var v0 is also used with the value 0x0c0c0c0c, which, as it were, hints at the presence of an exploit heap spray. Why this value is so popular can be found here www.corelan.be/index.php/2011/12/31/exploit-writing-tutorial-part-11-heap-spraying-demystified .

    In variables v3, v4, shellcode is scanned due to the presence of a series of NOP instructions at the beginning of the variable values.

    To confirm my assumptions, I used the libemu emulator from the free PDFStreamDumper product with the value taken from the v4 variable. You can also find libemu in Remnux:



    Bingo. Url detectedxxxxxx.info/cgi-bin/io/n002101801r0019Rf54cb7b8Xc0b46fb2Y8b008c85Z02f01010 which was used to download and then execute our malware:



    3. Also, the comparison parameters found in the script:



    look the same as in CVE-2009-2990 Adobe Reader array and index error Acrobat 9.x prior to version 9.2, 8.x and 8.1.7, as well as from version 7.x to 7.1.4, allows arbitrary code to be executed.
    In the FlateDecode-encoded stream of object 11, we also find the code in U3D:



    Now we have a URL, Shellcode, several CVEs, and this is quite enough for this article.

    The author of the article: Andrey Efimyuk , an expert in the field of information security, OSCP, eCPPT, a good friend of PentestIT .
    Original article

    Also popular now: