Viruses. Viruses? Viruses! Part 2

As promised in the last part , we will continue to review the virus engines. This time we will talk about the polymorphism of the executable code. Polymorphism for computer viruses means that each new infected file contains a new virus code descendant. Theoretically, for an antivirus, this would have meant a real nightmare. If there was a virus that in each new generation would change its code 100%, and in a truly random way, no signature analysis could detect it.

Maybe somewhere there is a super programmer who actually wrote such code, and that is why we know nothing about it. I do not really believe it, and it even seems that mathematicians who work on the mathematical justification for the operation of computing systems could prove that there is no such specific polymorphism algorithm, the result of which could not be 100% detected using another specific algorithm. But we are simple people, we are just interested in the idea of the code that changes itself, and in the light of the “algorithm vs. algorithm”, consideration of the opposition of the methods of hiding the executable code to the detection methods for the programmer should be very interesting.

We recall and slightly supplement the first article.

Let us recall our heroes from the first article: the virus maker and the programmer of the antivirus company, and attach to them their karmic twins: the developer of the hinged protection and the cracker. The first try to hide the executable code and information about it, the second - to get access to the characteristic code and its internal algorithms. In the virus domain, automatic methods prevail (virus engine and antivirus detector), in the protection area, manual methods (the attachment protection parameters are controlled manually, the process of software hacking is also manual work, despite the abundance of auxiliary software).

At the end of the first article, we have a virus that can correctly infect an executable file (that is, it knows how to work out itself and correctly execute the code of the file itself when launching the file) and an anti-virus detector that knows that the virus is located in strictly certain places of the file, and that at some distance from the characteristic point (from the entry point, from the beginning of the section, from the end of the header) there is a fixed set of bytes that characterizes the virus. Also, so that the article does not move into the discussion of “what is wrong with the virus,” let's agree that the payload of the virus does not do anything. So you can weed out all the discussions regarding the nature of the actions of the code in question and focus on the methods of detection and concealment.

We continue to remember and supplement the first article. The scheme of opposition of a virus and antivirus can be considered by analogy from the point of view of hacking a commercial program. Instead of the infector, the program itself works, “hanging” protection on the executable file. It “spoils” the code of the program itself, the information necessary for its recovery is hidden in its body, and in the same cunning way as the virus, hiding the algorithm of its work, first executes the protection code that checks the validity of the serial number or the time of the trial, and Then, after “repairing” the main program, it transfers control to it. Cracker, in turn, plays the role of an antivirus, trying to get to the internal security code and learn its algorithm. It turns out that he does the same thing as Aver, trying to find (and save) the characteristic part of the code. The only difference is

The most popular way to remove such protection is to dump the image of the program into memory onto a disk. Cracker is looking for the moment when the protection has worked, deciphered and “fixed” the main program, and there is a healthy code image of this main program in memory. In fact, he is looking for an opportunity to stop the program at the so-called OEP (Original Entry Point) - the “old” entry point of the protected program. At this point, the image in memory can be saved to disk. Of course, it will not work, but it can be repaired by “reconfiguring” the Entry Point of the executable file so that it points to the OEP, and if the program is working at that moment, such an image will work just skipping protection (there are still many manipulations with recovery of external function calls, multiple dumps in case the program is not fully decoded, and in general, This is a topic for a dozen articles, but the main principle is this). Another popular way is to find a piece of code that generates a serial number and, if possible, bite it, and make a small executable file that generates valid serial numbers (keygen). As we will see below, a similar course of action is not alien to the antivirus detector.

I also like to draw analogies with biological systems, I will try not to burden you very much with this. I really want to see artificial intelligence and life as soon as possible.

Disassembler and debugger

Understanding the basic principles of their work is important to consider the area of protection of executable code. You probably know something about this, since you are reading this article, but anyway - either be patient or just skip this section.

The disassembler accepts either an executable file or an abstract buffer with code and, quite importantly, the first address from which to start disassembling. In the case of an executable file, this is, for example, an entry point. Putting a pointer to the first instruction, the disassembler determines what kind of instruction it is, its length in bytes, whether it is a transition instruction, which registers it uses, which addresses it refers to in memory, etc. If the instruction is not a branch instruction, the disassembler proceeds to the next instruction, moving the pointer forward for the length of the instruction. If it is an unconditional JMP or CALL, the disassembler moves the next instruction pointer to where the transition address points. If this is a conditional transition (JZ, JNA, etc.), then the disassembler marks the following two addresses to be considered at once - the address of the next instruction and the address, to which the transition is possible. If the byte combination is not recognized, the process of disassembling this branch stops. It is also necessary to mention that the disassembler stores information about which instructions refer to this (!), Which allows you to define function calls, and, most importantly, who calls them.

A disassembler turns a sequence of bytes into a sequence of multiply connected structures in which information is stored about each byte of an instruction: whether a particular byte is part of an opcode (operation code), data, an address to which a transition originates from, etc. Each structure can contain references to one or two of the following structures and at the same time be an object referenced by an arbitrary number of other structures (for example, the first instruction of a function that is called many times). Also, smart disassemblers can follow the stack pointer, or be able to recognize and correctly label for disassembly such constructions as: mov eax, 0x20056789; call eax; Plus, recognize the characteristic functions of a set of instructions, set the starting points for manual disassembly, comment on individual instructions and save the result of disassembling to disk, because the operation of constructing a call graph and marking structures is very costly, well, you can mess around with one file for days. But, as we discussed earlier, it is possible that a transition on a disk in a file leads to an encrypted buffer, and in this case the disassembler generates a mess of instructions or stops. In this case, you need to get this encrypted buffer right at runtime, when it is open in memory, and this requires a debugger. and in this case, the disassembler generates a mess of instructions or stops. In this case, you need to get this encrypted buffer right at runtime, when it is open in memory, and this requires a debugger. and in this case, the disassembler generates a mess of instructions or stops. In this case, you need to get this encrypted buffer right at runtime, when it is open in memory, and this requires a debugger.

The main task of the debugger is to stop the program in the most interesting place. There are several ways to do this. You can open the process memory, and instead of one of the instructions, enter int 3 - in this case, the processor, following this instruction will generate an exception, and the debugger will process it, open its window, restore the original instruction and show what is in this memory area. You can turn on the trace flag in the processor, and then the processor will generate this exception on each instruction. Finally, the processor has debug registers, you can put some address in them, and the processor, having gained access to the memory at this address, will stop. So, for example, by setting breakpoint to access the start address of the encrypted buffer, we will stop for the first time when the decryptor starts decrypting and reads the first byte of the buffer, and the second time, when will transfer control there. At this point, the contents of the buffer can be written to disk, set a disassembler on it and learn all its secrets. In advanced protections, there is generally no such moment in time when the full working program code is in memory - parts of the code are decrypted in chunks as needed. In these cases, the reverse has to collect the dump in pieces.

Protection from code exploration

The topic of protecting the executable code from research is worthy of a dozen articles, therefore, in the framework of this issue, we’ll dwell only on a few points. Static protection of the code from research represents various methods of entangling executable code and encrypting buffers with important parts of the code with subsequent decryption at runtime. Code entanglement can be implemented with the help of special, code-obfuscating compilers, and encryption with the help of the polymorphic engines considered below (which, from the point of view of the code, include commercial protections).

Dynamic protection means that the program can determine at runtime whether it is being debugged and to take some action in connection with this. For example, after reading the buffer with its own code, the program can compare its checksum with the reference one, and, if the debugger has inserted int 3 into the code (see above), understand that it is being debugged or have modified its code in some other way. But perhaps the most reliable and portable way to understand that you are being debugged is the measurement of the execution time of characteristic code segments. The point is simple: time is measured (in seconds, parrots or processor cycles) between instructions in some buffer, and if it is more than a certain threshold value, it means the program is stopped in the middle. The protection, having understood that it is being debugged, may, for example, ignore branches, inside which the reverser may stop and stupidly not work, as a virus, remove yourself from the system. To combat such situations, reversers work in controlled environments that can be easily reproduced — for example, in virtual machines, for which everything can be played back, up to BIOS settings. Therefore, when examining the code of a virus or protection, it is necessary to remember that the program may well detect the fact of the study and do something wrong.

Polymorphic engines or “code has become smaller, engine is stronger”

Let's go back to the viral engines. At a certain point in the development of DOS, after the heaps of mega-current packers appeared at the time, programmers, apart from files, began to pack everything that was being packaged. And ".exe" files take up a lot of space, and a rather large part of such a file is an executable code with a stable frequency distribution of groups of bytes, which probably presses well on the correct algorithm. Therefore packers became the first steps to polymorphic engines.

The principle of the packer is quite simple:

take a buffer with executable code (code section, for example);
we pack it;
we take the position-independent unpacker code and supplement it with the correct addresses of the beginning and end of the buffer with the packed code;
add the transition to OEP (the first instruction of the unpacked code) to the end of the unpacker;
We place the unpacker and compressed code buffer in the executable file (we correct the sizes of sections and / or EP).

The resulting file is much smaller in size than the original one. After the appearance of new, coolest hard drives with a capacity of 100MB, this became not so important, but the packaging opened up many new possibilities for the wizards and protection developers:

the size of the virus (despite our coolest 100 MB hard drive) is still important. If the payload code is bold and multifunctional, then the entire virus will be harder to cram into a file, especially if you use something smarter than adding a new section to the end of the file. The use of packaging will allow almost all large and complex virus code to be packaged in a buffer that is several times smaller than the original size.
the buffer with the packed code is not necessary to be placed in the section with the execution flag. For advanced infector is a very important factor, because the main body of the virus can be safely put anywhere. After unpacking, the unpacker must take care that the memory into which the code has been unpacked is allowed to run. That is why Windows APIs that work with memory access attributes (all sorts of VirtualProtect, VirtualProtectEx, VirtualOuery, and VirtualQuervEx) invariably attract heuristics
Well, for sweet, the most important thing - instead of packing or after it, the buffer with the code can be encrypted, and the key can be put in the unpacker. Now it will not be a unpacker, but a decryptor. With each new infection (or hanging protection on the executable file), the buffer with the code can be encrypted using a new key, and then the buffer with the code will have completely new content (of course, using good encryption algorithms) .: w In the future I will not write “ packed, ”but I assume that the packaging may be included in the encryption process.

Well, here it is, in fact, the first polymorphic engine. Let us write in more detail the approximate algorithm of infection:

We generate a new encryption key.
We take the decryptor code (where and how - let's talk later, in the simplest case, stupidly get the ready-made code from our body).
We introduce into it (in the decryptor code) our new encryption key.
We inject the transfer of control from the victim file into the virus code and back (while the code is not yet encrypted).
We encrypt our large buffer with a code with a new key.
Silently we put the encrypted buffer in the victim file (it is significantly different from the previous one, so you can not hide especially).
Add a transition to the beginning of the encrypted buffer at the end of the decryptor.
Slyly (as far as possible) we put the decryptor into the victim file.

Let's see what happened: most of the virus (encrypted buffer) changes completely from file to file, and only a small decryptor remains unchanged. This decryptor actually contains several addresses (varying from file to file), a decryption key (also changing), and the decryption code itself. Antivirus now had to strut, the patterns typical for this virus are hidden inside the encrypted buffer, and a piece of code for the signature now has to be searched for in the decryptor, but it is small and contains much fewer characteristic code and data segments.

Such a simplification of the task caused the appearance of more advanced polymorphic engines, which, when infected, change only the decryptor code - after all, dealing with a small piece of code is much easier than with all the payload code. Joyful virmakers and protection developers rub their hands and learn ways to hide the little decryptor more cunningly, and the Avers and crackers repair disassemblers who, after trying to disassemble randomized byte strings to which JMP is present in the code, the roof goes.

Evolution of viral engines

Now the Aver spends a little more time creating signatures, since You have to work with a small amount of code, in which there are less characteristic sections. And the virmaker is only concerned with the mutation of a rather small decryptor with a fairly simple internal algorithm, and the task of hiding it from the detector now seems more real. Given that the antivirus compares the signature at a fixed offset, first the virmaker tries to shift the decryptor code in various ways and, accordingly, to discredit the characteristic signature inside it.

NOP zone or "maybe blow over"

The first simple technique that came to viruses from exploiting vulnerabilities is the NOP zone. When an attacker succeeded in successfully exploiting any vulnerability and forcing the processor to make the transition to a given address, but the exact address of the location of the shellcode in memory is unknown, the attacker can do this: fill the heap of space in front of the actual exploit code with NOPs:

addr1:	nop
		nop
		;... 				еще очень много NOP-ов 
		nop
addr2: 	jmp    addr3;		shellcode
		pop esi;			shellcode
		xor edx,edx		shellcode
		;...

Now you can make the transition "somewhere there", in the NOP-zone. If only an approximate memory location is known, this technique allows shellcode to be executed successfully.

You can do the same with the decryptor, just put it in different places of a long NOP line when infected. And in some places (where it does not break the transitions) you can cram these NOPs directly into the code. In this case, everything will work correctly, but the bias of the characteristic signature will always be different. Of course, the offsets for conversion instructions will have to be recalculated.
Too much of a free solution didn’t strain Avera, who simply added a sign “skip all NOPs when calculating signature”, but this small step is quite remarkable because for the first time the detector began to look at instructions, not bytes. But more about that later.

Permutation or “add something”

Reflecting on how to discredit the comparison by signature, without breaking the decryptor code, the virmaker comes up with the idea of permutation. Permutation is a permutation of code blocks in each new generation. The code consists of a certain number of blocks, these blocks are rearranged by places in each new generation of the virus, and are connected by JMPs. As always, everything is simple on paper, and problems start in implementation. Inside the blocks there are conditional and unconditional jumps and function calls, therefore such logical blocks must remain intact. At the same time, the thicker the blocks, the less the variability of the resulting decryptor, and the smaller the block size, the more transitions need to be added, inflating the decryptor code, and the more difficult it is to maintain integrity. For the alignment of blocks in length, you can, for example, use NOP zones.

Here is an example of the algorithm: in the body of the virus we store a ready-made set of blocks with markup (which is the block number and its length). Then we take a random block, write it to the buffer, and rule the JMP at the end of the previous block. We supplement the result with JMP th on the first block and the buffer with randomly rearranged blocks is ready. Unlike previous games, this is already a serious enough serious application, each new generation, albeit at the expense of unconditional transitions, but still generates, in terms of offsets, a completely different code. Virmeyker falls asleep with a contented smile.

	[block 1]	[block 2]	[block 3]	[...]	[block N]
[jmp block 1]	[block 2] [jmp block 3]	[block 1] [jmp block2]	[block 3] [jmp block 4]	[...]	[block 4] [jmp block 5]

Aver wakes up. Tracing the code of several generations of the virus, he understands that in the decryptor he is dealing with the rearrangement of blocks, and it is necessary to refine the detector, if possible without depriving it of its performance. He decides to write a fast automatic disassembler that can run according to instructions, dwell only on transition instructions, calculate the transition address and proceed to the analysis of instructions at the transition address.

Now, the anti-virus database contains the following instruction: starting from the entry point, follow the instructions, make the transitions according to the JMPs encountered, and, after passing N instructions, compare the signature. If the signature is in the tenth block, you will have to go to the tenth transition, if conditional transitions (JZ) are possible inside, then you can conditionally consider them as two transitions - to the next instruction and to the address of the transition, and, accordingly, branch the passage according to the instructions. Of course, no one has canceled and detection is simpler, for example, if there are blocks of a virus of fixed length L and their N pieces, you can simply make N comparisons by signature on the displacements [0, (1 * L), (2 * L), ..., ((N-1) * L)].

We estimate the complexity of the search process using disassembler. The disassembler should provide the minimum instruction length definition and VA (Vitual Address) to RVA (Relative Virtual Address) conversion (the address specified in the JMP in the file offset). Determining the length of the instruction is basically a fast enough algorithm (accessing the array element and calculating the next step based on the flags written in the corresponding array element), and the address conversion is a pair of elementary operations of address addition based on which section belongs address. Plus, a little crazy for determining cheap tricks to replace the banal JMP next_block_address, such as:

        XOR eax,eax; 
        JZ next_block_address;
        ; или
        PUSH next_block_address;
        RET;
        ; или
        MOV eax, next_block_address;
        CALL eax;

These are not very scary algorithms in terms of performance, but, nevertheless, it does not look like calculating the CRC32 from a short line at a given offset, and an angry tester swears that the detector already chews the test base half the night and devoured the entire processor.

As usual, if something is turned on, but it slows down, you must either optimize it, or try not to turn it on unnecessarily. The first method, alas, does not roll - you don’t optimize much in a simple disassembler, so Aver goes to the favorite place of all antiviruses - the heuristic analyzer.

Heuristic analyzer or “showdown”

In the first article, we already touched upon the heuristic analysis - indeed, there are signs that with varying degrees of reliability can say that the code was injected into the file. And then, Aver really singled out some of them that were suspicious, but didn’t pull into the right to declare 100% of the fact that the file was infected. Then he just commented them out, because I spent a lot of time on them and absolutely sorry for them. Now, based on them, it is possible to decide whether to run more difficult, using disassembling, file analysis, or not.

There is one more problem - because Heuristics react to anything suspicious, commercial defenses cause genuine interest in him, so Avier had to add a couple of hundred “whites” to popular attached defenses in the signature database - they cannot be touched. Thanks to them, we still can run various commercial software normally. And when writing your own software that uses the methods of working with executable code, it would be nice to get rid of all the files of your program on all popular antivirus programs somewhere on virustotal before release. For unpopular ones, you don’t have to worry much, it’s difficult to drag a heuristic analyzer as easy as a signature database and it’s unlikely that an unpopular antivirus analyzer will be as cool as it has been developed for many years.
It is worth mentioning, of course, about the Virmaker’s attempts to disguise his virus under popular protection. For this, the signature itself is needed, and it begins to parse the anti-virus database in order to understand where to put the necessary bytes so that the anti-virus will take its virus for protection. Anyway, making the next version of the virus, it would be nice to get acquainted with the code that detects the current one. So, anti-virus databases are also objects of reverse engineering, and the detector code is also analyzed by virmakers.
But back to our heuristic analyzer, we present several heuristic signs:

The entry point in the open section for recording (rwx). An open for writing, executable section into which control is immediately transferred is likely to indicate the presence of a self-modifying code, such sections are used in the overwhelming majority of cases by viruses and software protection.
Transition instruction at the entry point. There is no special point in placing the transition instruction at the entry point, and such a sign indicates the presence of a self-modifying code in the file.
Entry point in the second half of the section. Viruses that use the extension section, in most cases, are located at the end of the section. This is not typical for normal files, so this situation is suspicious.
Breakage in the title. Some modifications of the header after infection leave the file operable, but the header itself contains errors that the linker would not allow. This is also suspicious.
Non-standard format of some service sections. In executable files there are utility sections, such as, for example, .ctors, .dtors, .fini, etc. Features of these sections can be used by viruses to infect a file. Violation of the format of this section is also suspicious.
... and a hundred more such signs

There may be many such signs, they have different degrees of danger, some can be dangerous only in combination with others, but this is a powerful tool for making decisions about the need for more thorough analysis and the fact of infection. It is not easy to bypass the heuristics (I mean to ensure that it does not even issue a warning). These are either any platform-specific solutions that use features of certain compilers or frameworks (such as rewriting standard constructors or destructors) that quickly get into the heuristic base, or use of really large and complex infectors that can really high-quality code in the file.

When heuristic signs say that “the file is 100% infected”, but the hard analysis did not find anything, the antivirus writes that the file is infected with a virus with a name like: “Generic Win32.Virus”, or in some ways “Some Win32 Virus”. Such messages can often be found on all sorts of keygens, loaders, etc. In the last article I have already said that it is for this reason that the instructions for installing pirated software are written “disable the antivirus before installation”. Also, I once again want to draw attention to one of the most important information assets of antivirus companies - a collection of executable files of sufficient size so that the analyzer can be tested on it without fear of releasing a version to the world that will be thrown onto legitimate files that are added there. Offended keygens and loaders are surely outraged that they are not added there promptly,

So, after working on the heuristic, Aver comes to the following general detection algorithm:

Check the file with a regular signature search.
If successful, treat the file as infected.
If a “white” defense is found, exit silently.
Check file heuristic analyzer.
If no signs were found, exit.
If signs of sufficient weight are found, run an analysis using disassembly.

At the same time, if the heuristic signs are serious enough to talk about infection, consider the file infected, regardless of whether the analyzer found something or not.

A lot of work has been done, and the antivirus now, even if it does not identify the threat, but with a very high percentage of authenticity, recognizes the facts of infection. Support for the test database of executable files allows you to safely add new heuristic signs as soon as new infection algorithms appear, and finally, the antivirus can respond to threats before the new infection has time to spread. It should be noted that if earlier testing of antivirus on all executable files in the world seemed completely unrealistic, now, now the base of all possible WWW executable files no longer seems fiction. The executable file is a thing that requires serious time, and the world does not produce them that much. In addition, testing on this huge database of files is easily parallelized, therefore, it is quite realistic to train heuristics on vast arrays of possible data. Happy avier drinks his cocoa and goes to bed ...

"Mutants are coming." Metamorphism

This time the warmaker decides not to manipulate the already existing code, but to generate a new decryptor code in each new generation. This is a metamorphism - the generation of new code in each new generation. Unlike permutation, in this case the code does not just rearrange the blocks inside itself, but actually changes its content. In theory, this should mean the unconditional victory of a wirmaker over the exact detection of his virus (no one has canceled the heuristics). Now, the signature made for one generation of the virus will become irrelevant for the other, and even if it continues to detect the virus, it will not give a guarantee of efficiency in the next generation.

What is a metamorphic generator? The basis for the generation of a new generation decryptor is a kind of “base code”, and in what language it is written - not essential. It is stored inside the virus's encrypted body, so it can be permanent. There, in the body of the virus, lies the engine, which, on the basis of each instruction of this “base” code, generates a new, executable code each time. This is very similar to the compiler - at the input there are some semantic constructions, the output is ready for execution by the processor code. Another similar generation of executable code based on the base code occurs in virtual machines - at the moment when on a certain platform the virtual machine executes the prepared byte code. It is at this moment that the “basic” byte code turns into a specific executable that the processor understands. AND,

If we recall that we generate the decryptor code, which is as independent as possible from where and when it is executed (does not contain system calls, does not access the saved state, does not contain complex objects), and works with already prepared data in memory based on known offsets. then the task seems quite solvable. At the input of the generator there are three main parameters - the address of the encrypted buffer, its length and key. Well, let there be another seed for the pseudo-random generation of any constants, future keys, etc. The decryptor also contains conditional transitions, but only within its own body, which also slightly simplifies the task.

Garbage generation

Virmeyker decides to approach the issue, using the generation of a set of unnecessary instructions, and “stir” the true decryptor code in them. Even if the original instructions remain unchanged, in a heap of other instructions it will be very difficult to isolate the necessary for comparison by signature. Despite the nondescript name, the garbage generator is the most difficult and interesting part of the metamorphic engine, because garbage or non-garbage, and you need to generate executable code that will not break itself and will not spoil the main decryptor code. In the process of "mixing" it will be necessary:

- to follow the displacements of characteristic points (addresses of transitions, exits from the cycle, etc.);
- make sure that the garbage code does not spoil the necessary registers and flag register.

MMX, SSE, floating-point instructions are very attractive candidates for the title of garbage instructions, you can easily generate as many as you need, the main thing is not to touch the stack, not to write to general registers and not to break the flags needed by the decryptor, and the first metamorphic code looks like like this:

	mov ecx, 100h; 			; декриптор
lbl0:	mov eax, [esi + ecx]			; декриптор
	xor eax, edi				; декриптор	
	mov [ebx], eax				; декриптор
	add ebx, 4h				; декриптор
	movd mm0,edx			; мусор
	movd mm1,eax			; мусор
	psubw mm1,mm0			; мусор
lbl1:	jcxz lbl2					; декриптор, выход из цикла
	psubw mm1,mm0			; мусор
	movd mm3,ecx			; мусор
	jmp lbl0					; декриптор, продолжение цикла
lbl2:	subebx, 100h; декриптор

Aver is not very worried, because Heuristics still continue to swear at the infected files (working on the generator, the virmaker is reluctant to mess with a serious infector), but it can no longer identify the specific virus. Therefore, on a dark night, Avera dreams of an infector that is not amenable to heuristics, and his obsessive idea becomes the need to detect the reptile with 100% accuracy. In order to accurately identify a virus, the detector needs to be refined - now it is necessary, starting from the entry point, to step on instructions, skip all garbage and add only meaningful ones to the analyzed ones, which means that the disassembler in the detector begins to grow. If you remember about the NOP zones in the paragraph about permutation, then the omission of NOPs when stuffing the buffer for comparison by signature is, in fact, the first approach to a snapshot - the detector skips NOPs, as trash instructions. Now, instead of comparing with 0x90 (opcode NOP), aver uses a disassembler (the faster, the better), which:

Shifts the pointer to the beginning of the next instruction (disassembler lengths).
Tells whether this instruction is garbage (NOP, MMX, SSE, etc.).
Significant instructions added to the analyzed buffer.
In the case of an unconditional branch, marks the transition address as the next one being analyzed.
In the case of a conditional branch, marks both possible branches of the code for further analysis.

Thus, the Aver collects the buffer from the instructions that make up the main decryptor code, and already in it can make a comparison by signature. This is still a fairly quick procedure, but when programming it, the Aver gets more and more worried: “Will I always be able to distinguish trash instructions from important ones?” Wyrmaker, feeling this, is finalizing his garbage generator. Now he calls for help the instructions for saving the context: pushad / popad (put all the general-purpose registers from the stack) and pushfd / popfd (the same for the flags register).

<pre>
	mov ecx, 100h; 		; декриптор
lbl0:	mov eax, [esi + ecx]		; декриптор
	xor eax, edi			; декриптор	
	mov [ebx], eax			; декриптор
	add ebx, 4h			; декриптор
	pushad				; сохраняем регистры
	pushfd				; сохраняем флаги
	mov eax, 12321h		; мусор
	xor edx,edx			; делаем что хотим
	subeax, esi; продолжаем мусорить
	popfd				; восстанавливаем флаги
	popad				; восстанавливаем регистры
lbl1:	jcxz lbl2				; декриптор, выход из цикла
	pushad				; сохраняем регистры
	pushfd				; сохраняем флаги
	shr ebx, 4				; мусор
	popfd				; восстанавливаем флаги
	popad				; восстанавливаем регистры
	jmp lbl0				; декриптор, продолжение цикла
lbl2:	subebx, 100h; декриптор
</pre>

Now, the analyzer disassembler should monitor not only what instructions it analyzes, but also whether they are in the “do what we want” area. And this means that the disassembler has global variables that store information about where we are in the program. Everything becomes more interesting. Well, in general, the instructions for saving the context for any reverse engineer like a red rag for a bull — when analyzing executable files, any meeting with such an instruction means “put a breakpoint here!”.

The next iteration in the development of the metamorphic code is the generation of the necessary action in various ways using various arithmetic operations and all kinds of assembly tricks. Something like that:

"Basic instruction"	generated code 1	generated code 2
virt_mov eax, 10h	mov eax, 20h; sub eax, 10h;	mov edx, 10h; mov eax, edx;
virt_mov ecx, 08h	xor ecx, ecx; add ecx, 08h ;;	mov ecx, 04h; add ecx, 04h;
virt_sub eax, ecx	neg ecx; add eax, ecx;	mov edx, ecx; sub eax, edx;

For example, you can work with all the constants this way: suppose in the “base code” there are two instructions “virt_mov edx, 10h” and “virt_mov ecx, 100h”. Then, generating a new code, the engine selects a random constant, for example, “50h”, and uses it to work with all absolute values, and “virt_mov edx, 10h” mutates into “mov edx, 50h; sub edx, 40h; ", a" virt_mov ecx, 100h "in" mov ecx, 50h; add ecx, B0h. Different constants spawn various byte patterns, which forces the Avera to add more logic to the disassembler, implement wildcards in signatures according to instructions, making for instructions something like “mov eax, <wildcard-constant>; <skip trash>; mov ecx, <wildcard-constant> ". This is not very simple, and not very quickly, and it generally smells like fried ...

After analyzing the detector code, in addition to the constants, to modify the data in the instructions, the virmaker now wants to change the entire set of registers used in the decryptor. To afford this, it is necessary to use register division - some registers are working for the decryptor, and the rest are for the garbage generator. In this case, the garbage generator does not touch the working registers, and also does not spoil the register of flags. For example, the whole decryptor can only work with eax, edx and esi. Then all instructions generated by the generator should work only with ebx, ecx, edi and not change flags. At the same time, the set of registers should be changed in each new generation of the virus.

. . . 
mov eax, 10h	; декриптор
mov ebx, 20h	; декриптор, ebx - мусорный регистр, его можно испортить командой xchg
xchg	edx, ebx	; для того, чтобы загрузить 20h в регистр edx  
xor ecx,ecx		; мусор
inc ebx		; мусор
add ecx,ebx		; мусор
add eax, edx	; декриптор
mov edx, [esi]	; декриптор
xchg edi,ebx	; мусор
cmp edx, 0		; декриптор
. . .			;

The garbage generator, in general, can change the "forbidden" registers and flags, but at the same time return their state back. In this case, the “true” decryptor instructions can be inserted not at any place in the buffer with garbage, but only in those places where the values in the registers and flags are “clean”.

Making a signature according to the instructions becomes a really serious test if various assembly tricks are used, allowing you to implement the necessary actions using completely different patterns, for example:

"Basic instruction"	generated code 1	generated code 2
virt_push eax	sub esp, 04h; mov [esp], eax;	mov edx, esp; sub edx, 04h; mov [edx], eax;
virt_mov eax, ebx	lea eax, [ebx];	push ebx; xchng eax, ebx; pop ebx;

Such patterns - a huge amount, if you wish, you can easily find them. Now the generator becomes much more complicated, because work with offsets, stack, flags, etc. is much more complicated. But this is another serious step to the ideal generator.
So, to generate a code in which it will be as difficult for the detector to say if the current instruction is garbage, you need a generator capable of generating code with the following properties:

contain running integer instructions with regular registers;
Do not use context save-restore instructions to separate the garbage code from the true one;
the set of registers in both garbage blocks and decryptor instructions in each generation must be different;
basic decryptor instructions must be turned into blocks of instructions of different lengths;
The byte structure of the mutated instructions should be as varied as possible.

"Our response to Chamberlain." Emulation

Suppose that a virmaker after 42 months of work has written an almost perfect metamorphic generator, and the detector cannot filter instructions according to the principle “garbage-not garbage” and, accordingly, cannot collect enough data for comparison by signature. But even in the reserve there is an answer that is equally difficult to implement, but able to cope with the detection of a particular virus using even such advanced metamorphism methods. In the process of confronting all the new generators and making more and more complex disassembler signatures, the disassembling engine has reached a state where, in addition to the current instruction, it also stores the entire environment of this particular instruction, following changes in registers, flags, stack pointer, etc. Critically looking at the resulting code, the Aver suddenly realizes that, in fact, programmed the software model of the processor. Passing through the instructions, its code updates the variables corresponding to the registers, monitors the flags to predict the conditional transition, tracks the top of the stack, etc., i.e. actually executes readable code virtually. A detection method that uses an emulation engine is called emulation.

Recall how crackers remove the attachment file attachment protection by stopping it on the first instruction at the moment when the workable unpacked code is in memory (at the Original Entry Point). To understand how the emulator can help the detector to get to the delicious, encrypted payload code, I will briefly describe one simple but effective way to stop the program on the OEP. It is based on the fact that when the main program starts, the stack pointer should be set to its original value, since the stack contains data related to the program environment, arguments, environment variables, etc. Therefore, you can be sure that after the protection has been running, esp will return to the value that was set at the beginning. Cracker stops the program right at the entry point, remembers the value of the esp stack pointer and sets a conditional breakpoint, which will stop the program at the moment when esp becomes equal to the same value that was fixed at the time of launch. It is very likely that at this very moment it will be located on the OEP (well, or at the root of the decryptor from the point of view of nesting of functions). The virus decryptor (if it uses the stack, of course) must also return the stack pointer to its place, and the detector code running according to the instructions can follow this pointer by setting the cur_esp variable and changing it every time it encounters instructions that change esp.

. . .			; base_esp = cur_esp;	
push eax		; cur_esp -= 4;
mov eax, 1h		; -
push edx		; cur_esp -= 4;
. . . 			; -
pop edx			; cur_esp += 4;
pop eax			; cur_esp += 4; (cur_esp == base_esp) !!!
. . .			; здесь возможно отработал декриптор или весь вирус

Just at this moment, when the stack is restored, the decrypted virus body is in memory, and there is tasty data for a constant signature in it. In the case of a virus, it is not even necessary to wait for the whole unpacking, probably within the decryptor there are all the same permanent data such as key length, block length, offset (which depends on the spacing of the section in the file). In other words, if you follow the decryptor instruction after the instruction, then whatever the code generated by the metamorphic generator, the moment will surely come when the data characteristic of this particular virus lie somewhere in the memory or in the registers. Catching this moment, you can determine what kind of virus. Also, by following the instructions, you can expect dangerous actions - calls to suspicious APIs, writing to suspicious files, etc.

There was nothing - to automate this process. As I already mentioned, the emulator is a software model of the processor executing the code of our file. And a freeloader processor, because he doesn't need to write to memory, do input-output and in general, he is interested only in what will allow the program to be stopped in the right place. He does not know MMX, SSE, and in general, the less he can (while performing his function), the better (because he is conditional for the freeloader, and very heavy). Suppose at some point the decryptor puts the string “BANANAS” on the stack, knowing that Avver can execute the code on the virtual processor by constantly checking the top of the stack for the presence of this string.
The emulator engine has variables in it that correspond to registers, flags, memory for stack emulation, etc. I deliberately left the blocks between pushad / popad to demonstrate that the emulator can skip blocks of code as well, rather than separate instructions, since emulation is not an easy procedure. Here is how it works (let this “BANANAS \ 0” lie at the address in ESI).

	mov ecx, 0h; 			; ecx_var = 0;
lbl0:	mov eax, [esi + ecx]		; esi и eax мы знаем (из прошлой эмуляции),
						; поэтому загрузим из правильного места в памяти
						; указатель на "BANANAS"xor eax, edi			; eax_var = eax_var XOR edi_var;	
	push eax				; esp_var -= 4; *esp_var = eax_var;  
	pushad				; включаем режим безделья
	pushfd				; skip
	mov eax, 12321h		; skip
	xor edx,edx			; skip
	subeax, esi; skip
	popfd				; skip
	popad				; выключаем режим безделья
								; "качественные" мусорные инструкции
								; эмулятор не знает об этом
								; и вынужден исполнять их виртуально
	mov edx, 23h			; edx_var = 23h; 			
	or edx, eax;			; edx_var = edx_var OR eax_var; 
lbl1:	inc ecx				; ecx_var++;
	cmp ecx, 8h;			; if (ecx_var == 8) { goto lbl2; }:
	pushad				; включаем режим безделья
	pushfd				; skip
	shr ebx, 4				; skip
	popfd				; skip
	popad				; выключаем режим безделья
	jmp lbl0				; goto lbl0
lbl2:	subebx, 100h; на стеке лежит "BANANAS" - попался!

Of course, emulation requires processing a lot of specific situations, there are not enough resources to honestly emulate each instruction, and at the same time do not forget to perform the necessary environmental checks. Therefore, emulators detect cycles, transfer parts of the code to the actual processor, in general, as in metamorphic generators, there is more than enough space for creativity.
So, somewhere in the ideal world there is an ideal metamorphic generator, generating absolutely undetectable code. And in the same place, as opposed to it, there is an ideal emulator on which you can execute this metamorphic code and detect it. Is there still a development of the topic of self-modifying code?

Philosophical questions

Let's try to expand the topic of modification of the executable code in each new generation. After all, it is precisely the variability of each new generation that underlies evolution, therefore, the topic of self-modifying code in the aspect of information technology smacks of creationism. Who knows, maybe we are gradually creating a new universe and a new life?

Considering encryption of the buffer with the main virus code, we talked about changing the virus in terms of bytes - i.e. changed the set of bytes that make up the virus. This is the most primitive level, a separate byte carries little information about the properties of the virus and changes at this level do not achieve the variation of generations. If we draw an analogy with the development of life, it resembles a variety of simple chemical compounds. Many simple compounds, in various combinations, water, ammonia, carbon dioxide, acid residues and hydroxyl groups - for millions of years this cocktail could not produce anything complicated. But in the end, successful combinations led to the emergence of complex organic molecules - the basis of life.

Considering the decryptor mutation using a metamorphic generator, we generally considered the virus to be a set of instructions, not a byte. This is an important fact, meaning that we are now working with information elements of the following order. Now we are working with “resetting eax” instead of going into details how we do it (xor eax, eax or sub eax, eax) and all our disassemblers detectors are a replacement for detection by byte detection by a chain of instructions. Replacing one instruction by another, we change including the set of bytes, i.e. this level includes the previous one, and, in terms of the evolution of species, it is much more advanced. By changing the instruction set, you can achieve a more targeted variation, rather than blunt pseudo-random “mixing” bytes. A biological analogue, probably, could be amino acids,

If we continue the analogy, the next level at which a virus can change from generation to generation will be functions, i.e. entities consisting of a set of instructions. This is something like a huge collection of functions in several copies for each separate piece of functionality, for example, many functions that look for a file to infect, encrypt and decrypt the virus body, infect the victim file, etc. In each generation of the virus, the set of functions used is recombined, changing all the internal functionality. Each function is written in a separate way, contains a different set of instructions and bytes, i.e. this level also includes all previous ones. If it is possible to implement such a scheme without architectural vulnerabilities, such a virus can theoretically be detected only by a detector that can work with code also at the function level, that is, in fact, evaluating the behavioral scenario of the virus, rather than individual instructions. I'm not sure that now there is something similar to the engine that can mutate at the function level, although in several articles I read about ideas, such as downloading a virus with your own parts from the wirmeyker’s host and replacing your own functionality, or genetic algorithms (when two viruses change functional blocks between each other). Nevertheless, a huge number of different high-level languages and frameworks in theory should contribute to the emergence of such programs, but already, it seems to me, not in the area of viruses (I will tell you below). Perhaps the biological analogy of the functions of such a virus can be proteins. Intracellular functions can be provided by different types of proteins, proteins can be replaced by others, while the cell remains complete,

Well, above only a mutation at the level of the general algorithm of the program. For example, a virus that is now detected by the antivirus in the next generation becomes generally another program, good and fluffy, and the antivirus simply does not need to detect it, hehe. I would like to spread the thought, as in the last paragraph, but ... It's fantastic.

Salvage triumphs over evil

Well, where is all this, you ask? Where are the thousands of terrible metamorphic generators that have invaded the users' computers, sent hundreds of programmers from anti-virus companies to psychiatric clinics, where terrible virus outbreaks are laying networks for weeks, where is that all? In my opinion there are several reasons.

The first technical reason is environmental constraints. Before NT-shnogo kernel, NTFS and Linux on home PC virus code lived very freely - write wherever you want, execute wherever you want. Now use the infected system is much more complicated and not so interesting. No, I don’t want to say that everything is lost, but all that was left was to dream of the past power, and every year the situation got worse - the rights to files and processes, file signatures, online validation of the software being run - all this almost killed the “thoroughbred” computer viruses . But who knows, according to reports, the mobile market is growing very actively, does this mean that mobile developers will have to go through the same rake as developers of large operating systems? I really hope that this is not the case, and the protection technologies have fully migrated to mobile devices.

Well, the main, as it seems to me, reason is the qualification of the programmer. If you are able to write a good metamorphic generator with which specialists travel for at least a few days, or an emulator that detects a signature inside a high-quality virus engine, or make a high-quality crackme that is publicly respected, then ... just write me. I am not a recruiter, but if you are many, I will change my profession, simply introducing you to companies that are engaged in security. Be sure, your income and stability in life many times outweigh what you can get by spreading viruses or hacking software. This is the main reason - advanced engines write very few enthusiasts with very good preparation, and, in most cases, find much more attractive uses for their talent in life. Well, in my opinion,

Epilogue

It is a pity that the code that can generate its own modified copies, until there was no other use than to hide from antivirus software and resist hacking. I sincerely wish these methods further development, preferably without any bias towards the malware. And let the programs generated by the human mind become a little closer to what we call life.

Tags: