MS-DOS virus world

https://blog.benjojo.co.uk/post/dive-into-the-world-of-dos-viruses
  • Transfer
image

This post is a text version of a speech that I gave at the 35th Chaos Computer Congress in late 2018.

And so I have to admit that MS-DOS outraged me a little, despite the fact that MS-DOS malware always fascinated me to some extent, but first we must ask: “What is DOS?”

  • DOS is one of the versions of CP / M, another very old operating system.
  • The DOS family covers a wide range of vendors, simply because it is DOS, does not mean that it will run on an 8086 CPU or better.
  • Some of these DOS providers have an API compatibility, which means that some of them use malware!


image

Video of performance:


EDISON Software - web-development
Пост написан при поддержке компании EDISON Software, которая разрабатывает приложение для виртуального мобильного оператора и занимается разработкой и сопровождением сайтов на Python.

image

But in fact, most of our memories of the DOS era are the aesthetics of what computers of that time looked like:

image

This is the era of “beige color calculations” and the Model M keyboard, which may be known or notorious depending on whether you like noisy keyboard or not.

image

Some of us may have memories of using DOS, and some may still use DOS!

image

For example, George R. Martin, who wrote Game of Thrones, is rumored to have used Wordstar at DOS to write a book!

image

We also can not miss QBASIC, for many it would be their first acquaintance with programming!

image

But sometimes life using DOS was not so good, sometimes you used DOS, and suddenly such things happened. In this example, a small tune is played during printing, so this can be a very awkward situation in an office environment.

image

Some of them are more “cute”, in this case, for example, an ambulance car, drawn by the ascii characters, passes, and then the program that you wanted to open is launched, in the worst case with light inconveniences.

Thanks to a bunch of archivists for malware running under the name VX Heavens, we have a good historical archive of DOS malware, or at least until the Ukrainian police raid the site:
On Friday, March 23, the server was seized by the police in connection with a criminal investigation (Article 361-1 of the Criminal Code of Ukraine - the creation of malicious programs for the purpose of their sale or distribution) based on someone's hints. about “free access of malicious software designed for unauthorized hacking of computers, automated systems, computer networks”.

Fortunately, popular torrent sites still have copies of the site database, which can provide us with a wonderful set of data:

$ tar -tvf viruses-20070914.tar | wc -l
66714
$ ls -alh viruses-20070914.tar
6.6G viruses-20070914.tar

However, to start exploring these samples, we first need to understand the typical distribution stream of these samples, given that these programs worked in the pre-Internet era:

image

After you get an infected file on your system and run it, the malware will either actively search, or install system call interceptors for programs that you run. He often does this in a subtle and invisible way to avoid detection. The importance of subtlety is important because to distribute this malware, you must either transfer it to another system using media (floppy disks), or download to another distribution point, such as a BBS.

image

At runtime, the malware has two options; it can either remain hidden and infect new files, or display the payload.

Some payloads are pretty beautiful! The example below uses unusual functions, such as 256 colors:

image

Or this one, which plays with your screen buffer:

image

image

However, for the most part the malware will be silent and try to find files to infect. Infecting most files is very simple, for example, if you view a COM file as a long machine code tape:

image

Then “all you have to do” is to insert the JMP at the beginning of the program and add data to the end of the program. It will look something like this:

image

Some code was smarter and found “empty space” in a binary file and wrote itself down there, which prevented an increase in the size of the binary file, which probably meant that the antivirus could use a red flag.

image

However, earlier, I also mentioned intercepting system calls. Although the MS-DOS runtime is very simple and practically unprotected (you can trivially load Linux from a COM file). It still contains the full API so that applications do not need to have their own file system implementation. Here are some of the syscalls functions:

image

They work by causing a software interrupt, in which the program will ask the processor to move to another section of the system memory to process something:

image

However, MS-DOS also offers the ability to add / change these calls (using another call), allowing you to expand the system so that new drivers can load at runtime. However, this is also an ideal place to add malware interceptions:

image

It was a well-used trick, since you could intercept the “Open File” call and then use it to detect new executable files on the system ... and infect them.

As a quick example of how they are used, let's take a look at the simple “Hello World” program:

image

As we can see, there are two type calls here int. We use 21h(h = hex) as the main system call number, and we can specify what action we want MS-DOS to perform, based on the valueAh

image

In this case, the program makes a call to print the string, and then quits with a return code of 0 (unspecified).

As mentioned earlier. When you call int 21h, the CPU will look in the IVT table where to go, inside this handler there is often a segment like a router that routes various basic calls; in the case of Int 21h, it routes to different functions based on the value ah. As soon as we get to the place, the actual call handler will deal with the task, then it will launch iret to return to the execution of the main program, often leaving behind registers of the call results:

image

So. If we want to see all the system calls that the program started, we can set a breakpoint at the beginning of the interrupt handler and check what the ah value is:

image

We do this because the interrupt handler is always in a fixed place in MS-DOS (this is much earlier than the ASLR and Kernel ASLR), but the location of the program is not.

image

As soon as we launch it, we will be able to see the challenges made by this pattern. While we can see on the screen that he only printed a Goat file notification (Goat is a file intended to infect, like a sacrificial goat). We also see that this program does more than just type a string. It checks the version of DOS (probably to check compatibility), and then opens, reads and writes data!

image

It is interesting! But we would like to know more about what the system calls are in red, since they should have input for things like file names and data to write to / output to the screen.

To do this, we need to look at other registers during syscall:

image

Using the “Print String” as a simple example, we can see what the usage looks like:

image

What is DS: DX? Why are there two registers, and how do we get data from them?

To do this, we need to understand a little more about the 8086

image

processor . The 8086 processor is a 16-bit CPU, but with 20-bit memory addressing. This means that the processor can only store values ​​that indicate 64 KB, this is a problem when the memory capacity is up to 1 MB.

To get around this, we need to understand the segmentation registers:

image

The 8086 processor has 4 segmentation registers, which we need to take care of:

  • CS - code segment
  • DS - data segment
  • SS - stack segment
  • ES - an additional segment (in case you need another one to get around different situations)

There are a number of other general purpose registers that save you from excessive memory usage and allow you to pass parameters to other functions.
The segmentation logs the operation, changing the block in RAM:

image

This allows the 16-bit CPU to see all 20 bits of RAM, ensuring that for each DS value the block is shifted by 16 bytes.

image

In this case, the DS call is used as a pointer inside a 16-bit window as to where the beginning of the line is. Then the string printer will scan until it finds the $ symbol and then stops. This is similar to other systems that use zero byte instead of $.

image

With age ISA x86, little has changed, instead of the fact that the size of the processor bits has grown, the same registers have become wider.

So, with this knowledge, we can create a list of “tasks” to track these programs:

image

With this setting, we can throw several large computers to a problem for several hours and collect the results!

image

And we get ...

image

Nothing like that.

It is disappointing.

We've been burned at least a couple of activations! (Xs, how to translate it)

image

If we look at some samples, we will see a smoking gun here. A decent piece of samples checks the date or time.

If we look at the documentation for these calls, we see that the system call returns values ​​in the form of registers for the program:

image

So we can brute-force them! All we need to do is something like this:

image

But there is one problem with this method.

image

The sample testing phase takes about 15 seconds, because it uses the full qemu emulation process, and it can take up to 15 seconds to fully launch the program in the virtual machine. Since DOS does not have power saving features, this means that when DOS is in standby mode, it is in a busy cycle.

Thus, we could look at this problem differently by looking at what code will be executed after the date / time request.

Since our tracer is in the interrupt handler, we do not know from the box where the program is located:

image

For this we need to look at the stack, where the CS and IP registers are waiting for us!

image

As soon as we take these two registers off the stack, we can use them to get the return code so that our checklist looks like this:

image

After we have done this and repeated the test of the data set, we will see what part of the return code looks like!

image

Here is a sample of one. Here we see that a comparison is made for DL ​​and 0x1e.

image

If we look at our documentation, we see that DL is the day of the month, that is, we can analyze the top three opcodes as follows:

image

We could go and manually review all of this, but there are a lot of these samples that check the time around 4700:

image

So instead, we need to do something else. We need to write something ... We need to write ...

image

The world's worst x86 emulator, called the BenX86, is an emulator designed specifically for our needs, and nothing more:

image

But it has some advantages in its speed.

image

image

We added 10 thousand different execution tests based on the paths we found. using brute force using BenX86. So, I’ll finish with some of my favorite discoveries that are activated by time:

image

This pattern is activated on the day of the new year and hangs up your system after the greeting is displayed. It can be good if you are stuck in the office for the new year, or it can be bad if you really need to do something on New Year's Day.

image

This example surprised me a lot. It is activated in early 1995 and informs the user about all infected files that it has infected, and then removes the virus (removing the transition at the beginning), and then does nothing more. Although for some reason it says that you have to buy McAfee, it’s obvious that this message is not out of date.

image

This, to be honest, really confuses me, on November 8 of any year, it will turn all 0 in the system into tiny “hate” glyphs. It really confuses me, if you know why you need it, let me know ...

image

This is probably my nightmare, when after running any program, this is a message saying that it could not eat your main disk. It would be incredibly disturbing to see out of the blue.

image

Finishing, we have that there is a Navy Seal Copypasta version of malware for DOS. Not sure that this author dislikes Aladdin, but whatever you do, you are a human.

If you are interested in the code that runs in this article, I released my toolkit on github , without any guarantees. If you want to create this code yourself, you will need to work to make sure that it works with your MS-DOS installation (fix handler breakpoint)

However, if you are just looking to see what I saw while looking at this project I archived the web interface here: dosv.benjojo.co.uk

See you soon!

Also popular now: