Steganography past files: we hide data directly in sectors
Steganography, if anyone does not remember, is the concealment of information in any containers. For example, in pictures (discussed here and here ). You can also hide data in the service tables of the file system (this was written here ), and even in TCP service packets . Unfortunately, all these methods have one drawback: in order to quietly “intersperse” information in a container, you need cunning algorithms that take into account the features of the container’s internal structure. Yes, and with the resistance of the container to manipulation, problems arise: for example, if you slightly edit the picture, hidden information is lost.
Is it possible to somehow dispense with cunning algorithms and subtle data manipulations, while still ensuring the container's performance and an acceptable level of security for hidden data? Looking ahead, I’ll say - yes, you can! And even I will offer a utility.
Bloody Method Details
The basic idea is as simple as a club hit on the forehead: there are areas on the disk in which the operating system never writes (or writes in rare cases). In order not to need to look for these areas with cunning algorithms, we will use redundancy - that is, we will duplicate our hidden information over all sectors of the disk many, many times. Then, right on top of all this grandeur, you can create the necessary partitions, format file systems, write files and install the OS - anyway, some of the secret data will be saved and can be extracted, and repeated duplication will help us to compose the original whole from pieces.
The advantage of this method is obvious: we do not depend on the file format, or even on the type of file system used.
The disadvantages are also, I think, obvious:
- Secret data can only be changed by completely rewriting the entire disk, with the subsequent reconstruction of the contents visible to the user. At the same time, you cannot use software that recreates a disk from an image: it will recreate previous secret data.
- The larger the amount of classified data, the greater the likelihood of losing some of the information.
- Retrieving data from disk can take a long time. From several minutes to several days (modern discs are large).
Now let's move on to the details.
It is clear that if you simply smear secret data throughout the disk, then they will be hidden only from the naked eye. If you equip your eyes with, say, a disk editor, then the data will appear in all its glory. Therefore, it would be nice to encrypt the data so that it does not flash. We will encrypt simple, but with taste: according to the aes256-cbc algorithm. We ask the user for the encryption key, let him think up a good password.
The next question is how we can distinguish “correct” data from corrupt data. Here the checksum will help us, but not simple, but SHA1. And what? For git, it is good enough, which means it will suit us. It was decided: we supply each saved fragment of information with a checksum, and if after decryption it coincided, then the decryption was successful.
You also need the fragment number and the total length of the secret data. Fragment number - to keep track of which pieces we have already decrypted and which remained. The total length will be useful to us when processing the last fragment, so as not to write extra data (in other words, padding). Well, since we still have a headline, we add the name of the secret file there. It will come in handy after decryption so as not to guess how to open it.
Check the method in practice
For verification, we take the most common media - a flash drive. I found an old one at 1 GB, which is quite suitable for experiments. If you, like me, had the idea of not taking a steam bath with physical media, but testing on a file - a disk image, I’ll say right away: it won’t work out. When formatting such a “disk”, Linux creates the file again, and all unused sectors will be filled with zeros.
As a Linux machine, unfortunately, I had to use the weather station on the Raspberry Pi 3 lying on the balcony. There is not much memory there, so we will not hide large files. We limit ourselves to a maximum size of 10 megabytes. There is no sense in hiding too small files either: the utility writes data to disk in clusters of 4 Kb. Therefore, from below we restrict ourselves to a 3 kb file - it fits into one such cluster.
We will mock the flash drive in stages, checking after each stage whether hidden information is read:
- Fast formatting in FAT16 format with a cluster size of 16 kb. This is what Windows 7 proposes to do with a flash drive that does not have a file system.
- Filling the flash drive with all kinds of garbage by 50%.
- Filling the flash drive with all kinds of garbage is 100%.
- "Long" formatting in FAT16 format (with overwriting everything).
The first two tests quite expectedly ended in complete victory: the utility was able to successfully extract 10 megabytes of sensitive data from a flash drive. But after the flash drive was jammed with files to the eyeballs, a failure occurred: As you can see, only 158 clusters were successfully decrypted (632 kilobytes of raw data, which gives 636424 bytes of payload). It is clear that there is no way to get 10 megabytes, and there are obviously duplicates among these clusters. Even 1 megabyte cannot be restored in this way. But it can be guaranteed that we will recover 3 kilobytes of sensitive data from a flash drive even after it is formatted and written to the eyeballs. However, experiments show that it is quite possible to extract a 120 kilobyte file from such a flash drive. The last test, unfortunately, showed that the flash drive was overwritten all:
Total clusters read: 250752, decrypted: 158
ERROR: cannot write incomplete secretFile
$ sudo ./steganodisk -p password /dev/sda
Device size: 250752 clusters
Total clusters read: 250752, decrypted: 0
ERROR: cannot write incomplete secretFile
Not a single cluster has survived ... Sadly, but not tragically! Let's try to create a partition on the flash drive before formatting, and already in it - the file system. By the way, she came from the factory with this formatting, so we are not doing anything suspicious.
It is expected that the available space on the flash drive is slightly reduced.
It is also expected that 10 megabytes still could not be hidden on a completely clogged drive. But now the number of successfully decrypted clusters has more than doubled!
Total clusters read: 250752, decrypted: 405
Unfortunately, a megabyte cannot be assembled from pieces, but two hundred kilobytes is easy.
Well, the news of the last, 4th check, this time is joyful: the full formatting of such a flash drive did not lead to the destruction of all the information! 120 kilobytes of secret data perfectly fit into unused space.
Summary table for testing:
A bit of theorizing: about free space and unused sectors
If you once partitioned a hard disk, you might notice that it is far from always possible to allocate all the free space on the disk. The first section always starts with some indentation (usually 1 megabyte, or 2048 sectors). Behind the last section, too, it happens, there remains a small “tail” of unused sectors. And sometimes there are gaps between sections, although rarely.
In other words, there are sectors on the disk that cannot be accessed during normal work with the disk, but you can write data to these sectors! So, read it too. Adjusted for the fact that there is also a partition table and bootloader code, which are just located in an empty area at the beginning of the disk.
Let us distract ourselves for a while from the partitions and look at the disk from a height, so to speak, of a bird's flight. Here we have an empty partition on the disk. Create a file system in it. Is it possible to say that some sectors on the disk have remained unlocked?
And-and-and - drum roll! The answer will almost always be - yes! Indeed, in most cases, the creation of a file system boils down to the fact that only a few blocks of service information are written to the disk, but otherwise the contents of the section do not change.
And yet - purely empirically - we can assume that the file system can not always take up all the space allotted to it to the last sector. For example, a FAT16 file system with a cluster size of 64 kilobytes will obviously not be able to fully occupy a partition with a size not multiple of 64 kilobytes. At the end of such a section, a tail should remain in several sectors, inaccessible for storing user data. However, experimentally this assumption could not be confirmed.
So, to maximize the space available for the steganogram, you need to use a larger file system with a cluster size. You can still create a partition, even if it's optional (on a flash drive, for example). It is not necessary to create empty sections or leave unallocated areas - this will attract the attention of interested citizens.
Utility for experiments
The source code of the utility can be found here.
To build, you will need Qt version 5.0 and higher and OpenSSL. If something is not going to happen, you may have to fix the steganodisk.pro file.
You can change the cluster size from 4 KB to, say, 512 bytes (in secretfile.h). At the same time, expenses for official information will increase: the header and checksum occupy a fixed 68 bytes.
You need to run the utility, of course, with root user rights, and with caution. There will be no questions before overwriting the specified file or device!