yndi July 22, 2013 at 14:15

DarkJPEG: steganography for everyone

From the sandbox

As part of the DarkJPEG project, a new generation steganographic web service has been developed that allows you to hide confidential information in the form of imperceptible noise in JPEG images, and you can only select this information if you know the secret key-password set during encoding.

The project was developed with the goal of exercising freedom of information by people in those countries that violate human rights by introducing censorship of the media or legally prohibiting the use of cryptography.

The service uses persistent steganography methods to conceal the fact of hiding information together with persistent cryptography methods to protect data transmitted through open channels from compromise (the fact of access by unauthorized persons). The source code for the project is distributed under the MIT license.

Key Features:

Using SHA-3 to generate keys;
Symmetric encryption AES-256;
JPEG (DCT LSB) steganography;
Support for RarJPEG and double hiding;
Selection of a random container;
Calculations without server involvement;
Full confidentiality guarantee.

Intro

Steganography is the science of the hidden transmission of information by keeping secret the very fact of transmission. Typically, the message will look like something else, such as an image, article or letter. Steganography is usually used in conjunction with cryptography methods, thus complementing it. The advantage of steganography over pure cryptography is that messages do not attract attention. Messages whose encryption is not hidden are suspicious and may themselves be incriminating in those countries where cryptography is prohibited. Thus, cryptography protects the content of the message, and steganography protects the very fact of the presence of any hidden messages.

How it all started

The idea to create an affordable, fast, private, and most importantly, completely free steganographic web service visited me a little less than a month ago. In connection with some funny laws hastily adopted in this (and not only) country, I was struck by the thought that somehow it was terribly wrong, that every person should have the opportunity to freely exchange information (and no matter which) in open and / or insecure communication channels with other people, and without bothering with studying the features of gpg encryption or blowing dust from utilities such as steghide, so that everything is cross-platform, with really good usability and “here and now”. That's why I got the idea to make such a service, just like that, just for fun. And, despite the fact that this is my first experience in creating such services, and the interface designer is not so hot from me, for three weeks of exciting development in the evenings after work, it seems to me that everything worked out for me. But let's talk about everything in order.

Method

To begin with, about a dozen scientific articles were studied that reveal what steganography methods generally are, their implementation features, analysis and detection. It was decided to use JPEG images as containers, as the most common type of content on the Internet. As it turned out, some methods quite easily prove themselves using non-standard quantization matrices, while others do not pass the histogram test, the third yield a useful volume of about 9-13% of the container size, that is, if we wanted to hide 500Kb of useful information, the container would have to look at least 5MB in size, and this is pretty sad.

As a result, having examined the principle of operation of fairly new F5 steganographic methods and based on quantization error, it was decided to use simple and banal LSB (least significant bits), supplementing it with preliminary encryption of AES-256 data, which in addition to the possibility of using a 256-bit key for encoding, gives the output a pseudo-random sequence of bits, which is exactly what they achieve in F5, by random permutations of data blocks. Thus, here is a schematic representation of the operation of the method:

we take the 256-bit SHA-3 hash of the password entered by the user + randomly generated salt;
we encrypt the data + header (signature, name and size of the file to be attached) using AES-256, add salt at the beginning, at the end 0xFFD9 - JPEG End of Image marker (why this will be done a little lower);
we select a container, the size suitable for our data, translate the colors from RGB to YCbCr;
over each block of 8x8 pixels we produce a discrete cosine transform;
in the last two bits of each nonzero coefficient, we gradually write our data;
coefficients are compressed using series coding and Huffman codes.

As you can see, during the JPEG encoding, the step of quantizing the coefficients was skipped using the appropriate matrices, instead of which only one is written to the file - thereby simulating image compression with a quality of 100%, which on the one hand, of course, significantly inflates the file size (adding useful volume for data), on the other hand, reduces suspicion, since such unit quantization matrices are not uncommon. Hurrah! Everything worked out! The resulting file is a fully valid JPEG with about 20% of the volume occupied by our data, and we safely give it to the user. Decoding is carried out according to a similar, reverse scheme.

Gluing and RarJPEG

In fact, it turned out that such a method is even quite redundant, in practice there is usually enough (unless we hide something really serious!) Of the usual addition of our encrypted data to the end of the JPEG file. This is where the false marker for the end of the picture comes in handy, which we add manually - it at least slightly removes suspicion from our glued tail. Concatenating files gives such a fun opportunity as creating and processing RarJPEGs, you only need to stick a ZIP or RAR file and skip the encryption step with an empty password, then you can easily get to hidden content if you open the resulting image with almost any archiver ( but this picture is still valid JPEG!)

What is the result

Thus, we have three steganography options available: auto, join, and steg. By default, auto (it was decided by me) uses join to encode, gluing files together (not necessarily with an archive - with anything), the only difference is that only with join you can use an empty password to create RarJPEG, and with auto and steg for reasons no security. There is another tricky option: you can encode the file into the container using the steg method, and then attach something to it with the join method, this allows you to give the password for the join part in case of “pushing against the wall” without compromising the steg part - it turns out such a container with a "double bottom". By the way, if the picture is somehow changed (cropped, pinched etc.), no steganography, by definition, alas, will survive, JPEG is a compression format with loss of quality.

Containers

As for containers, there are also three options: rand, grad and image. But here everything is much simpler: the default rand downloads a random picture from Wikimedia, grad is used due to the impossibility of the first method (for example, when there is no Internet or the data volume is too large), image allows the user to choose their own image for the container. There is also the unsafe option, which is recommended to be turned on exclusively for owners of weak computers, if due to lack of memory they will not work for them, but until then it is not recommended to be used for the same security reasons.

Confidentiality

Let's move on to the most interesting and most controversial part: confidentiality and anonymity. Generally speaking, I agree that the expressions “web service” and “confidentiality” in the same sentence sound, to put it mildly, quite strange. In fact, everything is far from so bad. All service code is executed exclusively on the client, all calculations do not crawl anywhere beyond the open browser tab, no information about user actions is tracked, cached, saved, not logged, not transmitted anywhere. Moreover, for its operation, the web server, by and large, is not needed either, just save the project (or clone the repository) to disk and open index.html just like that, without installing any web server. The only thing,

Of course, if we consider extreme cases, such as hacking a github, there is a danger of replacing scripts, but we should not forget, however, that absolutely everything that is somehow connected with the outside world can be exposed to such a scenario: so, it is possible to steal the private key of the maintainer of some software and use it to sign the changed packages, which, by the way, has already been observed somehow. The only difference of the web service in this case is that it works in a much more limited sandbox, and the maximum that can happen in this case: user tracking and data compromise. Therefore, I advise everyone to the perfect solution to any paranoia problems: please use my service through TOR (as well as, of course, when transmitting encoded content via any communication channel).

Security

I don’t know how quickly, in case of urgent need, some people with gloomy faces will be able to determine the IP users through the provider, but here, as in the case with the same I2P, you can only prove the fact of visiting the service, track the user's actions from the side almost impossible (unless you spy on the person himself).

As for detecting darkjpegs on different sites, it will be a little difficult if you use the join method and, generally speaking, it is quite difficult when using steg. For example, to detect the latter, only a resource-intensive chi-square test is suitable, so you can not really worry about this.

If someone wants to decode the encoded data without knowing the cipher, it should be remembered that the crypto algorithm uses a 256-bit key, and if you do not use any qwerty, 123 and other combinations easily generated from dictionaries as a password, you just have to go with a red-hot soldering iron to the sender (which, incidentally, still needs to be found, which is not at all trivial if you use the same TOR; please use TOR), since bruteforce for a couple of trillion years seems a dubious occupation.

App Engine Support

However, perhaps the controversial part is the use of the Google App Service to retrieve pictures when entering a link instead of specifying the file itself. Since we do not have a server (github pages, which can only render static pages, not counting), we need to somehow be able to download pictures (for decoding) from different sites. There is a restriction prohibiting cross-domain downloads, unless otherwise specified by the server from which we are trying to download content, and this limitation cannot be overcome, alas. There are four workarounds if we still really want to use cross-domain booting:

saving a file to disk manually and selecting it from the local file system is inconvenient;
using a local copy of a project open from a local file system (file: /// home / ...) is not worth it;
automatic use by the service, in case of direct download failure, two proxy services on the App Engine platform is convenient, but the restriction of the information sent is 2GB per day;
the use of already written extensions for browsers available from the main page of the project, for decoding images directly from any sites, through the context menu just clicking on them is ideal.

Nevertheless, despite the paranoia from the word “Google”, the proxy service is exactly the same self-written, it doesn’t save anything, does not cache, it just accepts a link encoded with base64 and displays content with an additional CORS header, it is used only if cross -domain downloads on the site are either disabled or simply not supported; The source code for hugs - hugs-01 and hugs-02 services - is also on the github.

To developers

The core of the project, dark.js, can be used by you in any third-party development. It is designed as an asynchronous web worker and accepts the following requests:

- {action: "encrypt", name: "file.ext", pass: "password", buffer: ArrayBuffer}
- {action: "encode", method: "join", width: 0, height: 0, buffer: ArrayBuffer}
- {action: "encode", method: "auto | join | steg", width: image.width, height: image.height, buffer: ImageData}
- {action: "decode", method: "auto | join |steg", buffer: ArrayBuffer}
- {action: "decrypt", pass: "password"}

Responses should be processed by the worker.onmessage function and look like this:

- {type: "encrypt", size: encrypted}
- {type: "encode", time: duration, isize: res.length, csize: enc.length, rate: 100*isize/csize, buffer: ArrayBuffer}
- {type: "decode", time: duration, isize: res.length, csize: dec.length, rate: 100*isize/csize}
- {type: "decrypt", name: "file.ext", buffer: ArrayBuffer}
- {type: "progress", name: "encrypt | decrypt | encode | decode", progress: percent}
- {type: "error", name: "encrypt | decrypt | encode | decode", msg: message}

The exact format of the encoded file is as follows:

- container: [ JPEG <+> encoded data ] or [ JPEG ][ encoded data ]
- encoded:   [ 16-bit encryption salt ][ AES256 encrypted data ][ 0xFFD9 ]
- encrypted: [ 0x3141593 ][ 32-bit file size ][ 16-bit file name length ][ UTF-16 file name ][ DATA ][ zero padding ]

Read at your leisure

Summary

works only with modern browsers (Safari 6, Chrome 25, Firefox 21, Opera 15 and newer);
yes, Opera 12.xx, alas, is not supported due to the lack of support in Presto for many html5 features (such as Blob URLs, download attribute, transferable objects etc.);
working scripts are minified using unglifyjs, to check for the absence of “bookmarks”, you can run git clone, make and then diff of what happened and what was already in the build directory before the build - everything is transferred from the client directly from the github to the client;
all calculations are really performed on the client, without the slightest involvement of the server, you can save the project to your disk and simply open index.html without any Apache - everything will work;
yes, everything is really confidential, nothing is saved anywhere, nothing is tracked, logs are not kept - the github pages server can only render static pages;
There are three encoding methods: auto, join and steg, by default auto is used, which basically uses join;
join glues files - the size of the inserted data is limited only by common sense, security is average;
steg uses the real DCT LSB steganography - the permissible size of the inserted data is about 20% of the container size, high security;
the difference between auto and join is that in join you can encode with an empty password, this makes it possible to create and process RarJPEG;
There are three types of containers: rand, grad, and image, the first selects a randomly sized picture from Wikimedia, the second is used when the first is not possible, the third allows you to use any image specified by the user that is downloaded from the local file system;
files can be attached from the local FS by pressing enter, clicking on the plus sign, drag-and-drop, or by URL;
all multi-colored arrows (and not only them) are clickable;
the inscription “darkjpeg” is also clickable, it acts as an analogue of page refresh without reloading;
if you specify a URL for decoding, two proxy services on the Google App Engine are used due to CORS restrictions;
for development, you can use dark.js, which contains JavaScript implementations of a JPEG encoder and decoder, AES-256, SHA-3 and implemented as an asynchronous web worker;
how my service will be used: whether to post a pony on image boards, or to coordinate some activists - it's up to you, I allow everything.

License

This software is provided “as is” without warranty of any kind, either expressed or implied, including, but not limited to warranties of merchantability, fitness for a particular purpose and no violation of rights. In no case shall authors or copyright holders be liable for claims for damages, losses or other claims under existing contracts, tort or otherwise arising out of, having a cause or related to the software or the use of the software or other actions with the software.

Acknowledgments

Emily Stark, Mike Hamburg, Dan Boneh, Stanford University for their implementation of AES-256 in JavaScript;
Chris Drost for his implementation of SHA-3 Keccak;
Yury Delendik, Brendan Dahl, notmasteryet for parts of their JavaScript JPEG decoder;
Andreas Ritter for his amazing JavaScript port encoder for JavaScript;
Dan Gries for his examples of very beautiful fractal gradients;
Brsev for the gear icon from his Token Dark set;
Fabrizio Panattoni for his Premade Background 019;
To my girlfriend for inspiration;
For your reading, do not judge strictly: 3

Tags: