The history of reverse engineering of one furry animal
On the quiet morning of the third of January, when Moscow was dozing after the New Year holidays, a doorbell rang in our apartment. The mail finally delivered a parcel with New Year gifts ordered on Amazon. Among other things, it contained a gift for his son - an electronic pet Furby . His purchase was, in general, impulsive. The toy was listed in the bestsellers of the New Year season and was relatively inexpensive. I did not understand Furby grades, but once upon a time I heard something positive about a toy.
Because of his one-year-old age, little son was not very impressed with the gift, and I was sorry for letting the complex electronic device drop onto the floor and tearing off this device, and everything went to putting the gift on the shelf until better times, but my eyes fell on one inscription on colorful packaging ...
The inscription said that for this toy in the AppStore you can download an application with which you can feed a cyber pet, give him all sorts of commands, as well as translate the phrases that he said in his own language - Furbish, into English. The application was downloaded, the pet was fed with all sorts of edible and inedible objects, which he either swallowed with appetite or spit out, and the translator from Furbish to English worked surprisingly accurately.
Does audio recognition work so reliably nowadays and even in a rather noisy environment? Something is wrong here. And how does the application transmit Furby commands? IK disappears (early versions of Furby, as it turned out, had an infrared port for communicating with each other), Bluetooth, too. Only audio remains. This is interesting ... Now, if you could hack the communication protocol with this creature and be able to manage it from the computer ... Find some “Easter eggs", hidden or service teams! Or ...
In general, as you understand, the father of the family gave himself a gift for the New Year.
First, I synchronized the iPhone with the computer and looked inside the application file (.ipa). Among other entrails there were several dozen short WAV files, numbered in a special way. All this looked like ready-made audio commands. The first file started with the number 350. After playing this file in Audacity Furby, he chewed something in a businesslike manner and produced a joyful “Mmm, yum!”. “Aha!” - thought Stirlitz, - "Now you have enough of me!"
Commands in the application began with the 350th and ended with the 900th, with large spaces in the numbering. So Furby is potentially able to perceive a much larger number of commands than these ready-made WAV files are on hand. We must look further.
The appearance of the signal in Audacity suggested that some kind of frequency modulation was used, and there was one signal, then a short pause, then visually the same signal again. The total duration is one and a half seconds. Since the modulation is frequency, it would be nice to look at the spectrum. I looked at the graph - it clearly showed five peaks at equal distances from each other in the region of 16-19KHz:
The tower from Mordor is, of course, beautiful, but how to decipher it? Rummaged in Audacity a little more and opened the mode of display of audio in the form of a spectrogram. This picture was already much more beautiful than the first:
Two parcels are clearly visible here with a pause in the middle, differing from each other in the sequence of “notes” (base frequencies). Moreover, the average frequency is the carrier, constantly alternating with the other four “notes”.
For the convenience of decoding the sequence, I made a mask in a graphical editor that I placed on top of the spectrogram screenshot, assigned numbers 0 to 3 to each note in sequence, and began to analyze successive commands (as we recall, iOS application developers helpfully numbered all the WAV files for us). At first it turned out that in neighboring teams the numbers sometimes “jump”, i.e. do not go as we would like with sequential increment of numbers. After some analysis, it became clear that the “notes” should be numbered as in the figure below:
Here the parcel is deciphered as
3233 3012 1032(for convenience, I broke the sequence into blocks of four digits; in a quadruple system, each such block is one byte).
Further analysis of the teams, the translation of their binary form and bitwise comparison revealed the following structure of the premise and the team as a whole:
- The first byte (in the example, this
3233), being written in binary form, has the following structure:
11 1 01111where the high two bits are always equal
11, the next bit is equal
0to the first send in the command and
1to the second, and
01111this is the data itself (part of the command identifier);
- The second byte (
3012) is the checksum, depending on the 6 bits of the command (where 1 bit is the identifier of the package and 5 bits are the data itself);
- The third and final byte of the send is always equal
What does it mean? Firstly, the command is divided into two packets of 5 bits of data each. In total, we get a 10-bit number, i.e. the potential number of commands that Furby can send or receive is 1024. However, the checksum calculation method could not be calculated. After analyzing the command numbers, it turned out that I can find 7 out of 32 checksums for the first package and 31 out of 32 checksums for the second package based on existing WAV files. In total, this gave 217 potential commands instead of the existing 76 (in the form of ready-made WAV files), which is also not bad.
I wrote a script that generated a .wav file similar to the finished one according to the command number and started sorting through the ranges of commands available to me. As it turned out, there really were undocumented teams - Furby reacted to them in different ways, sang songs, raped, sneezed, imitated a dream and did other simple things.
This spurred the research appetite, however, the checksum algorithm did not stubbornly yield to reverse engineering, which means that most of the teams remained inaccessible to me.
Once again combing the Internet for any clues, I suddenly found a link to the official Furby application for Android (about which there was not a word on the box with the toy). “Android → Java → bytecode → sources → ... → PROFIT!”. Stirlitz has never been so close to a solution ...
Finally finding the needed .apk file on some garbage dump, I climbed inside and didn’t see a single WAV file with commands, although in general the set of resources was similar to the one in the iOS application. Since there are no .wav files, does the application somehow generate commands on the fly? That's what I need! Decompiling and browsing the Java code gave some interesting clues, but as it turned out, all the interesting stuffing, namely the generation and analysis of audio, is located inside the native .so library, in which there is one method that I needed, namely
private static native byte GenerateComAirCommand(int paramInt);.
How to reach the native method? Porakinin brain, Stirlitz decided to download the Android SDK. As a result, a small project was assembled, which included the native library itself and the minimum binding, providing access to only one function I needed. The application itself, at startup, simply created WAV files for the minimum set of WAV files I needed, where the commands contained the same missing high and low 5 bits, for which I needed checksums. After some smoking of Stack Overflow (I had no experience writing applications for Android at that time), the application started and generated a set of WAV files I needed on the virtual SD card of the emulator, which I dragged through
adb pullto the normal file system. Analysis of these files gave me complete coverage - all 64 checksums by which you can recreate any of the 1024 commands.
In analyzing Furby’s reactions to commands, another range of commands was found that Furby reacted to in one way or another. However, some atomic commands such as “open your eyes”, “close your eyes”, “move your ears” were not found. As well as no EULA self-destruction commands or reading EULA were found on the furbish (this does not mean that there are no specialized teams, they can simply be activated, for example, by a special sequence or a completely different set of codes - but it is hardly possible to find out somehow )
However, I decided to go further and write a Furby response analyzer, since some commands, although they do not give a visible result, can cause a Furby reaction in the form of response commands, which is also interesting. As a result, a Perl script was written that analyzes the PCM data stream from a microphone, does it on the fly and decrypts these premises. All this was written under Windows, where for Perl, unfortunately, there are no normal ways to record data from a microphone, so I had to make a console program in Delphi that reads data from a microphone and outputs it continuously to STDOUT. The data stream is redirected to a script where analysis is already taking place. Such is the Unix way for Windows.
“Stop, stop, stop,” the weary reader will say, “What is all this for?”
I was interested to see, “what's inside”, without breaking the toy physically (after all, I didn’t buy it for myself). Along the way, I gained knowledge about the generation and analysis of sound in Perl, about FFT, window functions, about working with Android, which is exciting in itself.
Perhaps this article is useful to someone when implementing their own protocol, because there are all sorts of interesting gadgets for the iPhone that transmit data through the audio jack.
And finally, the ability to control Furby through a computer potentially opens up an emotional method of notifying you of some events. For example, when mail arrives from a specific recipient, you can ask Furby to dance, after a commit in Git comes from a certain person to purr, and from another to make a less decent sound (which Furby has in stock). True, for this, you still need to solve a couple of hardware tasks. Firstly, to forbid Furby to fall asleep after 10 minutes of inactivity (and physical braking is considered as activity - for this he has a position sensor in space) and power it not from batteries, but from a power supply or USB. Maybe on Habr there are experts in iron who want to finally tame the beast?
The code itself after some combing is posted on GitHub. Wishes and finds are welcome in every possible way. Of course, all the information and program code presented is provided solely for educational purposes.
Oo-tah-toh-toh. Kah way-loh.