XakepRU August 6, 2014 at 12:32

Reverse engineering firmware for a Chinese Android tablet

The Chinese have a special idea of copyright - they simply do not work for them. At the same time, they protect their best practices by various technical means, for some reason “forgetting” to share them with their customers. It would seem that the situation is hopeless: a batch of Chinese tablets arrived, and the task was to flash them so that the customer’s content was not erased when resetting, there was stock firmware in an unknown bin-format, but there was no SDK. What to do, how to assemble custom firmware? There is only one way out - apply reverse engineering.

Intelligence service

The device to work with was built on the basis of the GeneralPlus GP330xx SoC , and its system software was developed using the OpenPlatform SDK, and although the Chinese declare their readiness to provide source codes, they do not. Despite the complexity of the task, optimism was added by the root access included in the device by default. Therefore, the learning process began with the launch of ADB Shell.

The entire disk space of the tablet was one large block NAND-flash device ( /dev/block/nanda), broken into sections:

Disk /dev/block/nanda: 7457 MB, 7457472512 bytes
4 heads, 16 sectors/track, 227584 cylinders
Units = cylinders of 64 * 512 = 32768 bytes
Device Boot                Start         End      Blocks  Id System
/dev/block/nanda1             257      174335     5570528   b Win95 FAT32
/dev/block/nanda2          174336      207103     1048576  83 Linux
/dev/block/nanda3          207104      223487      524288  83 Linux
/dev/block/nanda4          223488      227583      131072  83 Linux

Part of the memory was allocated for the so-called Internal SD card. It is necessary to dwell on this term in more detail. In Android, each application program runs in its own sandbox and uses the system API to access files. This API allows you to access the internal memory (Internal Storage) and external memory (External Storage). In this case, the external memory is divided into removable storage media (an SD card that is inserted into the slot at the end of the device) and internal (non-removable) storage (the internal memory section that mimics the SD card). In this tablet, the largest partition - / dev / block / nanda1 - was allocated specifically for the internal SD card. Therefore, it was decided to divide it into two sections, highlighting one of them for the customer’s content, and the second for the internal SD card.

The device / dev / block / nanda is partitioned using MBR, not GPT, so the maximum number of primary partitions is four. Using fdisk, the / dev / block / nanda1 partition was deleted and an extended partition was created in its place with two subsections / dev / block / nanda5 and / dev / block / nanda6.

We conjure over sections

Looking through the list of mounted devices, we see that the / dev / block / vold / 253: 97 section is mounted on / mnt / sdcard.

root@android:/etc # mount
...
/dev/block/vold/253:97 /mnt/sdcard vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0602,dmask=0602,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0
/dev/block/vold/253:97 /mnt/secure/asec vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0602,dmask=0602,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0
...

What is the relationship between / dev / block / vold / 253: 97 and / dev / block / nanda1? Vold is Volume Management daemon, a demon for mounting external media. It has a configuration file similar in syntax to the standard nix fstab, called vold.fstab:

## Vold 2.0 Generic fstab
...
dev_mount sdcard /mnt/sdcard auto /devices/virtual/block/nanda /devices/virtual/block/nanda/nanda1 /devices/virtual/block/nanda/nanda2 /devices/virtual/block/nanda/nanda3 /devices/virtual/block/nanda/nanda4
...

At first glance, everything is clear: / mnt / sdcard is the mount path, auto - automatically selects the first suitable partition for mounting from the list of partitions listed below (/ devices / virtual / ...). However, the vold.fstab file in this device was, in fact, a "stub". When making modifications to the dev_mount sdcard ... line (for example, mounting a freshly created partition other than / devices / virtual / block / nanda / nanda1), the daemon refused to work. It is difficult to say for sure whether this is due to the customized kernel or to the customized demon, but be that as it may, the motives of the developers of such a solution are not clear.

Thus, it turned out that neither / dev / block / nanda5 nor / dev / block / nanda6 can be mounted using vold. Then you could go in two ways:

Start mounting the SD card from init scripts manually. True, this path could not guarantee 100% compatibility with all Android internals, in other words, it would be impossible to vouch for the stability of the system, removing from it the key component of "communication" with external drives - vold.
Take the open source vold and try to build it for this device. There are also no guarantees, in addition, it could require a fair amount of time, which, as always, was not enough.

For such a method of solving the problem, one would have to write shell scripts called via ADB and get the resulting firmware binary in no way, and this, in turn, would make the technical work of the customer more expensive, so this path was left in reserve and research continued in a new direction.

The decision came suddenly

Faced with such a problem, I decided once again to carefully study what was in our hands. Of particular interest was the flasher, which, in addition to the firmware.bin firmware file itself, contained a number of auxiliary resources: bootheader.bin, bootpack.bin, bootresource.bin, scanram.bin, updater.bin. They are also necessary, but irrelevant for our task. Of greater interest are the files that the flasher uses to download its own code to the device: small_isp.bin, cmdline, initrd, and kernel.

This device used the so-called ISP mode for firmware (this is a designation of one of the flash programming modes). The flasher work algorithm can be divided into four stages:

The technician restarts the device in firmware mode, holding down the buttons when it is turned on .
The flasher recognizes the device via USB and reboots it into ISP mode.
The flasher downloads to the Linux device by transferring the cmdline, initrd and kernel files.
The kernel file is the kernel of the OS, initrd is the section with the flasher software on the device side, cmdline is the kernel parameters containing the size of the initrd file.
The Linux loaded on the device begins to receive the main firmware files from the flasher, unpack them and write them in accordance with internal algorithms.

What were these internal algorithms? The decision came suddenly. It turned out that initrd contained the source code of the Lua flasher, as well as the binaries of additional Lua modules. To unpack initrd, you must run the following commands:

# mkdir initrd-unpacked
# cd initrd-unpacked
# gunzip < ../initrd | cpio -i --make-directories

For reverse packaging (if necessary; for example, for testing modified versions of scripts):

# find ./ | cpio -H newc -o > initrd.cpio
# gzip initrd.cpio
# mv initrd.cpio.gz initrd

This may seem strange, but for some reason, the developers have come up with their own firmware format, while leaving the scripts operating with this format in initrd in the clear.

Fig. 1. Tablet firmware format

Pluto

The tablet firmware headers were packed using the Pluto module , which packs Lua tables in a binary format. The Lua programming language generally actively uses plug-ins, which are so-libraries that add certain APIs. In addition to everything, as follows from the documentation, Pluto was platform and architecture dependent. Intel and ARM (on which the tablet was built) are significantly different: Intel uses little-endian byte order in the representation of numbers, and ARM uses big-endian.

And here a serious problem arose: the standard Pluto module did not unpack the received data. Different versions of Lua and even different CPU architectures (x86, x86_64, ARM) have been tried. It turned out that just the firmware developers used their own, with nothing compatible version of Pluto.

In order to unpack the data, I had to use the QEMU emulator for the ARM architecture and install Debian Linux on it. And then install Lua and put the pluto.so module, extracted from initrd, into the Lua module directory.

Fig. 2. The headings of the firmware binary in the console of the ARM emulator

Fig. 2. The headings of the firmware binary in the console of the ARM emulator

Connecting Lua Modules

The Lua programming language is expanded by external plug-ins, which can be written both in Lua and C. In the latter case, these are ordinary so-libraries that export a number of API functions.
Their connection is made using the function require, and the variable is responsible for the path of searching for binary modules package.cpath. The proprietary LZO module has its own connection feature, which consists in its name - lua_lzo.so. At the same time, the module itself is called lzo, which is why its connection instead of the usual one:
package.cpath = package.cpath .. "/home/mikhail/lua_so/?.so
require ;lzo;
should be done like this:
package.cpath = package.cpath .. /home/mikhail/lua_so/lua_?.so
require lzo
It is also worth paying attention to the LuaRocks package manager, which allows you to install modules from a single repository and conveniently connect them. For example, in the framework of this study, the nixio and MD5 modules were connected via LuaRocks.

Lzo

Separate complexity was also presented by the LZO compression algorithm. The fact is that the data format for this archiving algorithm is not standardized, so it is difficult to write an unpacker without knowing how the file was packaged. However, among the initrd Lua modules was the lua_lzo.so module. The method described in the previous paragraph came to the rescue, however, complicated by the fact that lua_lzo.so required depending on the system library liblzo.so (which was taken from the same initrd) and non-standard connection of the module through package.cpath.

LZO Compression Algorithm

LZO is a family of block compression algorithms that have characteristics important for laptop computers:
very high unpacking speed;
low memory consumption;
block unpacking of data, in small portions.

From the point of view of reverse engineering, it has two drawbacks:
LZO includes nine compression algorithms, and each of them has its own unpacker.
The file structure of LZO archives is not standardized; different libraries generate different structures.

In our case, the archived data had the following format:
Magic sequence ("PMOC").
The data block size used in packaging (131072). Let me remind you that the little-endian system is used in ARM, which means that this number corresponds to the hex value 0x00000200 (see Fig. 3).
Data blocks containing:
Block size (e.g. 1816).
Packed data of the size indicated above.

This means that a block of packed data of 1816 bytes is unpacked into 128 kilobytes of information.

Unpacking is performed in a loop, by data blocks. For unpacking the following functions are used:
handle = >lzo.decompressInit(header), where header is magic number + size of the archive block, handle is the handle used in two other functions
... = lzo.decompressPorcess(handle)
lzo.decompressFinish(handle)

It is noteworthy that you need to know the exact size of the archive so that the decompression is successful. Otherwise, unpacking hangs on the status DECOMPRESS_NEED_MORE_DATA. The size of the archive is indicated in heading 2 (see Fig. 1).
Data compression is more difficult, since the compression functions are not documented and their performance was detected by trial. The functions are similar:
handle, header= lzo.compressInit(blockSize)
...= lzo.compressProcess(handle, data)
lzo.compressFinish(handle)

The distinguishing feature of compression from decompression is that before recording a block of data obtained as a result of the function lzo.compressProcess, it is necessary to record the size of the packed data block. This follows from the general documentation on the LZO compression algorithm and from the analysis of the archive obtained when parsing the original firmware.
As a result, by examining the source code of the scripts, trying to understand their logic of work, data formats, and also after conducting many experiments, the firmware was unpacked.

Fig. 3. The size of the data block of the LZO archive

Resize System Partition

The unpacked file of the system partition ( system.bin) was an image of the ext4 file system. In order to record customer data, it was necessary to expand it by 1 GB. To do this, do the following:

Extend the file system itself.
In heading 2, in the partition table, decrease the partition by 1 GB nanda1and increase the partition by the same amount nanda2.
Again, archive system.bin, recount the checksums and write them in the headers.

The very resize of the system partition is performed by the following commands:

# mkdir system_new
# losetup /dev/loop0 system.bin
# e2fsck -f /dev/loop0
# resize2fs /dev/loop0 2G
# mount /dev/loop0 system_new
...
# umount system_new
# losetup -d /dev/loop0

Work with data section

As part of this task, some of the changes in the system were made not only in /system, but also in /data. For this it was necessary to unpack dataImage.tar.gz, make the necessary changes and pack back. You should do the same with c userImage.tar.gzif you want to make changes to the contents of the SD card.
For packaging while maintaining access rights, we use the following commands:

# tar cvf - . | gzip -9 - > ../user.tar.gz
# tar cvfp - . | gzip -9 - > ../data.tar.gz

Replacing default applications

The customer needed not only to write his content to the device’s permanent memory, but also to replace the standard launcher with his own application, providing the required User Experience.
Replacing launcher (and other default applications) was done by editing the /data/system/packages.list and /data/system/packages.xml files. First, the default settings were performed on the device, then the contents of the files were partially transferred to the firmware.
The packages.list file is a list of packages installed on the system. The desired launcher package is called com.soaw.launcher and is added with the line:

com.soaw.launcher 10068 1 /data/data/com.soaw.launcher

package name="com.soaw.launcher" codePath="/system/app/SOAWLauncher.apk" nativeLibraryPath="/data/data/com.soaw.launcher/lib" flags="1" ft="141c2c2bbe0" it="141c2c2bbe0" ut="141c2c2bbe0" version="1" userId="10068"
sigs count=1;
cert index=20; key=... /sigs/package

The next entry is the default program settings. Here you can select the launcher and the media player program.

System settings

As you know, Android has a SQLite database of system settings, which can be modified at the stage of preparing the firmware image. The database file is located in /data/data/com.android.providers.settings/databases/settings.db.
The bottom panel is hidden in the system table with the following entries:

navigation_bar_mode = 4
navigation_bar_buttons_show = 0
navigation_bar_buttons_need_show = 0

Disabling the lock screen is performed in the secure table:

coockscreen.disabled = 1

Init scripts

Android init scripts are written to /at the time the device boots up and therefore, although they can be edited directly on the device, the next time they reboot, they will be overwritten with the original files. Most likely, they are located in initrd, but studies on this topic have not been conducted.

Conclusion

Our life is a process. Closed software systems are dark. The process of knowing the darkness is reverse engineering. This approach helped not only solve the main business problem - to release custom firmware, but also to learn more about the internal Android device as a whole, which is undoubtedly very interesting for a real hacker. It is important to remember that reverse engineering is a legal and universal tool. If it weren’t for him, the world would never have known about the most dangerous backdoors in the firmware of leading manufacturers of network equipment, about hardware “bookmarks” in microprocessors, about data leaks in popular Internet applications. If someone invented the “black box”, then there is always someone who can understand how it works.

Author: Mikhail Emelchenkov

First published in the journal “ Hacker»From 02/2014.

Subscribe to Hacker

Tags: