How I found Easter eggs in the protection of Android and did not get a job at Google

Google loves Easter eggs. He loves so much that you can find them in almost every product of the company. The tradition of Easter eggs in Android stretches from the very first versions of the operating system (I think everyone knows what will happen if you click on the line with the Android version several times in the settings).

But it also happens that Easter eggs are found in the most unexpected places. There is even such a legend: once one programmer googled “mutex lock”, and instead of the search results I got on the foo.bar page , solved all the problems and got a job at Google.

Event reconstruction

That's the same amazing story (only without the happy end) happened to me. Hidden messages where they definitely cannot be, the reverse of Java code and native libraries, the secret virtual machine, the interview with Google - all this is under the cut.

Droidguard

One boring evening I made a factory reset and started re-tuning my smartphone. First of all, a fresh Android asked me to enter my google account. “I wonder how the registration and login in Android happens in general?” I thought. The evening stopped being languid.

I use the PortSwigger burp suite to capture and analyze traffic. Free Community version will be enough. So that we can see https requests, first we need to install a certificate from PortSwigger on the device. For tests, I found an eight-year-old Samsung Galaxy S with Android 4.4 on board in my bins. If you have something fresher, then you may have problems with https: certificate pinning and all that.

In fact, there is nothing particularly interesting in appeals to the Google API. The device sends data about itself, receives tokens in response ... The only incomprehensible moment is a POST request to the anti-abuse service.

After this request, among the unremarkable parameters, one mysterious appears, with the name droidguard_result . It represents a very long Base64 line:

DroidGuard is a Google mechanism for separating bots and emulators from real devices. SafetyNet also uses data from DroidGuard in its work. A similar thing Google has for browsers is Botguard.

But still, what is this data that is transmitted to them? Now we will understand.

Protocol buffers

Where does the link www.googleapis.com/androidantiabuse/v1/x/create?alt=PROTO&key=AIzaSyBofcZsgLSS7BOnBjZPEkk4rYwzOIz-lTI come from , who exactly makes this request on Android? It is easy to find that this link is stored in one of the obfuscated Google Play Services classes right in this form:

publicbdd(Context var1, bdh var2){
  this(var1, "https://www.googleapis.com/androidantiabuse/v1/x/create?alt=PROTO&key=AIzaSyBofcZsgLSS7BOnBjZPEkk4rYwzOIz-lTI", var2);
}

As we have already seen in Burp, Content-Type from POST requests for this link is application / x-protobuf (Google Protocol Buffers, the protocol for binary serialization from Google). Not json, of course - you won't understand right away what is being sent there.

Protocol buffers works like this:

First, we describe the structure of the message in a special format and save it in a .proto file
compile .proto files, the protoc compiler generates the source code in the selected programming language (in the case of Android it is Java)
use generated classes in the project

To decode messages in the protobuf format we have two ways. The first is to use any protobuf analysis tool and try to recreate the original description of the .proto files. The second is to pull out the classes generated by the protoc compiler from Google Play Services. I went the second way.

We take the apk file of Google Play Services of the same version that is installed on the device (and if the device is ruted, then apk can be copied directly from it). Using dex2jar, we flip the .dex file back to .jar and open it with our favorite decompiler. I really like JetBrains Fernflower lately. It works as a plugin for IntelliJ IDEA (or Android Studio), so just open a class in Android Studio with the most cherished link. If proguard didn't try very hard, then the decompiled Java code for creating protobuf messages can be simply copied entirely into the project itself.

From the decompiled code, you can see that in the protobuf message, the Build. * Constants go to the server (well, that was immediately obvious):

...
var3.a("4.0.33 (910055-30)");
a(var3, "BOARD", Build.BOARD);
a(var3, "BOOTLOADER", Build.BOOTLOADER);
a(var3, "BRAND", Build.BRAND);
a(var3, "CPU_ABI", Build.CPU_ABI);
a(var3, "CPU_ABI2", Build.CPU_ABI2);
a(var3, "DEVICE", Build.DEVICE);
...

But in the server's response, unfortunately, all the fields of protobuf messages after obfuscation turned into meaningless letters of the Latin alphabet. But what is stored in these fields can be recognized by error handling. This is how data from the server is checked:

if (!var7.d()) {
    thrownew bdf("byteCode");
}
if (!var7.f()) {
    thrownew bdf("vmUrl");
}
if (!var7.h()) {
    thrownew bdf("vmChecksum");
}
if (!var7.j()) {
	thrownew bdf("expiryTimeSecs");
}

Apparently, this is exactly what the fields before obfuscation were called: byteCode , vmUrl , vmChecksum and expiryTimeSecs . Such naming already suggests certain guesses.

We collect all decompiled classes from Google Play Services into a test project, rename it, type test constants Build. * And run it (if you wish, you can simulate the parameters of any device). If someone wants to repeat, here is the link to my githab .

If the request is correct, the following result is returned from the server:

00: 06: 26.761 [main] INFO daresponse.AntiabuseResponse - byteCode size: 34446
00: 06: 26.761 [main] INFO daresponse.AntiabuseResponse - vmChecksum: C15E93CCFD9EF178293A2334A1C9F9B08F115993
00: 06: 26,761 [main] INFO daresponse.AntiabuseResponse - vmUrl: www. gstatic.com/droidguard/C15E93CCFD9EF178293A2334A1C9F9B08F115993
00: 06: 26.761 [main] INFO daresponse.AntiabuseResponse - expiryTimeSecs: 10

The first stage is over. Now let's see what's interesting behind the vmUrl link .

Secret apk

The link leads us straight to the .apk file whose name matches its SHA-1 hash. The size and contents of the apk file are modest - the file weighs 150 kilobytes. The savings are not superfluous: if each of the two billion Android devices loads it, then 270 terabytes of traffic is running.

DroidGuardServicewhich is part of Google Play Services, carefully uploads this file to the device, unpacks, extracts .dex and .so files and unceremoniously, through reflection, uses the class com.google.ccc.abuse.droidguard.DroidGuard. If any error occurs, then DroidGuardServiceswitch from DroidGuard to Droidguasso. But how Droidguasso works is a separate story.

In essence, a class DroidGuardis just a JNI wrapper around a native .so library. The ABI of the native library corresponds to what we sent in the protobuf request in the fieldCPU_ABI: we can request armeabi, we can x86, and we can even mips.

The service itself DroidGuardServicedoes not contain any interesting logic to work with the loaded class DroidGuard. It simply creates a new instance of the class DroidGuard, passing it to the byteCode constructor from the protobuf message, calls a public method that returns an array of bytes. This array of bytes is sent to the server in the droidguard_result parameter .

To get a rough idea of what is happening inside, DroidGuardwe can repeat the logic DroidGuardService(only without downloading apk, since we already have the native library). We can take the .dex file from the secret APK, overtake it in .jar and then use it in the project. The only problem is how the classDroidGuardloads the native library. In the static block initialization method is called loadDroidGuardLibrary():

static
  {
    try
    {
      loadDroidGuardLibrary();
    }
    catch (Exception ex)
    {
      thrownew RuntimeException(ex);
    }
  }

In turn, the method loadDroidGuardLibrary()reads the library.txt file (which lies in the root of the .apk file) and loads the library with this name through a call System.load(String filename). Not the most convenient way for us, you have to invent something when building apk to put the file in the root library.txt and .so. It would be more convenient to store the .so file in the lib folder and load it via System.loadLibrary(String libname).

Fix it easy. For this we will use smali / baksmali - assembler / disassembler for dex format. With it, classes.dex turns into a set of .smali files. The class com.google.ccc.abuse.droidguard.DroidGuardneeds to be corrected so that the method is called System.loadLibrary("droidguard")instead in the static initialization block loadDroidGuardLibrary(). The smali syntax is pretty simple, the initialization block will look like this:

.method static constructor <clinit>()V
    .locals 1
    const-string v0, "droidguard"
    invoke-static {v0}, Ljava/lang/System;->loadLibrary(Ljava/lang/String;)V
    return-void
.end method

With the help of the baksmali utility, everything is going back to .dex, which in turn is converted to .jar. After these manipulations on the output we get a jar file, which we can use in the test project. By the way, here he is .

All work DroidGuardtakes a couple of lines. The most important thing is to load the array of bytes that we received at the last step after a request to the anti-abuse service and transfer it to the constructor DroidGuard.

privatefunrunDroidguard() {
        var byteCode: ByteArray? = loadBytecode("bytecode.base64");
        byteCode?.let {
            val droidguard = DroidGuard(applicationContext, "addAccount", it)
            val params = mapOf("dg_email" to "test@gmail.com", "dg_gmsCoreVersion" to "910055-30",
                "dg_package" to "com.google.android.gms", "dg_androidId" to UUID.randomUUID().toString())
            droidguard.init()
            val result = droidguard.ss(params)
            droidguard.close()
        }
    }

Now, using the Android Studio profiler, we can see what happens during the DroidGuard operation.

The native method initNative()collects information about the device and calls java-methods: hasSystemFeature(), getMemoryInfo(), getPackageInfo()... Already something, but the specific logic is still not visible. Well, nothing remains but to disassemble the .so file.

libdroidguard.so

In fact, analyzing a native library is not much more complicated than analyzing .dex and .jar files. You will need a program similar to Hex-Rays IDA and occasionally a little knowledge of assembler under arm or x86, to choose from. I chose arm, because I have a customized device for debugging. If there is no such thing at hand, then you can take the library under x86 and debug in the emulator.

A program like Hex-Rays IDA decompiles a binary into something like a c-code. If we open the code of the method Java_com_google_ccc_abuse_droidguard_DroidGuard_ssNative, we will see approximately the following picture:

__int64 __fastcall Java_com_google_ccc_abuse_droidguard_DroidGuard_initNative(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9)  
...
  v14 = (*(_DWORD *)v9 + 684))(v9, a5);  
  v15 = (*(_DWORD *)v9 + 736))(v9, a5, 0);
...

It looks so-so. First you need to take a couple of preliminary steps to bring it into a decent view. The decompiler knows nothing about JNI, so install the Android NDK and import the jni.h file. As we well know, the first two parameters of the JNI method are JNIEnv*and jobject (this). The types of the remaining parameters and their purpose we can learn from the Java code DroidGuard. After assigning the required types, meaningless offsets are turned into calls to JNI methods:

__int64 __fastcall Java_com_google_ccc_abuse_droidguard_DroidGuard_initNative(_JNIEnv *env, jobject thiz, jobject context, jstring flow, jbyteArray byteCode, jobject runtimeApi, jobject extras, jint loggingFd, int runningInAppSide){
...
  programLength = _env->functions->GetArrayLength)(_env, byteCode);  
  programBytes = (jbyte *)_env->functions->GetByteArrayElements)(_env, byteCode, 0);
...

If you have patience and follow the path of the byte array that we received from the anti-abuse server, then you can be upset. Unfortunately, there will be no simple answer to the question “what is going on here at all?”. This is really the real bytecode, and the native library is a virtual machine. A bit of AES encryption, and then the virtual machine, byte by byte, reads the bytecode and executes commands. Each byte is a command followed by operands. There are not so many commands, only 70 pieces: read int, read byte, read a line, call a java method, multiply two numbers, if-goto, and so on.

Wake up neo

I decided to go a little further and deal with the bytecode format for this virtual machine. There is one problem with the commands: periodically (once every few weeks) a new version of the native library appears, in which each team has a different byte. This did not stop me, and I decided to recreate this virtual machine in Java.

The byte code performs all the routine work of collecting information about the device. For example, loads a string with the name of the method, gets its address via dlsym and executes. In my java version of the virtual machine, I implemented 5 methods on the strength and learned to interpret literally the first 25 bytecode commands of the anti-abuse service. On the 26th team, the virtual machine read the next encrypted string from the bytecode. Suddenly it turned out that this is far from the name of the next method.

Virtual Machine command # 26
Method invocation vm-> vm_method_table [2 * 0x77]
Method vmMethod_readString
index is 0x9d
string length is 0x0066
(new key is generated)
34 3 35 4A DD 55 B3 91 33 05 61 04 C0 54 FD 95 2F 18 72 04 C1 55 E1 92 28 11 66 04 DD 4F B3 94 33 04 35 0A C1 4E B2 DB 12 17 79 4F 92 55 FC DB 33 05 35 45 C6 01 F7 89 29 1F 71 43 C7 40 E1 9F 6B 1E 70 48 DE 4E B8 CD 75 44 23 14 85 14 A7 C2 7F 40 26 42 84 17 A2 BB 21 19 7A 43 DE 44 BD 98 29 1B
decoded string bytes are 59 6F 75 27 72 65 20 6E 6F 74 20 6A 75 73 74 20 72 75 6E 6E 69 6E 67 20 73 74 72 69 6E 67 73 20 6F 6E 20 6F 75 72 20 2E 73 6F 21 20 54 61 6C 6B 20 74 6F 20 75 73 20 61 74 20 64 72 6F 69 64 67 75 61 72 64 2D 68 65 6C 6C 6F 2B 36 33 32 30 30 37 35 34 39 39 36 33 66 36 36 31 40 67 6F 6F 67 6C 2E 65 63 6D 6F
Decoded: string of value is ( You're not just the running strings on Our .so! Talk to us AT droidguard@google.com )

Very strange, until this point virtual machines never talked to me. It seemed to me that it was an alarm bell, if you see the secret messages addressed to you. In order to make sure that the roof is still in place, I decided to drive through my virtual machine a couple of hundred different answers from the byte-code anti-abuse service. Every time, literally in 25-30 commands, a message was hidden in the bytecode. Often they repeated, but I selected unique ones. The email address, however, I changed. Plus, in every such message, the email address was in the format “droidguard+tag@google.com”: for each request to the anti-abuse service, this tag is unique.

droidguard@google.com: Don't be a stranger!
You got in! Talk to us at droidguard@google.com
Greetings from droidguard@google.com intrepid traveler! Say hi!
Was it easy to find this? droidguard@google.com would like to know the
folks at droidguard@google.com
What's all this gobbledygook? Ask droidguard@google.com… they'd know!
Hey! Fancy seeing you here. Have you spoken to droidguard@google.com yet?
You're not just running strings on our .so! Talk to us at droidguard@google.com

Probably, I'm the one chosen? I decided it was time to stop digging into DroidGuard and connect with Google, since they ask me to.

Your call is very important to us.

I decided to report the results of my research to the address indicated. To make the results look more impressive, I automated the virtual machine analysis process a bit. The fact is that strings and arrays of bytes are stored in encrypted byte-code. The virtual machine decodes using constants that the compiler has injected. With a program similar to Hex-Rays IDA, getting them out is not difficult. But with each new version of the native library, these constants change and it is inconvenient to manually get them.

On Java, the parsing of a native library turned out to be surprisingly easy. With the help of jelf (library for parsing ELF files) the method offset is found Java_com_google_ccc_abuse_droidguard_DroidGuard_initNativein the binary, and then with the help of Capstone(a framework for disassembling, there are binders for various programming languages, including Java) you can get the code in assembler and search for it loading constants in registers.

The result was a program that repeats all the work of DroidGuard: makes a request to the anti-abuse service, loads apk, unpacks, parses the native library, gets the necessary constants from there, selects the virtual machine commands mapping and interprets the byte code. Having collected all this in a heap, I sent the letter to Google. In parallel, I began to prepare for the move and it was useful to study glassdoor on the topic of average salary in the company. For less than a six-figure sum, I decided not to agree.

The answer was not long in coming. A letter from a member of the DroidGuard team was rather concise: “Why are you doing this at all?”.

“Prost,” I replied. A Google employee explained to me what DroidGuard is for: to protect Android from intruders (it can't be!). And it would be reasonable to not place my source code of the DroidGuard virtual machine anywhere. At this our communication is over.

Interview

A month later, another letter unexpectedly arrived. The DroidGuard team in Zurich needs a new employee. Maybe I would like to join them? Still would!

There are no roundabouts for the device to Google. The maximum that my counterpart could do for me is to forward my resume to the hr department. After that, the standard bureaucratic procedure from the interview series is launched.

There is an abundance of information about an interview on Google on the Internet. Algorithms, Olympiad puzzles and programming in Google Doc were not my strong point, so I began to prepare hard. I wiped up the holes of the course "Algorithms" on coursera, I resolved hundreds of tasks on a hackerrank, I could write with my eyes closed a bypass of the graph in width and depth ...

In preparation, two months have passed. To say that I was ready - to say nothing. Google Doc has become my favorite IDE. It seemed to me that I knew almost everything about algorithms. Of course, I adequately assessed my strength and understood that I would hardly pass the 5 face-to-face interviews in Zurich. But free to go to Disneyland for programmers in Switzerland - this is also not bad. The first stage is a telephone interview, in order to weed out very weak candidates and not to waste developers time on face-to-face interviews. The day was set, I waited for the call ...

... and I immediately failed the very first telephone interview. I was lucky, I came across a question that I had previously seen on the Internet and which I had already decided before the interview. The task was to serialize an array of strings. I suggested to encode strings in Base64 and save them through the delimiter. In response, the interviewer suggested that I implement the Base64 algorithm. After that, the interview turned into a monologue, in which the interviewer explained to me how Base64 works, and I remembered the bit operations in Java.

If the article is read by Google employees

Guys, you are geniuses, if you could get there! Seriously. I have no idea how to get through this obstacle course.

3 days after the call, I received a letter in which I was informed that they did not want to interview me further. At this, my communication with Google is completely over.

Why in DroidGuard hidden messages calling to talk, I did not understand. Perhaps just for statistics. As I was told, they are written with a different frequency: sometimes every week for three people, and sometimes once a year.

I think to get an interview at Google there are ways much easier. In the end, with the same success, you can ask any of the one hundred thousand employees of the company (although there are fewer developers, of course). But the experience was interesting.

Tags: