Interception of calls of functions of native libraries in Android applications

What is it for


I often came across the need to debug Android applications using native code. Sometimes I needed to intercept calls to bionic (libc), sometimes to .so-shkami, to which I had no source code. Sometimes it was necessary to include other people's .so in their applications, to which there were no sources and it was necessary to correct their behavior.

So how to do LD_PRELOAD in Android?

As is widely known, in a regular Linux desktop, this problem is easily solved using the environment variable LD_PRELOAD. This trick works as follows: a dynamic linker places the library from this variable at the very top of the list of available libraries. As a result, when the code tries to make a library call for the first time (lazy binding), the linker binds the function to the one that we defined in our library.

This is all great, but on Android this trick will not work. Applications launched from the UI are already linked by the time the code written by the author of the application is running. Purely theoretically, applications can be launched from the command line and set LD_PRELOAD. But this is a difficult task, and it works only for debug.

A bit about dynamic layout


In order to use dynamic libraries, you need the ability to call their code from other libraries - and vice versa. How can already compiled code call code from another library? Usual jmp / bx type transition operations require an address, but we cannot know it in advance (at the time of assembly .so), since different .so in memory can go to different (or even random) places. You can trivially patch the addresses of the required functions in the code when all .so are already laid out in memory. But this is not elegant, slow, it requires writing to the code area, plus each application had to get its own copy of the code and there would be no memory saving.

The solution is very simple: the jump takes place at an address written somewhere outside the executable code section. And if this address is made not absolute, but relative (for example, writing it as the offset of the command itself), it turns out that the code itself can be placed anywhere in memory. And already behind it is the PLT table, procedure linkage table. It usually maps as (r, or rw), and not eXecutable. This table contains just the “real” addresses. The table can be filled both at the start and directly at runtime, in lazy mode.

If you put everything together, in order to make the xxx.so module jump to our interceptor when calling the yyy () function:
  • find the PLT section in xxx;
  • count / find the offset for yyy () in PLT;
  • write down the address of our function.


Actually, interception


Android uses bionic and it is slightly different from glibc, but there are no fundamental differences. The internal data is stored in the structure soinfoand this is a linked list of all .so data downloaded at the moment.

The glibc dlopen()returns us spherical void*in a vacuum:

void *dlopen(const char *filename, int flag)


But, looking at the source bionic, we will see that the coveted soinfo
soinfo* do_dlopen(const char* name, int flags)

If the library is already loaded, we will be returned soinfofor it. Hooray, now we have in our hands all the information about .so that interests us.

In ELF, lines with characters are stored separately (strtab), separate structures with description of characters (symtab). For the characters themselves (string constants), a hash is calculated, which allows you to quickly find the offset for the character of interest to us.

ELF character hash counting
 static unsigned elfhash(const char *_name)  
 {  
   const unsigned char *name = (const unsigned char *) _name;  
   unsigned h = 0, g;  
   while(*name) {  
     h = (h << 4) + *name++;  
     g = h & 0xf0000000;  
     h ^= g;  
     h ^= g >> 24;  
   }  
   return h;  
 } 


When the hash is counted, you need to find the sivol.
character search by hash
static Elf32_Sym *soinfo_elf_lookup(soinfo *si, unsigned hash, const char *name)  
 {  
   Elf32_Sym *s;  
   Elf32_Sym *symtab = si->symtab;  
   const char *strtab = si->strtab;  
   unsigned n;  
   n = hash % si->nbucket;  
   for(n = si->bucket[hash % si->nbucket]; n != 0; n = si->chain[n]){  
     s = symtab + n;  
     if(strcmp(strtab + s->st_name, name)) continue;  
       return s;  
     }  
   return NULL;  
 }


And here is the procedure for replacing the desired value:
 int hook_call(char *soname, char *symbol, unsigned newval) {  
  soinfo *si = NULL;  
  Elf32_Rel *rel = NULL;  
  Elf32_Sym *s = NULL;   
  uint32_t sym_offset = 0;  
  uint32_t page_size = 0;
  if (!soname || !symbol || !newval)  
     return 0;  
  si = (soinfo*) dlopen(soname, 0);  
  if (!si)  
   return 0;  
  s = soinfo_elf_lookup(si, elfhash(symbol), symbol);  
  if (!s)  
    return 0;  
  page_size = getpagesize();
  sym_offset = s - si->symtab;  // индекс найденного символа
  rel = si->plt_rel;  
  /* идем по таблице релокаций пока не попадется нужный нам индекс */  
  for (int i = 0; i < si->plt_rel_count; i++, rel++) {  
   unsigned type = ELF32_R_TYPE(rel->r_info);  
   unsigned sym = ELF32_R_SYM(rel->r_info);  
   unsigned reloc = (unsigned)(rel->r_offset + si->base);  
   unsigned oldval = 0;  
   if (sym_offset == sym) {  
    switch(type) {  
      case R_ARM_JUMP_SLOT: 
         // нужно пометить страницу как RW, и адрес должен быть page-aligned
         mprotect((uint32_t *) reloc& (~(page_size - 1), page_size, PROT_READ | PROT_WRITE);
         oldval = *(unsigned*) reloc;  
         *((unsigned*)reloc) = newval;  
         return 1;  
      default:  
         return 0;  
    }  
   }  
  }  
  return 0;  
 }  


Now, to intercept connect () from libandroid_runtime.so, we need to call:

hook_call("libandroid_runtime.so", "connect", &my_connect);

Also popular now: