
Facilitating the reversal of Golang binaries or why write scripts in IDA
Golang is a great language. Strong typing, garbage collection, calling C functions via cgo, reflect, chan - just a fairy tale! Obviously, I think so, not only because go is popular, which means it is used by many programmers, which means there is a high probability that sometime someone will need to reverse his binaries - that’s what we’ll do now.
To get started, take fresh golang 1.8 under windows / amd64. For the binary, I'll take one of my crafts . Compile with
We open our binary in ida and are sad, because of the detected functions, only entrypoint. We dig a little deeper and see that ida rested against:
We go to the address, declare the function, run the analysis - 215 functions are found.

If you continue in this spirit, it will take a long time to parse. It is time to think. We recall that reflect from the standard library allows you to call functions by their names , which means the names of the functions, well, or at least their hashes should be stored somewhere, and we can get them to feed the idea.
Scroll to the beginning of the file in the hope of running into the pointers to the structures we need, but we meet a line
which talks about searching in some go.buildid character table . We assume that we find this line in the binary, and we find.

Right above the line we see the number 401000h, which is a pointer to our line. Below we see the definitions of functions, and above - an array of pairs in which one of the values looks like pointers. We check where these pointers point and do not miss - on the function.

To find out what we saw above, compile the code leaving the debug information. Open it in ida, find out that this array is called pclntab , and it is populated in the linker with the func (ctxt * Link) pclntab () function .
We have information about the pclntab table in which function names are stored and there is a way to access this table through buildid. We write the code:
We start and look at the results:

Almost all 5728 functions have a name

The second thing that interested me after the function name was type identification. First of all, we are interested in how dynamic memory is allocated for structures, which immediately leads to the newobject function , which is passed a pointer to runtime._type .
Have you noticed? Hint: field
You can continue to parse this binary, but we have already achieved our goals and have shown that golang binaries are fairly easy to explore, and scripting in IDA greatly facilitates the work.
Source code: github.com/mogaika/golang_ida_scripts
The code is written as "proof of concept" and may contain flaws, all claims can be expressed in private messages here or on github.
The text is written on the rights of "all good", complaints about the quality of the text are expected in private messages.
Training
To get started, take fresh golang 1.8 under windows / amd64. For the binary, I'll take one of my crafts . Compile with
go build
. File size - 7256576 bytes, DWARF in place. Will not work. Most often, when released, everything superfluous is cut out of the binary. Standard tools like strip do not work well with the go, find in google is one of the most popular options for the binary circumcision go build -ldflags "-w -s"
. We look at what these flags do , we find out that -w removes DWARF, and -s removes the symbol table and debugging information. We compile, we look at the file size - 4894720, well, it seems like no one promised that it would be small.Looking for features
We open our binary in ida and are sad, because of the detected functions, only entrypoint. We dig a little deeper and see that ida rested against:
lea rax, qword_452018+128h
jmp rax
We go to the address, declare the function, run the analysis - 215 functions are found.

If you continue in this spirit, it will take a long time to parse. It is time to think. We recall that reflect from the standard library allows you to call functions by their names , which means the names of the functions, well, or at least their hashes should be stored somewhere, and we can get them to feed the idea.
Scroll to the beginning of the file in the hope of running into the pointers to the structures we need, but we meet a line
Go build ID: "e07bfb8669c13efb74574c0ad220c5f2cfae5cd4"
. We eat the golang sources, since they are available to us, and we find a mention of this line. We notice a line in the linker codectxt.Syms.Lookup("go.buildid", 0)
which talks about searching in some go.buildid character table . We assume that we find this line in the binary, and we find.

Right above the line we see the number 401000h, which is a pointer to our line. Below we see the definitions of functions, and above - an array of pairs in which one of the values looks like pointers. We check where these pointers point and do not miss - on the function.

To find out what we saw above, compile the code leaving the debug information. Open it in ida, find out that this array is called pclntab , and it is populated in the linker with the func (ctxt * Link) pclntab () function .
We have information about the pclntab table in which function names are stored and there is a way to access this table through buildid. We write the code:
Code that names functions
def go_find_pclntab():
pos = idaapi.get_segm_by_name(".text").endEA
textstart = idaapi.get_segm_by_name(".text").startEA
while True:
# hex дата содержит строку "go.buildid" и некоторые другие значения структуры,
# которые не должны меняться
gobuilddefpos = FindBinary(
pos, SEARCH_UP,
"67 45 23 01 " + "00 "*20 +
"67 6f 2e 62 75 69 6c 64 69 64")
if gobuilddefpos < 100 or gobuilddefpos > pos:
# указатель невалидный, все варианты пройдены
# buildid pclntab entry не найден :(
break
# проверяем что buildid entry валидный
if Dword(gobuilddefpos-0x10) == textstart:
# вычитаем из имени символа смещение имени символа от pclntab
# тем самым получаем смещение pclntab
return gobuilddefpos + 24 - Dword(gobuilddefpos-0x8)
pos = gobuilddefpos
return None
def go_pclntab_travel(pclntab):
nfunc = Dword(pclntab+8)
# проходимся по всем функциям в массиве pclntab
for i in xrange(nfunc):
entry = pclntab + 0x10 + i * 0x10
sym = Qword(entry)
info = Qword(entry + 8) + pclntab
symnameoff = Dword(info + 8) + pclntab
symname = GetString(symnameoff)
# объявляем функцию
go_pclntab_handle_function(sym, symname, info)
We start and look at the results:

Almost all 5728 functions have a name

We are looking for types
The second thing that interested me after the function name was type identification. First of all, we are interested in how dynamic memory is allocated for structures, which immediately leads to the newobject function , which is passed a pointer to runtime._type .
Runtime._type structure
type _type struct {
size uintptr
ptrdata uintptr // size of memory prefix holding all pointers
hash uint32
tflag tflag
align uint8
fieldalign uint8
kind uint8
alg *typeAlg
// gcdata stores the GC type data for the garbage collector.
// If the KindGCProg bit is set in kind, gcdata is a GC program.
// Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
gcdata *byte
str nameOff
ptrToThis typeOff
}
Have you noticed? Hint: field
str nameOff
. Hint: str is a string. Hint: nameOff -> name -> name -> type structure name -> type name. The type structure contains the type name! Only it is some kind of relative. We find out that str is the offset from the beginning of the types of the module section, a pointer to which can be obtained from the moduledata module structure. So in order to find out the type, we need to find moduledata. For simplicity, we assume that we always have one module. We find in the function runtime.resolveNameOffthat the first module is in the variable firstmoduledata, it must be found in our binary. You don’t have to go far for this, you can look in the same function resolveNameOff, because we already collected the addresses and names of all functions in the script above. It remains only to find types for recognition, for this I just took all the calls to the newobject function and took pointers to types from the parameters.lea rbx, _type_p_elliptic_p256Point_6dad60 ; *elliptic.p256Point
mov [rsp+0A8h+var_A8], rbx
call runtime_newobject ; теперь то мы знаем что за тип тут аллоцируется
Total
You can continue to parse this binary, but we have already achieved our goals and have shown that golang binaries are fairly easy to explore, and scripting in IDA greatly facilitates the work.
Source code: github.com/mogaika/golang_ida_scripts
The code is written as "proof of concept" and may contain flaws, all claims can be expressed in private messages here or on github.
The text is written on the rights of "all good", complaints about the quality of the text are expected in private messages.