xoo May 23, 2019 at 19:58

We solve the Best Reverser problem with PHDays 9

From the sandbox

Hello!

My name is Marat Gayanov, I want to share with you my solution to the problem from the Best Reverser contest , to show how to make keygen for this case.

Description

In this competition, participants are provided with ROM games for Sega Mega Drive ( best_reverser_phd9_rom_v4.bin ).

Task: to pick up such a key which together with the participant’s email address will be recognized as valid.

So the solution ...

Instruments

IDA Pro 6.8
Smd_ida_tools plugin

Checking Key Length

The program does not accept every key: you need to fill in the entire field, these are 16 characters. If the key is shorter, you will see a message: “Wrong length! Try again ... ".

Let's try to find this line in the program, for which we will use the binary search (Alt-B). What will we find?

We will find not only this, but also the other service lines nearby: “Wrong key! Try again ... ”and“ YOU ARE THE BEST REVERSER! ”.

Shortcuts WRONG_LENGTH_MSG, YOU_ARE_THE_BEST_MSGand WRONG_KEY_MSGI set it to be convenient.

Set a break for reading the address 0x0000FDFA- find out who works with the message “Wrong length! Try again ... ". And run the debugger (it will stop several times before the key can be entered, just press F9 at each stop). Enter your email, key ABCD.

The debugger results in 0x00006FF0 tst.b (a1)+:

There is nothing interesting in the block itself. It is much more interesting who transfers control here. We look at the call stack:

Click and get here - on the instructions 0x00001D2A jsr (sub_6FC0).l:

We see that all possible messages were found in one place. But let's find out where control is transferred to from the block WRONG_KEY_LEN_CASE_1D1C. We will not set breaks, just move the cursor over the arrow going to the block. The caller is located at the address 0x000017DE loc_17DE(which I will rename to CHECK_KEY_LEN):

Put a break on the address 0x000017EC cmpi.b 0x20 (a0, d0.l)(the instruction in this context looks to see if there is an empty character at the end of the key character array), restart, enter the mail and key again ABCD. The debugger stops and shows that at the address 0x00FF01C7(stored at that moment in the register a0) is the key entered:

This is a good find, through it we will grab everything at all. But first, mark up the bytes of the key for convenience:

Scrolling up from this place, we see that the mail is stored next to the key:

We dive deeper and deeper, and it's time to find a criterion for the correctness of the key. Rather, the first half of the key.

Criterion for the correctness of the first half of the key

Preliminary calculations

It is logical to assume that immediately after checking the length, other operations with the key will follow. Consider the block immediately after the check:

This block is undergoing preliminary work. The function get_hash_2b(in the original was sub_1526) is called twice. First, the address of the first byte of the key (the register a0contains the address KEY_BYTE_0) is transmitted to it , the second time - the fifth ( KEY_BYTE_4).

I named the function like this because it considers something like a 2-byte hash. This is the most understandable name that I picked up.

I will not consider the function itself, but I will immediately write it in python. She does simple things, but her description with screenshots will take up a lot of space.

The most important thing to say about it: the input address is supplied to the input, and work is being done on 4 bytes from this address. That is, they sent the first byte of the key to the input, and the function will work with the 1,2,3,4th. Filed fifth, the function works with the 5,6,7,8th. In other words, in this block there are calculations over the first half of the key. The result is written to the register d0.

So the function get_hash_2b:

# key_4s - четыре символа ключа
def get_hash_2b(key_4s):
	# Правило преобразования байта
	def transform(b):
		# numbers -.
		if b <= 0x39:
			r = b - 0x30
		# Letter case and @
		else:
			# @ABCDEF
			if b <= 0x46:
				r = b - 0x37
			else:
				# WXYZ
				if b >= 0x57:
					r = b - 0x57
				# GHIJKLMNOPQRSTUV
				else:
					r = 0xff - (0x57 - b) + 1  # a9+b
		return r
	# Перевод в байты
	key_4b = bytearray(key_4s, encoding="ascii")
	# Каждый байт аргумента трансформируется
	codes = [transform(b) for b in key_4b]
	# А здесь они просто склеиваются
	part0 = (codes[0] & 0xff) << 0xc
	part1 = (codes[1] << 0x8) & 0xf00
	part2 = (codes[2] << 0x4) & 0xf0
	hash_2b = (part0 | part1) & 0xffff
	hash_2b = (hash_2b | part2) & 0xffff
	hash_2b = (hash_2b | (codes[3] & 0xf))
	return hash_2b

Immediately write a hash decoding function:

# Возвращает строку из 4-х символов
def decode_hash_4s(hash_2b):
	# Правило преобразования байта
	def transform(b):
		if b <= 0x9:
			return b + 0x30
		if b <= 0xF:
			return b + 0x37
		if b >= 0x0:
			return b + 0x57
		return b - 0xa9
	# Нарезаем отдельные байты из переданного хэша и переводи
	b0 = transform(hash_2b >> 12)
	b1 = transform((hash_2b & 0xfff) >> 8)
	b2 = transform((hash_2b & 0xff) >> 4)
	b3 = transform(hash_2b & 0xf)
	# Склеиваем
	key_4s = [chr(b0), chr(b1), chr(b2), chr(b3)]
	key_4s = "".join(key_4s)
	return key_4s

I didn’t come up with a better decoding function, and it’s not entirely correct. Therefore, I will check it like this (not right now, but much later):

key_4s == decode_hash_4s(get_hash_2b(key_4s))

Check the work get_hash_2b. We are interested in the state of the register d0after the function is executed. Put the breaks on 0x000017FE, 0x00001808the key enter ABCDEFGHIJKLMNOP.

The register d0entered values 0xABCD, 0xEF01. And what will give get_hash_2b?

>>> first_hash = get_hash_2b("ABCD")
>>> hex(first_hash)
0xabcd
>>> second_hash = get_hash_2b("EFGH")
>>> hex(second_hash)
0xef01

Verification passed.

Then it is done xor eor.w d0, d5, the result is recorded in d5:

>>> hex(0xabcd ^ 0xef01)
0x44cc

The receipt of such a hash 0x44CCconsists of preliminary calculations. Further, everything only becomes more complicated.

Where does the hash go

We can’t go further if we don’t know how the program works with the hash. Surely it moves from d5to memory, because The register comes in handy somewhere else. We can find such an event through the trace (watching d5), but not manual, but automatic. The following script will help:

#include 
static main()
{
	auto d5_val;
	auto i;
	for(;;)
	{
		StepOver();
		GetDebuggerEvent(WFNE_SUSP, -1);
		d5_val = GetRegValue("d5");
		// Ловим факт изменения d5
		if (d5_val != 0xFFFF44CC){
			break;
		}
	}
}

Let me remind you that we are now on the last break 0x00001808 eor.w d0, d5. Insert the script ( Shift-F2), click. Run

The script will stop at the instructions 0x00001C94 move.b (a0, a1.l), d5, but by this moment it d5has already been cleared. However, we see that the value from is d5moved by the instruction 0x00001C56 move.w d5,a6: it is written into memory at the address 0x00FF0D46(2 bytes).

Remember: the hash is stored at 0x00FF0D46.

We catch the instructions that are read from 0x00FF0D46-0x00FF0D47(put a break on reading). There were 4 blocks:

How to choose the right / right from them?

Go back to the beginning:

This block determines whether the program will go to LOSER_CASEor to WINNER_CASE:

We see that there d1must be zero in the register for victory.

Where is zero set? Just scroll up:

If the loc_1EECcondition is met in the block :

*(a6 + 0x24) == *(a6 + 0x22)

then we get zero in d5.

If we put a break on the instruction 0x00001F16 beq.w loc_20EA, we will see that the a6 + 0x24 = 0x00FF0D6Avalue is stored there too 0x4840. And in a6 + 0x22 = 0x00FF0D68stored 0xCB4C.

If we enter different keys, mail, we will see that 0xCB4C - константа. The first half of the key will be accepted only if it 0x00FF0D6Awill be too 0xCB4C. This is the criterion for the correctness of the first half of the key.

We find out which blocks are written in 0x00FF0D6A- put a break on the record, enter the mail and key again.

And here loc_EACwe find this block (in fact there are 3 of them, but the first two are simply reset 0x00FF0D6A):

This block belongs to the function sub_E3E.

Through the call stack we find out that the function sub_E3Eis invoked in the blocks loc_1F94, loc_203E:

Remember, we found 4 blocks earlier? loc_1F94we saw there - this is the beginning of the main key processing algorithm.

First important loop loc_1F94

What loc_1F94is a loop can be seen from the code: it runs d4once (see instructions 0x00001FBA d4,loc_1F94):

What to look for:

There is a function sub_5EC.
The instruction 0x00001FB4 jsr (a0) calls the sub_E3E function (this can be seen with a simple trace).

What's going on here:

The function sub_5ECwrites the result of its execution to the register d0(a separate section is devoted to this).
A d1byte is stored in the register at the address sp+0x33(the 0x00FFFF79debugger tells us), it is equal to the second byte from the key hash address ( 0x00FF0D47). This is easy to prove if you put a break on a record at 0x00FFFF79: it will work on instructions 0x00001F94 move.b 1(a2), 0x2F(sp). At a2that moment, the address 0x00FF0D46is stored in the register — the hash address, that is 0x1(a2) = 0x00FF0D46 + 1, the address of the second byte of the hash.
It d0is written in the register d0^d1.
The resulting xor'a result is given to a function sub_E3Ewhose behavior depends on its previous calculations (shown below).
Repeat.

How many times does this cycle run?

Find out this. Run the following script:

#include 
static main()
{
	auto pc_val, d4_val, counter=0;
	while(pc_val != 0x00001F16)
	{
		StepOver();
		GetDebuggerEvent(WFNE_SUSP, -1);
		pc_val = GetRegValue("pc");
		if (pc_val == 0x00001F92){
			counter++;
			d4_val = GetRegValue("d4");
			print(d4_val);
		}
	}
	print(counter);
}

0x00001F92 subq.l 0x1,d4- here it is determined what will be in d4immediately before the cycle:

We deal with the sub_5EC function.

sub_5EC

Significant piece of code:

where 0x2c(a2)always 0x00FF1D74.
This piece can be rewritten like this in pseudo-code:

d0 = a2 + 0x2C
*(a2+0x2C) = *(a2+0x2C) + 1 #*(0x00FF1D74) = *(0x00FF1D74) + 1
result = *(d0) & 0xFF

That is, 4 bytes of 0x00FF1D74are the address, because they are treated like a pointer.

How to rewrite function sub_5ECin python?

Or make a memory dump and work with it.
Or just write down all the returned values.

The second method I like more, but what if, with different authorization data, the values returned are different? Check this out.

The script will help in this:

#include 
static main()
{
    auto pc_val=0, d0_val;
    while(pc_val != 0x00001F16){
        pc_val = GetRegValue("pc");
        if (pc_val == 0x00001F9C)
            StepInto();
        else
            StepOver();
        GetDebuggerEvent(WFNE_SUSP, -1);
        if (pc_val == 0x00000674){
            d0_val = GetRegValue("d0") & 0xFF;
            print(d0_val);
        }
    }
}

I just compared the outputs to the console with different keys, mails.

Running the script several times with different keys, we will see that the function sub_5ECalways returns the next value from the array:

def sub_5EC_gen():
	dump = [0x92, 0x8A, 0xDC, 0xDC, 0x94, 0x3B, 0xE4, 0xE4,
	        0xFC, 0xB3, 0xDC, 0xEE, 0xF4, 0xB4, 0xDC, 0xDE,
	        0xFE, 0x68, 0x4A, 0xBD, 0x91, 0xD5, 0x0A, 0x27,
	        0xED, 0xFF, 0xC2, 0xA5, 0xD6, 0xBF, 0xDE, 0xFA,
	        0xA6, 0x72, 0xBF, 0x1A, 0xF6, 0xFA, 0xE4, 0xE7,
	        0xFA, 0xF7, 0xF6, 0xD6, 0x91, 0xB4, 0xB4, 0xB5,
	        0xB4, 0xF4, 0xA4, 0xF4, 0xF4, 0xB7, 0xF6, 0x09,
	        0x20, 0xB7, 0x86, 0xF6, 0xE6, 0xF4, 0xE4, 0xC6,
	        0xFE, 0xF6, 0x9D, 0x11, 0xD4, 0xFF, 0xB5, 0x68,
	        0x4A, 0xB8, 0xD4, 0xF7, 0xAE, 0xFF, 0x1C, 0xB7,
	        0x4C, 0xBF, 0xAD, 0x72, 0x4B, 0xBF, 0xAA, 0x3D,
	        0xB5, 0x7D, 0xB5, 0x3D, 0xB9, 0x7D, 0xD9, 0x7D,
	        0xB1, 0x13, 0xE1, 0xE1, 0x02, 0x15, 0xB3, 0xA3,
	        0xB3, 0x88, 0x9E, 0x2C, 0xB0, 0x8F]
	l = len(dump)
	offset = 0
	while offset < l:
		yield dump[offset]
		offset += 1

So the function sub_5ECis ready.

Next in line sub_E3E.

sub_E3E

Significant piece of code:

Decrypt:

Эта конструкция выдает адрес, куда сохранять d2, который сейчас содержит входной аргумент.
Регистр a2 хранит значение 0xFF0D46, a2 + 0x34 = 0xFF0D7A
d0 = *(a2 + 0x34)
*(a2 + 0x34) = *(a2 + 0x34) + 1
Входной аргумент сохраняется, в регистре a0 адрес этого сохранения
a0 = d0
*(a0) = d2
Здесь вычисляется некий offset, который сохраняется в регистре d2.
Регистр a2 хранит значение 0xFF0D46, a2 + 0x24 = 0xFF0D6A - это место, где хранится
предыдущий результат функции(см. конец) либо 0x00000000, если функция вызывается впервые
d0 = *(a2 + 0x24)
d2 = d0 ^ d2
d2 = d2 & 0xFF
d2 = d2 + d2
Вытаскиваются какие-то 2 байта по адресу 0x00011FC0 + d2, это область ROM, поэтому
содержимое 0x00011FC0 + d2 постоянно
a0 = 0x00011FC0
d2 = *(a0 + d2)
Предыдущий результат этой функции сдвигается на 8 бит
d0 = d0 >> 8
Результат
d2 = d0 ^ d2
Записывается текущий результат функции
*(a2 + 0x24) = d2

The function sub_E3Ecomes down to the following steps:

Save the input argument to an array.
Calculate the offset offset.
Pull 2 bytes to the address 0x00011FC0 + offset(ROM).
Result = (предыдущий результат >> 8) ^ (2 байта 0x00011FC0 + offset).

Imagine the function sub_E3Ein this form:

def sub_E3E(prev_sub_E3E_result, d2, d2_storage):
	def calc_offset():
		return 2 * ((prev_sub_E3E_result ^ d2) & 0xff)
	d2_storage.append(d2)
	offset = calc_offset()
	with open("dump_00011FC0", 'rb') as f:
		dump_00011FC0_4096b = f.read()
	some = dump_00011FC0_4096b[offset:offset + 2]
	some = int.from_bytes(some, byteorder="big")
	prev_sub_E3E_result = prev_sub_E3E_result >> 8
	return prev_sub_E3E_result ^ some

dump_00011FC0Is just a file where I saved 4096 bytes from [0x00011FC0:00011FC0+4096].

Activity around 1FC4

0x00001FC4We have not yet seen the address , but it is easy to find, because the block goes almost immediately after the first cycle.

This block changes the contents by address 0x00FF0D46(register a2), and it is there that the hash of the key is stored, so we are now studying this block. Let's see what happens here.

A condition that determines selected left or right branch is: (хэш первой половины ключа) & 0b1 != 0. That is, the first bit of the hash is checked.
If you look at both branches, you will see:
- In both cases, a shift to the right by 1 bit occurs.
- An operation is performed on the hash in the left branch ИЛИ 0x8000.
- In both cases, the 0x00FF0D46processed hash value is written to the address , that is, the hash is replaced with a new value.
- Further calculations are not critical, because, roughly speaking, there are no write operations to (a2)(there is no instruction where the second operand would be (a2)).

Imagine a block like this:

def transform(hash_2b):
	new = hash_2b >> 1
	if hash_2b & 0b1 != 0:
		new = new | 0x8000
	return new

The second important loop is loc_203E

loc_203E- cycle, because 0x0000206C bne.s loc_203E.

This cycle calculates the hash and here is its main feature: jsr (a0)- it is a call to a function sub_E3Ethat we have already examined - it relies on the previous result of its own work and on some input argument (it was passed through the register above d2, and here through d0).

Let's find out what is passed to her through the register d0.

We have already met with the construction 0x34(a2) - there the function sub_E3Esaves the passed argument. This means that previously passed arguments are used in this loop. But not all.

Decrypt the code part:

Здесь берется 2 байта из адреса a2+0x1C
move.w 0x1C(a2), d0
Инвертируется
neg.l d0
В регистр a0 пишется адрес последнего сохраненного аргумента функции sub_E3E
movea.l 0x34(a2), a0
Наконец, в d0 пишутся 2 байта по адресу a0-d0(вычитание потому что d0 инвертирован)
move.b (a0, d0.l), d0

The bottom line is a simple action: at each iteration, take the d0stored argument from the end of the array. That is, if d04 is stored in , then we take the fourth element from the end.

If so, what are the specific meanings d0? Here I did without scripts, but simply wrote them out, putting a break at the beginning of this block. Here they are: 0x04, 0x04, 0x04, 0x1C, 0x1A, 0x1A, 0x06, 0x42, 0x02.

Now we have everything to write a complete key hash calculation function.

Full hash calculation function

def finish_hash(hash_2b):
	# Правило преобразования хэша
	def transform(hash_2b):
		new = hash_2b >> 1
		if hash_2b & 0b1 != 0:
			new = new | 0x8000
		return new
	main_cycle_counter = [17, 2, 2, 3, 4, 38, 10, 30, 4]
	second_cycle_counter = [2, 2, 2, 2, 2, 4, 2, 4, 28]
	counters = list(zip(main_cycle_counter, second_cycle_counter))
	d2_storage = []
	storage_offsets = [0x04, 0x04, 0x04, 0x1C, 0x1A, 0x1A, 0x06, 0x42, 0x02]
	prev_sub_E3E_result = 0x0000
	sub_5EC = sub_5EC_gen()
	for i in range(9):
		c = counters[i]
		for _ in range(c[0]):
			d0 = next(sub_5EC)
			d1 = hash_2b & 0xff
			d2 = d0 ^ d1
			curr_sub_E3E_result = sub_E3E(prev_sub_E3E_result, d2, d2_storage)
			prev_sub_E3E_result = curr_sub_E3E_result
		storage_offset = storage_offsets.pop(0)
		for _ in range(c[1]):
			d2 = d2_storage[-storage_offset]
			curr_sub_E3E_result = sub_E3E(prev_sub_E3E_result, d2, d2_storage)
			prev_sub_E3E_result = curr_sub_E3E_result
		hash_2b = transform(hash_2b)
	return curr_sub_E3E_result

Health Check

In the debugger, we put a break on the address 0x0000180A move.l 0x1000,(sp)(immediately after calculating the hash).
Break to the address 0x00001F16 beq.w loc_20EA(comparison of the final hash with a constant 0xCB4C).
In the program, enter the key ABCDEFGHIJKLMNOP, click Enter.
The debugger stops at 0x0000180A, and we see that a d5value has been written to the register 0xFFFF44CC, the 0x44CCfirst hash.
We start the debugger further.
We stop at 0x00001F16and see what 0x00FF0D6Alies at the address 0x4840- the final hash

Now check out our finish_hash (hash_2b) function:

>>> r = finish_hash(0x44CC)
>>> print(hex(r))
0x4840

We are looking for the right key 1

The correct key is this key whose final hash is equal 0xCB4C(found out above). Hence the question: what should be the first hash for the final to become 0xCB4C?

Now it’s easy to find out:

def find_CB4C():
	result = []
	for hash_2b in range(0xFFFF+1):
		final_hash = finish_hash(hash_2b)
		if final_hash == 0xCB4C:
			result.append(hash_2b)
	return result
>>> r = find_CB4C()
>>> print(r)

The output of the program suggests that there is only one option: the first hash should be 0xFEDC.

What characters do we need to have their first hash 0xFEDC?

Since 0xFEDC = хэш_первых_4_символов ^ хэш_вторых_4_символов, then you need to find only хэш_первых_4_символовbecause хэш_вторых_4_символов = хэш_первых_4_символов ^ 0xFEDC. And then decode both hashes.

The algorithm is as follows:

def get_first_half():
	from collections import deque
	from random import randint
	def get_pairs():
		pairs = []
		for i in range(0xFFFF + 1):
			pair = (i, i ^ 0xFEDC)
			pairs.append(pair)
		pairs = deque(pairs)
		pairs.rotate(randint(0, 0xFFFF))
		return list(pairs)
	pairs = get_pairs()
	for pair in pairs:
		key_4s_0 = decode_hash_4s(pair[0])
		key_4s_1 = decode_hash_4s(pair[1])
		hash_2b_0 = get_hash_2b(key_4s_0)
		hash_2b_1 = get_hash_2b(key_4s_1)
		if hash_2b_0 == pair[0] and hash_2b_1 == pair[1]:
			return key_4s_0, key_4s_1

A bunch of options, choose any.

We are looking for the right key 2

The first half of the key is ready, what about the second?

This is the easiest part.

The responsible piece of code is located on 0x00FF2012, I got to it by manual tracing, starting with the address 0x00001F16 beg.w loc_20EA(validation of the first half of the key). In the register a0is the mail address, it loc_FF2012is a cycle, because bne.s loc_FF2012. It is executed as long as there is *(a0+d0)(the next byte of mail).

And the instruction jsr (a3)calls an already familiar function get_hash_2b, which now works with the second half of the key.

Let's make the code clearer:

while(d1 != 0x20){
	Подсчитывается длина почты
	d2++
	d1 = d1 & 0xFF
	Подсчитывается сумма байтов почты
	d3 = d3 + d1
	d0 = 0
	d0 = d2
	Очередной байт почты
	d1 = *(a0+d0)
}
d0 = get_hash_2b(key_byte_8)
d3 = d0^d3
d0 = get_hash_2b(key_byte_12)
d2 = d2 - 1
d2 = d2 << 8
d2 = d0^d2
if (d2 == d3)
	success_branch

In the register d2- (длина почты-1) << 8. B d3is the sum of bytes of mail characters.

The criterion of correctness is obtained as follows: третья_четверть_ключа ^ d2 == последняя_четверть_ключа_2 ^ d3.

We write the selection function of the second half of the key:

def get_second_half(email):
	from collections import deque
	from random import randint
	def get_koeff():
		k1 = sum([ord(c) for c in email])
		k2 = (len(email) - 1) << 8
		return k1, k2
	def get_pairs(k1, k2):
		pairs = []
		for a in range(0xFFFF + 1):
			pair = (a, (a ^ k1) ^ k2)
			pairs.append(pair)
		pairs = deque(pairs)
		pairs.rotate(randint(0, 0xFFFF))
		return list(pairs)
	k1, k2 = get_koeff()
	pairs = get_pairs(k1, k2)
	for pair in pairs:
		key_4s_0 = decode_hash_4s(pair[0])
		key_4s_1 = decode_hash_4s(pair[1])
		hash_2b_0 = get_hash_2b(key_4s_0)
		hash_2b_1 = get_hash_2b(key_4s_1)
		if hash_2b_0 == pair[0] and hash_2b_1 == pair[1]:
			return key_4s_0, key_4s_1

Keygen

Mail must be a capsule.

def keygen(email):
	first_half = get_first_half()
	second_half = get_second_half(email)
	return "".join(first_half) + "".join(second_half)
>>> email = "M.GAYANOV@GMAIL.COM"
>>> print(keygen(email))
2A4FD493BA32AD75

Thanks for attention! All code is available here .

Tags: