DamonV79 October 6, 2015 at 14:08

Adding a new processor family to IDA pro

Not so long ago I had to gut the firmware from the M16C (Mitsubishi / Renesas). I was surprised to find that IDA v6.1.xxx turns out not to "hold" this family of controllers, alas. However, the SDK is available, which means it’s not scary - we will correct the situation. As practice has shown, there is nothing beyond the complexity of writing your module (not rocket science, tea).

Denial of responsibility

I am not a specialist in IDA pro and writing modules for it. Since the task was to analyze the firmware, the module was written in a hurry (on the knee), part of the code was pulled from the SDK, without understanding how it works (and whether this is necessary at all). The rest of the code was sorted out and it was “creatively” rethought.

I had no tests and no time to write them. The correctness of the work was checked by the disassembled firmware listing in the Renesas IDE. Thus, errors are possible (and certainly there are, although I did not come across any errors in the process). If someone has a desire to write tests or modify a module, I will be glad.

Be that as it may, the module turned out to be quite working and allowed to complete the task. In this article, I will outline my thoughts on all of this. So use this work at your own risk.

Sources and plugin assembly

Sources are here .
Since I do not particularly respect Studio (MSVC ++), I used MinGW for assembly and piled my toolchain. You can take it here . This toolchain is self-sufficient - it contains a compiler and tools for assembly.

Preparation for assembly is as follows. Unpack IDA_Plugins.7z somewhere, clone the repository from GitHub, and copy the m16c_xx directory to the root of the IDA_Plugins directory, then run build.cmd.

Introduction

Each processor module, which is actually a regular dll (with a slight difference - a slightly changed DOS header), must export a structure with the name LPH, of type processor_t. It stores pointers to key functions and module structures.
From the whole variety of fields of this structure, we will be primarily interested in the following pointers to functions:

processor_t LPH = {
            …
    ana,        // analyze an instruction and fill the 'cmd' structure
    emu,        // emulate an instruction
    out,        // generate a text representation of an instruction
    outop,      // generate a text representation of an operand
            …
}

With their help, basically, all the work is done. The ana () function is called every time when analyzing a new instruction (if there is something to analyze), its signature:

int idaapi ana(void); //analyze one instruction and return the
                      // instruction length

The task of this function, by selecting the byte from the current instruction address pointer, try to decode the instruction, if it does not work out, then continue to select subsequent bytes until the entire instruction and its operands are decoded. Then fill in the fields of the global variable cmd and return the length of the instruction in bytes.

The emu () function is intended to emulate an instruction, its signature:

int idaapi emu(void); //emulate one instruction

The objective of this function is to:

creating cross-references from (k) this instruction, both for data and for code;
creating stack variables (unfortunately, I haven’t yet figured out how this works)
something else there.

The out () function creates and displays a textual representation of the assembler instruction, its signature:

void idaapi out(void); //output a single disassembled instruction

The outop () function creates and displays a text representation of the operands of the assembler instruction, its signature:

bool idaapi outop(op_t &x); //output an operand of disassembled
                            // instruction

Instruction analysis

The analysis of the source data is carried out by the function ana () , which we have to implement. Its task, reading sequentially the bytes of the firmware, determine the instructions, their operands and the length of the instructions.

After recognizing the instruction and its parameters, fill in the fields of the global variable cmd, which is of type insn_t :

class insn_t { 
public: 
  ea_t cs; // Current segment base paragraph. Set by kernel 
  ea_t ip; // Virtual address of instruction (within segment).
           // Set by kernel 
  ea_t ea; // Linear address of the instruction. Set by kernel 
  uint16 itype; // instruction enum value (not opcode!).
                // Proc sets this in ana 
  uint16 size; // Size of instruction in bytes. Proc sets this in ana 
  union { 
    // processor dependent field. Proc may set this 
    uint16 auxpref; 
    struct { 
      uchar low; 
      uchar high; 
    } auxpref_chars;
  };
  char segpref;      // processor dependent field. Proc may set this 
  char insnpref;     // processor dependent field. Proc may set this 
  op_t Operands[6];  // instruction operand info. Proc sets this in ana 
  char flags;        // instruction flags. Proc may set this 
};

As can be seen from the description, the instruction can have up to 6 operands (which is “behind the eyes” for us - in our case, operations contain a maximum of 3 operands).

So far I have written a module only for the RISC controller - everything is quite simple there. We put a mask on the CPC field (Operation Code) of the instruction, in the branches of the switch statement we check the additional conditions and, in fact, everything (the code for the Microchip PIC controller is analyzed in approximately this way, you can look at the SDK for an example). The advantages of RISC are obvious here - a reduced set of commands and their equal length. Alas, for the M16C I counted more than 300 unique commands (it all depends on the point of view - I took into account unique CPCs), and even variable-length commands (CISC). Therefore, the switch statementbecause it is cumbersome and fraught with difficult to grasp errors.

Here you need to make a small digression. Since machine code (as well as assembler itself), in fact, is a regular language ( grammar ), it should, without problems, be understood using a state machine (FSM), which, in essence, is the processor.

There was no desire to take the machine manually, but there was a positive experience with Ragel - a finite state machine compiler from a special FSM description language into C, C ++, Objective-C, D, Java, OCaml, Go or Ruby code. It does not create any external dependencies, only a self-sufficient source in the selected programming language.

Among other things, Ragelinteresting in that it allows you to parse input data "on the fly." Those. there is no need to form and pass to the parser for analysis a large buffer that is guaranteed to contain a command, but you can limit yourself to small amounts of data, up to one byte, while maintaining the state between calls. It suits us perfectly!

The result is a kind of DSL for parsing processor instructions.

DSL for parsing commands

The advantage of this DSL over switch , primarily in its linearity. No need to skip the operator branches to understand how this works or modify behavior. All processing of CPCs and operands for a specific command is concentrated in one place. Example:

#//   0x00             0000 0000                            BRK
  M16C_BRK = 0x00 @ {
    cmd.itype = M16C_xx_BRK;
    cmd.Op1.type = o_void;
    cmd.Op1.dtyp = dt_void;
  };

Or, as an option, a command with operands:

#//   0x01..0x03       0000 00DS                            MOV.B:S        R0L, DEST
  M16C_MOV_B_S_R0L_DEST = (0x01..0x03) @ {
      cmd.itype = M16C_xx_MOV_B_S_R0L_DEST;
      MakeSrcDest8(SRC_DEST_R0L, cmd.Op1);
      switch(*p & 0x03) {
        case 0x01:
          MakeSrcDest8(SRC_DEST_DSP_8_SB_, cmd.Op2);
          break;
        case 0x02:
          MakeSrcDest8(SRC_DEST_DSP_8_FB_, cmd.Op2);
          break;
        default:
          MakeSrcDest8(SRC_DEST_ABS16, cmd.Op2);
          break;
      }
    };

It seems to me that it is quite clear and convenient. In cmd.itype placed one of the enumeration values enum opcodes (image ins.hpp ), which will further indicate a textual representation of instructions, the number of operands and instructions describing the interaction with operands. And also the operand fields are filled.

Instruction execution emulation

The instructions themselves, the number of operands and the effect on the operands are described in the instruc_t instructions [] array ( ins.cpp ). Basically, the record format is simple and intuitive:

instruc_t instructions[ ] = {
    ...
{ "ADC.B",        CF_USE1|CF_CHG2                 },
{ "ADC.W",        CF_USE1|CF_CHG2                 },
{ "ADC.B",        CF_USE1|CF_CHG2                 },
{ "ADC.W",        CF_USE1|CF_CHG2                 },
{ "ADCF.B",       CF_CHG1                         },
{ "ADCF.W",       CF_CHG1                         },
{ "ADD.B:G",      CF_USE1|CF_CHG2                 },
{ "ADD.W:G",      CF_USE1|CF_CHG2                 },
{ "ADD.B:Q",      CF_USE1|CF_CHG2                 },
    ... 
};

It can be seen that the instruction " ADC.B " has 2 operands, the first is just used, and the second is changed during the execution of the instruction. Which is logical: ADC is ADdition with Carry and the operation looks like this:

[ Syntax ]
  ADC.size src,dest 
        ^--- B, W
[ Operation ]
  dest <- src + dest + C

Next, the execution of the instruction itself in the emu () function is emulated .

int emu( ) {
  unsigned long feature = cmd.get_canon_feature( );
  if( feature & CF_USE1 ) TouchArg( cmd.Op1, 1 );
  if( feature & CF_USE2 ) TouchArg( cmd.Op2, 1 );
  if( feature & CF_CHG1 ) TouchArg( cmd.Op1, 0 );
  if( feature & CF_CHG2 ) TouchArg( cmd.Op2, 0 );
  if( !( feature & CF_STOP ) )
    ua_add_cref( 0, cmd.ea + cmd.size, fl_F);
  return 1;
}

As you can see, with an argument, we convert it to a digestible form in the TouchArg () function . This function looks as follows:

void TouchArg( op_t &x, int isload ) {
  switch ( x.type ) {
    case o_near: {
        cref_t ftype = fl_JN;
        ea_t ea = toEA(cmd.cs, x.addr);
        if ( InstrIsSet(cmd.itype, CF_CALL) )
        {
          if ( !func_does_return(ea) )
            flow = false;
          ftype = fl_CN;
        }
        ua_add_cref(x.offb, ea, ftype);
      }
      break;
    case o_imm:
      if ( !isload ) break;
      op_num(cmd.ea, x.n);
      if ( isOff(uFlag, x.n) )
        ua_add_off_drefs2(x, dr_O, OOF_SIGNED);
      break;
    case o_displ:
      if(x.dtyp == dt_byte)
    	  op_dec(cmd.ea, x.n);
      break;
    case o_mem: {
        ea_t ea = toEA( dataSeg( ),x.addr );
        ua_dodata2( x.offb, ea, x.dtyp );
        if ( !isload )
          doVar( ea );
        ua_add_dref( x.offb, ea, isload ? dr_R : dr_W );
      }
      break;
    default:
      break;
  }
}

Depending on the type of operand, we accordingly fill in the fields of the op_t structure (we “decode” the operand).

Output a text representation of the instruction

The out () function is responsible for this action. It looks like this:

void out() {
	char str[MAXSTR];  //MAXSTR is an IDA define from pro.h
	init_output_buffer(str, sizeof(str));
	OutMnem(12);       //first we output the mnemonic
	if( cmd.Op1.type != o_void )  //then there is an argument to print
		out_one_operand( 0 );
	if( cmd.Op2.type != o_void ) {  //then there is an argument to print
		out_symbol(',');
		out_symbol(' ');
		out_one_operand( 1 );
	}
	if( cmd.Op3.type != o_void ) {  //then there is an argument to print
		out_symbol(',');
		out_symbol(' ');
		out_one_operand( 2 );
	}
	term_output_buffer();
	gl_comm = 1;      //we want comments!
	MakeLine(str);    //output the line with default indentation
}

We print a text representation of the instruction and, if any, operands with minimal formatting.

Output text representation of operands

Here the code is more interesting, but everything is also quite simple:

bool idaapi outop(op_t &x) {
	ea_t ea;
	switch (x.type) {
		case o_void:
			return 0;
		case o_imm:
			OutValue(x, OOF_NUMBER | OOF_SIGNED | OOFW_IMM);
			break;
		case o_displ:
			{  //then there is an argument to print
				OutValue(x, OOF_NUMBER | OOF_SIGNED | OOFW_IMM);
				switch (x.dtyp) {
					case dt_byte:
						break;
					case dt_word:
						out_symbol(':');
						out_symbol('8');
						break;
					case dt_dword:
						out_symbol(':');
						out_symbol('1');
						out_symbol('6');
						break;
					default:
						ea = toEA(cmd.cs, x.addr);
						if (!out_name_expr(x, ea, x.addr))
							out_bad_address(x.addr);
						break;
				}
				out_symbol('[');
				out_register(M16C_xx_RegNames[x.reg]);
				out_symbol(']');
			}
			break;
		case o_phrase:
			out_symbol('[');
			out_register(M16C_xx_RegNames[x.reg]);
			out_symbol(']');
			break;
		case o_near:
			ea = toEA(cmd.cs, x.addr);
			if (!out_name_expr(x, ea, x.addr))
				out_bad_address(x.addr);
			break;
		case o_mem:
			ea = toEA(dataSeg(), x.addr);
			if (!out_name_expr(x, ea, x.addr))
				out_bad_address(x.addr);
			break;
		case o_reg:
			out_register(M16C_xx_RegNames[x.reg]);
			break;
		default:
			warning("out: %a: bad optype %d", cmd.ea, x.type);
		break;
	}
	return true;
}

Depending on the type of operand, we appropriately arrange its output to the screen.

Deficiencies and problems identified

At the moment, there is one significant misunderstanding. When you try to create a string (ASCII-Z String) in Undefined data, a string is created only up to an address multiple of four bytes, even if a zero byte has not yet been encountered. If zero byte occurs earlier, then the line ends with it. With the array, the same problem.

The most unpleasant thing is that I don’t even know where to dig in this situation. Can someone tell me?

Conclusion

Thus, writing plugins for IDA pro is not very difficult. Everything is simple, except with a large number of commands of the target assembler, it is rather tedious. The introduction of DSL for parsing CPCs and operands greatly simplifies and speeds up development.

Tags: