IDA C++ SDK 9.2
|
We can imagine a virtual micro machine that executes microcode. This virtual micro machine has many registers. Each register is 8 bits wide. During translation of processor instructions into microcode, multibyte processor registers are mapped to adjacent microregisters. Processor condition codes are also represented by microregisters. The microregisters are grouped into following groups:
Each micro-instruction (minsn_t) has zero to three operands. Some of the possible operands types are:
The operands (mop_t) are l (left), r (right), d (destination). An example of a microinstruction:
add r0.4, #8.4, r2.4
which means 'add constant 8 to r0 and place the result into r2'. where
Each operand has a size specifier. The following sizes can be used in practically all contexts: 1, 2, 4, 8, 16 bytes. Floating types may have other sizes. Functions may return objects of arbitrary size, as well as operations upon UDT's (user-defined types, i.e. are structs and unions).
Memory is considered to consist of several segments. A memory reference is made using a (selector, offset) pair. A selector is always 2 bytes long. An offset can be 4 or 8 bytes long, depending on the bitness of the target processor. Currently the selectors are not used very much. The decompiler tries to resolve (selector, offset) pairs into direct memory references at each opportunity and then operates on mop_v operands. In other words, while the decompiler can handle segmented memory models, internally it still uses simple linear addresses.
The following memory regions are recognized:
If the operand size is bigger than 1 then the register operand references a block of registers. For example:
ldc #1.4, r8.4
loads the constant 1 to registers 8, 9, 10, 11:
#1 -> r8 #0 -> r9 #0 -> r10 #0 -> r11
This example uses little-endian byte ordering. Big-endian byte ordering is supported too. Registers are always little- endian, regardless of the memory endianness.
Each instruction has 'next' and 'prev' fields that are used to form a doubly linked list. Such lists are present for each basic block (mblock_t). Basic blocks have other attributes, including:
These lists are represented by the mlist_t class. It consists of 2 parts:
All basic blocks of the decompiled function constitute an array called mba_t (array of microblocks). This is a huge class that has too many fields to describe here (some of the fields are not visible in the sdk) The most importants ones are:
Facilities for debugging decompiler plugins: Many decompiler objects have a member function named dstr(). These functions create a text representation of the object and return a pointer to it. They are very convenient to use in a debugger instead of inspecting class fields manually. The mba_t object does not have the dstr() function because its text representation very long. Instead, we provide the mba_t::dump_mba() and mba_t::dump() functions.
To ensure that your plugin manipulates the microcode in a correct way, please call mba_t::verify() before returning control to the decompiler.