What's new

How to decode ia-32 instructions?

blueshogun96

A lowdown dirty shame
This is something i've been trying to figure out for weeks, almost a month. How do you decode x86 (IA-32) instructions in C? I've been searching google for ages and haven't been able to make sense of the Intel docs. Anyone have any links or ideas? Thanks.

http://www.intel.com
 

dwx

New member
i already tried to disassemble(decode) x86 opcodes and it is a real pain!
get some sourcecode from a x86 disassembler - a lot of them are open source.

bye dwx
 

Exophase

Emulator Developer
I'm kinda rusty, but I'll see what I can glean from some old x86 emitters I wrote in NASM macros a few years ago.

I'm just going to talk about the basic instruction set for 80386 (ie, no extended instructions). Note that this assumes that the instructions are in protected mode.

First, you have prefix bytes. These can change the arithmetic or memory pointer word size from the current mode's to the other one (ie, if you're in protected mode you have a word size of 32bits, so using these prefixes can go to 16bits). They can do other things like repeat certain instructions, lock the coprocessor, or override the default segment register with another, but in protected mode only the word size changing ones are really useful, and these are:

0x66: Treat arithmetic operands as in opposite mode, IE if doing add eax, 12345 then it'll really use ax and the immediate will be 16bits long instead of 32.

0x67: Treat memory operands as in opposite mode, IE if doing add eax, [ebx + 12345] it'll use bx instead of ebx and a 16bit displacement instead of a 32bit one.

There can be zero or multiple prefix bytes. I don't know the exact limit, but I'm pretty sure the overall instruction cannot be more than 15 bytes.

Then you have the opcode. This is one or two bytes and says what instruction it is. Some opcodes also have an operand built in, these are special small form instructions.

If there are any operands not implicit in the instruction itself you then have a Mod R/M byte which specifies the operands. If there are two operands then at least one of them is always in a register. The top two bits specifies where the other operand (the non-register one) is.

0 means it's at a memory location with a register offset.
1 means it's at a memory location with a register + 8bit immediate offset.
2 means it's at a memory location with a register + 32bit immediate offset (or 16bits if prefix 0x67 is used)
3 means it's in a register.

Then the next lower 6bits specify the registers to use. The upper 3bits specify the always register value and use the standard register encoding which goes eax, ecx, edx, ebx, esp, ebp, esi, edi. The lower 3bits specify the memory base register, however, ebp and esp have special usage. ebp in non-displacement indexing means direct address with word size immediate, so a 16/32bit immediate is used as a memory address directly.

If it's esp then a special memory address mode is used, and the next byte is called the SIB, Scaled Index Byte. The SIB byte allows you to create a complex memory address that can add a base register, an index register multiplied by a scale value, and an offset together to form the address. The Mod R/M will determine if there's an immediate displacement or not.

The top two bits of the SIB byte determines the scale, it can be a value from 0 to 3 which means scale by 1, 2, 4, or 8. Of the next 6 bits, the top 3 specify the index register, and the bottom 3 the base register. Using esp as the index register really means no index register should be used (I think, I can't find this clearly in my source but I'm pretty sure this is how it works). The only reason you'd want to have no scaled index is so you can address relative to esp.

Next comes the 8, 16, or 32bit memory displacement as specified in the Mod R/M byte if it has one.

Finally, there's an 8, 16, or 32bit arithmetic immediate if the instruction has one (this is implicit to the opcode).

So if you want to decode an x86 instruction, first you should grab all the prefix bytes. Then you get the first byte of the opcode, and if it's a 2byte one get the second one. Then if the instruction has non-implicit non-immediate operands get the Mod R/M byte. If the address mode is memory and the base register is esp, get the SIB. If there's a memory displacement get it, and if there's an immediate operand get that.

If you're disassembling, determine the names of each of the prefix bytes, then the instruction; if the instruction has an implicit operand grab it, then determine the other operands (display them based on the address mode). I'd recommend looking up the names of things using a lookup table.

That should cover most of the basics, hope I didn't get anything wrong. Let me know if you have any questions. People complain about x86 a lot but if you ask me some aspects of the instruction encoding are more straightforward than say, ARM (and others are a lot worse..)
 
Last edited:
OP
blueshogun96

blueshogun96

A lowdown dirty shame
Open source x86 disassemblers? Why didnt I think of that? I'll try that. Also exophase, that was a really good explanation. I'm gonna have to read it again a few times though so it all sinks in. Thanks.
 

Lightning

Emulator Developer
All the details are in the following manuals

http://www.intel.com/design/pentium4/manuals/index_new.htm

A-32 Intel® Architecture Software Developer's Manual, Volume 2A and Volume 2B.

Looking at chapter 2.1 of Volume 2A, the byte decode is defined.

Example, MOV EBX, 05040302h. is B8 +rd. Looking up +rd in Chapter 3

+rb, +rw, +rd, +ro — A register code, from 0 through 7, added to the hexadecimal byte given at the left of the plus sign to form a single opcode byte. See Table 3-1 for the codes. The +ro columns in the table are applicable only in 64-bit mode.

So, EBX = 3, B8 + 3 = BB. Put in the value in little edian order (lowest byte first), 02030405. The result is BB02030405.
 
OP
blueshogun96

blueshogun96

A lowdown dirty shame
Lightning said:
All the details are in the following manuals

http://www.intel.com/design/pentium4/manuals/index_new.htm

A-32 Intel® Architecture Software Developer's Manual, Volume 2A and Volume 2B.

Looking at chapter 2.1 of Volume 2A, the byte decode is defined.

Example, MOV EBX, 05040302h. is B8 +rd. Looking up +rd in Chapter 3

+rb, +rw, +rd, +ro — A register code, from 0 through 7, added to the hexadecimal byte given at the left of the plus sign to form a single opcode byte. See Table 3-1 for the codes. The +ro columns in the table are applicable only in 64-bit mode.

So, EBX = 3, B8 + 3 = BB. Put in the value in little edian order (lowest byte first), 02030405. The result is BB02030405.
First of all, sorry it's been so long since a reply, my online time has been cut short lately. Thanks for your explanation. I appreciate it.
 
OP
blueshogun96

blueshogun96

A lowdown dirty shame
Ok, earlier today, I've been working on this PIII core alot. Based on the information found in the intel docs, the examples by Exophase and Lightning, and an example x86 decoder written by Nick Capens from DevMaster.net, I was able to come up with this.

Here's my header file (Yes = 1, No = 0)
Code:
/**********************************************************************\
	File:	xbox_cpu.h
	Desc:	Contains Pentium III related code.
	Author:	blueshogun96
\**********************************************************************/

#ifndef XboxCPU_h
#define XboxCPU_h

#define LITTLE_ENDIAN	Yes

typedef enum
{
	EAX = 0,
	ECX = 1,
	EDX = 2,
	EBX = 3,
	ESP = 4,
	EBP = 5,
	ESI = 6,
	EDI = 7
}X86_REG32;

typedef enum
{
	ES = 0,
	CS = 1, 
	SS = 2, 
	DS = 3, 
	FS = 4, 
	GS = 5
}X86_SREG;

union PIIIReg
{
	uint32 d;
	uint16 w;

	struct _b
	{
#if LITTLE_ENDIAN
		uint08 l;
		uint08 h;
#else
		uint08 h;
		uint08 l;
#endif
	}b;
};

struct PIIISReg
{
	uint32 base;
	uint16 selector;
	uint32 limit;
	uint32 d;
};

struct PIIITReg
{
	uint32 base;
	uint16 limit;
};

struct PIIISRegDesc
{
	uint32 base;
	uint16 selector;
	uint32 limit;
};

union X87Reg
{
	uint64 i;
	double f;
};

struct cpuc
{
	// IA-32 registers (Pentium III)

	union PIIIReg regs[8];	// General purpose registers
	struct PIIISReg sregs[6];	// Segment registers
	struct PIIITReg gdtr, idtr;
	struct PIIISRegDesc ldtr, task_reg;
	uint32 eip;			// Instruction pointer
	uint32 last_eip;	
	uint64 mm[8];	// MMX registers
	uint32 cr[5];	// Control registers
	uint32 dr[8];	// Debug registers
	uint32 tr[8];	// Test registers
	union X87Reg st[8];	// FPU registers
	uint16 fpu_status_word;
	uint16 fpu_control_word;
	uint16 fpu_tag_word;
	uint64 fpu_data_operand_ptr;
	uint64 fpu_inst_ptr;

	struct uint128 xmm[8];	// SSE registers
	uint32 mxcsr;	// SSE (control and status register)

	uint32 pc;	// Alternate program counter

	// Flags
	uint32 eflags;
	uint08 CF, PF, AF, ZF, SF, TF;
	uint08 IF, DF, OF, IOPL, NT;
	uint08 RF, VM, AC, VF, VP, ID;

	uint32 halted;

	sint32 cycles, base_cycles;
	uint08 opcode;
};

struct cpuc PIIIContext;

#define p PIIIContext

// 32-bit register defines
#define REG_EAX p.regs[EAX].d
#define REG_ECX p.regs[ECX].d
#define REG_EDX p.regs[EDX].d
#define REG_EBX p.regs[EBX].d
#define REG_ESP p.regs[ESP].d
#define REG_EBP p.regs[EBP].d
#define REG_ESI p.regs[ESI].d
#define REG_EDI p.regs[EDI].d
#define REG_EIP p.eip

// 16-bit register defines
#define REG_AX p.regs[EAX].w
#define REG_CX p.regs[ECX].w
#define REG_DX p.regs[EDX].w
#define REG_BX p.regs[EBX].w
#define REG_SP p.regs[ESP].w
#define REG_BP p.regs[EBP].w
#define REG_SI p.regs[ESI].w
#define REG_DI p.regs[EDI].w

// 8-bit register defines
#define REG_AH p.regs[EAX].b.h
#define REG_AL p.regs[EAX].b.l
#define REG_CH p.regs[ECX].b.h
#define REG_CL p.regs[ECX].b.l
#define REG_DH p.regs[EDX].b.h
#define REG_DL p.regs[EDX].b.l
#define REG_BH p.regs[EBX].b.h
#define REG_BL p.regs[EBX].b.l

// Segment registers
#define SREG_ES p.sregs[ES]
#define SREG_CS p.sregs[CS]
#define SREG_SS p.sregs[SS]
#define SREG_DS p.sregs[DS]
#define SREG_FS p.sregs[FS]
#define SREG_GS p.sregs[GS]

// Flag macros
#define Set_CF( x ) p.CF = ( x & 0x1 ) ? 1 : 0
#define Set_PF( x ) p.PF = ( x & 0x4 ) ? 1 : 0
#define Set_AF( x ) p.AF = ( x & 0x10 ) ? 1 : 0
#define Set_ZF( x ) p.ZF = ( x & 0x40 ) ? 1 : 0
#define Set_SF( x ) p.SF = ( x & 0x80 ) ? 1 : 0
#define Set_TF( x ) p.TF = ( x & 0x100 ) ? 1 : 0
#define Set_IF( x ) p.IF = ( x & 0x200 ) ? 1 : 0
#define Set_DF( x ) p.DF = ( x & 0x400 ) ? 1 : 0
#define Set_OF( x ) p.OF = ( x & 0x800 ) ? 1 : 0

// misc
#define MOD( d ) ( ( d >> 6 ) & 0x3 )
#define REG( d ) ( ( d >> 3 ) & 0x7 )
#define RM( d ) ( d & 0x7 )

#define IA32OP(n) ia32op_##n


void CPU_Reset( uint entry_point );
void CPU_ExecuteInstruction( uint08* ptr );
void CPU_SetEIPReg( uint32 eip );
// void CPU_RaiseException <- TODO

#endif

And here's my source file

Code:
/**********************************************************************\
	File:	xbox_cpu.c
	Desc:	Contains Pentium III related code.
	Author:	blueshogun96
\**********************************************************************/

#include "xenoborg.h"
#include "xbox_cpu.h"



//***********************************************
// Name: CPU_SetEIPReg
// Desc: Sets the EIP (Instruction Pointer) register
//		 to whatever you want/need it to be.
//***********************************************
INLINE void CPU_SetEIPReg( uint32 eip )
{
	REG_EIP = eip;
}

//***********************************************
// Name: CPU_Reset
// Desc: Reset's the CPU regs to default registers
// Notes: See Intel's docs (IA-32 programmers guide,
//		  Vol 3A: System programming guide Part 1,
//        page 377) for more details.  I'm assuming
//		  that the Xbox does the same as a regular 
//		  Pentium III.  The only real difference here
//		  is that I set the EIP register to the xbe's
//		  program entry point.  Make any corrections 
//		  you see fit.  kthnx.
//***********************************************
INLINE void CPU_Reset( uint entry_point )
{
	// Clear context
	memset( &p, 0x0, sizeof( p ) );

	// Reset flags and regs
	p.eflags = 0x00000002;
	p.eip = 0xFFF0;		//
	p.cr[0] = 0x60000010;

	SREG_CS.selector = 0xF000;
	SREG_CS.base = 0xFFFF0000;
	SREG_CS.limit = 0xFFFF;
	SREG_ES.limit = 0xFFFF;
	SREG_SS.limit = 0xFFFF;
	SREG_DS.limit = 0xFFFF;
	SREG_FS.limit = 0xFFFF;
	SREG_GS.limit = 0xFFFF;

	//REG_EDX = 0x000n06xx;		<- TODO

	p.fpu_control_word = 0x0040;
	p.fpu_tag_word = 0x5555;

	p.mxcsr = 0x1F80;

	p.gdtr.limit = 0xFFFF;
	p.ldtr.limit = 0xFFFF;
	p.idtr.limit = 0xFFFF;
	p.task_reg.limit = 0xFFFF;

	p.dr[6] = 0xFFFF0FF0;
	p.dr[7] = 0x00000400;

#if Yes
	CPU_SetEIPReg( entry_point );
#endif

	// TODO: perform any other cpu reset things here
}

//***********************************************
// Name: CPU_ExecuteInstruction
// Desc: Emulates a single IA-32 instruction
// Notes: This core is "partially" based off of
//		  Nick Capens's x86 decoder source code
//		  (from DevMaster.net).
//***********************************************
INLINE void CPU_ExecuteInstruction( uint08* ptr )	// Biiiiiiiiig TODO! 
{
	int OperandSize = 4;	// Operand Size (in bytes)
	int AddressSize = 4;	// Address Size (in bytes)
	int FPU = 0;
	int twoByte = No;		// Two byte instruction?
	uint08* func = ptr;		// Binary buffer pointer
	uint08 opcode;			// Opcode byte
	uint08 modRM;			// Mod R/M byte
	uint08 sib;				// SIB (Scale, Index, Base) byte
	uint32 displacement;	// Displacement byte(s)
	uint32 immediate;		// Immediate byte(s)

	// Copy the current value of EIP into the alternate program counter
	// The EIP reg will be updated later on
	p.pc = p.last_eip = REG_EIP;

	// 
	if( func[p.pc] == 0xCC );

	//--------------------
	// *PREFIXES*
	//--------------------

	// Get prefix bytes
	while( func[p.pc] == 0xF0 ||	// LOCK
		   func[p.pc] == 0xF2 ||	// REPNE/REPNZ
		   func[p.pc] == 0xF3 ||	// REP or REPE/REPZ
		   func[p.pc] == 0x2E ||	// CS segment override
		   func[p.pc] == 0x36 ||	// SS segment override prefix
		   func[p.pc] == 0x3E ||	// DS segment override prefix
		   func[p.pc] == 0x26 ||	// ES segment override prefix
		   func[p.pc] == 0x64 ||	// FS segment override prefix
		   func[p.pc] == 0x65 ||	// GS segment override prefix
		   func[p.pc] == 0x66 ||	// Operand-size override prefix
		   func[p.pc] == 0x67 )		// Address-size override prefix
	{
		if( func[p.pc] == 0x66 )
		{
			OperandSize = 2;
		}
		else if( func[p.pc] == 0x67 )
		{
			AddressSize = 2;
		}
		else if( ( func[p.pc] & 0xF8 ) == 0xD8 )
		{
			FPU = func[p.pc++];
			break;
		}

		p.pc++;
	}

	//--------------------
	// *OPCODE*
	//--------------------

	// Check for 2 byte opcode
	if( func[p.pc] == 0x0F )
	{
		twoByte = Yes;
		p.pc++;
	}

	// Get opcode byte
	opcode = func[p.pc++];

	//--------------------
	// *MOD R/M*
	//--------------------

	modRM = 0xFF;

	if( FPU )
	{
		if( ( opcode & 0xC0 ) != 0xC0 )
		{
			modRM = opcode;
		}
	}
	else if( !twoByte )
	{
		if( ( opcode & 0xC4 ) == 0x00 || 
        ( opcode & 0xF4 ) == 0x60 && ( ( opcode & 0x0A ) == 0x02 || ( opcode & 0x09 ) == 0x9 ) || 
        ( opcode & 0xF0 ) == 0x80 || 
        ( opcode & 0xF8 ) == 0xC0 && ( opcode & 0x0E ) != 0x02 || 
        ( opcode & 0xFC ) == 0xD0 || 
        ( opcode & 0xF6 ) == 0xF6 ) 
        {
			modRM = func[p.pc++]; 
        }
	}
	else
	{
		if( ( opcode & 0xF0 ) == 0x00 && ( opcode & 0x0F ) >= 0x04 && ( opcode & 0x0D ) != 0x0D || 
            ( opcode & 0xF0 ) == 0x30 || 
              opcode == 0x77 || 
            ( opcode & 0xF0 ) == 0x80 || 
            ( opcode & 0xF0 ) == 0xA0 && ( opcode & 0x07 ) <= 0x02 || 
            ( opcode & 0xF8 ) == 0xC8 ) 
        { 
            // No mod R/M byte 
        } 
        else 
        { 
            modRM = func[p.pc++]; 
        } 
    } 

	//--------------------
	// *SIB*
	//--------------------

	// Read SIB byte
	if( ( modRM & 0x7 ) == 0x4 && ( modRM & 0xC0 ) != 0xC0 )
	{
		sib = func[p.pc++];
	}

	//--------------------
	// *DISPLACEMENT*
	//--------------------

	// Displacement (TODO)
	if( ( modRM & 0xC5 ) == 0x05 ) p.pc += 4;	// Dword, no base
	if( ( modRM & 0xC0 ) == 0x40 ) p.pc++;		// Byte
	if( ( modRM & 0xC0 ) == 0x80 ) p.pc += 4;	// Dword

	//--------------------
	// *IMMEDIATE*
	//--------------------

	// Skip FPU stuff for now
	if( FPU )
	{
		// TODO
	}
	else if( !twoByte )
	{
		// Handle ops
		switch( opcode )
		{
		default:
			break;
		}

		if( ( opcode & 0xC7 ) == 0x04 || 
            ( opcode & 0xFE ) == 0x6A ||   // PUSH/POP/IMUL 
            ( opcode & 0xF0 ) == 0x70 ||   // Jcc 
              opcode == 0x80 || 
              opcode == 0x83 || 
            ( opcode & 0xFD ) == 0xA0 ||   // MOV 
              opcode == 0xA8 ||            // TEST 
            ( opcode & 0xF8 ) == 0xB0 ||   // MOV
            ( opcode & 0xFE ) == 0xC0 ||   // RCL 
              opcode == 0xC6 ||            // MOV 
              opcode == 0xCD ||            // INT 
            ( opcode & 0xFE ) == 0xD4 ||   // AAD/AAM 
            ( opcode & 0xF8 ) == 0xE0 ||   // LOOP/JCXZ 
              opcode == 0xEB || 
              opcode == 0xF6 && ( modRM & 0x30 ) == 0x00 )   // TEST 
        { 
            p.pc += 1; 
        } 
        else if( ( opcode & 0xF7 ) == 0xC2 ) 
        { 
			p.pc += 2;   // RET 
        } 
        else if( ( opcode & 0xFC ) == 0x80 || 
                 ( opcode & 0xC7 ) == 0x05 || 
                 ( opcode & 0xF8 ) == 0xB8 ||
                 ( opcode & 0xFE ) == 0xE8 ||      // CALL/Jcc 
                 ( opcode & 0xFE ) == 0x68 || 
                 ( opcode & 0xFC ) == 0xA0 || 
                 ( opcode & 0xEE ) == 0xA8 || 
                   opcode == 0xC7 || 
                   opcode == 0xF7 && ( modRM & 0x30 ) == 0x00 ) 
        { 
            p.pc += OperandSize; 
        } 
    } 
    else 
    { 
		// Handle ops
		switch( opcode )
		{
		default:
			break;
		}

        if( opcode == 0xBA ||            // BT 
            opcode == 0x0F ||            // 3DNow! 
          ( opcode & 0xFC ) == 0x70 ||   // PSLLW 
          ( opcode & 0xF7 ) == 0xA4 ||   // SHLD 
            opcode == 0xC2 || 
            opcode == 0xC4 || 
            opcode == 0xC5 || 
            opcode == 0xC6 ) 
        { 
            p.pc += 1; 
		} 
        else if( ( opcode & 0xF0 ) == 0x80 ) 
        {
            p.pc += OperandSize;   // Jcc -i
        }
    }

	// Update EIP register
	REG_EIP = p.pc;
}

Well, I did the best I could with what I have so far. If you see anything to be added or any corrections to be made, please let me know. Thanks alot guys.
 
Last edited:

Top