Intel 80x86 Architecture

Von Neumann architecture

This basic model of a computer as a processing unit that receives input, communicates with a memory and produces output is known as the von Nuemann architecture named after the first computer scientist John von Neumann. In this architecture, the processor itself consists of several specialized parts:

The arithmetic logic unit (ALU)-the central processing unit capable of performing mathematical and logical operations.
The control unit that directs the movement of instructions in and out of the processor and sends control signals to the ALU so that it performs the correct operation at a given time.
The processor registers : small special-purpose storage areas used to store the information the ALU is working with.
The memory unit, not part of the processor, contains both data and instructions (program). To be used, this information must be transferred to registers.

Arithmetic Logic Unit

The acronym ALU in Italian means Arithmetic logic unit and is what allows computers to perform operand evaluation and mathematical operations. It is located inside the CPU along with the control unit (which reads machine language instructions, executes them, and stores the results) and registers, which we will discuss later.

There are infinite possibilities for creating ALUs capable of performing different operations. The entire ALU is broken down into smaller units, each capable of processing only one bit for a given operation, which in cascade will give the final result. It is normal for there to be small differences between individual multi-bit ALUs in order to be able to optimally handle mathematical operations that do not generate an integer result in the available bits and thus be aware in case of Overflow of the result. Nowadays, almost all processors have multiple ALUs running in parallel and are often specialized for specific jobs.

Registers

Processors in the intel x86 family have at least the following registers: AX, BX, CX, DX, CS, DS, ES, SS, SP, BP, SI, DI, IP, FLAGS. Originally, until the birth of the 80386 processor, the AX, BX, CX, DX, SP, BP, SI, DI, FLAGS and IP registers had a size of 16 bits. Starting with the 80386, their size was increased to 32 bits and the letter E (to indicate extended) was added to their name in the first position. For example, the AX register became EAX.

EAX, EBX, ECX, and EDX are general purpose registers (general purpose registers), so any value can be assigned to them. However, during the execution of some instructions the general purpose registers are used to store well-determined values:

EAX (accumulator register) is used as an accumulator for arithmetic operations and contains the result of the operation.
EBX (base register) is used for memory addressing operations.
ECX (counter register) is used for “counting,” such as in operations of loops.
EDX (data register) is used in input/output operations, division and in multiplications.

CS, DS, ES and SS are the segment registers (segment registers) and should be used with caution:

CS (code segment) points to the memory area that contains the code. During program execution. Used in conjunction with IP (Instruction Pointer) is used to access the next instruction to be executed (caution: cannot be changed).
DS (data segment) points to the memory area that contains the data.
ES (extra segment) can be used as an auxiliary segment register.
SS (stuck segment) points to the memory area where the stack resides.

ESP, EBP, EIP are the pointer registers (pointer registers):

ESP (stack pointer) points to the top of the stack. It is modified by the PUSH (insertion of a data item into the stack) and POP (extraction of a data from the stack). Remember that the stack is a LIFO type structure (Last In First Out - the last in is the first out). It is possible to modify it even manually at your own risk!
EBP (base pointer) points to the base of the stack.
EIP (instruction pointer) points to the next instruction to be executed. It cannot be modified.

ESI and EDI are the index registers (index registers) and are used for string and vector operations:

ESI (source index) points to the source string/vector.
EDI (destination index) points to the destination string/vector.
EFLAGS is used to store the current state of the processor. Each flag (bit) in the register provides a particular piece of information. For example, the flag at position 0 (carry flag) is set to 1 when there has been a carry or a borrow during an arithmetic operation; the flag at position 1 (parity flag) is used as the parity bit and is set to 1 when the result of the last operation has an even number of 1s;

EFLAGS register composition

CF (carry flag): set to 1 if an arithmetic operation borrows from the most significant digit or performs a carry beyond the most significant digit;
PF (parity flag): set to 1 if the number of 1s present in the result of an operation is odd, to 0 if it is even.
AF (auxiliary flag): used in Binary Coded Decimal (BCD) arithmetic to check whether a carry or a borrow has occurred;
ZF (zero flag): set to 1 if the result of the operation is 0.
SF (sign flag): set to 1 if the result of the operation is a negative number, to 0 if it is positive (2’s complement representation).
OF (overflow flag): set to 1 in the case of overflow of an operation.

Addressing mode

The term addressing mode refers to the way the operand of an instruction is specified. There are 7 main addressing modes:

Register addressing: the operand is contained in a register. The name of the register is specified in the instruction.
Absolute addressing: the operand is contained in a memory location. The address of the location is specified in the instruction.
Immediate addressing: the operand is a constant value and is explicitly defined in the instruction.
Indirect addressing: the address of an operand is contained in a register or memory location. The address of the location or register is specified in the instruction.
Indexed addressing: the actual address of the operand is calculated by adding a constant value to the contents of a register.
Addressing with autoincrement: the actual address of the operand is the contents of a register specified in the instruction. After the operand is accessed, the contents of the register are incremented to point to the next element.
Addressing with autodecrement: the contents of a register specified in the instruction are decremented. The new contents are used as the effective address of the operand.

About SerHack

I am a security researcher, a writer, and a contributor to the Monero project, a cryptocurrency focused on preserving privacy for transactions data. My publication Mastering Monero has became one of the best rated resources to learn about Monero. More about me

Get your copy now!

Follow me on Twitter or send me an e-mail. I also appreciate donations, they allow me to continue doing my work and writing.