

# "MACHINE LANGUAGE"

## — TRANSLATION —

- To encode instruction means to convert assembly lang to machine code.
- decode is viceversa.

### INSTRUCTION FORMAT :

|                    |          |         |        |                  |          |
|--------------------|----------|---------|--------|------------------|----------|
| Instruction Prefix | OPcode   | Mod R/M | SIB    | Address/Displace | Imm/Opnd |
| 1-byte             | 1-3 byte | 1 byte  | 1 byte | 1-4 byte         | 1-4 byte |

- Instructions are stored in Little Endian order, so the first byte is located at instruction's starting address.
- Every instruction has opcode (other field are optional).
- Mostly insts. are of 2-3 bytes.

- INSTRUCTION PREFIX :  
overrides default operand sizes

- OPCODE :  
(Operation Code) identifies a specific variant of an instruction  
The ADD instr. for ex has nine diff opcodes, depending on parameter types used.

02 00 000 000  
Mod AL R/M

|            | 6-7 bits | 3-5 bits      | bit 0-2 |
|------------|----------|---------------|---------|
| • MOD R/M: | Mod      | Reg / op code | R/M     |

This field identifies the addressing mode & operands. The notation "R/M" stands for register & mode.

bit 6-7    bits 3-5    bits 0-2

| • Scale Index Byte (SIB): | Scale | Index | Base |
|---------------------------|-------|-------|------|
|---------------------------|-------|-------|------|

SIB is used to calc. offsets of array indexes.

#### • Address Displacement:

holder an operand's offset, or it can be added to base & index registers in addressing modes such base-displacement or base-index-displacement.

#### • Immediate Data:

This field holds constant operands.

### SINGLE BYTE INSTRUCTION:

- Simplest type of instruction with either no operand or an implied operand.
- Such inst. req. only the op code field.

e.g. AAA → 37 , CBN → 98 , XLAT → D7  
 AAS → 3F , LODSB → AC , INC DX → 42

Some commonly used instructions have been given unique op codes.

- Move Immediate to Registers

→ Immediate operands are appended to insts. in little Endian order (lowest byte first).

The ~~instru~~ encoding format for MOV

→ (no. 0-7 indicating which reg is used)

$B8 + rw \ dw$

[ ]   
 opcode byte      [ ]   
 value              immediate  
 word operand

AX / AL → 0

BP / CH → 5

CX / CL → 1

SI / DH → 6      TABLE #03

DX / DL → 2

DI / BH → 76

BX / BL → 3

SP / AH → 4

① egs PUSH CX → ~~51~~ 51

opcode for PUSH = ~~0050~~ 50 (with 16-bit register operand is 50)

opcode for CX = ~~1~~ 1 (so adding 1 to 50).

② egs: MOV AX, 1 → B8 01 00

MOV → B8 (for moving 16-bit to a reg.)

AX → 0 (so adding 0 to B8 = B8)

Immediate operand → 0001 (appending to B8 in little Endian).

B8 00 10. here

changing the 10 in little

Endian to majorise

lowest byte 1st major

10 B8 01 00 major

e.g.:  $MOV\ BY, 1234h \rightarrow BB\ 34\ 12$

$MOV \rightarrow BB$

$BX \rightarrow Z \quad (BB+3 = BB)$

$1234 \rightarrow 34\ 12$

### • REGISTER - MODE INSTRUCTIONS

→ In instructions using register operands, the MOD R/M byte contains a 3-bit identifier for each register operand.

| R/M | Register       | R/M | Register  |
|-----|----------------|-----|-----------|
| 000 | AX or AL, ES   | 100 | SP or 14H |
| 001 | CX or CL, CS   | 101 | BP or CH  |
| 010 | DX or DL, SS   | 110 | SI or DH  |
| 011 | BX or BL or DS | 111 | DI or BH  |

→ The choice of 8-bit or 16-bit reg depends on the bit #0 of the opcode field:

↳ 1 → 16-bit reg ↳ 0 → 8-bit reg

~~MOV AX, BX → 09 D8~~

Opcode for ~~moving~~ a 16-bit value MOV from a reg to any other oper. is 89/8.

/8 indicates MOD R/M byte. 89 is code MOD R/M with Byte exchange.



- MOD field identifies addressing mode. See table 12-18.  
11 → means R/M field contain 0 register number
- Reg field identifies SRC operand. In our ex BX is register 011.
- R/M field, which identifies destination operand.  
7 AX → 000 so R/M → 000.

mod reg R/M  
 $\boxed{1110111000}$

$\boxed{11011000}$   
 0111 0011  
 D 8

89 D8

e.g.: MOV AX, DX      89  $\boxed{1111000}(010)$



- The 6-bit opcode identifies operation. The same opcode is used for 16-bit and 8-bit.
- The size of operand is given by "W" bit: ~~W=0~~  
 $W=0 \rightarrow 8\text{-bit data}$       mov ax, bx \$0 W=1  
 $W=1 \rightarrow 16\text{-bit data (or 32 bits)}$       mov ax, bx  
 $W=0$
- Bit marked "D", specifies direction of data transfer.  
 $D=0 \rightarrow$  then destination operand is a memory loc  
 e.g.: add [bx], al      (not in reg mod)

'D' or 'REG' field opas in related  
(now)

$D=1 \rightarrow$  then destination operand is reg (not in reg)  
in "REG" field

- The "D" bit specifies whether the register in "REG" field is a source or destination. If 0 then src of & then dest.

• MOD R/M : 

|     |     |     |     |   |   |   |   |
|-----|-----|-----|-----|---|---|---|---|
| 7   | 6   | 5   | 4   | 3 | 2 | 1 | 0 |
| Mod | R/M | R/M | R/M |   |   |   |   |

- "R/M" field combined with MOD specifies, either
  - ↳ the second oper in a two-oper instr, or
  - ↳ the only & a one-oper & like NEG or NOT.
- ↳ There two are used to let the CPU know if there is some memory operand. If yes, then how to calc offset address. *R/M "mod" field correlated here*
- The "MOD" field specifies x86 addressing mode.
  - ↳ MOD = 11 (means Register Mode)
  - ↳ MOD = 00 (means memory mode with no displacement)  
except when R/M = 110 then a 16-bit displacement follows
  - ↳ MOD = 01 (means memory, with 8-bit displacement)  
following (D8)
  - ↳ MOD = 10 (means a 16-bit & (D16))

| BYTE 1           | BYTE 2    |           |                | BYTE 3        | BYTE 4        | BYTE 5            | BYTE 6                                      |
|------------------|-----------|-----------|----------------|---------------|---------------|-------------------|---------------------------------------------|
| 6 bits<br>opcode | 1bit<br>D | 1bit<br>W | 2-bits<br>MOD. | 3-bits<br>REG | 3-bits<br>R/M | Low DISPL<br>DATA | High DISPL<br>DATA<br>low data<br>high data |

### General Format

- Instruction size can range from 1-6 bytes.
- the last 4 bytes are dedicated with the displacement or immediateval (data)
  - if a displacement is given then high part of disp. in Byte 4 & low in 3. If only data is given then also same.
  - if both displ. & data are given then displ. in 3 & 4 and data in 5 & 6.

o mov bl, al

→ opcode for mov → 100010 (will be given)

→ REG mein humne "al" rakhna hai ab chunkie AL source hai to D = 0 sakhege.

→ REG = AL = 000 ⇒ value of 41

→ W = 0 as 8-bit oper.

→ both are reg so MOD = 11

→ R/M = BL = 011 → as BL is destination

$$\underbrace{100010}_{\text{opcode}} \underbrace{00}_{\substack{\downarrow \\ D}} \# \underbrace{11}_{\text{MOD}} \underbrace{000}_{\text{REG}} \underbrace{011}_{\text{R/M}} = 88C3H$$

• MOV CX, DX

Mov  $\rightarrow$  100010

REG  $\rightarrow$  CX  $\rightarrow$  001 ( $\Rightarrow$  we choose CX and it is dest). So  
 $D = 1$

$W = 1$  (16-bit oper)

MOD  $= 11$ , R/M  $= DX = 010$

100101111001010 = 8BCAH

• ADD [BX][DI] + 4567H, DX

$\rightarrow$  Opcode  $\rightarrow$  Add  $\rightarrow$  000000

$\rightarrow$  REG  $= DX \rightarrow 010$  (yeha humne REG DX choose kia q k left  
oprand memory hai).

$\rightarrow D = 0, W = 1$

$\rightarrow$  MOD  $= 10$  ( $\Rightarrow$  it is memory mode with 16-bit disp.)

$\rightarrow$  R/M  $= 001$  (ye humne table se dekhata hai  $+4567H$ )

MOD 10 mein (BX)[DI] + 16 bit ki R/M val  
find koi wo hai 001.

000000|01110|010101001|67H|45H  $\rightarrow$   
Opcol  $\rightarrow$  000000  
 $\rightarrow$  D  $\rightarrow$  000000  
 $\rightarrow$  MOD  $\rightarrow$  10  
 $\rightarrow$  REG  $\rightarrow$  001  
 $\rightarrow$  R/M  $\rightarrow$  001  
 $\rightarrow$  high disp  
 $\rightarrow$  low disp

01916745H.

- ADD AX, [BP][SI] + 45H

Opcode → 000000

REG → AX → 000

D=1 (i.e. REG is dest), W=1 (16-bit oper.)

MOD = 10 (same reason)

R/M = 00

(000000|11101100|010) 45H → 034245H

### Immediate Addressing Mode:

~~MOVING.~~

- MOV AX, 1234H

→ Immediate to Reg.

|      |   |     |          |           |
|------|---|-----|----------|-----------|
| 1011 | W | Reg | Low data | High data |
|------|---|-----|----------|-----------|

MOV AX, 1234H

opcode → 1011

W = 1 → (16-bit oper.)

Reg → AX → 000

DATA = LB = 34, HB = 12

1011 | 000 34H 12H → B08412H

## ENCODING ONE OPERAND INSTR.

### ① SINGLE OPERAND IN A REGISTER

There are two encoding patterns

- 1 Short form → which consist of just one byte and used when reg. operand is a 16-bit reg

- 2 General form → which consist of opcode byte followed by "mod r/m" byte and is used in any other situation than short form.

Short form with a 16-bit reg operand (single):

Two formats used for short form.

↳ One for Segment reg

↳ One for 16-bit reg

- For 16-bit Reg other than Segment Reg:

| opcode  | seg code |
|---------|----------|
| 7-3 bit | 2-0 bit  |

PUSH AX      50h+00      → 50h

PUSH BX      50h+03      → 53h

INC AX      40h+00      → 40h

INC SI      40h+06      → 46h

opcode seg  
code → for this refer to TABLE #03

▲ Seg code are added in opcodes

- Short form for Segment Registers:

|               |                 |               |
|---------------|-----------------|---------------|
| 6-7 bit       | 5-3 bit         | 2-0 bit       |
| <u>opcode</u> | <u>reg code</u> | <u>opcode</u> |

|         | opcode | Reg code | opcode |       |
|---------|--------|----------|--------|-------|
| PUSH ES | 00     | 000      | 110    | = 06h |
| PUSH CS | 00     | 001      | 110    | = 0Eh |
| POP DS  | 00     | 011      | 111    | = 1Fh |

→ The opcodes for 6-7 bit & 2-0 bit are given in APPENDIX 2.

General form of An Instruction :

|                                                           |
|-----------------------------------------------------------|
| 7   6   5   4   3   2   1   0                             |
| <u>opcode byte</u>   11   <u>opcode</u>   <u>seg code</u> |

ismen bhii cooki sabi hou,

|        |      |     |     |
|--------|------|-----|-----|
| opcode | mod. | reg | y/m |
|--------|------|-----|-----|

opcode extension for INC  
for ALL the code is 100.

INCAH 4FE 11 000 100 → FEC4

~~mod~~  
 $\downarrow$   
 ∵ it is reg  
 so mod = 11  
 $\uparrow$

IMUL BX F7 11 101 011 → F7EB

▲ Ags Single Instruction with Reg No to "REG"  
 k andar "opcode extension" ayeega. Ye  
 Appendix 2 mein given hai.

## ② SINGLE OPERAND IN A MEMORY

There are diff encoding pattern for diff encoding mode.

- Direct Addressing Mode : (No displacement)

- Here mod will always be 00 & it's memory and no displacement.
- R/M will be 110

| opcode                 | mod | reg | r/m | offset    |
|------------------------|-----|-----|-----|-----------|
| INC WORD PTR [DS:200h] | FF  | 00  | 000 | 110 00 20 |
| <u>FF 06 00 02</u>     |     |     |     |           |

- Indirect Addressing Mode : (No disp.)

- Here also mod = 00

| opcode                | mod | reg | r/m     |
|-----------------------|-----|-----|---------|
| INC BYTE PTR [BX][SI] | FE  | 00  | 000 000 |
| → FE 00               |     |     |         |

opcode  
extension  
for INC

→  
r/m is value (BX)(SI)  
mean jst mod=00 hs  
to 000 hot han.

- Indirect Addressing Mode : (With Displ.)

- Mod will be either 01 (for 8-bit) / 10 (for 16bit disp)
- REG mein opcode extension <sup>displ</sup>
- R/M mein reg ki value
- OR last mein displacement (little Endian)

|                            | opcode           | mod      | seg | s/m | displ.                              |
|----------------------------|------------------|----------|-----|-----|-------------------------------------|
| IDIV WORD PTR [DI] + 1A2Bh | F7               | 10       | 111 | 101 | 2B 1A                               |
|                            | ↓<br>opcode      | ↑<br>reg |     |     | ↳ [DI] value at mod=10<br>extension |
|                            | op               | mod      | seg | s/m | dis                                 |
| IDIV WORD PTR [SI + 2Bh]   | F7               | 101      | 111 | 100 | 2B                                  |
|                            | ↓<br>8bit<br>reg |          |     |     | ↳ [SI] value at<br>mod = 01         |

### (3) SINGLE IMMEDIATE DATA :

When the only operand of the instruc. is an immediate data the machine language instruction is  
opcode - immediate data

RET 8 → C2 00 00

# "ENCODING TWO OPERAND INSTR."

- ① Register, 9mm Oper. (With no 'mod r/m' byte)

|           | opcode byte   9mm. data |
|-----------|-------------------------|
| ADD AX, 7 | 05   07   00            |

lagr AX, kisi imm 0507 0000

K saath add hogta to opcode & imm data likhenge.

- ② Register, 9mm. Oper. (With 'mod r/m' byte)

| considering<br>5 00 byte | opcode   11   opcode   reg   imm |
|--------------------------|----------------------------------|
| CMP CX, 5                | 83   11   11   001   0500        |

↓  
mod      extension      r/m  
              of CX

“only reg is involved”

- ③ Reg, Reg Operands

mod will be 11 as both are reg.

|            | opcode   11   sec reg   1st reg |
|------------|---------------------------------|
| SUB BX, DX | 29   11   010   011             |

mod      reg      r/m

yeha shayid opcode men D=0

hoga jabhi “reg” field mein src wala reg hua.

29 → 1000011011

opcode      D → 16 bit  
                +      reg

④ Reg, Mem in Direct Mode (With 'mod & m' byte)

MOV [DS:200h], AL

|        |      |         |
|--------|------|---------|
| opcode | byte | Offset  |
| A2     |      | 02   00 |

⑤ Reg, Mem in Direct Mode (With "mod & m byte")

MOV [DS:200h], BL

| opcode | mod | reg | s/m | Offset  |
|--------|-----|-----|-----|---------|
| 88     | 00  | 011 | 110 | 00   02 |

mod value of BL      symbol at direct address when mod = 00

⑥ Reg, Mem in Indirect Mode (With no disp)

mod = CC (memory & no disp)

MOV [BX][DI], DS

| opcode | mod | reg | s/m | disp |
|--------|-----|-----|-----|------|
| 8C     | CC  | 011 | 001 | -    |

mod value for DS       $\rightarrow$  value of [BX][DI] at mod = 00

⑦ Reg, Mem in Indirect Mode (With disp)

mod = 01 / 10 (depending on disp.)

MOV [BX][DI]+5, DS

ADD AX,[SI]+1A ABh

| opcode | mod | reg | s/m | disp | -  |
|--------|-----|-----|-----|------|----|
| 8C     | 01  | 011 | 001 | 05   | -  |
| 03     | 10  | 000 | 100 | 2B   | 1A |

val of AX      val of [SI] in mod = 10

⑧ Mem Oper. in Direct Mode, 9mm Data  
 $\text{mod} = 00$  ( $\Rightarrow$  mem and no disp)

|                             | opcode | mod | opc-ext | s/m | offset | imm   |
|-----------------------------|--------|-----|---------|-----|--------|-------|
| SUB WORD PTR [DS:20h],1A2Bh | 81     | 00  | 101     | 110 | 00/20  | 2B/1A |

$\downarrow$   
value  
of DS

⑨ Mem Oper. in Indirec. Mod (Moddisp), 9mm data

$\text{mod} = 00$  ( $\Rightarrow$  mem & no disp)

|                     | opcode | mod | opc-ext | s/m | offset | imm  |
|---------------------|--------|-----|---------|-----|--------|------|
| SUB WORD PTR [BX],5 | 83     | 00  | 101     | 111 | -      | 05/- |

level of BX at  
 $\text{mod} = 00$

⑩ Mem Oper in Quodirect (Disp), 9mm data

$\text{mod} = 01/10$  ( $\times$  dep. on disp)

| word                         | opcode | mod | (reg) | s/m | disp  | imm   |
|------------------------------|--------|-----|-------|-----|-------|-------|
| SUB DIR [BX]+0EOFh,<br>1A2Bh | 81     | 10  | 101   | 111 | 0F/0E | 2B/1A |

$\hookrightarrow$  val of BX at mod = 10

(v)

j) ADD AL, [BX+SI]

| opcode | mod | reg | s/m |        |
|--------|-----|-----|-----|--------|
| 02     | 00  | 000 | 000 | → 0200 |

↓      ↓      ↓  
mem    AC    BX+SI  
→ no disp.

ii) INC DX

| opcode | mod | reg | s/m |        |
|--------|-----|-----|-----|--------|
| FF     | 11  | 000 | 010 | → FFC2 |

↓      ↓  
op: eat    DX val

iii) MOV AX, Var+6

| opc | mod | reg | s/m | disp |  |
|-----|-----|-----|-----|------|--|
| B8  | 01  | 000 |     | 06   |  |

↓  
val of AX

B8 → mov reg16, imm16 ⇒ B8 00 08 h

iv) SUB CX, VAR ; offset 0008h

| opc | mod | reg | s/m |        |
|-----|-----|-----|-----|--------|
| AB  | 00  | 001 | 110 | → AB E |

↓  
no disp

val  
of  
CX

v) MOV [SI+490], BX

| opcode | mod | reg | s/m | displacement | ↑       |
|--------|-----|-----|-----|--------------|---------|
| 89     | 10  | 011 | 100 | 490          | ↑ 0008h |

mod      val of      val of SI+016 16      FE  
BX      BX      at mod=10

111011100

11111100100100

(i) B9 00 12

~~MOV~~ MOV ~~seg~~ 16, imm 16  $\rightarrow$  B8 + reg cd

① opcode = B8

reg cd = 01  $\rightarrow$  CX

imm. = 1200h

[MOV CX, 1200h]

C (ii) 8C 85 DC 01

8C  $\rightarrow$  MOV reg 16 / mem 16, Sreg

8C = 1000 1100

$\therefore d = 0 \rightarrow$  so "seg" field contains src reg  
 $w = 0$

Reg16dest  $\rightarrow$

85  $\rightarrow$  10 000 101

mod = 10

reg = 000  $\rightarrow$  ES

s/m = 101  $\rightarrow$  [DI + DI 16]

DC 01  $\rightarrow$  1111111000100100  $\rightarrow$  2's comp  $\rightarrow$  FE24

[MOV [DI + ~~476~~ <sup>FE24</sup>], ES]

1000011110101010

8B 87 56 78

8B  $\rightarrow$  mov reg16, mem16

8B  $\rightarrow$  10001011

d = 1  $\rightarrow$  sc "reg" field contain dest. reg

w = 1  $\rightarrow$  16-bit operation

87  $\rightarrow$  10 000 111

mod = 10  $\rightarrow$  memory mode 16 bit disp.

reg = 000  $\rightarrow$  AX

8lm = 111  $\rightarrow$  BX + DI6

7856h  $\rightarrow$  2's compl  $\rightarrow$  87AAh

[MOV AX, [BX + ~~DI6h~~]]

87AAh

iv) 28 1D

28  $\rightarrow$  SUB mem8/reg3, reg3

28  $\rightarrow$  00101000

src

d = 0  $\rightarrow$  reg field contain ~~dest~~ reg

w = 0  $\rightarrow$  8-bit oper.

1D  $\rightarrow$  00 011 101

mod = 00  $\rightarrow$  memory mode no disp

reg = 011  $\rightarrow$  BX

8lm = 101  $\rightarrow$  DI

SUB [DI], BX