

COAL

## ASSEMBLY LANGUAGE (LAB)

→ Integer Constant

$26 \rightarrow$  decimal ( $26d$ )

$26h \rightarrow$  hexa-decimal

$101010_b \rightarrow$  binary

→ Character Constant

'A' | "A"

## //DATA TYPES//

EAX = 32 bit register

(1) BYTE, SBYTE

↳ unsigned int  
0-256

↳ signed int ; -128-127

(2) WORD

↳ 2 Byte signed : 4 Unsigned

(3) D WORD

↳ 32 Bit

(4) Q WORD

↳ 64 Bit

(5) T BYTE

↳ 80 Bit

\* For string  
greeting BYTE ("GOOD MORNING")  
[ each byte for each char ]  
Terminator

→ For Array  
list BYTE 20,10

→ For UnInitialized Variable

Var SWORD ?

Value will be assigned @ Run time

Registers

EAX = 32 bits ; each hexa = 4 bits ;



$$AL/AH = 8 \text{ bits}$$

$$AX = 16 \text{ bits}$$

## Array

- var1 BYTE 20 DUP(0)
- ↳ 20 bytes reserved → initialized to 0
- DUP(?) → All uninitialised
- DUP("STACK")

## // COAL - Sir Danish //

### • Basic Micro-Processor design

1- I/O devices

2- Memory Storage Unit

3- CPU / ALU / CU / Clock / Registers

4- buses

- Control Bus, )      • ALU → arithmetic & bitwise op.

- data Bus      • clock → synchronises CPU operations

- Address bus      • CU coordinates

## INSTRUCTION EXECUTION CYCLE

- Any program that needs to be executed needs to be fetched into memory
- Instructions loaded into Instruction Queue
- The operands are then fetched (operands = data values  $a+b$ )
- ALU executes the instruction
- Output stored      // learn diff b/w Var & regis

10  
→ variables (Aspects of variables)  
size, name, address, data  
mask over data

\* .CODE vs .DATA

Instructions

↳ raw data

→ when an instruction is loaded into memory,  
segments are formed. Segments contain multiple  
addresses



Segments:

→ data  
→ code  
→ slack

↳ segments are not contiguous  
in memory

↳ these segments, bee are stored in memory →  
base addresses

↳ segments' base is assigned a base address  
within the segment the individual mem.  
locations are assigned offset

(Theory)

Status Registers → calculations etc (Used in)

↳ e.g. Comparisons

$x = \text{Var1} - \text{Var2}$

$(\text{Var1} = \text{Var2})$  if  $x == 0 \rightarrow \text{Zero flag}$ .

$(\text{Var1} < \text{Var2})$  if  $x < 0 \rightarrow \text{Sign flag}$ .

else  $\text{Var1} >$ .

## Modes of Operation

- Protected mode → CPU, if working on segment P1, it cannot reach any other memory segment.
- System mode → CPU can access any segment of memory

Note: registers are global entity

## Assembly language of

↳ X86

Irvine32

32 bit

- Asm files.

MASM

Visual Studio

# ASSEMBLY LANGUAGE FUNDAMENTALS (chap 3)

main PROC

mov eax, 5 → copy 5 to eax register  
operands

add eax, 6

INVOKE Exit Process, 0

main ENDP

move destination, source.

· data // data Segment → var global to prog.

Sum Dword 0

· code // instructions

MOV AH, AH → reg to reg

MOV AH, 0AH → hexa value 0A,

AH into

MOV AH, 'A'

↳ ASCII value (65) will be saved in AH

→ string literal → series of characters

### Arithmatic Precedence

|     |         |                 |
|-----|---------|-----------------|
| 1st | ( )     | Parenthesis     |
| 2nd | - +     | Unary (as stgn) |
| 3rd | x / :   |                 |
| 4th | MOD     |                 |
| 5th | Add/Sub | Binary          |

### Reserve Words

e.g. PROC cannot use as variable.

We can pick names for (identifiers) for  
↳ Procedure , variable , labels

## COAL ( Lab )

Types of Registers

→ General Purpose

→ Control

→ Segment registers

→ data reg  
→ pointer reg.  
→ index reg.

## EAX (Accumulator register)

→ For multiplication

↳ The first operand is already present in the accumulator register ; the result is also stored in the same register

2nd operand

$$\boxed{AL} \times \boxed{8\text{-bit}} = \boxed{\begin{matrix} AH \\ AL \end{matrix}}_{\text{Ax}}$$

→ If operand is larger than 8 bits ; Data register is used.

$$\boxed{AX} \times \boxed{16\text{ Bits}} = \boxed{\begin{matrix} DX \\ AX \end{matrix}}_{\text{MSB } \text{LSB.}}$$

→ Pointee Registers

(1) Extended Instruction pointer

↳ add of next (executable) instruction

(2) Extended stack pointer

add of offset value w/in stack

(3) Extended base pointer

helps in referencing the parameter

variables passed to a sub routine



→ INDEX REGISTERS

• Extended Source Index (ESI)

source index for string operations

• Extended Destination Index

dest. index for string operation

e.g.



ESI S9

EDI 100

C // NOTE  
→ CS, EIP w IP cannot be  
destination operands

- For Increment → INC, register
- For decrement → DEC, register

E

→ MOV 2x Instrucción  
(MOV w zero extend)

MOV 3x (MOV w sign extend)

## FLAG REGISTER

↳ updates itself after each instruction



## Assembly lang (theory)

Instruction → work to do / operation

mnemonics → short codes for operations  
e.g. MOV.

label; a label give to some instruction so we can use it (goto) at any time in the programs.

e.g. LI: MOV Ax, Var1

↳ label.

target :

MOV Ax, bx

looping

JMP target

Directives → to make use of any resource outside the program.

↳ directly understandable to the microprocessor  
@ time of linking (included)

directives in AT

• .data, .code, .stack, .Dword

myWord DW 26 ; Dword tells the assembler to reserve space in the program for a double word variable.

// Diff b/w directives w/ instruction

(queue of addresses)

• 886 → following program will be for a ... ?

• model → decides how many memory Segs to be made for your program

• stack → 4096. (spare memory)

↳ creates a stack segment with 4096 bytes → 1KB

Read only memory

END P → end of procedure

END Main → End of program

↳ entry point name of procedure.

END PI → In the whole main only PI

will wait → rest procedures would be called

Defining data:

// always under .data

X BYTE 17

↳ scalar OFC001

17

X  
mask over address

X BYTE 17, 18, 19

↳ array;

→ Any data > a byte is stored in chunks;  
lower bytes will come first

e.g. X BYTE 17, 19, 20

1st element

|    |              |
|----|--------------|
| 17 | → lower byte |
| 00 | → upper byte |
| 19 |              |
| 00 |              |
| 20 |              |
| 00 |              |

// each element takes 2 bytes,  
where lower byte comes first.

// 17 = 0017 as a word

→ Since we only have one variable  
X → ~~point & get elem~~ and it has  
3 values 17, 18, 19 → here the  
other 2 values are unnamed → to  
reach them we use their offset

e.g. 

|     |     |     |     |     |     |
|-----|-----|-----|-----|-----|-----|
| 17  | 00  | 19  | 00  | 20  | 00  |
| 100 | 101 | 102 | 103 | 104 | 105 |
| X   |     |     |     |     |     |

To access 19 we use X+2

using Multiple Initializers e.g.

list 1 Byte 1, 2, 3, 4  
Byte 5, 6, 7, 8

To access 8 → we use list1+7.

list 2 Byte 10, 20, 30, 40  
Word 50, 60, 70, 80

OF01 10

OF02 20

OF03 30

OF04 40

OF05 50  
OF06 00

OF07 60  
OF08 00

OF09 70  
OF0A 00

OF0B 80  
OF0C 00

→ OF05 50  
Element.



8421

19 01

6919

To move 70 in Ax we write

MOV Ax, [list2 + 8]

// list2 + 8 is basically 70

But since it is an array of word,  
it would return the whole word 0070

// It doesn't matter if we use

list2 + 8 or list2 + 9 we would get

0070 → either address would  
return the whole element

Any word

135 1901h 2000h, OF0Lh  
STR byte "C0AL"

135 1901 // EDS

136 0A19 ↳ Data register

137 2000 // if there are is only

138 0B20 one array and it

139 0F01 is un-named → we

13A 0A0F will take the base

13B 'C' address of the data

13C 'O' register

13D A

13E L

70 C

# Defining Strings

→ greeting1 BYTE "GOOD MORNING", 0

NULL Terminator

↳ else the string doesn't have an exit point

not a part of the array

\* DUP operator (Duplicate)

Var1 BYTE 10, 3 DUP(0), 20

↳ in memory

[10|0|0|0|0|20]

BYTE 4 DUP("STACK")

= "STACK STACK STACK STACK"

BYTE 20 DUP(?)

↳ uninitialized

Var1 Sword 17, 3 POP(-17, 19)



[17|00|-17|00|-19|00|1|00|...]

Var Dword 3 DUP(4 DUP("ABC"))  
↳ size of data 36.

### LITTLE ENDIAN ORDER.

↳ all data types larger than a byte store their individual bytes in reverse order

Var1 Dword 12345678

|    |
|----|
| 78 |
| 56 |
| 34 |
| 12 |

→ data vs .data?

↳ data variables reserve bytes in the memory @ compilation time even if it is not used in the program

↳ any data variable under this is not initialized & doesn't take memory space

UNTIL used in code.

.data 2

bigArray Dword 5000 DUP(?) ↳ has to be initialized

↳ doesn't allocate memory until variable is used in code

## SYMBOLIC CONSTANTS;

$$\underline{\text{PI}} = 3.14267 \dots$$

- ↳ not a variable; whenever you write PI, you get its value.
- ↳ not in memory → only in program.
- ↳ Value cannot be changed @ runtime

## CURRENT LOCATION COUNTER; (\$)

VI WORD 16h, 160h, 1600h · OC10901 16  
DI DWORD \$ OC10902 00  
OC10903 60

since this is the current location counter, it will assign OC10907 to DI → OC10906 07  
OC10908 09  
OC10909 C1  
OC1090A 00  
OC1090B

WORD 1600, 160h, 1600h.  
BYTE 10, 20, 30, 40

list size ~~byte~~  $\equiv (\$ - \text{list})$

$$\$ = \text{C10907}$$

↳ This would give the offset (size of array) current location - start of array.

$$\text{listSize} = \text{C10907} - \text{C10901}$$
$$= 6$$

// Note, listSize should be IMMEDIATELY after the list otherwise current location would change accordingly

## DATA TRANSFERS INSTR(4.1)

// cannot use 2 memory operands

eg      MOV Var1, Var2      X  
                                not possible

### → Addressing Modes

↳ diff ways of accessing  
memory

① var name (Automatic dereferencing)

② using offset in array.

[Var1] → manual dereferencing

Dereferencing operator

→ eq 4 word 1907h, 190, 1019h.

Mov, EAX, [4+2] Bytes

↳ This would give an error bec  
dest > source

↳ to solve this we have Movzx & Movsx

it is called data extension

Movzx → it would add zeros  
on the LHS.

Movsx → for signed numbers, it  
adds 1 to the LHS.

MSB

either(1/0)

$\rightarrow \text{XCHG}$ ; exchanges the values of  
 two operands. At least one register  
 should be (immediate operands aren't  
 allowed)  $\text{xchq mem, mem}$   
 Not allowed!

'data

X BYTE 17h, 18h, 19h, 20h

Y word 1907h, 0190h, 1019h.

| X  | 17 | 18 | 19 | 20 | Y  | 19 | 07 | 19 | 01 | 10 | 01 | 19 | 01 | 90 |
|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
| +1 | 18 |    | 19 | 07 | +1 | 07 | 19 |    | 90 |    |    |    |    |    |
| +2 | 19 |    | 19 | 01 | +2 | 19 | 01 | 19 |    |    |    |    |    |    |
| +3 | 20 | 18 |    | 00 | +3 | 00 | 19 | 01 |    |    |    |    |    |    |
|    |    |    |    | 10 | +4 | 10 |    | 19 |    |    |    |    |    |    |
|    |    |    |    | 19 | +5 | 19 |    | 10 |    |    |    |    |    |    |

MOV BL, [x+1]  $BL = 18$

MOV BH, [x+2]  $BH = 19$

MOV AX, [Y+2]  $AX = 1900$  ~~0190~~

$\text{xchq BL, } x+3$   $BL = 20$   $x+3 = 18$

$\text{xchq AL, AH}$   $AL = 01$   $AH = 90$

$\text{xchq AX } [Y+4]$   $AX = 1019$  ;  $Y+4 =$

$$x \text{ chg } Bh, [x] \quad -Bh = 17 \\ x = 19$$

|   |    | Bx | 17 | 20 |
|---|----|----|----|----|
| x | 19 |    |    |    |
|   | 18 |    |    |    |
| y | 19 |    |    |    |
|   | 18 |    |    |    |
|   | 67 |    |    |    |
|   | 19 |    |    |    |
|   | 90 |    |    |    |
|   | 01 |    |    |    |
|   | 09 |    |    |    |
|   | 90 |    |    |    |
|   |    | Ax | 10 | 19 |
|   |    |    |    |    |

Q

X Byte 19, 20, 21  
Y Word 1701h.

→ Mov Ax, [x+3] → Mov Ax, [y] Same  
// ↳ direct offset addressing ↳ direct addressing

\* INC → operands mem/reg (For Increment)  
\* DEC → operand mem/reg (for decrement)

eg.

Mov AL, 255 } This would result in AL=0000  
INC AL } But Carry Flag Won't be  
T in this case

DEC AL → 255.

eg:

X word 250, 1701, 1801

INC word PTR [x+2]

(need to tell compiler that  
it is byte/word etc. b/c  
we are using addition)

$$\begin{array}{r}
 11111111 \\
 +10000000 \\
 \hline
 10000001 \\
 01111111
 \end{array}$$

255.

**ADD**, destination, source.

↳ operands → reg/mem

- mem, mem // not allowed
- size = size.

→ Mov AL, 255 // This would set the  
 ADD AL, 1 carry flag to 1,  
 $2F=1, PF=1, AF=1$   
 ↳  $AL = 0000\ 0000$ .

→ OF = 0 & SF = 0 bec these flags  
 are for signed operations

\* **SUB**, destination, source.

→ Mov AL, -128 | would result in  
 SUB AL, 1 |  $AL = 1111\ 1111$  (FF)

$CF=0, 2F=0, PF=1$

make sure  $AF=0$ ,  $SF=1, OF=1$

→ Carry Flag is only for Unsigned operations

$$-5 + (5 - 4) \\ -5 + 1 \\ -4.$$

## \* Neg (Negate operation)

↳ reg / mem.

Change from +ve  $\rightarrow$  -ve & vice versa

Mov AL, -128 ] would result in  
 Neg AL . ] overflow.

$\rightarrow Q_s$

$$A = -B + (B - C)$$

Mov Eax, B.

Sub Eax, C

X Sub Eax, B | Neg B  
 Add eax, B.

Mov A, eax

## IMPORTANT

Direct

[1009FC1], VARI, [VARI]

Base OFFSET

[Var 6+12]

INDIRECT ADDRESSING

[EBX], [ESI].

INDEXED ADDRESSING ARR[ESI], APR[7]

## (e) DATA RELATED OPERATORS & DIRECTIVES

- OFFSET - Operator
- PTR - Operator
- TYPE - Operator
- LENGTHOF - Operator
- SIZEOF - Operator
- LABEL - Directive

### 1) OFFSET

|         |    |            |       |
|---------|----|------------|-------|
| → IOPCI | 19 | Base Add   | IOPCI |
|         | 20 | offset     | 2     |
| →       | 21 | offset Add | IOPC3 |

→ For address use 32 Bit destination

e.g. mov esi, OFFSET dwl

↳ would move offset address of dwl to esi (register)

### 2) PTR

↳ used to override the size of an operand.

e.g. myDouble Dword 12345678h || means we are accessing  
mov, ax, myDouble  
mov, ax, WORD PTR mydouble a subset  
↳ ax = 5678 (equal to word)  
of myDouble

mov

eg.  $x \text{ byte } 19, 20, 21, 22, 24, 25$

• code

mov eax, dword  $[x+1]$  OR

// eax = 24 22 21 20

Qs Why little endian? Importance

// search

### 3) TYPE

↳ Tells us the type of memory.

↳ no. of bytes occupied.

eg.  $x \text{ byte } 10$ . | size of single  
Type  $x = 1$  | elements in  
an array

### 4) LENGTHOF

↳ returns no. of elements in an array

### 5) sizeof

↳ size of the whole array.

(no of elements \* size of element)

Sizeof = lengthof \* Type.

•data

x Dword 1717h, 1901FD01h

y Dword ?

-1001 17

+1002 17

+21003 0D

+31004 0D

1005 01

1006 C9 FD

1007 ~~10~~ 01

1008 19

1009 ~~00~~ 04

100A 10

100B 00

100C 00

mov ESI, offset [x+3]

mov EBX, [ESI]

{ ESI = 1004

↳ would move the value @ address

1004 to EBX.

// hence proving registers can act like  
pointers, but variable cannot

// mov EBX, ESI

↳ would move VALUE  
of ESI [] → this would dereference

### 6) LABEL

- assigns an alternative label name
- Type to an existing storage location → does not allocate storage of its own

eg.

.data

val16 LABEL WORD

val32 DWORD

12U39h

3)

val16 word

|    |
|----|
| 39 |
| 24 |
| 01 |
| 00 |

word  
dword  
val32

## 4.4 INDIRECT ADDRESSING

4)

• data

val1 Byte 10h, 20h, 30h

• code

mov esi, OFFSET val1

mov al, [esi]

inc esi

mov al, [esi]

inc esi

mov al, [esi].

5)

$\text{INC } [n] \rightarrow (n^2)$

$\text{INC ESI}$  → Address would inc  
 $\text{INC [ESI]}$  → value would inc

$\text{INC [Var1]}, \text{INC } \& \text{Var1}$

↳ same thing ; would inc value.

## INDEXED OPERANDS

• data

array W word 1000h, 2000h, 3000h.

• code

`mov esi, 0`

Index register offset displacement base pointer

- JMP AND LOOP OPERATIONS
  - ↳ Conditional Jumps based on status
  - ↳ Un-Conditional Jumps
- \* JMP Instruction causes an unconditional transfer to a destination, identified by a code label

top:

INC AX

MOV BX, AX

JMP lop

↳ offset of destination is given to pointer

## \* LOOP Instruction

- ↳ loop according to ECX counter

Step 1 CX (decrements)

Step 2 IF ( $X > 0$ )

] happen when reaches the instruction loop

|   |   |
|---|---|
| 4 | 1 |
| 3 | 2 |
| 2 | 3 |
| 1 | 4 |
| 0 |   |

eg.      mov ax, 0  
           mov ecx, 5

exe. ↓  
       cute ↓  
L1 : inc ax  
loop L1.  
       mov bx, ax

reads Loop

then dec ecx;

|     |             |
|-----|-------------|
| ecx | ax          |
| 5   | 0           |
| 5   | 1           |
| 4   | 2           |
| 3   | 3           |
| 2   | 4           |
| 1   | 5           |
| 0   | — terminate |

loop L1

— terminate

## STACK OPERATIONS

↳ Last in First out.

### \* PUSH (INSERT)

|                         |       |           |                       |
|-------------------------|-------|-----------|-----------------------|
| 002F → ESS              | 16    | ↓         | Every Push decrements |
| 002B                    | 18    | ↓         | the ESP by 4          |
| 0017                    | 19    | ↓         |                       |
| 0013                    | 20    | → Last in |                       |
| higher<br>↓<br>to lower | → ESP |           |                       |

- Stack grows downwards
- ESS — base address
- Offset of most recent insert is given by ESP.

↳ Initially  $ESP = ESS$

- ↳ Then w/ every push ESP is decremented by 4.
- ↳ above example, the first element pushed was 16 so ESS points to that (base)
- ↳ 20 was the last element to be inserted so ESP points to that

- // Memory segment → grows ↑
- // Stack segment → grows ↓

## STACK

# POP (to GET)

↳ Every Pop will Increment ESP by 4

→ When POP → The last element (20) .

will be removed

↳ ESP will now point to address 19 .

## INSTRUCTIONS

X PUSH reg / mem16 → allowed but don't use for now.

PUSH reg / mem32 ✓

PUSH imm32 ✓

→ Push cannot be < than 16 bits

POP reg / mem16 } imm not allowed

POP reg / mem16 }

- If operand is 16 bits, ESP is incremented by 2 ; if 32 bits by 4 .

OF8  
OF7  
OF6  
OF5

OPB  
OPC  
OPD  
OEE  
OFF  
100

OIB  
OIC  
OID

- PUSHAD

Pushed all 32-bit general purpose registers on the stack in the following

order: EAX = CF

ECX = 12

EDX = 77

EBX = 99

ESP → value before executing Pushad (00h)

EBP = 70

ESI = 00

EDI = F9.

|      |     |                                    |
|------|-----|------------------------------------|
| 100  | CF  | // You copied the H-A values wrong |
| 101  | OFC | 12                                 |
| OF8. |     | 77                                 |
| OF4  |     | 99                                 |
| OPF  |     | 100                                |
| OEJ  |     | 70                                 |
|      |     | 00                                 |
|      |     | F9                                 |

- POPAD

→ Pops the same registers in the reverse order.

EDI — ESI — EBP ... etc

→ POPA ← PUSHA → same; but for 16bits

## 5.2 DEFINING & USING PROCEDURES

↳ To define a subroutine  
main PROC

CALL → calling any procedure  
RET → where to return(?)

end PROC(?)

P1 PROC

eg: Mov Ax, 17h 0017

Mov Bx, 35h 0018

CALL P2 0019

Mov Cx, Ax 001A

RET 001B

P1 ENDP

EIP = 0017

EIP = 070C

slack = 001A

P2 PROC

Mov Cx, 11h 070C

Add Ax, 2 070D

CALL P3 070E EIP = 0986

Add Cx, Dx 070F slack = 070F

P2 ENDP

P3 PROC

Mov Dx, 21h 0986

Add Cx, Dx 0987

Add Cx, Bx 0988

RET 0989 ; POP into EIP

P3 ENDP

END P3

EIP = 070F

// procedure called @ end is completed first.

EIP → next instruction to be executed

// CALL will push the return address into stack and changes EIP to the next instruction to be executed (obj of the called proc)

// We can Pass Arguments through Registers since they are global entity

NOTE

END P1 → tells this is the starting point of program

it

END P2 → whole program starts from P2 ; P1 is useless

## // Revision

LI: Mov Cx .. .  
;  
}; ] local to the  
procedure,  
to make global  
use LI::

→ Call → JMP to another procedure  
and returns back

JMP → doesn't return

$XOR \rightarrow$  same Input  
↳ 0 output

## CONDITIONAL PROCESSING

### AND INSTRUCTION

Performs a boolean AND operation  
bit by bit pair of matching bits in  
two operands

Result stored at AL

AND destination, Source  
↳ result stored in destination

e.g.

$$\begin{array}{ll} \text{MOV AL, } & 1010 \ 1110 \\ \text{MOV CL, } & 1101 \ 1000 \\ \text{AND AL,CL; } & \text{AL} = 1000 \ 1000 \\ \text{CF} = 0; & \end{array}$$

↳ every time AND is performed  
 $CF = 0$ ;  $OF = 0 \rightarrow$  8 bits ka AND can never be a bit

### OR INSTRUCTION

↳ same as And  $\rightarrow$  Just diff  
operator

↳ in above e.g.

$$\begin{array}{r} 1111 \ 1110 \\ F \quad E \end{array}$$

### XOR INSTRUCTION

$AL = 0 \ 11 \ 0 \ 0 \ 110$  // carry w/ overflow  
= 0 for all above

$\begin{array}{r} \text{X} \\ \rightarrow 11111111 \\ \rightarrow 0000010 \\ \rightarrow 0000010 \end{array}$ 
 $\begin{array}{r} \text{X} \\ \rightarrow 0010 \\ \rightarrow 111 \\ \rightarrow 0010 \end{array}$

eg  $X$  byte  $\rightarrow$  value unknown

$\hookrightarrow$  to check value

ADD  $X, 11111111$

OR  $X, 00000000$

But these would change the value of  $X$

$\hookrightarrow$  to avoid this we use TEST

$\hookrightarrow$  similar to AND, but this sets the flags and doesn't store in destination ( $X$ )

$\hookrightarrow$  eg to check the MSB of  $X$

Test  $X, 1000 000$

$\hookrightarrow$  In this case if  $ZF=0 \rightarrow$  this means MSB was 1 if  $ZF=1 \rightarrow$  MSB was 0.

## NOT INSTRUCTION

$\hookrightarrow$  only on bits

MOV AL, 10h

NOT AL ;  $AL = 1110 1111$

$\hookrightarrow$  doesn't have any change on flag



// 'NOT' can be used to encrypt data

for eg  $A = 41h$ ; if we NOT A  
it turns to  $A = BE$ ;

$$\hookrightarrow A = 0100\ 0001$$

$$A' = 1011\ 1110$$

$\hookrightarrow$  change this @ destination

## CMP INSTRUCTION (dest - Source)

We use this to compare integers.

This performs an implied subtraction of a source operand from dest

// ONLY FLAGS ARE AFFECTED

If

$$\begin{array}{l} CF = 1 \\ ZF = 0 \end{array} \left. \begin{array}{l} \text{source} \\ \gg \end{array} \right\} \quad \begin{array}{l} CF = 0 \\ ZF = 1 \end{array} \left. \begin{array}{l} \text{destination} = \\ \text{source} \end{array} \right\}$$

$$SF = CF \quad \left. \begin{array}{l} \text{dest} \\ \gg \end{array} \right\}$$

when signed operands

## 6.3 CONDITIONAL TRAPS

based on  
↳ ECX  
↳ signed  
↳ unsigned  
↳ flag

eg.

CMP EAX, 0

JZ L1

:

L1:

; Jump if zeroFlag = 1

↳ In this case 2F = 1

when EAX is also 0.

3)

eg2

ADD DL, 10110000B

JNZ L2

; Jump when 2F != 0

↳ if DL = 4Fh then

4)

this would mean result is 0 hence this loop would not work.

5)

eg Jcond

destination

JZ

} Jmps based

JNZ

on flags

JC

JNC

## JUMPS BASED ON EQUALITY

- ↳ JE → Jump if equal (left op=right op)
- ↳ JNE → " " not equal
- ↳ JCx2      Jump if CX = 0
- ↳ JECXZ     " " ECX = 0
- ↳ JR CXZ    Jump if RCX = 0.

CMP BL, AL } For JMP inst ; it responds  
JNE LI      } to the instruction immediately  
                        above it

Ex4

xor ecx, ecx      } since xor same  
JECXZ L2        } bits then ans  
                        is defo zero

// JE → Indirectly checks zero flag //

Ex2

mov bx, 1234

sub bx, 1234

JNE L5 ; Check if ZF=0

JE L1 ; check if ZF=1

↳ JE == Jz

## ↳ UNSIGNED COMPARISON

JA

Jump if above (left op > right op)

JNBE

Jump if not below or equal

JAE

Jump if above or equal (dest > source)

JNB

Jump if not below

JB

Jump if below (dest < source)

JNAE

Jump if not above or equal

JBE

Jump if below or equal (dest <= source)

JNA

Jump if not above

### Example

if ( A > B ) { B > C } B > D )

A ++;

else

B ++;

C ++;

D ++;

↳ write code in arm.

mov eax, A  
mov ebx, B  
mov ecx, C  
mov edx, D

Cmp B, D X  
JA L1      Cmp A, B  
              JA L2

L1:  
Cmp B, C  
JA L2

L2:  
inc A

Cmp B, D  
JA L1  
L1:  
Cmp B, C  
JA L2

L2:  
inc A <sup>TMP</sup> ~~condElse~~  
~~jmp L2~~

Cmp A, B  
JA L2  
jmp condElse  
condElse:  
inc B  
inc C  
inc D

## SIGNED COMPARISON

↳ Jumps based on signed

JG → Jump if greater

JNLE → Jump if not less or equal

## 6.4 CONDITIONAL LOOP INSTRUCTION

Step 1: decrement ecx

Step 2: check if ecx > 0

### ↳ LOOPE | LOOPZ

i) decrements ecx

(ii) If ecx > 0 AND if zero flag = 1  
Then JMP dest.

e.g.

MOV CX, 10

L1: SUB CX, 1

LOOPZ L1

} loop will not work;  
since zero flag will  
not be set.

### ↳ LOOPNZ | LOOPNE

↳ checks if ecx > 0 AND ZF = 0

Then it will loop | JMP to destination

↳ Take user input of character

↳ output if the entered is a

vowel, consonant OR non char

## Chapter 7: INTEGER ARITHMETIC

### 7.1 Encrypt-Decrypt

↳ In order to Encrypt-Decrypt  
data → must make sure you don't  
lose any data

### 7.2 SHIFT / ROTATE INSTRUCTIONS :

Qs. ↳ SHIFT / ROTATE INSTRUCTIONS :  
↳ to shift any number of bits / data  
\* Count → No. of shifts to perform  
\* Direction → Right or left?

3)

eg. Shift 1 place to right      ↳ shift w/ rotate

1010 1111

1

empty bit.

Carry Flag

changes OF/CF

4)

|| to overcome this empty MSB, we have:

→ Logical shifting

5)      ↳ stores zero @ the empty bit place

→ Arithmetic shifting

↳ stores the same bit that was present before the shift

↳ In this ex. MSB remains 1.

SH

↓

performs  
left shift

→ MSB

→ LSB

eg

## SHL INSTRUCTION

↓  
performs a logical left shift.  
→ MSB → carry flag  
→ LSB → stored zero

Operand types

|              |                                      |
|--------------|--------------------------------------|
| SHL reg, imm | ↑ max 8 bits                         |
| SHL mem, imm | no concept of same size of operands. |
| SHL reg, CL  |                                      |
| SHL mem, CL  |                                      |

eg.

MOV AL, 0F0h

SHL AL, 5

0000 1111 0000

1111 0000 0000

↳ will eventually ;  
after 5 shifts

turns to zero

NOTE

SHL  $\frac{\text{reg, mem}}{\downarrow}$

destination, no.of iterations

## SHR Instruction

↳ same as SHL → just direction changed

highest (MSB) → replaced w zero

LSB → copied @ CF

eg. SHR 1111 0000, 2

#1 0111 0000 ; CF=0

#2 0011 1100 ; CF=0

carry flag holds

Most recently  
kicked out bit

HELL w  
previous shift

00 10  
0110

↳ SAL (shift Arithmetic Left)

eg ↳ works the same as the SHL  
↳ so r/r SAR is the only

Arithmetic shift

↳ SAR (shift Arithmetic Right)

Qs

MOV AL, 1001 1010h  
SAR AL 3.

3)

#1 1100 1100 ; CF=0

#2 1110 0110 ; CF=1

#3 1111 0011 ; CF=0

↳ This is an Arithmetic shift to Right  
where :

4)

Arithmetic | CF = LSB  
MSB = last, MSB ( retains the sign  
↓  
shifts Bit )

5)

## ROTATE INSTRUCTIONS

↳ ROL Rotate left

↳ Bitwise rotation towards left

↳ not losing any bit of data since  
after one rotation MSB=LSB.

↳ copy ↴ LSB is present in carry flag



↳ e.g. MOV AX, 1234, 4321

↳ do the work w/ ROL inst

ROL AL, 4

ROL AH, 4

Rd AX, 8

|    |      |      |
|----|------|------|
|    | 3    | 4    |
| R1 | 0011 | 0100 |
| R2 | 0110 | 1000 |
| R3 | 1101 | 0000 |
| R4 | 1010 | 0001 |
|    | 0100 | 0011 |

After 4 ROL of AL

AL = 0100 0011 (4,3)

## ROR INSTRUCTION

↳ works in the same way as ROL but does it in opp direction

## ↳ RCL (rotate carry left)

- eg ↳ RCL (rotate carry left)  
↳ shifts each bit to the left, copies  
carry flag to the LSB and THEN copies  
MSB to the CF

NO! @ the  
same time

\* SHL



Qs

Prev value of CF → LSB

MSB → new value of carry flag

All atomic  
operation

3)

- ↳ Simultaneously copies CF to LSB "4 puts"  
MSB into CF

eg



4)

## ↳ RCR

- ↳ Does the same shit as rcl  
only in opposite direction (towards right)

## \* SHLD | SHRD

→ source always a register

↳ reg16, reg16, CL / imm8

mem16, reg16, CL / imm8

reg32, reg32, CL / imm8

mem32, reg32, CL / imm8

↳ SHLD (shift left double)

↳ shifts towards the left by MSB

keeps on Amending the carry flag.

The empty bits @ MSB of destination  
are filled by MSB of source.

eg.

MOV AX, 1234

MOV CX, 9797

SHLD CX, AX, 4.

1001 0111 1001 0111

0111 1001 0111 0001 → CF = 1

7 9 7 1

D , S , Count

[10101]

[11110]

after shift

0111 011<sup>1</sup><sub>2</sub> 1<sup>3</sup><sub>4</sub>

↳ SHRD (right double)

↳ works the same as shld only  
destinations MSB are filled by the  
LSB of source

Qs

## ENCRYPTION & DECRYPTION

Encrypt

ROL AX,5      NOT AX

Add AX,3      dec AX

ROL AX,7      ror AX,7

Inc AX      Subs AX,3

NOT AX      ROR AX,5

4

5

8 u 21

$\text{CX} = \begin{smallmatrix} 1 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{smallmatrix}$

Q2

Mov Ax, 7907h  
Mov Cx, AOA0h  
SHRD Ax, Cx, 7

$\begin{array}{r} 0111 \\ \text{1st } \swarrow \quad \square 011 \\ 1001 \quad 0000 \quad 0111 \\ \text{CF} \swarrow \quad \square 011 \quad 1100 \quad 1000 \quad 10011 \quad \searrow \text{CF=1} \\ 2\text{nd } \swarrow \end{array}$

$\begin{array}{r} 0001 \\ \text{3rd } \swarrow \quad \underline{0} \\ 1110 \quad 0100 \quad 0001 \quad \text{CF=1} \end{array}$

$\begin{array}{r} 0000 \\ \text{4th } \swarrow \quad \underline{0} \\ 1111 \quad 0010 \quad 0000 \quad \text{CF=1} \end{array}$

$\begin{array}{r} 0000 \\ \text{5th } \swarrow \quad \underline{0} \\ 0111 \quad 1001 \quad 0000 \quad \text{CF=0} \end{array}$

$\begin{array}{r} 0000 \\ \text{6th } \swarrow \quad \underline{0} \\ 0011 \quad 1100 \quad 1000 \quad \text{CF=0} \end{array}$

$\begin{array}{r} 000 \\ \text{7th } \swarrow \quad \underline{1} \\ 0001 \quad 1110 \quad 0100 \quad \text{CF=0} \end{array}$

$\begin{array}{r} 100 \\ \text{8th } \swarrow \quad \underline{0} \\ 0000 \quad 1111 \quad 0010 \quad \text{CF=0} \end{array}$

$\Rightarrow 4 \ 0 \ F \ 2$

## MULTIPLICATION & DIVISION INSTRUCTIONS

↳ unsigned → MUL / DIV  
 ↳ signed → IMUL / IDIV

Qs ↳ IDIV & IMUL keeps in mind the value of the MSB (signed bit)

3 // MUL takes one operand

| Multiplicand | Multiplier  | Product                |
|--------------|-------------|------------------------|
| AL           | reg / imm8  | AX                     |
| AX           | reg / imm16 | <del>EA</del> * DX, AX |
| EAX          | reg / imm32 | EDX, EAX               |

upper 32 = EDX  
 upper 16 = EAX  
 lower 32

5 // After 16# Multiplication, CF/OF=0  
 if upper half of the result is zero;  
 otherwise

↳ meaning result is exceeding size of operand

## SIGNED MULTIPLICATION w/ IMUL

↳ 3-types of IMUL.

- IMUL A → 1 operand
- IMUL A, B → 2 operand
- IMUL A, B, C → 3 operand

\* 1 operand → same as MUL, A.

### 2 Operand

eg IMUL DX, CX (DX Multiplied by CX)

↳ used when we know my result is not supposed to exceed 16 bits

↳ Any result > than size of (operand) will lose data.

↳ any bits > than 16 will be truncated

eg  $\begin{bmatrix} DX, -8 \\ CX, -8 \end{bmatrix}$  ] IMUL DX, CX  
↳ DX = DX & CX

D 7  
ME 5

\* 3 Operands IMUL

↳ IMUL BX, CX, 8

then  $BX = CX * 8$ .

Q ↳ In 3 operand IMUL, we provide the destination ourselves.

↳ TASK → When does Carry Flag change w IMUL



## DIV

### DIV INSTRUCTION

Div reg/mem → DIVISOR

↳ DIVIDENT is implicitly taken

↳ if divisor is  $2^x$  bits; dividend will be  $2^x$  bits

2 Results; quotient & remainder

| DIVIDEND | DIVISOR   | RESULTS |     |
|----------|-----------|---------|-----|
| AX       | mem/reg8  | AL      | AH  |
| DX, AX   | mem/reg16 | AX      | DX  |
| EDX, EAX | mem/reg32 | EAX     | EDX |

eg mov ax, 0083h ; dividend

mov bl, 2 ; divisor

div bl ; AL=41h

AH=01h

## SIGNED DIVISION

↳ Signed operands

↳ Sign extension (MOVsx) necessary

eg MOV DX, 0

MOV AX, -1

MOV BX, 1

IDIV BX

DX AX

BX

↳ to preserve the

Sign

MOVsx DX, AX

|| Note: MOVsx only converts MSB's into signed bit

## INSTRUCTIONS FOR SIGNED INSTRUCTIONS

1. CBW (convert byte to word)

↳ CBW; MOVSX AX, AL

Extends MSB of AL into AX

2. CWD (convert word to double)

↳ CWD; MOVSX DX, AX

Extends MSB of AX to DX

3. CDQ (convert double to Quad)

↳ CDQ; MOVSX EDX, EAX

Extends MSB of EAX into EDX

## 7.4 EXTENDED ADDITION & SUBTRACTION

↳ ADC (Add with carry)

This Instrc adds both a source operand  
and the contents of the carry flag  
to the destination

ADC dest, source

; dest = dest + source + CF

@  
↓  
LSB

CF → Purani wali Inst ki carry

WJ11010101

SBB

Subtract \_\_\_\_\_

borrow

SBB dest, source

$$; \text{dest} = (\text{dest} - \text{source}) - \text{CF}$$

TASK: MID  $\rightarrow$  Qs 2(a) even odd Number.

↳ array of 1000 elements (random Pos)

↳ calc move odd in to 2nd array

## → MEMORY SEGMENTS

- Stack → local Data → every function gets its own stack frame
- Ret Addresses
- Arguments
- Register values

- Q
- Data
  - Code

3 Every called procedure gets its own stack frame

$$\text{No of stack frames} = \text{No of called procedures} + 1$$

↓  
of the main func

- L
- Arguments → values passed to func @ runtime
  - → EBP used as zero point

How STACK FRAME Is CREATED?

1. Passed arguments are pushed on stack
2. The subroutine is called, causing the subroutine return addresses to be pushed on the stack
3. As the subrou begins to execute ; EBP  
is pushed on the stack
4. local variables →  $\text{EBP} = \text{Zero Point} = \text{ESP}$
5. used Registers

## ADVANCED PROCEDURES

### STACK



To use Register stack for Passing Parameters

• data

Var1

• code

Push offset Var1

Pass by ref

Call P2.

• data

→ Pass by val

Var1

Var2

• code

Push Var1

Push Var2

Call P2



→ ESP

P2 :

push EBP  
mov ebp, esp  
mov eax, [ebp+  
add eax,



EBP+12

EBP+8

EBP+4

E9

P2 :

push ebp

\* mov ebp, esp ; base of stack frame // zero point?

mov eax, [ebp+12] ; second Parameter

mov eax, [ebp+8] ; first Parameter

pop ebp

// To clean slack after calling P2

bring esp back @ start of stack frame P2

call P2

add esp, 8

↳ esp points @ start of stack frame;

next time overwrite data

\* we moved esp into ebp to make esp  
the reference / zero point so we can  
easily access Arguments by going up the  
stack (add 4) or local variables by going  
down the stack (sub 4)

EJ main PROC  
Call Example

Exit  
Main ENP

Example 1 PROC

Push 6  
Push 5  
Call Add Two  
ret

Example1 ENPP

FF12 ret

FF0D EBP

FF09 ret

FF05 EBP

FF01 6

FEOB 5

FEO9 ret

EE05

EBP

main

Example

Add Two

// Every Time a Function is called ; EBP → Push  
w EBP == ESP

→ CALLING CONVENTIONS [C, STD]

The C calling convention solves the problem of cleaning up the runtime stack in a simple way

↳ calling conventions reclaims the space occupied by the arguments before POP EBP w extra work → manually

↳ hence the add esp, & → size of Arguments

is done @ back end

## 2. STD calling Convention

↳ returns an integer value

to the calling procedure (PI)

↳ an integer parameter to the RET

instruction, which in turn adds

integer to ESP after returning to the

calling procedure

\* The integer = size of Arguments



//

### IMP.

// Popping out EBP makes sure  
we do not lose the zero reference  
point of the prev call

eg

|          |              |            |
|----------|--------------|------------|
|          | FF09         | ret        |
| *        | EBP=ESP=FF05 | FF05       |
|          | FF01         | B          |
|          | FE0D         | S          |
|          | FE09         | ret        |
| EBP=FE05 | FE05         | EBP → FF05 |

↳ when popping EBP  
again FF05

## SAVING & RESTORING REGISTERS

↳ If registers are pushed into slack

REGISTERS SHOULD BE POPED OUT FIRST &  
IS A MUST

Then POP out EBP Then release  
addres in the clean the Arguments

## // LOCAL VARIABLES

MySub PROC

push ebp

mov ebp, esp.

sub esp, 8

// saving space for local variables

mov Dword Ptr [ebp-4], 10 ; x

mov Dword Ptr [ebp -8], \$0 ; y

mov esp, ebp

; remove locals

pop ebp

from slack

ret

ret address

EBP → EBP

loc(x) → EBP-4

20C4) EBP-8

## ENTER " LEAVE INSTRUCTIONS

- ↳ pushes EBP on the stack
- ↳ sets EBP to base of stack frame
- ↳ reserves space for local variables.

ENTER nbytes, nesting level

Enter does all what  
we do manually.

P1 NL ①

call P2 ②

call P3

call P4 ②

Proves that two/more  
procedures can have same  
nesting level.

## Leave (no arguments)

- ↳ reverts whatever is done by Enter

## LOCAL DIRECTIVE

### PI PROC

Local x : Byte , Y[10] : word .

↳ way to make local variables

↳ can only be used w/in PI

↳ if used, Local be used Just after  
the PI Proc line.

## TASK // lea instr wali slide using Local directive

Lea → load effective Address

// used to get address of anything @ runtime

lea esi, [ebp-30]

## RECURSION

- ↳ for base case cmp w any data variable ; eg if 4 recursions
- need to cmp ecx with 0
- more memory consumption

## Calculating a Factorial

## INVOKE DIRECTIVE

↳ only available in 32-bit  
using INVOKE is reduced to a single line  
in which the arguments are listed in  
reverse order (assuming STDGALL is in effect)

eg      push TYPE array

Push length of array

Push OFFSET

ADDR → Same As ~~Adder~~ offset

// By reference

→ INVOKE Sub2, ADDR byteVal

sends offset of byteVal to sub2

// By value

P. → INVOKE Sub1, byteVal, wordVal

sends value of val in wordVal  
to sub1

\* // can also send Constants

## PROC DIRECTIVE

↳ to define function Parameters

// PI PROC , x:BYTE , W:PTR word

↳ This is a signature of this procedure

// when invoking this ; first passing  
will be a value of byte-size

// the second passing is an addres.

## PROTOTYPE

\* To define a subroutine / procedure prototype

PI Proto x:Byte, w PTR byte

## TASK II Prev task using USES

### USES OPERATOR

ArraySum Proc USES esi ecx

    mov eax, 0

    L1:

    !

    loopl

    ret

Arraysum EndP

→ same as

Array Sum Procedure

Push esi

Push ecx

    mov eax, 0

    L1:

    !

    loopl

    pop ecx

    pop esi

Array Sum.

Uses is an operator which automatically pushes & pops registers in a procedure.

// Argument size mismatch

→ making sure that the argument sent is

the same size as the parameter that receives it

→ Never send constant value to a pointer parameter

In 32-bit mode, if we use parameters of

a diff size ; it prevents ESP from being

aligned on a double word boundary

↳ hence a page fault may occur & runtime

Performance may be degraded. // Memory is not

properly aligned.

↳ Same size Arg = Aligned Val

Note:  
If a 64-bit Argument is sent → it  
is stored in 2 halves (parts). High level  
in low level.

### NON DOUBLE LOCAL VARIABLES

P. if for example we use a  $\times$ :byte as  
a local variable, it is still stored  
on 4 bytes on the stack

- // Every local variable  
data is given the  
size of your architecture
- // if byte / word ... etc  
Any. Each are reserved  
a memory of 4 bytes



Note: These Null bytes are not a part of  
 $\times$ :byte → Just present here for  
memory alignment. // Dont confuse w/o/  
casting

Quiz Next Week → Chap 6, 7, 8.

## CHAPTER 9

### STRING & ARRAYS

#### • string primitive Instructions

↳ MOVSB, MOVSW, MOVSQ

↳ These instructions copy data from the memory location pointed to by ESI to the memory location pointed to by EDI

.data

source Dword

target Dword ?

.code

move esi, offset source

mov edi, offset target

movsd

↳ will copy contents of [esi]

to [edi] → of size 4 bytes because [sd]

↳ will also inc 4 int esi 4 edi

since command is movsd

DF // Direction Flag → 0 → forward  
→ 1 → backward

M

ea  
A word F2FO LCFI 11CFW  
B word 3 DOP(?)

mov esi, offset A  
mov edi, offset B

Pc



// To copy 1000 word elements you  
might use movsw a 1000 times  
in a loop

## REPEAT INSTRUCTION

X word ... 1000 → esi  
Y word ... 1000 → edi

CX ← 1000

L1 :

movsw

loop L1

OR

CX ← 1000

rep movsw

Both work the same way only rep instruction only works on string primitives and rep decrements CX as well.

// keep in mind that when processing word array use movsw only if you use movsb → then MSB will be saved as LSB by vice versa due to little Endian.

## COMPARE INSTRUCTION

→ CMPSB, CMPSW, CMPSD

↳ Compares a memory operand pointed to by ESI with memory operand pointed to by EDI

| Pc | esi | edi |
|----|-----|-----|
| *  | 12  | 76  |
|    | 99  | 12  |
|    | 00  | 00  |
|    | 10  | 02  |

Cmpsb → means 12, 76.

Cmpsw → 9912, 1276.

Cmpsd → 10009912, 02001276.

↳ compares a whole doubleword from ESI to a double word from EDI

**REPE** → repeat if cx != 0 & zero flag is ON

## Example Comparing 2 strings

source M A R T I N \_ \_  
↑  
esi

dest M A R T I N E Z  
↑  
edi

will compare till reaches the first  
space, bec of repe.

repe will not repeat when zero

flag is off flag is when  
'E' in space is not equal

↳ SCASB, SCASW, SCASD.

↳ scasb compares a value in al  
w address by edi

↳ scasw → AX by EDI

↳ scasd → EAX by EDI

## // Program to Search substring in String

date  
alpha "ABCDEGFGH", 0

mov edi, offset alpha ; EDI points to string

mov al, 'F' ; search F

mov ecx, length of alpha ;

clD

repne scasb

jnz quit

dec edi

; direction forward

; repeat while ! equal

; quit if letter not found

; found: edi since

will be pointing to G

decdi will ~~make~~ point

it to F again

CISC  $\rightarrow$  Complex Instruction Set  
RISC  $\rightarrow$  reduced

STOSB, STOSW, STOSD // Storing.

→ Stores the contents of Accumulator to the memory by EDI

↳ source = AL/AX/EAX

dest = EDI // Increments

.data

Count = 100

// check from slides

LODSB, LODSW, LODSD.

// Loading

loads a byte / word / DW from memory at ESI to AL/AX or EAX

source = ESI // INC ESI

Dest = AL/AX/EAX

// Good code of Array Mul check from slides

# TWO DIMENSIONAL ARRAY

- Two methods of arranging rows & columns are: Row-Major or Column Major

P. Array = 

|    |    |    |    |     |
|----|----|----|----|-----|
| 10 | 20 | 30 | 40 | 50  |
| 60 | 70 | 80 | 90 | A00 |
| B0 | C0 | D0 | E0 | F0  |

\*

Row Major

10 20 30 40 50 60 70 80 ....

Ex

Column Major

10 60 80 20 70 C0 30 80 D0

// No. of type of elements Must be same  
in each row

Base-Index Operands

[base + index]

// pointer needed for each row [pointing]  
esi used to access elements in kolumn  
in each row

Arr [ebx] [esi]

deck  
table  
R

Declaration

Table B Byte 10, 20, 30, 40, 50

RowSize = (5 - Table B)

B. Sum of column 4 in eax. array[5][5]

mov ebx, offset Array.

l1: lea ebx mov ecx, 3

L1:

add al, [ebx + column index]

add ebx, row size

↓  
Accessing  
element  
in  
array

Loop1

# MACHINE LANGUAGE

## INSTRUCTION FORMATS

| Instruction Prefix | Opcode | Mod/M | SIB | Add Displm | Imm data |
|--------------------|--------|-------|-----|------------|----------|
| 1                  | 1-3    | 1     | 1   | 1-4        | 1-4      |

- Instructions are stored in little endian order, so the prefix byte is located @ instructions starting address.
- Every Instruction has an opcode, remaining fields are optional  
mnemonic  $\rightarrow$  mask over opcode  
e.g. Opcode of ADD = 00
- Most Instruction are of 2-3 Bytes.

## OPCODE

eg op code for mov = 88. <sup>not necessarily</sup> hence

|   |   |   |   |   |   |   |   |
|---|---|---|---|---|---|---|---|
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |

D W

D bit → Direction

// coming from reg D=0

going to reg = D=1  
from 1 reg ]

W bit → Width

if operands are == 8 bits ; W=0

if operands are > 8 bits ; W=1

// op code 88 when D=0 & size>8.  
MOV AX, var1

## INSTRUCCIÓN FORMA1

↳ Instrucción prefix byte

for extended operands > 16 bits

② OP code byte

necessary numeric # for every

Instrucción

OPCODE BYTE

Instrucción Code

Dirección

Size

bit # 2-7

bit #1

bit 0

Dirección (Dbit)

- if data coming FROM Reg = 0

- if data coming To Reg = 1

SIZE (1 bit)

- if operands < 8 bits 0

- if operands > 8 bits 1

eg.

MOV CX, Var1

MOV=88

↳ 10001 0 I I  
            D W

↳ ③ MOD R/M Byte

MOD    Reg#    Mem#

6-7 bits

3-5

0-2

MOD - Addressing mode #?

- |   |                   |                        |              |
|---|-------------------|------------------------|--------------|
| ① | MOV Ax, Bx        | → register mode        | Binary<br>11 |
| ② | MOV Var1, Bx      | → Mem w/o displacement | 00           |
| ③ | MOV [ESI + 16h]   | → mem w/ 8bit "        | 01           |
| ④ | MOV [ESI + 1670h] | mem w/ 16bit           | 10           |

Register # (source ka if register mode)

each reg has its own; e.g. al — 000

cl — 001

dl — 010

bl — 011

Memory # (calculated from Table) → but if req mod  
then of dest

e.g. MOV Ax, Bx.

|           |                |              |
|-----------|----------------|--------------|
| <u>11</u> | <u>011</u>     | <u>000</u>   |
| mode      | Reg#<br>source | Reg#<br>dest |

g. ADD Ax, Cx

00 00 00 01  
11 001 000

opcode byte  
RM byte

## INSTRUCTION ENCODING. ( ADDRESS DISPLACEMENT)

eg. Mov [ESI + 170Fh] x DL

1000 1000 10 010 100 →  
opcode Mod R/M appended offset  
DW displacement

88 94 DF17

Displacement in form  
of little endian

if same instruc. but offset 17h. (8bit)

Mov [ESI + 17h] x DL

1 000 1000 01 010 100, offset

1000 1000 01 010 100 17  
8 8 9 4 17

MOV [EDI+1709FOFO], EDX

10001001 10010101 70FO0917,  
opcode Mod R/M offset 32.  
(Add displacement)  
89 95 70 FO 09 17

→ SINGLE BYTE INSTRUCTION

eg

CBW

LODSB

INC DX

98

AC

42.

already  
Pre-defined/abit

\* register increments are optimized for  
code size up

↳ SINGLE IMMEDIATE DATA

→ when the only operand eg Push 5FF0.

of an instruction is an  
immediate data then use a

machine code ⇒ OP code followed by data

eg Ret 8.

C2. 0800

for  
Ret

Always Appended.

## ↳ SINGLE OPERAND (Register) Instruction

e.g. PUSH DX

↳ In this case ADD reg# to <sup>OP code by</sup> ~~memory~~

$$\begin{aligned} \text{Push } &DX \\ = &52 + 2 \\ = &54 \end{aligned}$$

If EDX,  
Instruction Prefix  
scr.

## ↳ Register Immediate Instructions

Immediate operands are appended to instructions in little endian order.

→ The encoding format of a MOV imm that moves an immediate value to a register  
B8 + rw dw

↳ where Op code by value is B8+rw,  
indicating that a register is ADDED to  
B8

↳ dw is the immediate word operand

ADD CX, 190FH.

byte

0000 0000 + 001, OF 19,  
reg added operand  
                          APPENDED

↳ SINGLE OPERAND (MEMORY)

Assembly  
INC BYTE PTR [BX][SI]

IDIV WORD PTR [DI] + 1A2Bh. → ASSEMBLY LANGUAGE

| opcode                                           | mod-reg-r/m  | Machine language |
|--------------------------------------------------|--------------|------------------|
| F7                                               | 10, 111, 101 | F7 BD 2B 1A      |
| ↓<br>opcode extended<br>bits for<br>IDIV (norep) |              |                  |

// REG bits of R/M byte holds OP code  
Extension m these instructions

CISC: Reg-Mem

RISC: at max and min 4 byte encoding  
↳ aligns data

→ MIPS?

X → X → X

## chapter 13 : High Level Language Interface

↳ where convenience is developed is more imp than time or size of code → HLL is used.

### • model Directive

↳ • model Memory Model, Calling Convention, Stack Distance (CISC, C)

### Memory Models:

- Explains code w/ date segment(s) in code
- Tiny → Code, data, stack @ same place
  - Small → 1 code & 1 data segment
    - if TCS = EDS → same segment
  - Medium → mul code, 1 data segment
  - Compact → 1 code, mul data segment
  - Large → mul code, mul data segment
  - Huge → " of greater size
  - Flat → default (protected mode)

CISC : Reg - Mem

RISC : at max and min 4 byte encoding ] ?  
↳ aligns data

↳ MIPS ?



## chapter 13 : High Level Language Interface

↳ where convenience of development is more imp than time or size of code → HLL is used.

### • model Directive

↳ • model Memory, Calling, Stack  
      Model, Conventions, Distance  
      (SIC, C)

### Memory Models:

Explains  
code w/  
data  
segment(s)  
in  
code

- Tiny → Code, data, stack @ same place
- Small → 1 code w/ 1 data segment  
if TCS = EDS → same segment
- Medium → mul code, 1 data segment
- Compact → 1 code, mul data segment
- Large → mul code, mul data segment
- Huge → " of greater size
- Flat → default (protected mode)

### • STACK DISTANCE

near stack: Physically stack is present  
in data segment ( $EDS = ESS$ )

far stack:

A diff /separate segment

far stack

\* // Near Call → calling w/ called proc are on  
Same segment

// Far Call → calling w/ called proc are  
on different segments

↳ default depends on our model.  
↓  
memory

// EES, EFS → used to check extra segments

THE \_\_asm Directive

To write assembly in C++

— asm ~~num~~

// Inline assembly

{  
=====  
=====  
y

code block

// data to be made in  
HLL Part.

OR

— asm statement

This has its own features & limitations

### Limitations

- no data definition directives
- cannot use anything other than PTR
- cannot reference segments w/ name (e.g. date)
- Cannot make assumptions about registers b/c they are global to the system.

: In slides.  
:

Operating System Locks resources till one is using it.

Homework :

- File Encryption Example

- SIB

# MIPS ARCHITECTURE "Assembly of MIPS."

RISC :

- MIPS
- Load Store

ISC

↳ Instruction set architecture  
everything a machine  
language programmer needs  
to know in order to program  
a computer

RISC → faster → ; Simple instructions;  
addressing & easy ; Pipelining

## PIPELINING

eg ✓ MOV Ax, 12  
✓ MOV Bx, 13  
MOV Cx, Ax  
✓ MOV Dx, 12  
ADD Dx, Cx

Pipelining allows to  
execute INDEPENDENT  
Instruction w/o being  
dependent on anything else.  
(Increase Processor Speed / time)

This is called Parallel processing that

In Software (wise) This is called Parallelism  
↳ INSTRUCTION LEVEL PARALLELISM.

## ↳ DATA LEVEL PARALLELISM

Mov Var1, 17

Mov Var1[2], 19

Mov Ax, Var[4]

Mov Dx, Ax

Mov Var1[4], Ax

✓ If you execute these

Instructions parallelly

then we can access

the memory @ the same

time → data level Parallelism

We can also execute

2 Instructions Simultaneously

hence → Instruction Parallelism

RISC → every Instr 4 bytes → easier.

## // Homework

mobile type CISC / RISC → state of practices  
devices w/ CISC? (practical systems in market)

## ↳ TASK LEVEL Parallelism

different types of tasks running simultaneously

Processor switches easily & fast

e.g. VLC, Google together

// NOTE : CISC → mushkil bec of complexity

// has a lil pipelining

// Better for memory (shayad?) ✓ yes

//ILP doesn't only mean Instructions executing  
@ the same time ; also means to not  
be dependent on any instruction for  
another (!waiting)

## Load-Store

↳ ALU not allowed @ on memory

↳ check Memory data types

eg Word in CISC  $\rightarrow$  Halfwords in RISC

↳ 32 4-byte Registers

- all accessible  $\rightarrow$  except for zero register
- zero-register always has 0

↳ To Access Registers

R1 | \$1  $\rightarrow$  Register 1

R0  $\rightarrow$  R31  $\rightarrow$  32 registers present

in the register file (log of registers)

- check register usage table

# MEMORY LAYOUT



## Homework

### // Addressing Modes in MIPS (CISC)

#### Instruction Set

- 21 arithmetic
- 8 logical
- 8 bit manipulation
- 12 comparison
- 25 branch Jump
- 15 load Instructions
- 10 store Instructions  
[check Table]

#### LOAD & STORES

LB → load byte

LH → load halfword

SB → store byte

SH → store halfword

LB \$ , memory

SB \$ , memory

Equivalent Instruction → from Slides

R-TYPE Instructions :

Performs Arithmetic w/

logical on registers (shifts)

F

\* I-TYPE Instruction : load-store w/ Immediate

J-TYPE Instruction : used for jumps w/ branch

// Register Operands take less time/cycle to execute

II Declaration

<sup>date</sup>  
x:

• byte

0x12

[x: • byte 1, 2, 3  
↳ Array]

str: • ascii "Hello world" \n

noll  
character

SH, SLO → Results of mul w/ div ;

\* depends on size of results

Registers

\$S0 = R[16]

name of  
register ↓

(name reg file main  
16<sup>th</sup> register

## MISCELLANEOUS REGISTERS

\$PC

\$status or \$PSW

\$cause

\$hi \$lo

Program Counter [every 4 byte step]  
= I offset

Status Register (flags)

Exception cause register

Imul/ldiv results storage

## HOMEWORK

#3 Flags in MIPS

IN DIVIDE : Lo = Result (Quotient)

Hi = Remainder

## ARITHMETIC & LOGICAL

add \$1, \$2, \$3

\$1 = \$2 + \$3

OR \$1, \$2, \$3

\$1 = \$2 OR \$3

## DATA TRANSFER

Lw \$1, 100(\$2)  $\Rightarrow$  \$1 = Memory[\$2 + 100]

Sw \$1, 100(\$2)  $\Rightarrow$  Memory[\$2 + 100] = \$1

## Branch - altere Program flow

beq \$1, \$2, 25

if (\$1 == \$2)

should use  
label here

PC = PC + 4 + 4 \* 25

## MIPS INSTRUCTIONS

similarities in formats ease

Different formats for different purposes.

### ① R-Format

|    |        |    |    |    |       |      |
|----|--------|----|----|----|-------|------|
| 31 | OP     | rs | rt | rd | shamt | func |
|    | 6 bits | 5  | 5  | 5  | 5     | 6    |

### ② I FORMAT

|    |    |    |    |        |
|----|----|----|----|--------|
| 31 | OP | rs | rt | offset |
|    | 6  | 5  | 5  | 16     |

### ③ J-Format

|    |    |         |
|----|----|---------|
| 31 | OP | address |
|    | 6  | 11 16   |

rs - Source Reg #1

rt - Source Reg #2

rd - Destination Register

shamt - Shift amount - ?

func - Function # ? provided

eg add \$S0, \$S1, \$S2 ( registers 16, 17, 18)

| OP | rs | rt | rd | shift | funct |
|----|----|----|----|-------|-------|
| 0  | 17 | 18 | 16 | 0     | 32    |

000000 10001 10010 00000 0000 100000

↳ machine code generated

// command ka Pata funct # se chalta hai

## STAGES OF EXECUTING INSTRUCTIONS

(MIPS)

- 1) Fetch
- 2) Decode
- 3) ALU / effective Address
- 4) loading Memory (Reading)
- 5) Inserting result back

// ALU's take only 4 stages (No read memory)

// load / store is expensive in respect of steps (5 steps)

## PIPELINING

→ Traditional example of washing, Doing 4 Fold up clothes

|   |        |                    |          |          |
|---|--------|--------------------|----------|----------|
| W | Inst 1 | Dry I <sub>1</sub> | Fold (1) |          |
|   |        | Inwash (2)         | Dry (2)  |          |
|   |        |                    |          | Wash (3) |

In Example of Mul Instruction in Computer

| I <sub>1</sub> | Fetch | Decode               | ALU              | Mem            | WriteBack          |
|----------------|-------|----------------------|------------------|----------------|--------------------|
| I <sub>2</sub> |       | Fetch <sub>(2)</sub> | Decode           | ALU            | Mem                |
|                |       | Fetch <sub>(3)</sub> | D <sub>(3)</sub> |                | Alu <sub>(3)</sub> |
|                |       |                      | Fetch 4          | D <sub>4</sub> | F <sub>5</sub>     |

NOTE

If Instruc<sup>n</sup> 1 = load = 5 steps & Instruc<sup>n</sup> 2 = ALU = 4 steps

then WB would be WB too ; And 2 instructions will be using WB @ the same time = **STRUCTURAL HAZARD**

eq:

1.  $LW \$1, 100(\$7)$
2.  $Add \$11, \$7, \$10$
3.  $SUB \$2, \$1, \$3$

DATA

HAZARD

⇒ Instruction 3 is trying to read smth that  
isn't already written by 1.

3 is reading RAW - read After write

\* ⇒ B/W 2 w/ 1 There is no hazard since  
both are ONLY reading

4.  $LW \$7, 50(\$30)$  WAR w(2)

5.  $Add \$2, \$4, \$7$  WAW w(3)

// Never keep dependent in pipeline together

## HAZARDS

→ Structural

→ DATA

↳ WAR

↳ RAW

↳ WAW

at good  
exec

// one of the solutions → changing order

↳ for structural hazards

use

a stall



// CISC is hard to handle w/  
Pipeline

stall





## CACHE

↳ Principle of locality; most programs don't use every part of program (?)

↳ block minimum bytes(?) of data that can be present / not present on the cache

↳ Mapping:

data from memory → cache is called mapping

↳ Replacement Algorithm:

Algorithms to replace a block of memory in cache

- least recently used (LRU)

- First in First out

- least frequently used (LFU)

- Random

Cache' organization

↳ physical address → actual address @ memory

↳ logical address → OS assigned address also known as virtual address  
(due to multiple RAMs)

Memory location that we talk about are logical addresses

In a cache!

1 m blocks, called  
lines



12 bytes

## MAPPING Functions

### 1 Direct Mapping:

$$P = j \bmod 8m$$

maps each block of main memory into only one possible cache line.

e.g.

if main memory byte:

|   |     |  |
|---|-----|--|
| 0 |     |  |
| 1 |     |  |
| 2 |     |  |
| 3 | ABC |  |
| 4 |     |  |

|   |     |  |  |  |
|---|-----|--|--|--|
| 0 |     |  |  |  |
| 1 |     |  |  |  |
| 2 |     |  |  |  |
| 3 | ABC |  |  |  |
| 4 |     |  |  |  |

If data @ 3rd line in main memory

then in cache  $\Rightarrow 3 \bmod 4$  (4 lines)

hence @ 3

If memory = 8-word. Then each block =  $8^2$



$$\text{Then } j = 011/2$$

$i = j \bmod m$

$i$  = cache line #

$j$  = main memory BLOCK #

$m$  = number of lines in the cache

\* Advantage  $\rightarrow$  Easier to find!

\* Disadvantage  $\rightarrow$  Same cache line is reused again and again even if all cache is empty

$\rightarrow$  Replacement Algo also on that same line

### Word Addressable Memory:

every 4 byte = 1 block in main memory  $\rightarrow$  logical addresses are given to every 4 bytes and b a single byte.

**Note // Rem :** J is Memory Block number  
not the byte memory

## 2. Associative Mapping

↳ Maps each main Memory block into ANY line of the cache

Advantage : No fragments of cache (empty places hone k bawajood replacing)

Disadvantage :

\* Finding data is very hard;  
can also mean sequential searching

## 3. Set Associative Mapping $\rightarrow$ Most efficient

In this case cache' consists of a number of sets, each of which consists of a number of lines, The relatives are "set"

$$m = v \times k$$

$$i = j \text{ mod } v$$

$i$  = cache set #

$j$  = main memory block #

$v$  = # of sets

$m$  = Number of lines in cache.

$k$  = # of lines in each set

|| Returns a set #  $\rightarrow$  and data is Randomly mapped onto any line of that set

→ lesser Fragmentation than Direct  
→ lesser Distortion of data

Replacement:

if whole set is filled → it uses replacement Algos to get rid of any block. for replacement

## "INTERRUPTS & RELATED CONCEPTS"

→ Interrupts → exception

Bit user exception then

HARDWARE Interrupts

Hardware Interrupts → highest priority ①

→ Software ⇒ system Exit

→ system (exceptions)

eg. program into system memory

→ Control Bus = 0

↳ No Interrupt

Control Bus = 1

↳ Interrupt

→ if no response → MP busy



→ reserved  
PCB → process control Block