

Assembly → statement →  $(X+S) + Z = Y$   
 ↳ complex statement

Instruction → ADD → only one action perform.

MOV AL, 00H → hexa

If H not present then decimal

MOV AL, 00H  
 destination source

HIGH LL Assembly Machine code

$X = (Y+4) \times 3$  ADD Y, 4 Instruction or bytes

statement MUL Y, 3 1010, 111 0101

MOV X, Y  
 instruction

HIGH LL TO ASSEMBLY :- 1 to many relation.

ASSEMBLY TO MACHINE CODE :- one to one relation

\* In Assembly, no datatype. (Has memory spaces)

VIRTUAL MACHINE:- translate + interpret



Codes:-  
 mov ax, bx  
 add ax, 9h  
 mov x, ax

Include In-line 32 line  
 data  
 x byte 0h  
 code main PROC

|                     |     |              |
|---------------------|-----|--------------|
| PC / IP             | 001 | → mov ax, bx |
| Program Counter     | 002 | → add ax, 9h |
| Instruction pointer | 003 | → mov x, ax  |



FETCH - DECODE - EXECUTE :-

For execution, data is loaded into main memory.

Fetch :- processor fetches the instruction from memory.

Instruction is transferred from memory to instruction register.

Decode :- instruction is decoded by processor.

Execute :- processor executes instruction.





Memory stores both code (instructions) and data.

Bus → pathway for transferring data b/w diff. parts of CPU.

Instruction pointer → points to location in memory of <sup>next</sup> instruction to be executed.

Code cache - fast memory that stores frequently used instructions.

It speeds up execution time by reducing time it takes to retrieve info. from main memory.

Instruction decoder, - translates <sup>speed</sup> opn. signals into signals that <sup>CPU</sup> control unit can understand.

Control unit - coordinates with Add & floating point unit.

Registers - temporarily hold data that's being processed.

Add - performs arithmetic & logical operations on data.

Data cache - stores frequently used data for faster access.

After data is executed, goes to output device for display / data stored depends on instruction.

loop

fetch next instruction

advance the instruction pointer (IP)

decode the instruction

if memory operand needed, read value from memory.

execute the instruction

if result is memory operand, write result to memory.

continue loop

Bus - group of parallel wires that transfer data from one part of comp. to another.

clock is used to trigger events.  
↳ measures time of single operation.  
↳ synchronization

Date: \_\_\_\_\_  
M T W T F S S

Data bus → transfer data & instructions between CPU and memory.

I/O bus → transfer data between CPU and system input/output devices.

Control bus → uses binary signals to synchronize actions of all devices attached to system bus.

Address bus → holds address of data in instruction when currently executing instruction transfers data between CPU and memory.

Clock:- The basic unit of time for machine instructions is a machine cycle (or clock).

Clock synchronizes the internal operations of CPU with other system components.



Reading from Memory:- The CPU must wait one or more cycle until operands have been fetched from memory before current instruction can complete its execution.



✓ All are based on "von Neumann Architecture".

- Code and data both stored in memory.

- General purpose

- Fetch, decode, execute.

Reading a single value from memory involves 4 steps:-

① Place the address of the value you want to read on the address bus.

② Assert (change the value of) processor's RD (read) pin.

③ wait one clock cycle for memory chips to respond.

④ copy data from data bus into destination operand.

Each of these steps require a single Clik cycle.

CACHES... WHY ARE CACHES NEEDED?

Coz RAM is very slow as compared to CPU's processing speed.

Cache is high speed memory, reduces amount of time spent in reading & writing.

When CPU begins to execute a program, it loads the next thousand instructions into cache.

If it happens to be a loop in that block of code, the same instruction will be in cache.

When processor is able to find its data in cache memory, it's called cache hit.

If CPU tries to find something in cache but it's not there, it's called cache miss.



### LOADING & EXECUTING A PROGRAM

- ✓ OS searches for program's filename in current disk directory. If it cannot find the name there, it searches a predetermined list of directories (called paths) for filename.
- ✓ If OS fails to find program filename, it issues an error message.
- ✓ If program file is found, the OS retrieves info about program's file from disk directory including file size, its physical location on disk drive.
- ✓ Memory block is allocated based on program's size & location (descriptor).
- ✓ OS also adjusts values of pointers within program so they contain addresses of program data.
- ✓ OS begins execution (entry point).

- ✓ As soon as program begins running, it is called a process.
- ✓ OS assigns process an identification number (process ID), which is used to keep track of it while running.
- ✓ OS's job is to track the execution of process & to respond to requests for system resources → memory, disk files and input/output devices.
- ✓ When process ends, it is removed from memory.

Physical Address, Segment, Offset in

logical address      segments



$$\text{physical Address} = \text{Seg} \times 10h + \text{Offset}$$

Physical memory

|    |  |
|----|--|
| CS |  |
| SS |  |
| DS |  |
| ES |  |

Q. OA51 : CD90h → seg : offset

$$\begin{aligned}\text{Physical Address P.A.} &= \text{Seg} \times 10h + \text{Offset} \\ &= OA51 \times 10h + CD90h \\ &= OA510h + 0D90h \\ &= \boxed{172A0h}\end{aligned}$$

$$\begin{array}{r} 5+0 = 1 \text{ sum, } 2 \text{ carry} \\ \begin{array}{r} 0 \\ + C \\ \hline 1 \end{array} \quad \begin{array}{r} 5 \\ + 0 \\ \hline 5 \end{array} \quad \begin{array}{r} 1 \\ + 9 \\ \hline 0 \end{array} \\ \hline 172A0 \end{array}$$

Q. Address = 4A37Bh, offset = ?, Segment = 40FFh, see hexadecimal addition rules.

$$P.A. = \text{Seg} \times 10h + \text{Offset}$$

$$\begin{aligned}\text{Offset} &= P.A. - \text{Seg} \times 10h \\ &= 4A37B - 40FF \times 10h \\ &= 4A37B - 40FF0h \\ &= \boxed{938B \text{ D.h}}\end{aligned}$$

$$\begin{array}{r} \cancel{16} \\ 4 \ 9 \ A \ 2 \ B \ 15 \ 7 \ B \\ - 4 \ 0 \ F \ F \ 0 \\ \hline 0 \ 9 \ 3 \ 8 \ B \\ 2+16=18 \\ 18-15=3 \end{array}$$

Code Segment :- holds the base address for all executable instructions in program.

Data Segment :- holds the base addresses for variables. This segment stores data for program.

Extra Segment :- extra data segment used for shared data.

Stack Segment :- holds base address for stack. Also stores interrupt & subroutine return addresses.

Lecture:-

1111 1111 FF ← Byte

1111 1111, 1111 1111, FFFF ← Word

1111 1111, 1111 1111, 1111 1111, 1111 1111, FFFF FFFF ← Double word  
 byte byte byte byte

CF = 1 → (+ve)

AC = 1 Auxiliary carry flag

PF = 0 Parity flag

ZF = 1 Zero flag

SF = 0 Sign flag

OF = 0 → (-ve) overflow flag.

FF

+ 01

1 00

carry flag (CF)

1

CF

ZF = 1

0 000 000

↓

if there is any  
carry b/w  
4th and 5th

bit then  
Auxiliary carry

flag is set to 1.

If CF = 1 then OF = 0, If OF = 1 then CF = 0.

6 status flags :- 1. CF

2. AF

3. PF

4. ZF

5. OF

6. SF

Control flags :- 1. Trap flag

2. Interrupt enable flag

3. Direction flag

CF → Carry Flag: set by carry out of MSB.

PF → Parity Flag: set if result has even parity. number of 1's

ZF → Zero Flag: set if result = 0.

SF → Sign Flag: MSB of result

Status flags reflect the outcomes of arithmetic and logical operations performed by CPU.

Registers - high speed storage locations directly inside CPU, can be accessed much more quickly than conventional memory.

NOP/STC/CLC

① Mnemonics/opcode; instruction with no operand ← NOP

② \_\_\_\_\_; One operand; Inc Reg / Memory 8/16/32

opcode/mnemonic R (M → M) X Dec Reg / Memory 8/16/32

③ \_\_\_\_\_; destination source ; 2 operands ← mov / ADD / SUB / MUL / DIV

④ \_\_\_\_\_; 3 operands

opcode/Mnemonic Destination Source Direction

~~16 bit 8 bit~~

Mov ax, bx X ] NAT compatible  
 Add bx, ah X ] Assembler will give error

③ Instruction → 2 operands.



④ Movsb → to move string byte  
 Movsd  
Movsw, Dest2, Source, 0/1; 3 operands  
 Opcode Destination Source Direction

{  
 Source 1 byte " " ;  
 Dest2 byte " " ;

LABEL :-

CODE LABEL DATA LABEL

L1: array DWOD 1024, 2048

↑ this is data label

code label / loop / jmp

code  
data segment

L3:

data

→ arr1 byte  null, 0 ↴

→ arr2 byte  null, 0 ↴

code

mov ecx, length(arr1)

mov si, offset arr1

mov di, offset arr2

L1: mov al, [si]

mov dl, al

inc si

inc di

loop L1

} similar to  
goto statement  
in C.

General Purpose Registers:- AX (Accumulator), BX (Base), Counter (CX), DX (Data)

AX [Accumulator Register]. → arithmetic, logic and movement operations.



BX [Base Register] → arithmetic, logic and movement operations.

→ contains address of other memory location.



CX [Counter Register]. → arithmetic, logic & data movement operations.

→ used for loop or counter purpose.



DX [Data Register] → arithmetic, logic & data movement operations.

→ used in multiplication & division operations.



Segment Registers:- CS (Code segment), DS (Data segment), SS (Stack segment), ES (Extra segment)

Index Registers:-

Source Index (SI) :- used to store an offset address for source operand.

SI register work with DS (Data segment) DS : SI.

Destination Index (DI) :- used to store an offset address for destination operand.

DI register work with ES (Extra segment) ES : DI

Both SI & DI registers are used in string movement instructions.

Pointer Registers:-

Instruction Pointer (IP).- IP always work with CS register. CS : IP.

Always point to next instruction to be executed.

SS → Stack Segment  
CS → Code Segment

Date: M T W T F S S

Stack Pointer (SP) :- SP register work with SS register  $SS = SP$ .  
Always points to top item on the stack.

Base Pointer (BP) :- BP register work with SS register.  $SS = BP$ .

Used to access parameters passed to procedure.

Base address → starting address of segment.



Offset address → distance address of instruction to be executed from the base location.



Address of next instruction = CS + IP

## Modes

Real Address mode :- only 1 MB of memory can be addressed, from hexadecimal 00000 to FFFFF. Processor can run only one program at a time, but can momentarily interrupt the program to process requests (called interrupts) from peripherals. MS DOS OS (<sup>operating system</sup>) runs in real address mode & Windows 95 and 98 can be booted into this mode.

Protected mode :- processor can run multiple programs at the same time.

It assigns each process (program) a total of 4MB of memory. Each program can be assigned its own reserved memory area, so programs are prevented from using each other's code and data. Ex:- MS - Windows and Linux.

Virtual mode :- computer runs in protected mode and creates a virtual 8086 machine with its own 1-MB address space that simulates an 80x86 computer running in real address mode. for Ex :- Windows NT and 2000.

## MOV INSTRUCTION. Lecture 1

Date: M T W T F S S

MOV Reg 8/16, Reg 8/16

MOV al, bl / ax, bx

MOV mem 8/16/32, Reg 8/16/32

MOV reg 8/16, mem 8/16

.data

var1 WORD [?] when value is undefined/uninitialized.

Registers stores value and memory.  
that value can be address

BYTE 00

WORD 00 00

DWORD 0000 0000h



(s)WORD

EAX

signed → can save both + -

either negative or positive.



MOVZX

lahf and sahf → Instruction with no operand.

load save

→ NOP, CLC, STC, Lahf, Sahf



• data

mov ah, saveflags

saveflags BYTE ?

sahf

• code

lahf

mov saveflags, ah

XCHG → exchange instruction

mov ax, 9h

mov cx, 15h

Instead of 3 instructions  
just use xchg instruction

xchq ax, cx

{  
mov bx, ax  
mov ax, cx  
mov cx, bx  
}

direct access through access directive.

Date:   M T W T F S S

Direct offset-

arrB byte 5 DUP(0) // arrayB BYTE 10h, 20h, 30h, 40h, 50h

Q. Access through direct offset, indirect offset and index.

mov al, [arrayB] ; AL = 10h      mov al, [arrayB] = mov al, [arrayB+0]

any  
bl/bh ← any Byte register

mov al, [arrayB+1] ; AL = 20h

mov al, [arrayB+2] ; AL = 30h

mov al, [arrayB+20] ; AL = ?? Garbage value.

2 bytes

arrayW WORD 100h, 200h, 300h

mov ax, [arrayW] ; AX = 100h

mov ax, [arrayW+2] ; AX = 200h

2 bytes jump register.



100 → it's pass address ajaygaur

mov esi, offset arrB

INDIRECT

mov al, [esi + 0] AL = 10h

OFFSET

mov al, [esi + 1] AL = 20h

mov esi, offset arrW

mov edi, esi

mov ax, [edi] / [edi + 0]

inc edi; ~ unity increment only adds one byte so we write edi, 2 for 2 bytes  
add edi, 2

NEG :-

comp. does  
2's complement

mov ax, 10h

NEG reg, 8/16/32 at the backend neg ax AX = -10

NEG mem 8/16/32 neg ax AX = 10

OFFSET :-

any variable

mov esi, offset x

|          |     |   |      |     |             |
|----------|-----|---|------|-----|-------------|
| 04000000 | 120 | X | ds   | 20  | x 0400 0000 |
|          |     |   | cs   | 30  | y 0400 0001 |
|          |     |   | ss   | 200 | 0400 0002   |
|          |     |   | arrB | 500 | 0400 0003   |
|          |     |   |      | 400 |             |

mov esi, offset arrB

offset → returns starting address of any variable

↳ or distance, in bytes, of the label from beginning of data segment.



POINTER PTR :-

EAX



mov ax, WORD PTR myDouble

to access 16 bits / 2 bytes.

any 16 bits — starting ki thi nosakte  
middle end

mov ax, WORD PTR myDouble

Type Casting

↳ Output: 5678

|    |           |
|----|-----------|
| 78 | 0040 0000 |
| 56 | 0040 0001 |
| 34 | 0040 0002 |

mov ax, WORD PTR [myDouble+2]

12

↳ Output: 1234

little endian

big endian

Example of  
little endian

depends on  
architecture

## WEEK 03 + WEEK 04

Date: M T W T F S S

Integer constants :-

h → hexadecimal

If no radix is given, it is decimal.

8, 10 → octal

Ex:- 26 → 26d

d → decimal

b → binary

REAL NUMBER CONSTANTS :- [sign] integer. [integer] [exponent]

Ex:- 2., +3.0, -44.2E+05, 26.E5 E [+/-] integer

Atleast one digit in decimal point are required.

CHARACTER :- single quotes      STRING :- single or double quotes

RESERVED WORDS :- Special meaning in MASM.

Ex:- Mnemonics (ADD, SUB, MUL), Register names (ax, bx), Directive, Attributes (BYTE, word)

Operators, predefined symbols

Identifier :- programmer-chosen name. It might identify a variable, constant, procedure or code label.

Identifier → variable name.

An identifier cannot be same as assembler reserved word.

first character can be A-Z, a-z, -, @, ?, \$.

DIRECTIVES :- commands to the assembler that direct the assembly process.

Ex:- .data, .WORD, .code, .stack

Directives don't execute at runtime, are not case sensitive.

.data → Identifier the area of the program containing variables.

.code → " " " " " " containing executable instructions.  
.stack → " " " " " " containing runtime stack, setting its size.

→ different assemblers have different directives. NASM, MASM not same.

INSTRUCTION :- statement that becomes executable when a program is assembled.

Syntax : [label:] mnemonic [operands] [;comment].

LABEL :- act as place markers, marks the address (offset) of code and data.

→ follow identifier rules.

Data label, Code label

Data label :- identifier :- (count) DWOD 100

Identifier / name of variable / data label

Code label:- target of jump and loop instructions.

loop ↵ L1 : (followed by colon  
≡ code

## INSTRUCTION FORMAT EXAMPLE:-

- ① NOP [No Operation Instruction] :- STC (set carry flag)
  - ② One operand :- INC/DEC inc eax , inc myvar  
register 8/16/32      memory 8/16/32
  - ③ Two Operand :- Ex: ADD ebx, ecx  
reg, mem / mem, reg  
reg, reg  
mem, constant  
reg, constant expression  
mem, memo X

Ex:- SUB myvar, 25

ADD eax, 25 \* 36,  
Source operand  
Destination operand

## PROGRAM :-

## TITLE, Add and Subtract

Title directive marks entire line as a comment.

- ⑨ This is my program All text to right of ; is ignored by assembler / used for comments

INCLUDE Irvine32.inc    INCLUDE directive copies necessary definition

.code

main proc proc directive identifies the beginning of procedure.

name of procedure

~~MOV / ADD } instructions~~

calls a procedure that displays current value of CPU registers

call DumpRegs ;displays registers

**exit** - macro command defined in Invfne32.mc include file that provides a simple way to end program.

main ENDP → ENDP directive marks the end of main procedure.

END main END directive marks the last line of program to be assembled.

**INTRINSIC DATA TYPES** - 8/16/32 . Assembler is not case sensitive.

**BYTE**, **SBYTE**, **WORD**, **SWORD**, **DWORD**, **SDWORD**

dWord , Dword , DWORD → same  
no issue !!!

## OVERLAPPING VALUES:-

### .data

oneByte BYTE 78h  
 oneWord WORD 1234h  
 oneDword DWORD 12345678h

### .code

```
mov eax, 0           EAX = 00000000h
mov al, oneByte     EAX = 00000078h
mov ax, oneWord     EAX = 00001234h
mov eax, oneDword   EAX = 12345678h
mov ax, 0           EAX = 12340000h
```



EAX → 16 bit

## COPYING SMALLER VALUES To LARGER Ones:- mov ax, al × invalid

### .data

count WORD 1

### .code

```
mov ecx, 0
mov cx, count
```



What happens if we try this approach with signed integer -16?

### .data

signedval SWORD -16 → FFF0h (-16)

### .code

```
mov ecx, 0
mov cx, signedval
```



but in decimal this would be +65,520.

**MOVZX** :- move with zero extend. only used with unsigned integers.

### .data

byteval BYTE 10001111b

### .code

```
movzx ax, byteval
```

Source



0

↓

Destination



**MOVSX** :- move with sign extend. only used with signed integers.

Ex:-

mov bx, 0A69h

BX = 0A69

movsx eax, bx

EAX = FFFF0A69,

movsx edx, bl

EDX = FFFFFFFC9

movsx cx, bl

CX = FFC9

Source



↓

Destination



**LAHF** and **SAHF** :- load status flag into AH (LAHF), store AH into status flag (SAHF).

The following flags are copied: Sign, Zero, Auxiliary carry, parity & Carry.

```

.data
saveflags BYTE ?
.code
laf
mov saveflags, ah
mov ah, saveflags
salhf

```

XCHG instruction - exchanges the content of two operands

xchg  
reg, reg  
reg, mem  
mem, reg

Ex:-  
xchg ax, bx  
xchg ah, al  
xchg var1, bx  
xchg eax, ebx

Q. To exchange two memory operands. val1, val2

L combine mov & xchg  
instruction.

- ① mov ax, val1
- ② xchg ax, val2
- ③ mov val1, ax

xchg mem, mem x invalid

val1 = 5 ] Assume  
val2 = 10

ax = 5 ] mov  
val2 = 5 ] xchg  
ax = 10 ]  
val1 = 10 ] mov  
val1 = 10  
val2 = 5

DIRECT OFFSET OPERAND :-

arrayB BYTE 10h, 2ch, 30h, 40h, 50h

mov al, [arrayB+1] ; AL = 20h

mov al, [arrayB+2] ; AL = 30h

mov al, [arrayB+20] ; AL = ?? Range checking → logic bugs.

Ex:-  
.data arrayW WORD 100h, 200h, 300h

.code mov ax, [arrayW] ; AX = 100h  
mov ax, [arrayW+2] ; AX = 200h

Ex:-  
.data arrayD DWORD 1000h, 2000h

.code mov eax, [arrayD] ; EAX = 1000h  
mov eax, [arrayD+4] ; EAX, 2000h

INC | DEC :-

reg, mem

Ex:- .data myword WORD 100h

.code inc myword ; myword = 1001h  
 mov bx, myword  
 dec bx ; bx = 1000h

Overflow, Sign, Zero, Auxiliary carry, parity flags are changed according to the value of destination operand.

\* INC/DEC instruction donot affect Carry flag.

NEG:- reverses the sign of number by converting the number to 2's complement.

Implement Neg reg/mem

Implementing Arithmetic Expressions :-

$$Rval = -Xval + (Yval - Zval)$$

|                |                           |
|----------------|---------------------------|
| Rval SDWORD ?  | mov eax, Xval             |
| Xval SDWORD 26 | neg eax ; EAX = -26       |
| Yval SDWORD 30 | mov ebx, Yval             |
| Zval SDWORD 40 | sub ebx, Zval ; EBX = -10 |
|                | add eax, ebx              |
|                | mov Rval, eax ; -36       |

OFFSET OPERATOR :- distance in bytes of the label from beginning of data segment.

|                   |                                         |
|-------------------|-----------------------------------------|
| .data bval BYTE ? | .code mov esi, OFFSET bval ; ESI = 4000 |
| wval WORD ?       | mov esi, OFFSET wval ESI = 4001         |
| dval DWORD ?      | mov esi, OFFSET dval ESI = 4003         |
| dval2 DWORD ?     | mov esi, OFFSET dval2 ESI = 4007        |

Offset returns the starting address.

.data myarray WORD 1,2,3,4,5

.code mov esi, OFFSET myarray+4. ESI = 3 → address of 3rd integer

.data bigarray DWORD 500 DUP(?)

    parray DWORD bigarray

This means parray points to the bigarray.

mov esi, parray

esi = holds address of beginning of big array.

PTR OPERATOR:-

.data mydouble WORD 12345678h

.code mov ax<sup>2 bytes</sup>, mydouble<sup>4 bytes</sup>; error

word PTR makes it possible.

mov ax, WORD PTR mydouble ; 5678h

mov ax, WORD PTR [mydouble+2] ; 1234h

Moving smaller values into larger destinations

.data wordlist WORD 5678h, 1234h

.code mov eax, DWORD PTR wordlist ; EAX = 12345678h

TYPE OPERATOR:- returns size, in bytes of single element of variable.

.data var1 BYTE ? → <sup>TYPE</sup> 1

var2 WORD ? → 2

var3 DWORD ? → 4

LENGTH OF OPERATOR:- counts the number of elements in an array.

.data byte1 BYTE 10, 20, 30 → <sup>Value (length of)</sup> 3

array1 WORD 30 DUP(?) , 0, 0 ; 30 + 2,

array2 WORD 5 DUP(3 DUP(?)) ; 5 \* 3

array3 DWORD 1, 2, 3, 4 ; 4

str BYTE "12345678", 0 ; 9

SIZEOF OPERATOR :- = length of \* type

.data array WORD 32 DUP(0) ; 32 \* 2

.code mov eax, SIZEOF array ; EAX = 64

ARRAYS:-

.data arrays BYTE 10h, 20h, 30h

.code mov esi, OFFSET arraysB

    mov al, [esi] ; ESI = 10h

mov al, [esi] ; AL = 20h  
 inc esi  
 mov al, [esi] ; AL = 30h

WORD  
 ↓

Instead of inc esi do add esi, 2

### INDEXED OPERANDS:-

.data arrayW WORD 1000h, 2000h, 3000h

.code mov esi, offset arrayW

mov ax, [esi]

mov ax, [esi+2]  
 mov ax, [esi+4]

SCALE FACTOR:- ↳ mov eax, [esi \* 4]

### Lecture 1:-

\* Instead of esi you can use edi or ebx.

### PTR OPERATOR:-

.data myDouble DWORD 12345678h

varW WORD 670Ah

|    |           |
|----|-----------|
| 78 | 0040 0000 |
| 56 | 0040 0001 |
| 34 | 0040 0002 |
| 12 | 0040 0003 |
| 0A |           |
| 67 |           |

mov ax, WORD PTR [myDouble + 3]

will read  
2 bytes

0040 0000  
 + 3  
0040 0003

Memory Mapping

↳ output : 0A12h

### .data

arrayW WORD 1000h, 2000h, 3000h

.code mov esi, offset arrayW

mov ax, [esi] AX = 1000h

add esi, 1

mov ax, [esi] → output ← 0010

|    |
|----|
| 00 |
| 10 |
| 00 |
| 20 |
| 00 |
| 30 |