

# Computer Architecture of Organisation

## Chapter - 1 → Harvard Architecture



## Von Neumann Architecture



→ It uses the same memory for instructions & data by encoding them into symbols normally used in memory. The CPU decodes every instruction. This is the stored-program concept.

Modern machines also use registers (in CPU storage locations), stacks (no operands reqd. for instructions), accumulators (intel instructions sometimes use registers like eax as accumulators)

## Chapter - 2 → Floating point numbers



$$\text{The number } A = (-1)^S \times P \times 2^{E-\text{bias}} \quad 1 \leq E \leq 254$$

↳  $E = X + \text{bias}$  where bias = 127 & X is actual exponent of number. So  $X=0$  is represented by  $E=127$

$$P = 1+M, \quad 0 \leq M < 1$$

(2)

Mantissa is written from after point. Number is assumed to be of the form  $1 \dots \times 2^{\epsilon}$ .  $\epsilon$  varies from ~~0 to 255~~ 1 to 254.  $\epsilon = 0$  &  $\epsilon = 255$  are reserved.

| $\epsilon$ | M        | Value              |
|------------|----------|--------------------|
| 255        | 0        | $\infty$ if $s=0$  |
| 255        | 0        | $-\infty$ if $s=1$ |
| 255        | $\neq 0$ | NAN                |
| 0          | 0        | 0                  |
| 0          | $\neq 0$ | Denormal Number    |

There are 2 representations of 0<sub>2</sub> for  $s=1$  & -1.

Denormal Numbers  $\rightarrow$  introduced to give more range for small numbers.

$$A = (-1)^s \times P \times 2^{-126} \quad (P = 0 + M, 0 \leq M \leq 1)$$

Exponent is -126 by default, so smallest denormal number is

$$0.\underbrace{00 \dots 01}_{2^{-126}} \times 2^{-126} = \cancel{0} \cancel{1} 2^{-149}$$

largest is  $0.\underbrace{111 \dots 11}_{2^{-126}} \times 2^{-126} = 2^{-126} - 2^{-149}$

largest denormal number is smaller than smallest positive normal number.

\* Floating point operations are commutative but not associative. Order can play a role when dealing with high precision.

Chapter - 3  $\rightarrow$  Assembly language refers to a family of low-level programming languages that are specific to an ISA.

Each statement has an instruction code that translates to a basic machine instruction & a list of operands.

Basics of Assembly → Machine Model + Von Neumann machine augmented with registers

CPU reads out programs instruction-wise & executes them. The program counter (PC) keeps track of the instruction being executed. Instructions get operands from main memory & registers. CPU co-ordinates everything.

Registers → ~~languages~~ languages give registers for use.

→ 10--215 in ARM, eax, ebx, ecx, edx, esi, edi, esp, esp in x86 (32-bit). Most ISA use a return address register except x86. ARM gives access to PC in 215. x86 keeps it implicit & hidden.

Memory → Address is a unsigned 32-bit integer in 32-bit machines & 64 in 64-bits. Memory can ~~at~~ only be accessed as ~~a~~ a byte. For multi-byte data like short(2 bytes), int(4), long(8), there are 2 representations

Big Endian → Store MSByte in lowest address.

Little Endian → Store MSByte in highest address.



x86 is little-endian, ARM was as well, now supports both. IBM Power & Sun SPARC processors are big-endian.

Representing arrays → stored in contiguous locations.

A multi-dimensional array is stored by flattening out other dimensions. Row-major representation is when an array is stored row-wise in memory i.e. Each row is stored as a contiguous block. Same for column-major.

# Assembly - language Syntax → for GNU assembler.

File structure → There are many sections, they start with heading `.section name`. Common sections are `.text(actual code)`, `.data(Initialized data)`, `.bss(Data initialized to 0)`, `.file(at start of file to specify name of file)`.

Basic statement → Instruction operand<sub>1</sub> operand<sub>2</sub> ...

The list of operands has either the value or location of the operand. The value is a numeric constant & is called an immediate value.

An assembly statement can have a label, instruction & a comment.

Types of instructions → On basis of functionality—

- (1) Data processing
- (2) Data transfer
- (3) Branching
- (4) Exception generating

On basis of no. of operands:-

→ If an instruction requires n operands, it is a n-address format instruction.

In ARM, data processing instructions → 3 address format

data transfer instructions → 2 address format

X86, most are 2 address format.

Types of operands → Method of specifying & accessing an operand in assembly is known as addressing mode.

immediate-addressing → specify integer constants as operand in the instruction

Register-transfer notation →  $r_1 \leftarrow r_2 + r_3$

Add contents of  $r_2$  &  $r_3$ , store it in  $r_1$

$r_1 \leftarrow [r_2]$

~~Fetched memory value~~ Fetch content of the memory address stored in  $r_2$  & store it in  $r_1$ .

Generic addressing modes for operands :- Represent value of operand by V (5)

immediate  $\rightarrow V \leftarrow \text{imm}$  (Use constant  $\text{imm}$  as value of operand)

register  $\rightarrow V \leftarrow r_1$ , (Processor uses value in register  $r_1$ )

register-indirect  $\rightarrow V \leftarrow [r_1]$

base-offset  $\rightarrow V \leftarrow [r_1 + \text{offset}]$  (offset is a constant called displacement. Also)

base-index  $\rightarrow V \leftarrow [r_1 + r_2]$  ( $r_1$  is base register,  $r_2$  is index)

base-index-offset  $\rightarrow V \leftarrow [r_1 + r_2 + \text{offset}]$

memory direct  $\rightarrow V \leftarrow \text{addr}$  (Value is contained at address  $\text{addr}$  which is a numerical constant. Memory address is directly embedded in the instruction)

memory-indirect  $\rightarrow V \leftarrow [[r_1]]$

PC-relative  $\rightarrow V \leftarrow [\text{PC} + \text{offset}]$  (useful for branch instructions)

↳ Memory address specified by an operand is called effective memory address.

SimpleRISC ISA  $\rightarrow$  21 instructions.

Machine model  $\rightarrow$  16 registers  $r_0, r_1, \dots, r_{15}$ . 14 general purpose  $r_{14} \rightarrow$  stack pointer (SP),  $r_{15} \rightarrow$  return address (RA) 32-bit registers. Internal register called flags. flags. E(equal), flags. GT(greater than). E is set if comparison gives equal. GT is set if first operand  $>$  second operand.

6  
↳ Each instruction is encoded into 4 bytes/32-bit in memory.

Register Transfer Instruction :-  $\text{mov reg, (reg/imm)}$

Arithmetic Instructions :-  $\text{add, sub, mul, div, mod}$   
for all of them →  $\text{op reg reg (reg/imm)}$  (ex. add  $r_1 r_2 r_3$ )  
 $\text{CMP reg, (reg/imm)}$  (sets flags)  
by comparing

Logical Instructions :- and compute bitwise AND, OR, NOT.

and  $\text{reg reg (reg/imm)}$

or  $\text{reg reg (reg/imm)}$

not  $\text{reg (reg/imm)}$

(not  $r_1 r_2 \Rightarrow r_1 \leftarrow \neg r_2$ )

Shift Instructions → ~~lsl~~ logical shift left (lsl), logical ~~shift right~~ ~~lsh~~ (lsh)  
arithmetic shift right (asr).

lsl  $\text{reg, reg (reg/imm)}$

lsh  $\text{reg, reg (reg/imm)}$

asr  $\text{reg, reg (reg/imm)}$

EX  
lsl  $r_3 r_1 r_2 \Rightarrow r_3 \leftarrow r_1 \ll r_2$  (shift left)  
lsh  $r_3 r_1 r_2 \Rightarrow r_3 \leftarrow r_1 \gg r_2$  (shift right)  
asr  $r_3 r_1 r_2 \Rightarrow r_3 \leftarrow r_1 \gg r_2$  (shift right & fill with zeros)

Data-transfer Instructions :- load  $\$$  (ld) to load instructions from memory into registers, store (st) for the opposite site.

ld  $\text{reg, imm[reg]}$  | ld  $r_1, 10[r_2]$  |  $r_1 \leftarrow [r_2 + 10]$   
st  $\text{reg, imm[reg]}$  | st  $r_1, 10[r_2]$  |  $[r_2 + 10] \leftarrow r_1$

Unconditional Branch instruction:

b label | Example  
b .foo | b .foo  $\Rightarrow$  branch to .foo

Conditional Branch instructions :- branch based on the flag register.



Calling & Returning from Functions → If only a small number of values have to be passed b/w caller & callee, then use registers. Otherwise use memory. To prevent the callee overwriting registers the caller may require, either the caller saves them in memory before calling or the callee saves & restores them before returning. Saving registers in memory is called register spilling.

To solve all problems of function calls, we use a stack. A stack starts at some high location in memory & grows downward. It ~~contains~~ maintains the activation blocks of the functions in a stack. To return a large thing, callee writes it to activation ~~block~~ block \$ of caller.



The callee creates a new activation block ~~before calling the function by doing these steps:-~~

1. Decrement stack pointer by size of activation block.
2. Copy values of arguments.
3. Initialize local variables to memory if reqd.
4. Spell registers (including RA) if reqd.

The activation block is destroyed when function returns by adding its size to SP

This solves all our problems. Caller & callee don't overwrite each other's activation local variables, it is possible to stop register overwriting by specifying instructions to save them in activation blocks. No need to have an explicit agreement about memory to pass arguments, if caller just pushes them on stack & callee uses them. Similarly to return values, callee first destroys its stack by resetting sp. Then pushes values on stack which caller can use. If the callee calls another function, it can spill ~~too~~ <sup>extra</sup> ra in its activation block & restore it later.

call & ret → call label / <sup>example</sup> explanation  
 call .foo / ra ← PC + 4; PC ← address of .foo  
 ret / ret / PC ← ra

ret is a 0-address instruction. Call & ret can be thought of as unconditional branch instructions coz they change PC.

nop → does absolutely nothing. (All 21 ~~the~~ instructions done)

Modifiers → You can't load a 32-bit constant when the whole instruction is 32-bit. So you have to load the upper & then lower half separately.

Add suffix 'u' or 'h' to instructions except shift instructions. By default, the processor performs sign extension when converting 16-bit constants to 32-bit. 'u' specifies it to treat it as an unsigned constant & append only zeros. 'h' tells it to load it to the upper half of the register directly. So, a constant 0xFB12CDEF can be loaded into r0 in 2 instructions.

movb r0, 0xFB12

addu r0, r0, 0xCDEF

Q11

Encoding the SimpleRISC ISA :- We have to encode each instruction to 32-bits. Instructions are in 0, 1, 2, 3 address formats. There are 21 instructions, so we need 5-bits to represent them. Code for an instruction is called its opcode.

0-address → set & nop → only need 5-bits for opcode.

1-address → call, b, beg, bgt → take label as argument which is an address in memory. We only have 27-bits but address is 32-bits. Here we use PC-relative offset representation. So, we can represent addresses  $PC \pm 2^{28}$ . Assume instructions are 4-byte aligned. Then every offset has 00 as last 2-bits. So we can make them implicit. So, we can represent addresses  $PC \pm 2^{26}$ .



Encoding of 1-address instruction.

3-address → Need 1 bit to specify if third operand is register or immediate value, 16 registers so need 4-bits for each register

$I=0 \Rightarrow$  register operands, then



$I=1 \Rightarrow$  immediate operand, then



2 modifier bits  
 00 → default  
 01 → u  
 10 → h

Processor internally expands the immediate to a 32-bit value according to the modifier bits.

(10)

2-address instructions → We will use the 3-address here as and so that the processor decode logic is simpler.

comp → has 2 source-operands. ~~keep D<sub>16</sub>, D<sub>16</sub>~~ dat-reg empty.



mov & not → 1 dst reg & 1 src operand that can be either immediate or register. keep src1 empty.



Load & Store Instructions : ld & st are 2-address instructions. The second operand is a base register & a constant offset using base offset addressing. For load, we use the 3-address format in immediate mode.

ld rd, imm[rs1] → 

|       |         |          |     |     |
|-------|---------|----------|-----|-----|
| 01110 | I       | rd       | rs1 | imm |
| ← 5   | → 1 ← 4 | → 4 → 18 | →   |     |

Store instruction. doesn't have a destination register but a memory location. We still use the same format but the destination register argument ~~would be the dest~~ would be the source actually. We don't want a new format for just 1 instruction. We choose efficiency over elegance.

st rd, imm[rs1] → 

|       |         |          |     |     |
|-------|---------|----------|-----|-----|
| 01111 | I       | rd       | rs1 | imm |
| ← 5   | → 1 ← 4 | → 4 → 18 | →   |     |

\* So, there are total 3 instruction formats, branch, immediate & register (ret & nop are special case of branch)