

# Computer Architecture Basic

Computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems, providing a framework

Types of Computer Architecture (Categories Sub)

• Instruction Set Architecture or ISA: whenever an instruction is given to processor, its role is to read and act accordingly.

• Micro Architecture: It describes how a particular processor will handle and implement instructions for ISA. (ISA instruction handle res.)

System design: It includes the other entire hardware component within the system such as,

• multiprocessor (Other component may not be hardware, can be software)

etc to understand it.

two types of instructions

• miss

not remapped or A is not written with new information to give a correct output, above situation

## Importance of Computer Architecture

- The main role of CA is to balance the performance, efficiency, cost and reliability of a computer system. For example, ISA acts as a bridge between computer software and hardware.
- It works as a programmer's view of a machine. It's communicate user and computer.
- Difference Between Computer Architecture and Computer Organization.

### Computer Architecture

- It describes what the computer does.
- Deals with the functional behavior of computer system.
- Deals with high lvl design issues.
- As a programmer you can view architecture as a series of instructions addressing modes, registers.

### Computer Organization

- It describes how it does it.
- Deals with hardware relationship.
- Deals with low level design.
- The implementation of the architecture is called organization.

Q5

## Computer Performance

Computer performance is the amount of work accomplished by a computer system. It basically depends on response time.

- throughput
- Execution time of a Computer system.

Response time: Time between start and completion of a task. also called wall clock time.

Response time = Cpu time + waiting time (I/O, scheduling)

Cpu Execution time: Time spent executing the program instructions.

Cpu time = User Cpu time + Kernel Cpu time

Can be related to the number of CPU clock cycles.

ThroughPut: Total work done per unit of time.

Being use of faster version of a processor

and Decreasing the execution time it improves throughput.

with interrupt, no interrupt also throughput.

Relative Performance: It is the ratio between

two computer (Suppose A and B) performance

Relative performance =  $\frac{\text{Performance A}}{\text{Performance B}}$

work needs less time  $\downarrow$   
 $\frac{\text{execution time B}}{\text{execution time A}}$

Clock cycle: CPU speed is determined

the clock cycle which is the amount of time

between two pulses on an oscillator.

Clock rate:  $\text{Clock rate} = \frac{1}{\text{Clock Cycle}}$

measurement of Clock cycles per second

## ENIAC: Electronic Numerical Integrator and Computer. M

is one of the earliest electronic general purpose computers. It was Turing-Complete and able to solve a large class of numerical problems. It was built during World War II by the US.

Abacus: An abacus is a mechanical device that is used to calculate the arithmetic calculations. It is also referred to as a counting frame. Today, it is widely used in brain development programs. The standard abacus can be used to perform addition, subtraction, multiplication and division. and also be used to extract square roots and cubic roots.

reduced Instruction set computer  
complex instruction set Computer

## Differences Between RISC and CISC.

### RISC

### CISC

- Emphasis on Software
  - Small number of fixed length instructions.
  - Single clock cycles for instructions.
  - Heavy use of Ram.
  - Execution time is very short.
  - Thus it can be said RISC is better as it also can be designed quickly than CISC.
- Emphasis on hardware.
  - Large number of instructions.
  - Instructions can take several clock cycles.
  - More efficient use of Ram.
  - Execution time is very long.

Scale of ICs in about total transistors.

SST - Small Scale Integration  $\rightarrow$  less than 100

MSI - Medium Scale "  $\rightarrow$  Between 100 and 1000

LST - Large " "  $\rightarrow$  " 1000 and 10,000

VLSI - Very " "  $\rightarrow$  " 10,000 and 100,000

ULSI - Ultra Large " "  $\rightarrow$  " 100,000 and 10,000,000

Performance Improved  $\Rightarrow$  Reducing CPI

## Clock cycles per instructions (CPI) : Elaborated

as clock cycles per instruction

Average CPI for a given program =  $\frac{\text{total CPU clock cycles}}{\text{Total Instruction count}}$

MIPS : Millions Instructions Per Second (Execution rate)

\* Faster machine = Larger MIPS.

$$\text{MIPS} = \frac{\text{Instruction Count}}{\text{Execution time} \times 10^6} = \frac{\text{Clock rate}}{\text{CPI} \times 10^6}$$

Drawbacks of MIPS:

- Does not take into account the capability of instructions. (Instruction count different)
- MIPS varies between programs on the same computer. (single CPU  $\Rightarrow$  different)
- A higher MIPS rating doesn't mean better performance.

# CPU Clock Cycles = Instructions for a program  $\times$  Average CPU time

$$\text{# CPU Execution Time} = \frac{\text{CPU clock cycles}}{\text{Clock rate}}$$

$$\text{# CPU Execution Time} = \frac{\text{CPU clock cycles} \times \text{Clock cycle time}}{\text{Clock rate}}$$

$$\text{Clock rate} = \frac{CPUs \times b/s}{MIPS}$$

$$MIPS = \frac{Instructions}{Clock rate}$$

$$\text{Clock cycle} = \frac{CPU \times \text{Count}}{\text{Clock rate}}$$

$$\text{# CPU clock cycles} = \frac{\text{Instructions for program} \times \text{CPI} \times \text{Clock cycle time}}{\text{Clock rate}}$$

$$= \frac{1}{\text{Clock rate}}$$

$$\text{# CPU clock cycles} = \frac{\text{Instructions for program} \times CPI}{\text{Clock rate}}$$

$$\text{# Clock rate} = \frac{\text{Instructions for program} \times CPI}{\text{CPU Execution Time}}$$

$$\text{# Instruction per Second (IPS)} = \frac{\text{Clock Rate}}{\text{CPI}}$$

$$\text{# Clock cycle} = \text{Execution time} \times \text{Clock rate}$$

$$\text{# CPI} = \frac{\text{Clock cycles}}{\text{Instruction Count}}$$

$$\text{# Overall CPI} = \frac{\sum CPI_i \times L_i}{\text{Instruction Count}}$$

Ques: Computer A has a Clock cycle time 250ps and 2.0 for some program  
 Computer B has a " " " 500ps and 1.2 for some program.  
 which is faster and why?

Ans: we know that

$$\text{CPU clock cycles} = \text{Instruction program} \times \text{CPI}$$

$$\rightarrow \text{So A have} = 1000 \times 2.0$$

$$\rightarrow \text{B} = 1000 \times 1.2$$

$$\text{CPU time} = \text{CPU clock cycles} \times \text{Clock times}$$

$$\text{for A} = 1000 \times 250\text{ps} = 500\text{TPS}$$

$$\text{for B} = 1000 \times 500\text{ps} = 600\text{TPS}$$

So A is faster bcz 500TPS less time needed.

amount of faster

$$\frac{\text{CPU performance A}}{\text{CPU performance B}} = \frac{\text{Execution time B}}{\text{Execution time A}} = \frac{600\text{TPS}}{500\text{TPS}} = 1.2$$

### Example 2

Computer

pules  
A have a clock cycle time 2 ns CPI 8.0 for program  
B n n n " instruction 3 ns " 2.0 " "

Am

here  
CPU Clock Cycles = Introduction for Program XCP7

$$A = 1 \times 3.0$$

$$B_1 = T \times 2.0$$

Execution time = Cpu clock cycles  $\times$  Cpu clock time

$$A = 1.3 \times 2\pi = 6.18$$

$$B = 1.20 \times 0.3 = 0.36$$

So both showing same performance level.

### Example 3

Consider three different processors P<sub>1</sub>, P<sub>2</sub>, P<sub>3</sub> executing the same subtraction set.

P<sub>1</sub> has a 3 GHz clock rate and CPI 1.5

P<sub>2</sub> " " 2.5 GHz clock rate and CPI 1.0

P<sub>3</sub> " " 4.0 GHz " " CPI of 2.2

1. which one has the highest performance expressed in subtractions per second.

2. If the processor each execute a program in 10s, find the number of cycles and the number of subtractions.

3. trying to reduce the execution time by 30% but this leads to an increase of 20% in the CPI. what

clock rate should we have to get this time reduction.

with stages =  $\frac{CPI}{CPI - 1}$  x with stages =  $\frac{CPI}{CPI - 1}$  x with stages =  $\frac{CPI}{CPI - 1}$

Ans ①

$$IPS = \frac{\text{Clock Rate}}{CPI}$$

for P<sub>1</sub>:

$$IPS = \frac{\text{Clock Rate}}{CPI} = \frac{3 \text{ GPF}}{1.5} = \frac{3 \times 10^9}{1.5}$$

=  $2 \times 10^9$  instruction per second.  
(ips).

for P<sub>2</sub> =  $\frac{2.5 \times 10^9}{1.0} = 2.5 \times 10^9$  ips

$$P_3 = \frac{9 \times 10^9}{2.2} = 1.82 \times 10^9$$
 ips

here P<sub>2</sub> =  $2.5 \times 10^9$  ips is highest performance.

## ② Number of cycles:

Cpu execution times = Cpu Clock cycles × clock time

$$\text{Cpu clock cycles} = \frac{\text{Cpu execution times}}{\text{Clock cycle time}}$$

∴ Cpu execution time  $\times \frac{1}{\text{Clock cycle time}}$   
= Cpu execution time  $\times \frac{\text{Clock cycle time}}{\text{Clock rate}}$ ,

So

$$\text{clock cycles} = \text{execution time} \times \text{clock rate}$$

$$\text{for } A \quad u = 10 \times 3 \times 10^9 = 3 \times 10^{10} \text{ cycles}$$

$$\text{if } B \quad " = 10 \times 2.5 \times 10^9 = 2.5 \times 10^{10} \text{ cycles}$$

$$\text{if } C \quad " = 10 \times 4 \times 10^9 = 4 \times 10^{10} \text{ cycles}$$

$$\frac{\text{Number of Instructions}}{\text{Instruction count}} : CPI = \frac{\text{CPU clock cycles}}{\text{Instruction count}}$$

$$\text{Instruction count} = \frac{\text{Clock cycles}}{CPI}$$

$$\text{for } P_1 \quad P_1 = \frac{3 \times 10^{10}}{1.5} = 2 \times 10^{10}$$

$$P_2 \quad " = \frac{2.5 \times 10^{10}}{7.0} = 2.5 \times 10^{11}$$

$$P_3 \quad " = \frac{4 \times 10^{10}}{2.2} = 1.82 \times 10^{10}$$

Ans ③

$$\text{Execution time} = \frac{\text{num of Instruction} \times \text{CPI}}{\text{clock rate}}$$

we can reduce execution time by and CPI increase 20% then

$$\frac{(100+20)}{100-30} = 70\% = 0.7$$

$$\text{Execution time} * 0.7 = \frac{\text{Instruction} * \text{CPI} * 1.2}{\text{new clock rate}}$$

So new clock rate  $\Rightarrow$

$$= \frac{\text{clock rate} \times \frac{1.2}{0.7}}{\text{Execution time} * 0.7}$$

$$P_1 = \downarrow \\ 3 * 1.71 = 5.13 \text{ GHz}$$

$$P_2 = 2.5 * 1.71 = 4.27 \text{ GHz}$$

$$P_3 = 4 \text{ GHz} * 1.71 = 6.84 \text{ GHz}$$

### Example 4

| Class     | ALU | FPU | Load | Store | Branch |
|-----------|-----|-----|------|-------|--------|
| frequency | 20% | 30% | 20%  | 10%   | 20%    |
| CPI       | 1   | 4   | 5    | 3     | 2      |

a) what is average CPI

Average CPI :  $0.2 \times 1 + 0.3 \times 4 + 0.2 \times 5 + 0.1 \times 3$

$+ 0.2 \times 2 = 3.1$

Processor time, 2000 ns per clock

$$ALU = \frac{0.2 \times 1}{3.1} = 6.4\%$$

$$FPU = \frac{0.3 \times 4}{3.1} = 38.7\%$$

$$Load = \frac{(0.2 \times 5)}{3.1} = 32.3\%$$

$$Store = \frac{(0.1 \times 3)}{3.1} = 9.7\%$$

$$Branch = \frac{(0.2 \times 4)}{3.1} = 12.9\%$$

## MIPS Math

Measurement

Instruction count  
for program

A → 10 B → 8 B

→ 4.0 GHz

→ 4.2 GHz

→ 1.0

CPI → 1.1

→ 1.0

Q: which is higher rating? faster?

$$\text{we know MIPS} = \frac{\text{Clock rate}}{\text{CPI} \times 10^6}$$

$$\text{for A} = \frac{4.2 \times 10^9}{1.0 \times 10^6} = 4200 \text{ MIPS}$$

$$\text{for B} = \frac{4.0 \times 10^9}{1.1 \times 10^6} = 3636 \text{ MIPS}$$

$$\text{CPU time (t)} = \frac{\text{Instruction Count} \times \text{CPI}}{\text{Clock rate}}$$

$$\text{for A} = \frac{10 \times 10^9 \times 1.0}{4.2 \times 10^9} = 2.38 \text{ sec}$$

$$\text{for B} = \frac{8 \times 10^9 \times 1.1}{4.0 \times 10^9} = 2.20 \text{ sec}$$

Computer A is higher MIPS, B is less execution time

## Instruction

The word of machine language are Instructions.

29n programmer  $A = B + C$  in Computer binary form

operation of computer hardware.

$$f = (g+h) - (i+j)$$

add to  $g, h$  { to  $\rightarrow g+h$  result}

add to  $i, j$  of  $f$  "  $i, j$ "

sub ~~f~~ to,  $i, j$  of sub too  $-H$ )

$$(g+h) - (i+j) = f$$

format  
components.

there are 3 parts from



## # MIPs design Principle

A Simplicity favors regularity -

- keeping all instructions single size

- requiring 3 register operands for most instructions

Example

HLL

$$\begin{aligned} a &= b + c \\ d &= a \cdot e \end{aligned}$$

On top of MIPs machine  
performed by computer

add a, (b, c)

operand

$$(b+c) - (d \cdot e) = a$$

destination & statement both are

## # Smaller programs

$$j = (g+h) - (i+j)$$

so j, g, h, i, j assigned registers

\$64 respectively.

\$64 temporary variable.

(Size of register MIPS = 32 bits)

add \$t<sub>0</sub> \$<sub>1</sub>, \$<sub>2</sub> [g+h in to]

add \$t<sub>1</sub> \$<sub>3</sub>, \$<sub>4</sub> [c+j " to]

sub \$t<sub>0</sub> to, \$<sub>1</sub> [f = (g+h) - (c+j)]

# good design demands good compromise

- compromise between provng. for lenge address.
- Keeping all instructions same length

# Make the common case fast  
quick add instruction with one constant

operand is called add immediate or add  
to add 4 to register \$s<sub>3</sub>, we write  
add \$s<sub>3</sub> \$s<sub>3</sub>, 4 (if \$s<sub>3</sub> = s<sub>3</sub> + 4)

## MIPS Field



Example : Add \$t0 \$s1, \$s2.

→ operand in MIPS field in binary and decimal

$$\text{Let } s_1 = 17, s_2 = 18, t_0 = 8.$$

|           | OP             | TGS           | TGT           | RCD           | Shamt         | Funct         |
|-----------|----------------|---------------|---------------|---------------|---------------|---------------|
| decimal   | 000            | 17            | 18            | 8             | 0 unused      | 32            |
| binary    | 000000<br>6bit | 01001<br>6bit | 10010<br>5bit | 01000<br>5bit | 00000<br>5bit | 10000<br>6bit |
| operation | Add            | \$s1          | \$s2          | \$t0          | unused        | 32            |

MIPS  
function  
82 bit  
long

- # The numeric version of the instructions are Machine language or machine code.
- # The layout of the instruction is instruction format.

S<sub>0</sub> 16  
S<sub>1</sub> 17  
S<sub>2</sub> 18

S<sub>3</sub> 19

S<sub>4</sub> 20

S<sub>5</sub> 21

S<sub>6</sub> 22

S<sub>7</sub> 23 32 registers >> \$zero always equals 0

\$ reserved for constants

MIPS uses byte addressed

\$Zero => constant value 0

register number

0

\$v<sub>0</sub> - \$v<sub>1</sub> => expression evaluation

=> number 2-3

\$a<sub>0</sub> - \$a<sub>3</sub> => arguments

=> reg number 4-7

\$t<sub>0</sub> - \$t<sub>7</sub> => temporaries

=> reg num 8-15

\$s<sub>0</sub> - \$s<sub>7</sub> => Saved

=> reg num 16-23

\$t<sub>8</sub> - \$t<sub>9</sub> => temporary

" => 24-25

\$gp => global pointer

=> 28

\$sp => stack pointer

=> 29

register number

\$fp => frame pointer

=> 30

\$ra => return address

=> 31

\$k - k + => reserved 05

=> 26-27

=> 01

=> 02

t<sub>0</sub> 8  
t<sub>1</sub> 9  
t<sub>2</sub> 10  
t<sub>3</sub> 11  
t<sub>4</sub> 12  
t<sub>5</sub> 13  
t<sub>6</sub> 14  
t<sub>7</sub> 15

# MIPS assembly language

Arithmetic

add  $\rightarrow \text{add } \$1 \$2 \$3, \Rightarrow \$1 = \$1 + \$2$   
Subtract  $\rightarrow \text{sub } \$1 \$2 \$3 \Rightarrow \$1 = \$1 - \$2$

Data in register

Data transfer

Load word  $\rightarrow \text{lw } \$1, 100(\$2) \Rightarrow \$1 = \text{Memory}(\$2 + 100)$

Store word

$\text{sw } \$1, 100(\$2) \Rightarrow \text{Memory}[\$2 + 100] \Rightarrow \$1$

Segment  $\Rightarrow \$1$

Data from register to memory.

conditional  
Branch

branch on equal  $\rightarrow \text{beq } \$1, \$2, L \Rightarrow \text{if } (\$1 == \$2) \text{ goto } L$

branch on not equal

$\rightarrow \text{bne } \$1, \$2, L \Rightarrow \text{if } (\$1 \neq \$2) \text{ goto } L$

Set on less than

$\rightarrow \text{slt } \$1, \$2, \$3 \Rightarrow \text{if } (\$2 < \$3), \underline{\underline{\$1 = 1}}$

else  $\underline{\underline{\$1 = 0}}$



jump and link  $\rightarrow$   $J \pi + ra \rightarrow ra = pc + 4$  go to  $100000$

### MIPS machine language

| Name    | Op  | rs | rd | rd | Example | fmt. | Comments          |
|---------|-----|----|----|----|---------|------|-------------------|
| add (R) | 0   | 18 | 18 | 18 | 0       | 32   | add \$1 \$2 \$3   |
| sub (R) | 0   | 18 | 19 | 17 | 0       | 34   | sub \$1 \$2 \$3   |
| lw      | I   | 25 | 18 | 17 | 100     | 0    | lw \$1 100(\$2)   |
| sw      | I   | 43 | 18 | 17 | 100     | 0    | sw \$2 100(\$2)   |
| beq     | I   | 5  | 17 | 18 | 25      | 2    | beq \$1, \$2, 100 |
| bne     | I   | 5  | 17 | 18 | 25      | 2    | bne \$1, \$2, 100 |
| slt     | R   | 0  | 18 | 19 | 17      | 0    | slt \$1, \$2, \$3 |
| jal     | (j) | 3  |    |    | 2500    | 4    | jal 10000         |
| jr      | R   | 0  | 31 | 0  | 0       | 0    | 8                 |

short note

OP অন্তর্ভুক্ত  $LW=35, SW=43$

$beq = 4$   $bne = 5$

$add = 8$

fixed অন্তর্ভুক্ত  $add = 32$

$sub = 34$   $slt = 42$

$Jr = 8$

Example

$$A = b + c \quad \leftarrow \text{inc}$$

add \$S\_0 \$S\_1 \$S\_2 \leftarrow \text{assembly}

add \$rd, \$rs, \$rt \rightarrow \text{in reg field}

| OP | rs | rd | rd | Shamt | funct | decimal | binary                                |
|----|----|----|----|-------|-------|---------|---------------------------------------|
| 0  | 17 | 18 | 16 | 0     | 82    | 82      | 100000 10001 10010 10000 00000 100000 |

R format Instructions

- OP have 0

| OP    | rd | rd | rd | Shamt | funct |
|-------|----|----|----|-------|-------|
| bit 6 | 5  | 5  | 5  | 5     | 6.    |

• add, addu, Sub, Subu, and, or, more

SHU, SHU, all are R format

SLL \$1 \$2

rd rd

10

Shamt

| OP   | rs   | rd   | Shift Am        | funct |
|------|------|------|-----------------|-------|
| 0000 | 0010 | 0001 | 00000 1010 0000 | 00    |

w

18

17

ST - Ultra

## I format Introduction

- Have 2 registers and a constant value (immediate) present

| OP<br>6 bit | TCS<br>5 bit | Td<br>5 bit | Immediate<br>16 bit |
|-------------|--------------|-------------|---------------------|
|-------------|--------------|-------------|---------------------|

addi, addiu, sw, lw, beq, bne are I format

Example: add \$t<sub>2</sub>, \$t<sub>3</sub>, y (2 register \$t<sub>2</sub> \$t<sub>3</sub>)  

$$\begin{array}{r} \text{add} \\ \hline t_2 & t_3 & y \end{array}$$
 10 19

| OP | TCS | Td | Immediate |
|----|-----|----|-----------|
| 8  | 10. | 10 | 4         |

## J format Introduction

Have one address (part of one, acceptably)

J, JR, J are J format

| OP<br>2 | Address<br>25000 | #J 10000 |
|---------|------------------|----------|
|         |                  |          |

In J format  
address  
find by

$$10000 / 4$$

$$= 25000$$

Q4 Q8

Practise

MIPS Instruction  $\rightarrow$  machine code

①

1. bne \$5 \$3 \$4  $\rightarrow$  1000

So go to I format

| OP        | rs         | rd         | Immideate                      |
|-----------|------------|------------|--------------------------------|
| bne<br>=5 | \$3<br>=10 | \$4<br>=20 | (000 1000 0000 0000 0000 1000) |
| 000010    | 10011      | 10100      | 0000 0000 0000 1000            |

② add \$50, \$1, \$2

So go to R format

| OP        | rs         | rd         | rd         | Shamt | funct  |
|-----------|------------|------------|------------|-------|--------|
| add<br>=0 | \$1<br>=17 | \$2<br>=18 | \$0<br>=16 | 0     | 32     |
| 000000    | 100010     | 10010      | 10000      | 00000 | 100000 |

③

J 1101000

go to J format

| OP     | address                            |
|--------|------------------------------------|
| J=2    | 0000,0000 0000 0000<br>0001 001000 |
| 000010 |                                    |

4. Sub \$s<sub>0</sub> \$s<sub>1</sub> \$s<sub>2</sub>

q-to tc format

| OP  | res | rd | tc | Shamt | fnc1               |
|-----|-----|----|----|-------|--------------------|
| sub | 17  | 18 | 16 | 0     | sub = 34<br>100000 |

SP 23 S<sub>2,b</sub>

(i) add \$s<sub>5</sub> \$t<sub>5</sub>, 20 (q-to R format)

| OP         | res        | r1 | rd         | Shamt | fnc1 | Immmediate |
|------------|------------|----|------------|-------|------|------------|
| add<br>= 0 | 15<br>= 13 | 0  | 55<br>= 01 | 0     | 32   | 20         |

(ii) bne \$s<sub>3</sub> \$s<sub>1</sub> @100

q-to i format

| OP        | res        | rd         | Immmediate |
|-----------|------------|------------|------------|
| bne<br>-5 | 83<br>= 13 | 81<br>= 17 | 100        |

H8 ⑩

(iii)  $Sw \$s_1 \cdot 80 (\$s_2)$  ( $I+D$  i format)

here ~~out~~ Memory  $(s_2 + 80) = s_1$ .

| OP        | rs         | rd         | immediate           |
|-----------|------------|------------|---------------------|
| $SW = 43$ | $s_2 = 18$ | $s_1 = 23$ | 80                  |
| 101011    | 10010      | 10111      | 0000 0000 0010 0000 |

(iv)  $beq \$s_1, \$s_6, 24$  ( $s_1 = s_2 + 80$  to 24).

$I+D$  i format

| OP        | rs         | rd         | immediate           |
|-----------|------------|------------|---------------------|
| $beq = 4$ | $s_1 = 17$ | $s_6 = 22$ | 24.                 |
| 000000    | 10001      | 10100      | 0000 0000 0001 1000 |

(v)  $s11 \$s_1 \$s_2 \cdot 10$  (R format).

| OP   | rs    | rd    | rd    | Shamt | gnd |
|------|-------|-------|-------|-------|-----|
| 0000 | 00010 | 10001 | 01010 | 0000  |     |

18  
s2  
s1  
17  
10

## MIPS Addressing Mode

multiple forms of addressing are generically called addressing modes.

1. Register addressing  $\Rightarrow$  where operand is in register



2. Based on displacement addressing  $\Rightarrow$  where operand is at the memory location where address is the sum of a register and a constant in instruction.



3. immediate addressing  $\Rightarrow$  where the operand is a constant within the instruction itself.



④ PC-relative addressing Mode  $\Rightarrow$  where the address is the sum of the PC and constant in the instruction.



⑤ Pseudo direct addressing  $\Rightarrow$  where the jump address is the 26 bits of the instruction concatenated with the upper 18 bits of the PC.

of the PC



## Floating Point



# Base 10

$$n = 8 = 2^3 = 6 = 2^7 = 128^{-1} \quad (32 \text{ bits})$$

$(0.1234567)_{10} \times 10^7$

choose exponent 01

(1023 base 2)  $\leftarrow$  127

$$n = 11 = 2^{10} = 2^{10} = 1024 - 1 = 1023 \quad (64 \text{ bits})$$

$$(0.00011)_2$$

## Example

- IEEE 754 binary representation of number
- $0.75_{\text{ten}}$  Single or double precision

# - 0.75

$$\begin{array}{r}
 \cdot 75 \\
 + \cdot 75 \\
 \hline
 1.50
 \end{array}
 \quad
 \begin{array}{r}
 \cdot 50 \\
 + \cdot 50 \\
 \hline
 1.0
 \end{array}$$

For binary 1.11  
 1.11 → 1.11  
 So - 0.11

- 0.11 Three ways

→ 1000 1000 1000

$$\begin{array}{ccc}
 & 0.11 & \\
 \xrightarrow{\quad} & & \xrightarrow{\quad} 2^{-(2^0+2^1+2^2)} \\
 2^{-(2^0+2^1+2^2)} & &
 \end{array}$$

$$\text{So here } 0.\cancel{1}\cancel{1}_2 + 0.11$$

$$= -1.1 \times 2^{-1} \left[ \text{so } \rightarrow 5/2 \text{ so } 2^{-1} \right]$$

$$32 \text{ bit } \text{b}cub = 127$$

$$G_4 \text{ " } = \underline{1023}$$

Single precision requ:

$(-1)^5 \times (1 + \text{Fraction}) \times 2^{\text{Exponent}} = \text{basis}$

$$x_2(126-127)$$



On this as exponent 126 binary point

128 64 32 16 8 4 2 1  
0 1 1 1 1 1 0

64 bit (11) 2<sup>10</sup>, 1024

-2<sup>10</sup>, -1024

-2<sup>11</sup>, -1023

so double precision representation

$$(-1)^5 \times (1 + \text{Fraction}(s_2 \text{ bits})) \times 2$$

(Exponent-bit)

$$= (-1)^1 \times \left(1 + .1 \dots s_1 \text{ error}\right) \times 2 \quad (1023 \text{ or } -1023)$$

| s | 1 | 0 | 1 | 1 | 1 | 1 | + | 1 | 1 | 1 | 0 | 2 | <u>s<sub>1</sub> OR 0</u> |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---------------------------|
|   |   |   |   |   |   |   |   |   |   |   |   |   |                           |

1023 in binary

Converting binary to decimal.

\* binary tree

$$\frac{1}{8} \overline{10000001}$$

Exponent

$$\text{precision (23)} \rightarrow 2.02410 \quad 3.02^{-2} = \frac{1}{4} = 0.25$$

So sign = + Exponent  $\frac{10000001}{10} = 120$ , fraction  $= 0.25$

$$Q = (-1)^S + \left(1 + \text{fraction}\right) \times 2^{\left(\text{Exponent} - \frac{\text{bits}}{2}\right)}$$

$$= (-1)^1 + (1 + 0.25) \times 2^{(129 - 127)} \quad \begin{matrix} 32 \text{ bars} \\ 127 \end{matrix}$$

$$= (-5, 0) \stackrel{10}{\text{A m/s}}$$

$$3 \times 4 = 12$$

product

1st multiplication Algorithm

multiplicand  
/      multiplier  
multiplicand



repeat  
32 < 0  
next  
back



Multiplexer = MUL  
Multiplicand = MD

hardware of 1st version



## 1st division algorithm

division  
( $\frac{\text{dividend}}{\text{divisor}}$ )

$$8 \overline{) 32} \quad \begin{matrix} \text{divisor} \\ \text{quotient} \\ 32 \\ \hline 0 - \text{remainder} \end{matrix}$$

flow chart



If not division hardware



Final multiplication Hardware



## Final version of multiplication



# Final version of Division



## Final version : hardcore



# division

## Formal version Algorithm

$$(2 \div 7)$$

| Step                            | Divisor                | remainder / dividend                |
|---------------------------------|------------------------|-------------------------------------|
| Initial<br>shift left (rounded) | 0010                   | 0000 0110.                          |
|                                 | 0020                   | 0000 1010.                          |
| 0000<br>0010<br>1110            | 0.010<br>0040<br>0040  | 110.1110<br>0000 1110<br>0001 1100  |
| 1110<br>0010<br>10000<br>②      | 0010<br>0010<br>0020   | 111.1000.<br>0001 1100<br>0011 1000 |
| 0001<br>0040<br>1011<br>③       | 0010                   | 0001 1000<br>0014.0010              |
| 111<br>0010<br>1001<br>④        |                        |                                     |
| 0033<br>0010<br>0003<br>⑤       | 0001 1000<br>0011 0000 |                                     |

(i) Sub \$S\_2 , \$T\_5 , 20 (R formed.)

Sub rd, rd, rs, shift

|     | OP     | rs     | rd     | rd     | Shift  | fnet   |
|-----|--------|--------|--------|--------|--------|--------|
| d → | 0      | 0      | 13     | 18     | 20     | 34     |
| b → | 000000 | 000000 | 000000 | 100010 | 000000 | 100010 |

(ii) lde \$S\_3 , \$T\_1 400 (I format)

OP rs rd Immediate

|   | OP     | rs    | rd    | Immediate               |
|---|--------|-------|-------|-------------------------|
| d | 5      | 63/19 | 51/17 | 400                     |
| b | 000011 | 10011 | 10001 | 00000000000000000000100 |

(iii) SW \$87 80 (\$S\_2) (I format)  
 $\text{Set} = (80 + \$2) = 87$   
 $\text{memory}$

|   | OP     | rs    | rd    | Immediate           |
|---|--------|-------|-------|---------------------|
| d | 43     | 62/18 | 57/23 | 80                  |
| b | 101011 | 10010 | 10111 | 0000 0000 0101 0000 |