

# **NOTRE DAME UNIVERSITY**

## **BANGLADESH**



### Computer Architecture

### Assignment Mid

---

Submitted to: Mondira Chakraborty (LecturerCSE)

Submitted by: Shazidul Alam

Student ID: 0692220005101009

Batch: CSE-19

# Lecture → 1 (Introduction)

(25-08-24)

1) Definition of computer Architecture?

⇒ In computer engineering, computer architecture is a set of rules and methods that describe the functional organization, and implementation of computer systems. Some definitions of architecture define it as describing the capabilities and programming model of a computer but not a particular implementation. In other descriptions computer architecture involves instruction set architecture design, microarchitecture design, logic design and implementation.

\* Computer architecture = Instruction set architecture + Machine organization + Hardware

2) Instruction Set Architecture (ISA)?

⇒ ISA is the part of the processor that is visible to the programmer or compiler writer. The ISA serves as the boundary between software and hardware.



Example: MIPS

Desired properties:

- Convenience (from software side)
- Efficiency (from hardware side)

### 3) Machine Organization?

⇒ High-level aspects of a computer's design

- Principal components: memory, CPU, I/O, ...

- How components are interconnected

- How information flows between components.

- e.g. AMD Opteron 64 and Intel Pentium 4:  
Same ISA but different organizations.

### 4) Hardware?

⇒ Detailed logic design and packaging technology of a computer

- E.g. Pentium 4 and mobile pentium 4: nearly identical organizations but different hardware details.

### 5) Types of computers and their applications?

⇒ a) Desktop → Run third-party software

→ Office to home applications

b) Servers

→ Modern version of what used to be called mainframes, minicomputers and supercomputers

→ Large workloads

→ Built using the same technology in desktops but higher capacity:

- i) Expandable

- ii) Scalable

- iii) Reliable

c) Large spectrum: from low-end (file storage, small businesses) to supercomputers (high end scientific and engineering applications)

Examples: file servers, web servers, database servers.

d) Embedded?

- ⇒ i) Microprocessor everywhere! (washing machines, cell phones, automobiles, video games).
- ii) Run one or a few applications.
- iii) Specialized hardware integrated with the application.
- iv) Usually stringent limitations (battery power)
- v) High tolerance for failure.
- vi) Becoming ubiquitous.
- vii) Engineered using processor cores:
  - The core allows the engineer to integrate other functions into the process for fabrication on the same chip.
  - Using hardware description languages:  
Verilog, VHDL

---

### Computer organization / Anatomy:

Input → Memory → Datapath → Processor → Memory → Output

1) Input: Keyboard, mouse

2) Memory: Where programs, data live when running.

3) Control: (Brain) Guides the operation of other components based on the user instructions.

4) Datapath: (Brawn) performs arithmetic operation

5) Output: Display, printer.

(28-08-2029)

## ⑥ High-level language in C



Binary machine language program (for MIPS)

## ⑦ Benefits of High level programming language?

- ⇒ 1) Allow the programmers to think in a more natural language using English words and algebraic notation, which looks much more like text.
- 2) Allow languages to be designed ~~to~~ according to their intended use. Fortran used for scientific computation, Cobol for business data processing, Lisp for symbol manipulation and so on.
- 3) Another advantage of programming language is improved programmer productivity.
- 4) Widespread agreement is that it takes less time to develop programs when they are written in languages that require fewer lines to express an idea.
- 5) Compilers and assemblers can translate high-level language programs to the binary instructions of any computer.

These advantages are so strong that today very little programming done in assembly language.

(28-08-2024)

(Chapter-2)  
(Performance)

①



② Throughput: Total amount of work done in a give time. Example:

1 task — 5 sec

15 sec — 3 task (throughput)

③ Performance: To maximize performance, response time or execution time should be minimized.

Example: For computer X,  $\text{Performance}_X = \frac{1}{\text{Execution time}_X}$

$\text{Performance}_X > \text{Performance}_Y$

$$\frac{1}{\text{Execution time}_X} > \frac{1}{\text{Execution time}_Y}$$

$$\text{Execution time}_Y > \text{Execution time}_X$$

$$\frac{\text{Performance}_X}{\text{Performance}_Y} = \frac{\text{Execution time}_Y}{\text{Execution time}_X} = n$$

\* Q) If computer A runs a program in 10 seconds and computer B runs the same program in 15 seconds, how much faster is A than B?

⇒ Hence, Execution time of A = 10 sec  
Execution time of B = 15 sec

$$\therefore \text{Performance of } A = \frac{1}{10} = 0.1$$

$$\therefore \text{ " } \text{ " } B = \frac{1}{15} = 0.067$$

$$\therefore \frac{\text{Performance } A}{\text{Performance } B} = \frac{0.1}{0.067} = 1.49$$

∴ A is 1.49 times faster than B

---

Substitute:  $\frac{\text{Performance } A}{\text{Performance } B} = \frac{15}{10}$

$$\Rightarrow \frac{P_A}{P_B} = 1.5$$

$$\therefore P_A = 1.5 P_B$$

(01-09-2024)

## ⑤ Measuring Performance?

- Ans: a) The computer that performs the same amount of work in the least time is the fastest.
- b) Program execution time is measured in seconds per program.
- c) Time is also called, wall clock time, response time, execution time or elapsed time.
- d) These means total time to complete a task including disk accesses, memory accesses, input/output (I/O) activities, operating system overhead everything.
- e) Processors may work on several programs simultaneously.
- f) CPU execution time or CPU time which recognizes the distinction, is the time the CPU spends computing for this task and does not include time spent waiting for I/O or running other programs.

## ⑥ CPU time -

- a) User CPU time
- b) System CPU time

⑦ User CPU time: CPU time spent in the program is called User CPU time.

⑧ System CPU time: CPU time spent in the operating system performing tasks requested by the program is called System CPU time.

The term system performance is used to refer to elapsed time on an unloaded system while CPU performance refers to user CPU time on an unloaded system.

- 1) Atto second  $\rightarrow 10^{-18}$
- 2) Femto "  $\rightarrow 10^{-15}$
- 3) Pico "  $\rightarrow 10^{-12}$
- 4) Nano "  $\rightarrow 10^{-9}$
- 5) Micro "  $\rightarrow 10^{-6}$
- 6) Milli "  $\rightarrow 10^{-3}$
- 7) Centi "  $\rightarrow 10^{-2}$
- 8) Deci "  $\rightarrow 10^{-1}$
- 9) Second  $\rightarrow 1$
  
- 10) Deca Hz  $\rightarrow 10^1$
- 11) Hecto Hz  $\rightarrow 10^2$
- 12) Kilo Hz  $\rightarrow 10^3$
- 13) Mega Hz  $\rightarrow 10^6$
- 14) Giga Hz  $\rightarrow 10^9$
- 15) Terra Hz  $\rightarrow 10^{12}$

⑨ Clock Period: Time for a complete clock cycle. ( $T$ )

⑩ Clock rate: Inverse of clock period.

⑪ Math: Clock period =  $250 \text{ ps}$

$$\therefore \text{Clock rate} = \frac{1}{\text{clock period}}$$

$$= \frac{1}{250 \text{ ps}}$$

$$= \frac{1}{250 \times 10^{-12} \text{ s}}$$

$$= 4 \times 10^{-3} \times 10^{12} \text{ Hz}$$

$$= 4 \times 10^9 \text{ Hz}$$

$$= 4 \text{ GHz}$$

(04-09-2021)

## 12) Features of a clock?

Ans: **Period** is the time it takes for a clock signal to repeat. Its unit of measurement is seconds. The symbol for the period is  $T$ .

**Frequency** is  $1/T$ . Its unit is  $s^{-1}$ , which is inverse seconds. This is also called Hz. Sometimes "cycles per second". They all mean the same. "Speed" of a CPU also means the clock frequency.

**Positive edge**: The signal transitions from 0 to 1.

**Negative edge**: The signal transitions from 1 to 0.

## 13) Why clock used here?

Ans: We use a clock to synchronize the events of a CPU. Devices in a CPU can be categorized as sequential circuits or combinational circuits. Sequential logic devices use a clock.

Sequential logic devices can only change output ~~once~~ during a positive or a negative ~~edge~~.

## 14) Can't we make CPU faster?

Ans: It takes time for signal to move around in a circuit. It can be reduced in several ways. The main way is to make the circuit smaller. When the circuit is smaller, signals have less distance to travel and therefore everything can happen more quickly.

15) CPU Performance and It's Factors?

\* CPU Execution time =  $\frac{\text{CPU clock cycles for a program}}{\text{clock cycle time}}$

\* CPU Execution time =  $\frac{\text{CPU clock cycles for a program}}{\text{clock rate}}$

16) Our favorite program runs in 10 seconds on computer A, which has 2 GHz clock. We need to build computer B which will run this program in 6 seconds. So clock rate must be increased. This will cause computer B to require 1.2 times as many clock cycles as computer A for this program. What clock rate should we tell the designer to target?

Ans: Hence, CPU Execution time, A = 10 s

CPU clock rate, A = 2 GHz

CPU Execution time, B = 6 s

Let, computer A clock cycle =  $x$

computer B clock cycle =  $1.2 \times x$

Computer A CPU Execution time =  $\frac{\text{Computer A clock cycle}}{\text{clock rate computer A}}$

$$\Rightarrow 10s = \frac{x}{2\text{GHz}} \Rightarrow 10s = \frac{x}{2 \times 10^9 \text{Hz}} \Rightarrow x = 20 \times 10^9 \text{sec}$$

∴ Computer A clock cycle  $x = 20 \times 10^9$  cycles

CPU Execution time, B =  $\frac{\text{clock cycle B}}{\text{clock rate B}}$

$\Rightarrow$  Clock rate, B =  $\frac{\text{clock cycle B}}{\text{CPU Execution time B}}$

$$\Rightarrow \text{Clock rate, B} = \frac{1.2 \times 20 \times 10^9}{6s} = 4 \times 10^9 \text{s}^{-1} = 4 \text{GHz}$$

(08-09-2024)

17) Suppose we have two implementations of the same instruction set architecture. Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for some program. and computer AB has a clock cycle time of 500 ps and a CPI of 1.2 for the same program.

Which computer is faster for this program and by how much?

Ans: Here, clock cycle time, A = 250 ps  
clock cycle time, B = 500 ps

$$CPI_A = 2$$

$$CPI_B = 1.2 \quad \text{clock cycle time}$$

$$CPV \text{ clock cycle, } A = I \times \cancel{CPI} = I \times \cancel{2.0} 250 \text{ ps}$$

$$\text{II} \quad \text{II}, B = I \times \cancel{CPI} II = I \times 500 \text{ ps}$$

$$\text{CPV Execution time, } A = \frac{\text{CPU clock} \times \cancel{\text{clock cycle}}}{\text{cycle}} \frac{\text{CPI}}{\text{time}}$$
$$= \cancel{2.0} 250 I \times 2$$

$$\text{II} \quad \text{II}, B = \frac{\text{CPU clock} \times \cancel{\text{clock cycle}}}{\text{cycle}} \frac{\text{CPI}}{\text{time}}$$
$$= 500 I \times 1.2$$

$$\frac{\text{Performance A}}{\text{Performance B}} = \frac{\cancel{2.0} I \times 272}{\cancel{4.8} I \times 225} \rightarrow \frac{\text{CPV Execution time B}}{\text{CPV Execution time A}}$$

$$= \frac{500 I \times 1.2}{250 I \times 2}$$

$$= 1.2$$

18) A compiler designer is trying to decide between two code sequences to a particular computer. The hardware designers have supplied the following facts:

| CPI for each instruction class |   |   |   |
|--------------------------------|---|---|---|
|                                | A | B | C |
| CPI                            | 1 | 2 | 3 |

For a particular high-level language statement, the compiler writer is considering two code sequences that require the following instruction counts:

|   | Instruction counts for each instruction class |   |   |
|---|-----------------------------------------------|---|---|
|   | A                                             | B | C |
| 1 | 2                                             | 1 | 2 |
| 2 | 4                                             | 1 | 1 |

Which code sequence executes the most instructions? Which will be faster? What is the CPI for each sequence?

Ans: Sequence 1 executes =  $2 + 1 + 2 = 5$  instructions

$$\text{II} \quad 2 \quad \text{II} = 4 + 1 + 1 = 6 \quad \text{II}$$

$$\therefore \text{CPU clock cycles}_1 = (2 \times 1) + (1 \times 2) + (2 \times 3) = 10 \text{ cycles}$$

$$\therefore \text{II} \quad \text{II} \quad \text{II} \quad 2 = (4 \times 1) + (1 \times 2) + (1 \times 3) = 9 \text{ cycles}$$

$$\therefore \text{CPI}_1 = \frac{\text{CPU clock cycles}_1}{\text{Instruction counts}_1} = \frac{10}{5} = 2.0$$

$$\therefore \text{CPI}_2 = \frac{\text{CPU clock cycles}_2}{\text{Instruction counts}_2} = \frac{9}{6} = 1.5$$

(11-09-2024)

## Lecture-3 (Instruction Set)

Instruction set: The words of a computer's language are called instructions and its vocabulary is called instruction set. The chosen instruction set comes from MIPS technologies. MIPS is a reduced instruction set computer (RISC) instruction set architecture (ISA) developed by MIPS Technologies.

MIPS → (Microprocessor without Interlocked Pipeline Stages).

1)  $a = b + cd + e$  (Translate C code into assembly code)

$$\Rightarrow \begin{array}{ll} \text{add } a, b, c & \text{add } a, b, c \\ \text{add } a, a, d & \text{on, add } f, d, e \\ \text{add } a, a, e & \text{add } a, a, f \end{array}$$

2)  $f = (g+h) - (i+j)$

$$\Rightarrow \begin{array}{ll} \text{add } t0, g, h & \text{add } f, g, h \\ \cancel{\text{add }} & \text{on, sub } f, t0, i \\ \text{add } t1, i, j & \text{sub } f, t0, i \\ \text{sub } f, t0, t1 & \end{array}$$

3) Register: It is a small, fast storage location within a CPU used to hold data temporarily for processing.

(15-09-2024)

Example-3]  $f = (g+h) - (i+j)$

|                                     |                 |
|-------------------------------------|-----------------|
| add \$t_0, \\$S_1, \\$S_2           | $t_0 = g+h$     |
| add \$t_1, \\$S_3, \\$S_4           | $t_1 = i+j$     |
| <del>sub \$t_0, \$t_1, \\$S_5</del> | $f = t_0 - t_1$ |

or,

$$f = (g+h) - i - j$$

|                                     |                 |
|-------------------------------------|-----------------|
| add \$t_0, \\$S_1, \\$S_2           | $t_0 = g+h$     |
| <del>sub \$t_1, \$t_0, \\$S_3</del> | $t_1 = t_0 - i$ |
| sub \$S_0, \$t_1, \\$S_4            | $f = t_1 - j$   |

---

$$\text{Q} a = (b+d) - (-j+k) - c$$

$$\Rightarrow a = (b+d) + j - k - c$$

|                           |                 |
|---------------------------|-----------------|
| add \$t_0, \\$S_1, \\$S_3 | $t_0 = b+d$     |
| add \$t_1, \$t_0, \\$S_4  | $t_1 = t_0 + j$ |
| sub \$t_2, \$t_1, \\$S_5  | $t_2 = t_1 - k$ |
| sub \$S_0, \$t_2, \\$S_2  | $a = t_2 - c$   |

(22-09-2024)

$$\begin{aligned} \text{(i)} \quad P &= (-q-n) - t - (m-n) \\ &= -(q+n) - t + m + n \\ &= m + n - (q+n) - t \end{aligned} \quad \left| \begin{array}{l} \text{Let,} \\ \$S_0 = P, \$S_1 = m \\ \$S_2 = n, \$S_3 = q \\ \$S_4 = n, \$S_5 = t \end{array} \right.$$

$$\begin{aligned} \Rightarrow \text{add } &\$t_0, \$S_1, \$S_2 \quad | t_0 = m + n \\ \text{add } &\$t_1, \$S_3, \$S_4 \quad | t_1 = q + n \\ \text{sub } &\$t_2, \$t_0, \$t_1 \quad | t_2 = t_0 - t_1 \\ \text{sub } &\$S_0, \$t_2, \$S_5 \quad | P = t_2 - S_5 \end{aligned}$$

---

$$\text{(ii)} \quad a = (b+c) - (c-d) - (e+f)$$

$$\text{where, } a = \$S_0, b = \$S_1$$

$$c = \$S_2, d = \$S_3$$

$$e = \$S_4, f = \$S_5$$

$$\begin{aligned} \Rightarrow \text{add } &\$t_0, \$S_1, \$S_2 \quad | t_0 = b+c \\ \text{sub } &\$t_1, \$S_2, \$S_3 \quad | t_1 = c-d \\ \text{sub } &\$t_2, \$t_0, \$t_1 \quad | t_2 = t_0 - t_1 \\ \text{add } &\$t_3, \$S_4, \$S_5 \quad | t_3 = e+f \\ \text{sub } &\$S_0, \$t_2, \$t_3 \quad | a = t_2 + t_3 \end{aligned}$$

---

$$\text{(iii)} \quad g = h + A [8]$$

Assembly:

lw \$t\_0, 8(\$s3)  
add \$S\_1, \$S\_2, \$t\_0

(25-09-2024)

Given: Assume variable  $h$  is associated with register  $\$S2$  and the base address of the array  $A$  is in  $\$S3$ . What is the MIPS assembly code for the C statement below?

$$A[12] = h + A[8];$$

↓              Variable      ↓  
Array name                Array name

Ans:

```
lw $t0, 32($S3)
add $t0, $S2, $t0
sw $t0, 48($S3)
```

Given:  $A_{xx}[124] = A_{xx}[22] - A_{xx}[25] + h + (p+q)$

Ans:

```
lw $t0, 88($S1)
lw $t1, 100($S1)
sub $t1, $t0, $t1
add $t2, $t1, $S2
add $t2, $t2, $S3
add $t2, $t2, $S4
sw $t2, 998($S1)
```

|                    |
|--------------------|
| $t_0 = A_{xx}[22]$ |
| $t_1 = A_{xx}[25]$ |
| $t_2 = t_0 + t_1$  |
| $t_2 = t_2 + S_2$  |
| $t_2 = t_3 + S_3$  |
| $t_2 = t_2 + S_4$  |

(29 - 09 - 2024)

$$1/ \boxed{\square} A[22] = A[20] + B[12] + C[5] \rightarrow \begin{cases} A \text{ base} = \$S_1 \\ B \text{ base} = \$S_2 \\ C \text{ base} = \$S_3 \end{cases}$$

|                              |                        |
|------------------------------|------------------------|
| $\nwarrow \$t_0, 80 (\$S_1)$ | $t_0 = \text{ANN}[20]$ |
| $\nwarrow \$t_1, 48 (\$S_2)$ | $t_1 = \text{ANN}[12]$ |
| $\nwarrow \$t_2, 20 (\$S_3)$ | $t_2 = \text{ANN}[5]$  |
| add $\$t_3, \$t_0, \$t_1$    | $t_3 = t_0 + t_1$      |
| add $\$t_3, \$t_3, \$t_2$    | $t_3 = t_3 + t_2$      |
| $\nwarrow \$t_3, 88 (\$S_1)$ | $t_3 = \text{ANN}[22]$ |

---

$$2/ \boxed{\square} g = h + \text{ANN}[255]$$

|                 |                                                                                                         |
|-----------------|---------------------------------------------------------------------------------------------------------|
| $p = g + h + s$ | $\begin{cases} \$S_1 = \text{ANN}[253] \\ \$S_2 = g \\ \$S_3 = h \\ \$S_4 = s \\ \$S_5 = p \end{cases}$ |
|-----------------|---------------------------------------------------------------------------------------------------------|

$$\text{ANN}[253] = \text{ANN}[23] + p + g$$

$\Rightarrow$

~~add  $\$t_0, \$t_0, \$t_2$~~

|                                |                         |
|--------------------------------|-------------------------|
| $\nwarrow \$t_0, 1020 (\$S_1)$ | $t_0 = \text{ANN}[255]$ |
| add $\$S_2, \$t_0, \$S_3$      | $g = t_0 + h$           |
| add $\$t_1, \$S_2, \$S_3$      | $t_1 = g + h$           |
| add $\$S_5, \$t_1, \$S_4$      | $p = t_1 + s$           |
| add $\$t_2, \$S_5, \$S_2$      | $t_2 = p + \cancel{s}g$ |
| $\nwarrow \$t_3, 292 (\$S_1)$  | $t_3 = \text{ANN}[23]$  |
| add $\$t_4, \$t_3, \$t_2$      | $t_4 = t_3 + t_2$       |
| $\nwarrow \$t_4, 1012 (\$S_1)$ | $t_4 = t_3 + t_2$       |

# **NOTRE DAME UNIVERSITY**

## **BANGLADESH**



### Computer Architecture

### Assignment Final

---

Submitted to: Mondira Chakraborty (LecturerCSE)

Submitted by: Shazidul Alam

Student ID: 0692220005101009

Batch: CSE-19

(20-10-2024)

## Lecture-4] Representing Instructions in the Computer

1) Instruction: Instructions are kept in the computer as a series of high and low electronic signals and may be represented as numbers.

|                        |                        |
|------------------------|------------------------|
| \$ t <sub>0</sub> = 8  | \$ s <sub>0</sub> = 16 |
| \$ t <sub>1</sub> = 9  | \$ s <sub>1</sub> = 17 |
| \$ t <sub>2</sub> = 10 | \$ s <sub>2</sub> = 18 |
| \$ t <sub>3</sub> = 11 | \$ s <sub>3</sub> = 19 |
| \$ t <sub>4</sub> = 12 | \$ s <sub>4</sub> = 20 |
| \$ t <sub>5</sub> = 13 | \$ s <sub>5</sub> = 21 |
| \$ t <sub>6</sub> = 14 | \$ s <sub>6</sub> = 22 |
| \$ t <sub>7</sub> = 15 | \$ s <sub>7</sub> = 23 |

~~add~~ \$ s<sub>1</sub> + \$ s<sub>2</sub> = \$ t<sub>0</sub>  
first operand second operand

[0] [ ] 32

↓  
Means add

|       |                   |    |                   |   |                     |   |
|-------|-------------------|----|-------------------|---|---------------------|---|
| # add | \$ s <sub>1</sub> | ,  | \$ s <sub>1</sub> | , | \$ s <sub>1</sub> 2 | ↓ |
| 0     | 17                | 18 | 17                | 0 | 32                  |   |

|        |        |        |        |        |        |
|--------|--------|--------|--------|--------|--------|
| OP     | RA     | RT     | RD     | shamt  | funct  |
| 6 bits | 5 bits | 5 bits | 5 bits | 5 bits | 6 bits |

rn: register source 1st operand

rt: register source 2nd operand

rd: register destination

shamt: shift amount.

function: function code.

# lw \$t<sub>0</sub>, 32(\$s<sub>3</sub>)

Explain  
↓

| OP | 19 | 8 | address |
|----|----|---|---------|
|----|----|---|---------|

| OP     | rn     | rt     | constant or address |
|--------|--------|--------|---------------------|
| 6 bits | 5 bits | 5 bits | 16 bits             |

(23-10-2024)

add, sub, addi, lw, sw

R type

I type

$$a = b + c$$

$$d = e + f$$

$$p = a + d$$

$$a = \$s_0 \rightarrow 16$$

$$b = \$s_1 \rightarrow 17$$

$$c = \$s_2 \rightarrow 18$$

$$d = \$s_3 \rightarrow 19$$

$$e = \$s_4 \rightarrow \cancel{20}$$

$$f = \$s_5 \rightarrow 21$$

$$p = \$s_6 \rightarrow 22$$

add \$s<sub>0</sub>, \$s<sub>1</sub>, \$s<sub>2</sub>

$$S_0 = a + b$$

add \$s<sub>3</sub>, \$s<sub>4</sub>, \$s<sub>5</sub>

$$S_3 = e + f$$

add \$s<sub>6</sub>, \$s<sub>0</sub>, \$s<sub>3</sub>

$$S_6 = a + d$$

2nd operand

1st operand

destination

| OP | rn | rt | rd | shift | func |
|----|----|----|----|-------|------|
| 0  | 17 | 18 | 16 | 0     | 32   |

| OP | rn | rt | rd | shift | func |
|----|----|----|----|-------|------|
| 0  | 20 | 21 | 19 | 0     | 32   |

| OP | rn | rt | rd | shift | func |
|----|----|----|----|-------|------|
| 0  | 16 | 19 | 22 | 0     | 32   |

# lw \$t<sub>0</sub>, 32(\$s<sub>3</sub>)

| OP | rn | rt | address |
|----|----|----|---------|
| 35 | 19 | 8  | 128     |

# sw \$t<sub>0</sub>, 32(\$s<sub>3</sub>)

| OP | rn | rt | address |
|----|----|----|---------|
| 43 | 19 | 8  | 128     |

$$\boxed{A[300]} = h + A[300]$$

$$\begin{bmatrix} t_0 = 8 \\ t_1 = 9 \\ s_2 = 18 \end{bmatrix}$$

lw \$t\_0, 1200(\$t\_1) 

|    |   |   |      |
|----|---|---|------|
| 35 | 9 | 8 | 1200 |
|----|---|---|------|

add \$t\_0, \$s\_2, \$t\_0 

|   |    |   |   |   |    |
|---|----|---|---|---|----|
| 0 | 18 | 8 | 8 | 0 | 32 |
|---|----|---|---|---|----|

sw \$t\_0, 1200(\$t\_1) 

|    |   |   |      |
|----|---|---|------|
| 43 | 9 | 8 | 1200 |
|----|---|---|------|

(03-11-2024)

$$\boxed{A[500]} = A[412] + a + b + c$$

$$B[413] = A[500] + d$$

$$\begin{bmatrix} A_{base} = \$s_0 \\ B_{base} = \$s_1 \end{bmatrix}$$

$$\begin{bmatrix} \text{Let, } a = \$s_2, b = \$s_3 \\ c = \$s_4, d = \$s_5 \end{bmatrix}$$

$\Rightarrow$  lw \$t\_0, 1698(\$s\_0)  $t_0 = AB[412] -$ 

|    |    |   |      |
|----|----|---|------|
| 35 | 16 | 8 | 1698 |
|----|----|---|------|

add \$t\_1, \$t\_0, \$s\_2  $t_1 = t_0 + a$ 

|   |   |    |   |   |    |
|---|---|----|---|---|----|
| 0 | 8 | 18 | 9 | 0 | 32 |
|---|---|----|---|---|----|

add \$t\_1, \$t\_1, \$s\_3  $t_1 = t_1 + b$ 

|   |   |    |   |   |    |
|---|---|----|---|---|----|
| 0 | 9 | 19 | 9 | 0 | 32 |
|---|---|----|---|---|----|

add \$t\_1, \$t\_1, \$s\_4  $t_1 = t_1 + c$ 

|   |   |    |   |   |    |
|---|---|----|---|---|----|
| 0 | 9 | 20 | 9 | 0 | 32 |
|---|---|----|---|---|----|

sw \$t\_1, 2000(\$s\_0)  $A[500] = t_1$ 

|    |    |   |      |
|----|----|---|------|
| 43 | 16 | 9 | 2000 |
|----|----|---|------|

add \$t\_2, \$t\_1, \$s\_5  $t_2 = t_1 + d$ 

|   |   |    |    |   |    |
|---|---|----|----|---|----|
| 0 | 9 | 21 | 10 | 0 | 32 |
|---|---|----|----|---|----|

sw \$t\_2, 1652(\$s\_1)  $B[413] = t_2$ 

|    |    |    |      |
|----|----|----|------|
| 43 | 17 | 10 | 1652 |
|----|----|----|------|

(03-11-2024) [Lecture-5 (Logical Operations)]

| Logical Operations | C Operator | Java Operator | MIPS Instruction |
|--------------------|------------|---------------|------------------|
| Shift Left         | <<         | <<            | sll              |
| Shift Right        | >>         | >>>           | srl              |
| Bit-by-bit AND     | &          | &             | and, andi        |
| Bit-by-bit OR      |            |               | or, ori          |
| Bit-by-bit NOT     | ~          | ~             | nor              |

| i) << (Left shift)                                         | ii) >> (right shift)                                       | iii) NOT $\rightarrow$ NOR                                 |
|------------------------------------------------------------|------------------------------------------------------------|------------------------------------------------------------|
| $00100 \rightarrow 4$<br>$\Rightarrow 01000 \rightarrow 8$ | $00100 \rightarrow 9$<br>$\Rightarrow 00010 \rightarrow 2$ | $a = 9 \rightarrow 00100$<br>$\bar{a} \rightarrow 1\ 1011$ |

iv) AND

$$a = 8 \rightarrow 01000$$

$$b = 4 \rightarrow 00100$$

$$A \text{AND } B \rightarrow \underline{\overline{00000}}$$

$1001_{\text{two}}$   $\rightarrow 9_{\text{ten}}$   
 ↓  
 base 2  
 binary

$y$   
 base 10  
 decimal

(shift left logic)

sll \$t2, \$s0, 9



(06-11-2024)

$$\textcircled{1} \quad A [300] + A [400] + h = B [212]$$

$$\begin{cases} \text{Here} \\ A = \$S_1, M = \$S_3 \\ B = \$S_2 \end{cases}$$

$$\textcircled{2} \quad \text{Let, } a = 12$$

$$b = 13$$

\textcircled{1} a and b ; \textcircled{II} b or a ; \textcircled{III} shift left by 4 on the variable a

~~$$\textcircled{3} \quad P = h + a + b + c$$~~

~~$$M [53] = M [32] + p + a$$~~

~~$$\text{Let, } S_3 = h$$~~

Solution:  $\textcircled{1} \quad A [300] + A [400] + h = B [212]$

$$\text{I} \Rightarrow \text{lw } \$t_0, 1200 (\$S_1) \quad t_0 = A [300]$$

$$\text{I} \Rightarrow \text{lw } \$t_1, 1600 (\$S_1) \quad t_1 = A [400]$$

$$\text{R} \Rightarrow \text{add } \$t_0, \$t_0, \$t_1 \quad t_0 = t_0 + t_1$$

$$\text{R} \Rightarrow \text{add } \$t_0, \$t_0, \$S_3 \quad t_0 = t_0 + h$$

$$\text{I} \Rightarrow \text{sw } \$t_0, 898 (\$S_2) \quad t_0 = B [212]$$

|    |    |   |      |
|----|----|---|------|
| 35 | 17 | 8 | 1200 |
| 35 | 17 | 9 | 16   |

|   |   |    |    |   |    |
|---|---|----|----|---|----|
| 0 | 8 | 9  | 80 | 0 | 32 |
| 0 | 8 | 19 | 80 | 0 | 32 |

|    |    |   |     |
|----|----|---|-----|
| 43 | 18 | 8 | 848 |
|----|----|---|-----|

\textcircled{2} Here,

$$a = 12 \rightarrow 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 1100 \text{ two}$$

$$b = 13 \rightarrow 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 1101 \text{ two}$$

$$a \text{ AND } b \rightarrow 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 1100 \text{ two}$$

\textcircled{II} b OR a  $\rightarrow 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 1101 \text{ two}$

\textcircled{III} All } \\$t\_0, \\$t\_1, 4 \& [shift left by 4 on variable a]

$$0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 0000 \ 1101 \ 0000 \text{ two}$$

\textcircled{3} Let, } a = \\$S\_0, b = \\$S\_1, c = \\$S\_2, h = \\$S\_4, p = \\$S\_5, M [53] = \\$S\_3

~~$$P = h + a + b + c$$~~

$$\text{R} \Rightarrow \text{add } \$t_0, \$S_4, \$S_0 \quad t_0 = h + a$$

$$\text{R} \Rightarrow \text{add } \$t_0, \$t_0, \$S_1 \quad t_0 = t_0 + b$$

$$\text{R} \Rightarrow \text{add } \$t_0, \$t_0, \$S_2 \quad t_0 = t_0 + c$$

~~$\Rightarrow \text{add } \$$~~

~~$\Rightarrow \text{add } \$$~~

~~$\Rightarrow \text{add } \$S_5$~~

op MA RT RD Shift Count

|   |    |    |   |   |    |
|---|----|----|---|---|----|
| 0 | 20 | 16 | 8 | 0 | 32 |
| 0 | 8  | 17 | 8 | 0 | 32 |

|   |   |    |    |   |    |
|---|---|----|----|---|----|
| 0 | 6 | 18 | 21 | 0 | 32 |
|---|---|----|----|---|----|

lw \$t\_0, 128(\$s\_3)

$$t_0 = M[32]$$

add \$t\_0, \$t\_0, \$s\_5

$$t_0 = t_0 + P$$

add \$t\_0, \$t\_0, \$s\_0

$$t_0 = t_0 + a$$

sw \$t\_0, 212(\$s\_3)

$$M[53] = t_0$$

|    |    |     |     |   |    |
|----|----|-----|-----|---|----|
| 35 | 19 | 8   | 128 |   |    |
| 0  | 8  | 216 | 8   | 0 | 32 |
| 0  | 8  | 16  | 8   | 0 | 32 |
| 93 | 19 | 8   | 212 |   |    |

(10-11-2029)

(Instructions for making decisions)

if ( $P \neq Q$ )

{  $A = P + Q$  ;

$B = P - Q$  }

else {  $A = P + Q$

$B = A + Q$  }

$\Rightarrow$

beg  $\$S_2, \$S_3$ , Else

add  $\$S_0, \$S_2, \$S_3$

sub  $\$S_1, \$S_2, \$S_3$

j Exit

Else : add  $\$S_0, \$S_3$

add  $\$S_1, \$S_0, \$S_3$

Exit:

bne  $\$S_2, S_3, L1$

add  $\$S_0, \$S_2, \$S_3$

add  $\$S_1, \$S_0, \$S_3$

j Exit

L1 : add  $\$S_0, \$S_2, \$S_3$

sub  $\$S_1, \$S_2, \$S_3$

Exit:

Let,

$A = \$S_0, P = \$S_2$

$B = \$S_1, Q = \$S_3$

go to if  $P \neq Q$

$A = P + Q$

$B = P - Q$

go to exit

#  $A = P + Q$

$B = A + Q$

Label: Exit

goto else # -

$A = P + Q$

$B = A + Q$

jmp to exit

$A = P + Q$

$B = P - Q$

Label: L1

(17-11-2024)

Lecture-6 | What is the decimal value of this  
32 bit two's complement number?

1101 1101 0011 1111 1011 0111 0000 1001

Ans:

$$\Rightarrow -2^{31} + (2^{30} + 2^{28} + 2^{27} + 2^{26} + 2^{24} + 2^{21} + 2^{20} + 2^{19} + 2^{18} + 2^{27} \\ + 2^{16} + 2^{15} + 2^{13} + 2^{12} + 2^{10} + 2^9 + 2^8 + 2^3 + 2^0)$$
$$= -7172,99663$$

OR,

$$\Rightarrow -2^{31} + (\text{If all value is } 1)$$

$$= -2,147,483,698 + (2,147,483,697 - \cancel{1})$$

$$- 2^1 - 2^2 - 2^4 - 2^5 - 2^6 - 2^7 - 2^{11} - 2^{14} - 2^{22} \\ - 2^{23} - 2^{25} - 2^{29})$$

$$= -1 - 2^1 - 2^2 - 2^4 - 2^5 - 2^6 - 2^7 - 2^{11} - 2^{14} - 2^{22} - 2^{23} - 2^{25} - 2^{29}$$

$$= -7172,99663$$

Example: If first value is 0,

0 111 1111 1111 ... ..

$$= 2,147,483,697 - 2^0 - 2^1 - 2^2 - 2^3 - 2^4 - 2^5 - 2^6 - 2^7 - 2^8 - 2^9 - 2^{10} - 2^{11} - 2^{12} - 2^{13} - 2^{14} - 2^{15} - 2^{16} - 2^{17} - 2^{18} - 2^{19} - 2^{20} - 2^{21} - 2^{22} - 2^{23} - 2^{24} - 2^{25} - 2^{26} - 2^{27} - 2^{28} - 2^{29} - 2^{30} - 2^{31}$$

## Lecture-9 Memory Hierarchy

A memory hierarchy consists of multiple levels of memory with different speeds and sizes.

The faster memories are more expensive per bit than the slower memories and thus smaller.

Three techs in memory hierarchy:-

(1) DRAM (Main memory) (2) SRAM (Cache) (3) Magnetic disk <sup>(Hard disk)</sup>

### Memory Hierarchy Design:



3) Data Hierarchy: The data hierarchy is a layered structure where data is organized from the fastest and smallest storage level (Registers) to the slowest and largest (hard disks). Data is copied between adjacent level in blocks to optimize performance. Higher levels are faster but smaller and more expensive. While lower levels are slower but larger and cheaper.



(20-11-2024)

- 1) Hit: When the requested data is found in the upper memory level. Example: Cache.
- 2) Miss: When the requested data is not found, requiring access to a lower memory level.
- 3) Hit Rate: The percentage of memory accesses found in the upper level.
- 4) Miss Rate: The percentage of memory accesses not found, calculated as  $(1 - \text{Hit Rate})$ .
- 5) Hit Time: Time taken to access the upper memory level, including checking for a hit or miss.
- 6) Miss Penalty: Time required to fetch data from the lower memory level, replace a block in the upper level and deliver it to the processor.
- 7) Cache: Is fast, small memory near the CPU that stores frequently used data to reduce access time and improve performance. It has levels ( $L_1, L_2, L_3$ ) for efficiency.
- 8) Direct Mapped Cache: It maps each memory block to a specific cache line, offering simplicity but risking conflicts if multiple blocks map to the same line.



- 9) Tag: Is a part of a memory address in caching that identifies whether a specific block of data is stored in the cache. It helps match cache entries with requested memory addresses during a lookup.

(27-11-2024)

## Lecture-10 | Pipelining And Hazards

Pipelining: Is an implementation technique in which multiple instructions are overlapped in execution. It is used to make the processor fast.

The laundry analogy for pipelining:

Sequential 6 hours , while pipelined laundry takes 3 hours

Time - 6pm 7 8 9 10 11 12 1AM



Task Order

Time - 6PM 7 8 9 10 11 12 1AM



Task Order

MIPS Pipeline: Classically take five steps:

- 1) Fetch instruction from memory.
- 2) Read register while decoding the instruction. The regular format of MIPS instructions allows reading and decoding to occur simultaneously.
- 3) Execute the operation or calculate address.
- 4) Access an operand in data memory.
- 5) Write the result into a register.

## Graphical representation of the instruction pipeline:



IF: Instruction Fetch stage, with the block representing Instruction Memory.

ID: Instruction decode/register file read stage, with the drawing showing the register file being read.

EX: Execution stage; with the drawing representing ALU.

MEM: Memory access stage, with the box representing data memory.

WB: Write-Back stage, with the drawing the register file being written.

Hazards: There are situations in pipelining when the next instruction can't execute in the following clock cycle. These events are called hazards.

Three types of hazards:

(1) Structural Hazards (2) Data Hazards (3) Control Hazards

1) Structural Hazards: A structural hazard occurs when a part of the processor's hardware is needed by two or more instructions at the same time. This can be resolved by separating the component into orthogonal units (such as separate cache) or bubbling the pipeline.

Resolving structural Hazard: To resolve this, we stall the pipeline for one clock cycle when a data-memory access occurs. The effect of the stall is actually to occupy the resources for that instruction slot.



Data Hazards: Data hazards occur when the pipeline must be stalled because one step must wait for another to complete.

|     |              | 1  | 2  | 3                    | 4                   | 5                    | 6   | 7   | 8  | 9 |
|-----|--------------|----|----|----------------------|---------------------|----------------------|-----|-----|----|---|
| ADD | R1, R2, R3   | IF | ID | EX                   | MEM                 | WB                   |     |     |    |   |
| SUB | R4, R5, R1   |    | IF | ID <sub>sub</sub>    | EX                  | MEM                  | WB  |     |    |   |
| AND | R6, R1, R7   |    |    | IF ID <sub>and</sub> | EX                  | MEM                  | WB  |     |    |   |
| OR  | R8, R1, R9   |    |    |                      | IF ID <sub>or</sub> | EX                   | MEM | WB  |    |   |
| XOR | R10, R1, R11 |    |    |                      |                     | IF ID <sub>xor</sub> | EX  | MEM | WB |   |