

CS6303 - Computer ArchitectureUnit 1:- (X) Refer Problems in Class Works)

Eight ideas - Components of a computer system - Technology - Performance - powerwall - Uniprocessor to Multiprocessor  
 Instructions - operations and operands - representing instructions - logical operations - Control operations - Addressing and addressing modes.

Key features:- ARM - Advanced RISC Machine.

ARM - is an Instruction Set Architecture used for the embedded devices

Classes of Computing Applications:-

① Desktop Computer:

→ best known form of computing , characterized by personal Computer.

→ Delivers good performance to single user at low cost and usually execute third party software.

② Servers:

A computer used for running larger programs for multiple user often simultaneously and accessed only through network.

→ Many large Workload

→ consists of either single complex application

(a) Many small jobs (Web servers).

- provide greater expandability of both computing and i/o capacity.
- Some servers can be used for file storage, small business application they no need to have screen or keyboard.

### Supercomputers:

- to thousands
- Hundreds of Processors usually Terabytes of Memory and petabytes of Storage.
  - Cost Millions to hundreds of millions of dollars.
  - Used: Weather forecasting,  
Oil Exploration,  
Protein structure determination  
& other large scale problems.
- $1\text{KB} = 1024 \text{ bits}$   
 $1\text{MB} = 1024^2 \text{ KB}$   
 $1\text{GB} = 1024^3 \text{ MB}$   
 $1\text{TB} = 1024^4 \text{ GB}$

### Datacenters:

A room or building designed to handle power, cooling, and networking needs of a large number of servers.

Datacenters are used by companies like Google, ebay contain 1000's of processors TB of memory, PB of storage

### Embedded Computers:

A computer inside another device for running predefined applications or collection of softwares.

Embedded computers include microprocessors found

in computers in cell phones, computers in V-d-a, a-

2

television, airplane, cargo ship.

→ Embedded Application has unique Application Requirements.

### Multicore Microprocessor:-

A chip with multiple processors integrated in a single IC.

### Flash Memory:-

- Nonvolatile Semiconductor memory.
- cheaper & slower than DRAM
- faster and expensive than Disk.

Used in Mobile phones they replace the use of disk.

### Unit 1

#### ① Components of Computer:-

There are 5 components of computer they are

- ① Input
- ② Output
- ③ Memory
- ④ Data path  $\rightarrow$  Processor
- ⑤ Control
- ⑥ ALU

## Uniprocessors

A Uniprocessor system is defined as a computer system that has a single central processing Unit that is used to execute computer Task. As more and More Modern Software is able to to make use of Multiprocessing Architectures such as <sup>SMP</sup> <sup>Symmetric multi processing</sup> and <sup>MPP</sup> <sup>Massively parallel processing</sup> the term Uniprocessor is therefore used to distinguish the class of computers where all processing tasks share a single CPU.



## Instruction Register [IR]

The Instruction register (IR) holds the instruction that is currently being executed. Its output is available to the control circuits which generate the timing signal that control the various processing elements involved in executing the instruction.

Program Counter (PC) Program counter incremented by  $PC \leftarrow PC + 4$

Program Counter (PC) keeps track of the execution of a program. It contains the memory address of the next instruction to be executed.

A

- \* Common Case is simpler than the generic case.
- \* Easy to enhance the common case.
- \* This idea is possible only with careful Equipmentation and Measurement.

## ① Performance Via parallelism :-

Parallelism - (Simultaneous Execution of instruction) The parallelism technique used in the instruction is called Instruction Level Parallelism (ILP). Performance can be increased by performing operation in parallel.

## ② Performance Via pipelining :-

Overlapping the execution of the instruction to improve the performance of the CPU is called Pipelining.

## ③ Performance Via prediction:-

Starting with average guesses and start working rather than wait until you know for sure. Mechanism to recover from a misprediction is not much expensive.



Programmers want memory to be fast, large and cheap. Cost of memory today is often the majority of computer cost.

These problem can be addressed by Hierarchies of Memories where,

- Fastest, smallest and most expensive on top
- Largest, cheapest slowest at the bottom.

Cache Memory gives illusion that main memory is nearly as fast as the top of the hierarchy.

→ Layered triangle is used to represent Memory Hierarchy.

## ⑧ Dependability via redundancy.

- Computer need to be fast as well as dependable.
- Physical devices fail in real-life systems. Redundancy by including redundant components help to detect faults.

## Logical Operations :-

logical operations are used to control the program flow. The operations were added to programming language and instruction set architecture to simply among other things packing and unpacking of bits into words.

| Logical operation | Operators | Java operators | ARM instruction |
|-------------------|-----------|----------------|-----------------|
| Bit-by-bit AND    | &         | &              | AND             |
| Bit-by-bit OR     |           |                | ORR             |
| Bit-by-bit NOT    | ~         | ~              | BN              |
| Shift left        | <<        | <<             | LSL             |
| Shift Right       | >>        | >>             | LSR             |

### Bit-by-bit AND Operation.

AND is a bit by-bit operation that leaves a 1 in the result if both bits of the operands are 1. for eg if register  $R_2$  contains

0000 0000 0000 0000 0000 1101 1100 0000

and register  $R_1$  contains

0000 0000 0000 0000 0000 0011 0011 0000

then after execution ARM instruction

AND r15, r1, r2 ; reg r15 = reg r1 & reg r2

the value of register r15 is

0000 0000 0000 0000 0000 0001 0000 0000

2

AND can apply a bit pattern to set of bits to force 0's where there is a 1's in the bit pattern  
Such a bit pattern with AND is traditionally called a "mask" conceals some bits.

### ② OR operation

It is a bit-by-bit operation that places a 1 in the result if either operand's bit is a 1. It is used to include bits in a word.

The ARM instruction for OR operations is

ORR r15, r1, r2 ; reg r15 = reg r1 | reg r2

the value in register r15 is

r1, 0000 0000 0000 0000 0000 1100 0011 0000

r2, 0000 0000 0000 0000 0000 1101 1100 0000

---

r15, 0000 0000 0000 0000 0000 1101 1111 0000.

---

### ③ Not operation

If the third logical operation is NOT takes one operand and places either 1 in the

Result if one operand bit is a' and vice versa.

ARM instruction for NOT operation is MNW (Move Not)

MNW  $R_{15}, R_1$ ;  $R_{15} = \sim R_1$ .

Value of register  $R_1$  is

$R_1$  0000 0000 0000 0000 0000 1100 0011 0000<sub>2</sub>  
 $R_5$  1111 1111 1111 1111 0011 1100 0111



#### (b) Shift Operations :-

There are three shift operations they are

- ① Shift Left
- ② Shift Right

Shift operation moves all the bits in a word to the left or right, filling the emptied bits with 0's.

#### ① Shift Left Operation :-

Shifts the bits by 'i' bits and fills the emptied by 0's.

The ARM instruction used to perform Shift Left operation is LSL (Logical shift left).

e.g.: If register  $R_0$  contains,

0000 0000 0000 0000 0000 0000 0001<sub>2</sub> = 1<sub>10</sub>

When instruction the shift left by 4 bits was executed then the new value is

0000 0000 0000 0000 0000 1000 0000<sub>2</sub> = 144<sub>10</sub>

Shift provides a logical benefit, shifting left by 'i' bits gives the same result as multiplying by  $2^i$ .

In the above eg shift left by 4 bits of  $9_{10}$  which results in value when  $9_{10}$  is multiplied by  $2^4$  (4 bits shifts).

$$9 \times 2^4 = 9 \times 16 = 144.$$

### Shift right operation:-

The dual of Shift left is Shift right operation. Shifts the bits right and fill with 0's.

The ARM instruction is LSR (Logical Shift Right)  
eg:- if R0 contains

$0000\ 0000\ 0000\ 0000\ 11001000\ 0000\ 0000_2$   
shift Right by 2 bits.

$0000\ 0000\ 0000\ 0000\ 0011\ 0010\ 0000\ 0000_2$

ARM Shift Operations are Not separate Instructions  
Opcode.  
It offers the ability to Shift the second operand as a part of any data processing instruction.

ADD R5, R1, R2, LSL #2 ; R5 = R1 + (R2 << 2)

7

This instruction add register  $r_1$  to register  $r_2$  shifted left by 2 bits and puts the result in register  $r_5$ .

e.g.: 2

MOV  $r_6, r_5, LSR r_3 ; r_6 = r_5 \gg r_3$

(Moves the value of  $r_5$  shifted by amount in e.g.  $r_3$  into  $r_6$ ).

- The full ARM instruction set also include Exclusive OR (EOR) which set 1 when two bits diff, set 0 if they are same.
- C compiler insert and extract fields using Logical operations in ARM (AND, ORR, LSL, LSR).
- Other Instructions like ROR (Rotate Right) instead of discarding the bits on right shift brings them back into vacated location.

## Performance

Definition:-

Performance is a measure which means how quickly <sup>a computer</sup> it can execute <sup>the</sup> programs.

Performance is measured in different ways.

Performance can be increased if the execution time decreases and throughput increases.

Execution time or response time :-

The total time required for the computer to complete a task including disk access, memory access, I/O activities, operating system overhead, CPU execution time and so on. It is also called as elapsed time. (a) Time between the start and end of a task.

Throughput or Bandwidth :-

Throughput is defined as the no of tasks completed per unit time.

To Max performance we want to minimize the execution time or response time of some task.

We can relate performance and execution time for a computer X.

$$\boxed{\text{Performance}_X = \frac{1}{\text{Execution time}_X}}$$

For two computers X and Y if the performance of X is greater than performance of Y then,

$$\boxed{\text{Performance}_X > \text{Performance}_Y}$$

$$\frac{1}{\text{Execution time}_X} > \frac{1}{\text{Execution time}_Y}$$

$$\boxed{\text{Execution time}_Y > \text{Execution time}_X}$$

(ii) Execution time of Y is longer than on X if X is faster than Y.

When X is n times faster than Y we can write

$$\frac{\text{Performance}_X}{\text{Performance}_Y} = n \iff \frac{\text{Execution time}_Y}{\text{Execution time}_X}$$

Then execution time of Y is n times longer than X.

## Measuring Performance:-

9

Time is the measure for computer performance. The computer that performs same amount of work in least time is the fastest.

A processor may work on several programs simultaneously. Hence there is difference between the execution time and the time that the processor is working on behalf of.

### CPU Execution time or CPU time:-

CPU time is the time the CPU spends computing for a task and doesn't include time spent on waiting for I/O or running other programs.



System performance refers to elapsed time.

CPU performance refers to CPU time.

Clock cycles or ticks or clock period or cycles:

All the computers are constructed using a clock that determine when events takes place in hardware. These are called time intervals are called clock cycles.

clock period = length of each clock cycle.

CPU performance & its factors:

A formula that relates clock cycles and clock cycle time to CPU time is

$$\text{CPU execution time for a program} = \text{CPU clock cycles for a program} \times \text{Clock cycle time}$$

Clock rate is inverse of clock cycle time,

$$\text{Clock cycle time} = \frac{1}{\text{Clock rate}}$$

The above equation gives,

$$\text{CPU execution time of a program} = \frac{\text{CPU clock cycles for a program}}{\text{Clock rate}}$$

The above formula says that the designer can improve the performance by reducing the no of clock cycles required for a program.

### Instruction performance :-

The computer takes to execute the instructions to run the program. The execution time must depend on the number of instructions in a program.

∴ No of clock cycles required for a program can be written as

$$\text{CPU clock cycles} = \frac{\text{Instructions for a program}}{\text{Average clock cycles per instruction}}$$

### CPI ("cycles per Instruction") :-

It is the average number of clock cycles each instruction takes to execute.

Since different instruction takes different amount of time depending on what they do CPI is the average of all the instructions executed in the program.

### The classic CPU Performance Equation

We can write this performance equation in terms of Instruction Count (No of Instructions Executed by the program).

$$\text{CPU Time} = \text{Instruction Count} \times \text{CPI} \times \text{Clock cycle time}$$

Since Clock rate and Clock cycle time are inversely proportional to each other we can write

$$\text{CPU Time} = \frac{\text{Instruction Count} \times \text{CPI}}{\text{Clock rate}}$$

Components of performance

Unit of Measure

CPU execution time for a program

Seconds for the program

Instruction Count

Instructions executed for the program

Clock cycles per Instruction (PI)

Average no. of clock cycles per instruction

Clock cycle time

Seconds per clock cycle

①

②

## Instruction Mix

A measure of the dynamic frequency of instructions across one or many programs.

Since CPI varies by 'instruction mix' both the instruction count and CPI must be compared even if clock rates are identical.

## Five classic Components of a computer.

(12)

The five classic components of a computer

are ① Input

② Output

③ Memory

④ Data path

⑤ Control

→ Processor.



### Input Unit

Computer accepts the coded information through input units which reads the data.

When a key is pressed the corresponding digit or letter is automatically converted into binary code and translated into binary code and transferred to memory by processor.

Some input devices are,  
Keyboard, joystick, track ball, mouse, scanner  
etc.

### Memory Unit:

Function of Memory unit is to store programs and data. There are two types of memory



Primary Memory - fast memory operates at electronic speeds. The memory contains a large no of semiconductor storage cells each capable of storing one bit information.

→ To provide easy access a distinct address is associated with each word location. Address refers to location.

→ No of bits in each word is "Word length" which range from 16-64 bits.

RAM is called as Random Access Memory, which is Volatile Memory and data will be lost if system is turned off or shut down.

ROM - Read only Memory, is non Volatile Memory used for running Booting programs.

Coches - Smallest, fast RAM units are called Coches. Cache Memory provides an "illusion of" Main Memory to the processor.

### Secondary Storage:

Primary Storage is expensive. Secondary Storage is used when large amount of data and programs have to be stored.  
eg: Magnetic disk, tapes, optical disk (CD-R, DVD-R)

### Arithmetic and Logic Unit :- (ALU)

Most Computer's operations are executed in the ALU of the processor. When two numbers located in memory ~~are~~ need to be added they are brought to processor and addition is carried by ALU.

When the data's are brought in they are stored in high speed storage elements called registers. Each Register stores 1 word of data. ALU performs both Arithmetic & Logical operations like (Addition, Subtraction, Multiplication, division, AND, OR, Not etc).

### Output Unit :-

It sends the processed result to the outside world. Some output devices are monitor, printer, speaker etc. Some units such as graph display provide both output and Input function.

### Control Unit:-

The control unit co-ordinates the memory, ALU, ~~ALU~~ input and output units to store & process information.

Data transfer between the processor and Memory are controlled by control Unit by timing signal.

(14)

### Data path:

It manipulates the data coming through the processor. It also provides a small amount of temporary storage.

## Technologies

(6)

Technology shapes what a computer will be able to do and how quickly they will evolve. There are some technologies that have been used for years.

| Year | Technologies used in computers | Relative performance / unit cost |
|------|--------------------------------|----------------------------------|
| 1951 | Vacuum Tubes                   | 1                                |
| 1965 | Transistors                    | 35                               |
| 1975 | Integrated circuits            | 9000                             |
| 1995 | VLSI                           | 24,00,000                        |
| 2013 | ULSI                           | 6,200,000,000                    |

Vacuum Tubes - It is a device that controls electrical signals through a vacuum sealed container. It was used in 1<sup>st</sup> generation of computers.

- Also used for calculating, storing & controlling
- Produced large heat

Transistor:

- An on/off switch controlled by electricity,
- It's combines dozens to 100's of transistors into a single chip.

→ The increase in no of transistors used in the IC's result in VLSI (Very Large Scale Integrated Circuits) → Contains 1000's to Millions of Transistors.

The growth of DRAM (Dynamic RAM) technology for 35 years shows that capacity quadrupled every 3 years.



### Manufacturing of chip:

Manufacture starts with Silicon Ingots. Silicon is a substance made up of sand. It doesn't conduct electricity well hence is called "Semiconductor".

→ Special chemicals are added to make silicon into conductor ~~and~~ insulator, also as a switch.

Silicon transforms into the following:

- ① Excellent conductor of Electricity (Copper or Aluminum)
- ② " Insulated (Plastic wrap)
- ③ Areas that can conduct & insulate under special conditions (as a switch)

### Silicon Crystal Ingots

The manufacturing process starts with Silicon Crystal Ingots. It is a large, rectangular block of pure silicon with a diameter of 8-12 inches.

→ 8-12 inch in diameter  
→ 12-20 inches long

### Wafers:

→ The Silicon Ingots are finely sliced into wafers no more than 0.1 inch thick.

→ These wafers are gone through a series of processing steps, where a pattern of chemicals are placed on each wafer creating a transistor.

Conductor and Insulator



Patterned wafers are tested with wafer tester and good parts of it are identified.

The yield is  $\frac{\text{percentage of good dies}}{\text{total dies on the wafer}}$  for eg.  $\frac{17}{20} = 85\%$  is yield.

The good dies are bonded and packages and tested one more time before shipping to customers.

After wafer testing the wafers are chopped up or sliced into component called 'dies' known as 'chips'.

defect - A flaw in wafer or in patterning steps  
that can result in failure of the die containing  
that defect.

(P7)

The cost of an IC rises quickly as die size increases, due to both lower yield and smaller no of dies that fit on wafer. To reduce the cost, we should shrink the large die.

In 2012, 32 nm smallest die was processed.

### Bonding :

After finding good die, they are connected to input / output pins of a package called bonding. These are tested and then shipped to the customer.

$$\text{Cost per die} = \frac{\text{Cost per Wafer}}{\text{Dies per Wafer} \times \text{yield}}$$

$$\text{yield} = \frac{1}{1 + (\text{Defects per area})}$$

$$\text{Dies per Wafer} = \frac{\text{Wafer Area}}{\text{Die Area}}$$

Hence depending on the defect rate, usage of die and wafer cost are not linear in the die area.

## ULSI (Ultra Large Scale Integrated Circuits)

It is made up of embedding millions of Transistors on a single silicon semiconductor Microchip. ULSI is a successor of LSI & VLSI technology, same category of VLSI.

## The Power Wall

(18)

- Performance of the hottest chip grew by 52% per year 1986 - 2002, & grew only 20% in next 3 years → problem called power wall.
- More transistors were used in



→ It shows that there is increase in the power & clock rate for eight generation of Intel microprocessors over 30 years.

→ Clock rate & power increased rapidly for decades then flattened. The reason is that they are co-related.

→ practical power limit is needed for cooling Commodity Microprocessor.

→ Architects try to reduce cost of powering cooling 100,000 servers as their cost are high.

21) Just as measuring time in seconds is  
safer measure than MIPS, the energy metric  
Joules is better measure than a power rate  
like watts. [Joules/second]

The Dominant technology for IC is CMOS  
(Complementary Metal oxide semiconductor)

CMOS - primary source of energy consumption  
(dynamic energy) - (i.e.) energy consumed when  
transistors switch states 0  $\rightarrow$  1, vice versa.

dynamic loading depends on capacitance  
Capacitive loading of each transistor and the  
voltage applied.

$$\boxed{\text{Energy} \propto \text{Capacitive load} \times \text{Voltage}^2}$$

This ~~eqn~~ eqn is the energy of pulse during  
logic  $0 \rightarrow 1 \rightarrow 0$  or  $1 \rightarrow 0 \rightarrow 1$

then the energy of single transition is then

$$\boxed{\text{Energy} \approx \frac{1}{2} \text{capacitive load} \times \text{Voltage}^2}$$

Power required is product of energy of 19  
of transitions  $\times$  the frequency of transitions.

$$\boxed{\text{Power} \propto \frac{1}{2} \text{Capacitive Load} \times \text{Voltage}^2 \times \text{Frequency}} \\ \text{switched}$$

Frequency switched = function of clock rate.

Capacitive load per ~~gate~~ transistor is a function of both: (fanout) the number of transistors connected to an output & technology.

$\Rightarrow$  Power can be reduced by lowering the voltage. Since power is a function of voltage squared.

Suppose we want to develop a simple processor that has 80% of capacitive load of the more complete older processor. Further assume that it has adjustable so that assume so that it can reduce voltage 15% compared to processor B.

$$\frac{\text{Power}_{\text{new}}}{\text{Power}_{\text{old}}} = \left\{ \begin{array}{l} \text{Capacitive load} \times 0.85 \\ \times (\text{Voltage} \times 0.85)^2 \times \\ \text{Frequency switched} \\ \times (0.85) \end{array} \right\} \\ \text{Capacitive load} \times \text{Voltage}^2 \times \text{Frequency} \\ \text{switched}$$

Q) If power of a laptop is halved from  
0.85 W to 0.52 W, what will be the  
power of new processor?

The new Processor uses about half the  
power of the old processor.

### Problem

- Further lowering the voltage appears to make the transistors too leaky. Even today about 40% of power consumption in a server chip is due to leakage.
- To address problem, designers have already attached large devices to increase cooling, and they turn off parts of the chip that are not used in a given clock cycle.
- Although there are many more expensive ways to cool chip & thereby raise their power to say 300 watts these techniques are generally too expensive for personal computers & even servers.

(20)

dynamic energy is the primary source of energy consumption in CMOS, static energy consumption occurs because of leakage current that flows even if transistor is off.

In servers, leakage is responsible for 40% of energy consumption.

- \* ↑ in no of transitions ↑ power dissipation even if transistors are always off.
- \* A variety of design techniques & technology innovations are being deployed to control leakage but difficult to ↓ voltage. Power is a challenge for IC for 2 reasons

① power must brought in and distributed around the chip.

② power is dissipated as heat and must be removed

(Transistor  $\rightarrow$  even when much off regulates energy.)

Leakage =

Leakage = loss of energy.

## Uniprocessor to Multiprocessor

(21)



The chart measures the performance increase during years. The increase in growth was 52%. Performance of floating point oriented Calculation has also increased faster.

Since performance increasing the response time is decreased. Instead of decreasing the response time with single program running on a sing processor It is better to go with 'Microprocessor' with multiple processor per chip. Where benefit is more on 'throughput (ie) the no of tasks completed per unit time is more.'

Processors are often referred as 'cores'.

A microprocessor, with multiple processor or core, is called as Multi core Microprocessor.

Quad core - contains 4 processor or core in a single microprocessor chip.

Dual core - Microprocessor chip that contain 2 cores or processor.

### Dual Core processor



So programmers need rewrite the programs to take advantage of multiple processors.

Programmers should improve the code performance as no of cores doubles.

Hardware can execute multiple instructions at same time.

It is so ~~hard~~ for programmers to write parallel programs due to difficulty in programming. In parallel Hardware the programmer must divide an application so that each processor has same amount of work to do at same time. (e) Over head of Scheduling.



### Advantages:-

- \* Using Multiple core increases the throughput
- \* Executes multiple instruction
- \* High performance .

### Drawbacks

- \* Programming for parallel Hardware is difficult.
- \* Load balancing
- \* Optimizing communication and synchronization is required .

This table shows no of processor or cores , clock rates , power of recent Microprocessors .

| Product    | Intel Nehalem | IBM Power6 | Sun Ultra Sparc |
|------------|---------------|------------|-----------------|
| Cores      | 4             | 2          | 8               |
| clock Rate | ~2.5 GHz      | 4.7 GHz    | 1.4 GHz         |
| Power      | ~100 W        | ~100 W     | 94 W            |

## INSTRUCTIONS

(23) ①

Instructions :-

Instructions are the commands that used to command a Computer Hardware. It is the language that Computer Hardware can understand.

## Instruction Set Architecture

It is an interface between the Hardware and the Software. (It is a set of commands that can be understood by a given Architecture) → called Instruction

Set • ~~Architecture~~ (or) Set of Instructions designed for a particular Hardware. ~~is defined as~~

e.g.: ISA's are MIPS, ARM etc.

## Stored Program Concept:-

Instruction & data of many types can be stored in memory as numbers.

MIPS - Millions of Instructions per second

## MIPS Registers

There are 32 registers (0 - 31)

| Register No | Alternative Name                    | Description                                                 |
|-------------|-------------------------------------|-------------------------------------------------------------|
| 0           | Zero                                | The value 0                                                 |
| 1           | \$at                                | (Assembly Temporary)                                        |
| 2-3         | \$v <sub>0</sub> - \$v <sub>1</sub> | values from expression evaluation & function arguments.     |
| 4-7         | \$a <sub>0</sub> - \$a <sub>3</sub> |                                                             |
| 8-15        | \$t <sub>0</sub> - \$t <sub>7</sub> | temporaries                                                 |
| 16-23       | \$s <sub>0</sub> - \$s <sub>7</sub> | (Saved Values)                                              |
| 24-25       | \$t <sub>8</sub> - \$t <sub>9</sub> | temporaries addition to \$t <sub>0</sub> - \$t <sub>7</sub> |
| 26-27       | \$k <sub>0</sub> - \$k <sub>1</sub> | Interrupt Handler.                                          |
| 28          | \$gp                                | global pointers                                             |
| 29          | \$sp                                | stack pointer.                                              |
| 30          | \$s <sub>8</sub> / \$fp             | (points to location of stack)                               |
| 31          | \$ra                                | Saved values / frame pointers<br>Return Address             |

## Types of Instructions :-

- ① Arithmetic Instructions
- ② Data Transfer Instructions
- ③ Logical Instructions
- ④ Conditional Branch Instructions
- ⑤ Unconditional Branch Instructions

## Operations of - The Computer Hardware :-

24

25

The operators supported by MIPS instruction set Architecture can be categorized as follow.

| Category             | Instruction                                                                               | eg                                                                                                                                                        | Meaning                                                                                                                                                                                                                    |
|----------------------|-------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Arithmetic           | add<br>subtract<br>add immediate                                                          | ADD $r_1, r_2, r_3$<br>SUB $r_1, r_2, r_3$<br>ADDI $r_1, r_2, 20$                                                                                         | $r_1 = r_2 + r_3$<br>$r_1 = r_2 - r_3$<br>$r_1 = r_2 + 20$                                                                                                                                                                 |
| Data Transfer        | load Word<br>Store Word<br>load half<br>load unsigned<br>Store half<br>Store byte         | LW $r_1, \underline{20(r_2)}$<br>SW $r_1, 20(r_2)$<br>LH $r_1, 20(r_2)$<br>LHU $r_1, 20(r_2)$<br>SH $r_1, 20(r_2)$<br>SB $r_1, 20(r_2)$                   | $r_1 = \text{memory } [r_2 + 20]$<br>$\text{memory } [r_2 + 20] = r_1$<br>$r_1 = \text{memory } [r_2 + 20]$<br>$r_1 = \text{memory } [r_2 + 20]$<br>$\text{memory } [r_2 + 20] = r_1$<br>$r_1 = \text{memory } [r_2 + 20]$ |
| Logical              | AND<br>OR<br>NOR<br>AND Immediate<br>OR Immediate<br>Shift left logical<br>Shift Right // | AND $r_1, r_2, r_3$<br>OR $r_1, r_2, r_3$<br>NOR $r_1, r_2, r_3$<br>ANDI $r_1, r_2, 20$<br>ORI $r_1, r_2, 20$<br>SLL $r_1, r_2, 10$<br>SRL $r_1, r_2, 10$ | $r_1 = r_2 \& r_3$<br>$r_1 = r_2   r_3$<br>$r_1 = \neg(r_2   r_3)$<br>$r_1 = r_2 \& 20$<br>$r_1 = r_2   20$<br>$r_1 = r_2 \ll 10$<br>$r_1 = r_2 \gg 10$                                                                    |
| Conditional Branch   | Branch on equal<br>Branch on not equal.                                                   | BEQ $r_1, r_2, 25$<br>BNE $r_1, r_2, 25$                                                                                                                  | if ( $r_1 == r_2$ ) goto<br>$PC = PC + 4$<br><br>if ( $r_1 != r_2$ ) goto<br>$PC = PC + 4$                                                                                                                                 |
| Unconditional Branch | Jump<br>Jump register<br>Jump & link                                                      | J 2500<br>JR \$r1a<br>JAL 2500                                                                                                                            | go to <del>2500</del> 10000 (25004)<br>goto \$r1a<br>$$r1a = PC + 4$ , goto <del>2500</del> 10000                                                                                                                          |

## Arithmetic Instructions

These instructions are used to perform Arithmetic Instructions like Addition, Subtraction, Addition Immediate, Subtract Immediate. This type of instructions uses 3 operands, 2 source operand and 1 destination operand. It is also called register-register instruction.

e.g.: ADD  $r_1, r_2, r_3$  (Add of  $r_2$  &  $r_3$  and store to  $r_1$ )  
ADDI  $r_1, r_2, 20$

$$\hookrightarrow r_1 \leftarrow r_2 + \frac{r_3}{\downarrow}$$

SUB  $r_1, r_2, r_3$

Immediate  
Operand

## Data Transfer Instructions:-

This instruction moves the data between memory and registers. If the data is moved from Register to Memory is called 'Store' Instruction. If the data is moved from memory to Register

is called ~~Load~~ 'Load' Instruction

e.g.:  $a = b[8]$   
LW \$s\_1 32(\$s\_3) (We assume a as  $s_1$  and b as  $s_3$ ).

8x4  
where  
4 is 4 bytes

here  $s_3$  is called Base Register.  
8 is called as the offset

## Store Word Instruction

(25) (b)

ii)  $Sw \$S_1, 20, (\$S_2)$       Memory( $\$S_2 + 20$ ) =  $\$S_1$

Here the value in the  $\$S_1$  register is stored in the memory location calculated by adding the base register  $\$S_2$  with offset 20.

## Logical Instruction:-

There are different logical instructions they are AND, OR, NOR, ANDImmediate, ORImmediate, Shift left, Shift right.

$$AND \$S_1, \$S_2, \$S_3 \quad \$S_1 = \$S_2 \& \$S_3$$

$$OR \$S_1, \$S_2, \$S_3 \quad \$S_1 = \$S_2 \mid \$S_3$$

$$ANDI \$S_1, \$S_2, 20 \quad \$S_1 = \$S_2 \& 20$$

## Conditional Branch Instruction:-

A instruction that requires the comparison of two values and that allows for a subsequent transfer of control to a new address in the program based on the outcome of the comparison.

Some Conditional Branch instructions are BNE, BEQ, etc.

eg:-

if source program is

if ( $i == j$ )

$f = g + h$

else

$f = g - h$

MIPS Pictreution (Assume 'i' is in  $\$s_1$  &  
 $j$  is in  $\$s_2$ )

CMP  $\$s_1, \$s_2$

BNE else ; goto else if  $\$s_1 \neq \$s_2$   
(ie)  $i \neq j$

Unconditional Branches Instruction

These instruction doesn't have  
any condition to be evaluated it  
just transfer the control to a  
new location. These instructions are  
called Unconditional Branch instruction.

eg:  $j 2500$  [Jump to location  $(2500 \times 4)$   
 $= 10000$ ]

JR  $\$ra$  (Jump register)

go to  $ra$  (return address)

## Operations of the Computer Hardware

(26)

Every computer must be able to perform Arithmetic operation.

The RIPS notation for addition is

add \$S\_1, \$S\_2, \$S\_3

or

add a, b, c # The sum of b and c is placed in a.

The natural no of operands for an operation like addition is 3 where two no's has added and placed in one register.

Hardware for a variable no of operands is more complicated than hardware for fixed number.

This situation illustrates 3 design principles of HW design

Design Principle 1:

Simplicity favors regularity.

1) Regularity Makes implementation simple

eg: Consider source program

2) Simplicity enables higher performance at low cost

$a = b + c;$       |      RIPS Instruction generated by compiler  
 $d = a - e;$       |      is

add a, b, c # Sum of b & c in a  
sub d, a, e Subtraction of a, e  
of

Consider the complex statement

$$f = (g + h) - (i + j);$$

The MIPS instruction is ...

add \$t0, \$g, \$h # temporary variable to contain  $g + h$   
add \$t1, \$i, \$j # temporary variable to contain  $i + j$   
sub \$t2, \$t0, \$t1 # get  $t0 - t1$ , which  
is  $(g + h) - (i + j)$

## Operands of the Computer Hardware:-

Operands can either be in registers or memory. Most of Arithmetic & logical Instructions use register operands. MIPS architecture has  $32 \times 32$  bit register file. A group of 32 bits called a word. A character requires 1 byte of storage. Integer requires 1 word (4 bytes) of storage.

32 bit refers to the size of a register in MIPS architecture.

The reason for the limit of registers may be found in 2nd underlying principle of H/w design

**Design principle 2: Smaller is faster**

- \* A very large no of registers may increase the clock cycle
- \* Use few registers to conserve energy.

(27) 10

eg:-

Compiling  $a = b + c$  we get

Add  $\$s_1, \$s_2, \$s_3$  //  $\$s_1$  contains  $b + c$ ;

### Memory Operands :-

If the program contains lots of variables and more complex data structures like array & structures. These complex data contain many more data elements than the registers in computer. So processor can keep only small amount of data in registers. Hence Data Structure (array & structures) are kept in memory.

- Generally registers are faster than memory & have high throughput than memory.
- To achieve high performance & conserve energy compiler must use register efficiently.
- Most frequently used data is stored in register and rest of the data are in memory is called Spilling Register (Stack).

Instruction set for MIPS includes some instruction that transfer data between memory & register. Such instructions are called data transfer instruction.

Memory  $\rightarrow$  is a n no of registers in sequence to form an array.

To access any one of the memory location, the instruction must supply memory (Index) address. Index (displacement or offset).

Starting address is at 0 (referred as base address).

### Memory Organization

All memory chips are byte oriented and it is viewed as a large single-dimensional array with address.

An address is an index into the array starting at 0. Called Byte addressing.

To get the physical address the offset address must be added to the base register. Which is equivalent to  $4 \times 8 = 32$ .

Word Address



Address Data

Processor

Memory

Byte Addressing



Address Data

Processor

Memory

eg: Index of third element is 2 then the value of memory [2] is 10.

The Data Transfer instruction that copies data between memory & Registers.

Load Instruction

Copies or Transfer data from memory to Register.

eg:  $lw \$s_1, 20(\$s_2)$  (load Word) ( $\$s_1 = \text{memory } [\$s_2 + 20]$ )

Store instruction transfer the data from register to memory

eg:  $sw \$s_1, 20(\$s_2)$  (memory  $[\$s_1 + 20] = \$s_1$ )

Consider source code :- ( h is associated with  $s_2$ , base Addr of Array A is in  $s_3$  )

$g = h + A[8] \rightarrow \{ lw \$t_0, 8(\$s_3) \# \text{Temporary Register get Add } \$s_1, \$2, \$t_0 \# \text{Base Address offset } A[8]$

In MIPS word must start at addresses that are multiples of 4. This requirement is called Alignment restriction.

In Byte Addressing. Consider the example

$$A[12] = h + A[8]$$

$h$  is associated with register  $\$S_2$ ,

base address of the array  $A$  is in  $\$S_3$

The compiler generates:

$$\$t_0 = \text{Memory}[\$S_3 + 32]$$

Lw  $\$t_0, 32(\$S_3)$  # Temporary reg  $\$t_0$  gets  $h + A[8]$

add  $\$t_0, \$S_2, \$t_0$  # Temporary reg  $\$t_0$  gets

$$h + A[8]$$

The final instruction store sum into  $A[12]$  (4x12) = 48 as the offset & register  $\$S_3$  as the base register

Sw  $\$t_0, 48(\$S_3)$  # stores  $h + A[8]$  back into  $A[12]$ .

Constant or Immediate Operands

Incrementing an index to point to next element of an array uses Immediate operand. More than 50% of MIPS arithmetic & logical instructions have a constant as operand.

Design Principle : 3:- Make the Common Case fast. (29)

→ Immediate operand avoids a load instruction  
and uses less energy.

$a = b + 25$  | assume  $b$  is associated with  $\$S_2$  and  
 $a$  is associated with  $\$S_3$

Addi  $\$S_3, \$S_2, 25$ . } This can be written using  
load instruction.

→ {  
  lw  $\$t_0$ , Addr Constant 25( $\$S_2$ )  
  add  $\$S_3, \$S_2, \$t_0$ .

## Representing Instructions in the Computer

(20)

Instructions are kept in the Computer as a series of high and low electronic signals and may be represented as numbers. Each piece of an instruction can be considered as an individual numbers, and placing side by side these no's forms the Instruction.

\$S<sub>0</sub> to \$S<sub>7</sub> Maps to Register 16 to 23

\$G<sub>0</sub> to \$G<sub>7</sub> Maps to Register 8-15

Then \$S<sub>0</sub> means the no of Register is 16, \$S<sub>1</sub> means

17 etc.

| Opcode | Add | 0 | 32 |
|--------|-----|---|----|
| Sub    |     | 0 | 34 |
| IW     | 325 | - | -  |
| SW     | 43  | - | -  |

| Opcode |   |
|--------|---|
| J      | 2 |
| Jal    | 3 |

Opcode →

Instruction format

A form of representation of an instruction composed of fields of binary numbers. Or the layout of instruction in memory is called instruction format.

## Machine Language

The numeric version of instruction is Machine language. Sequence of Machine language forms Machine Code.

It would appear tedious for reading and writing strings of binary Nos! we avoid them by using higher base like 'Hexadecimal' other than binary. Since almost all computer data bytes are multiples of 4, Hexadecimal is popular.

| Hexa            | binary            | Hexa            | binary            | Hexa            | Binary            |
|-----------------|-------------------|-----------------|-------------------|-----------------|-------------------|
| 0 <sub>16</sub> | 0000 <sub>2</sub> | 5 <sub>16</sub> | 0101 <sub>2</sub> | A <sub>16</sub> | 1010 <sub>2</sub> |
| 1 <sub>16</sub> | 0001 <sub>2</sub> | 6 <sub>16</sub> | 0110 <sub>2</sub> | B <sub>16</sub> | 1011 <sub>2</sub> |
| 2 <sub>16</sub> | 0010 <sub>2</sub> | 7 <sub>16</sub> | 0111 <sub>2</sub> | C <sub>16</sub> | 1100 <sub>2</sub> |
| 3 <sub>16</sub> | 0011 <sub>2</sub> | 8 <sub>16</sub> | 1000 <sub>2</sub> | D <sub>16</sub> | 1101 <sub>2</sub> |
| 4 <sub>16</sub> | 0100 <sub>2</sub> | 9 <sub>16</sub> | 1001 <sub>2</sub> | E <sub>16</sub> | 1110 <sub>2</sub> |
|                 |                   |                 |                   | F <sub>16</sub> | 1111 <sub>2</sub> |

eg: convert ecas 6420<sub>hex</sub> to Binary



(31)

- Convert  $0001\ 0011\ 0101\ 0110\ 1001\ 1011\ 1101\ 1111_2$  into Hexadecimal.



### R-type or R-format :-

| OP   | rs   | rt    | rd    | Shamt | funct |
|------|------|-------|-------|-------|-------|
| 6bit | 5bit | 5bits | 5bits | 5bits | 6bits |

This format is used for Register Instructions.

MIPS fields :- Each of segments of instruction is called field.

There are 6 fields in the R-type Instruction.

- OP - Basic operation of the instruction, called opcode
- rs - The First register source operand
- rt - The second register source operand
- rd - The register destination operand, hold the result
- shamt - shift amount
- funct - function, this field is called function code

Opcode - field that denotes operation & format of an Instruction.

Q1. Consider NIPS language for instruction add \$t0, \$s1, \$s2.

The decimal representation 1111 1103 1029

|   |    |    |   |    |    |
|---|----|----|---|----|----|
| 0 | 17 | 18 | 8 | 10 | 32 |
|---|----|----|---|----|----|

First & last field is 0 & 32 which says that operation is Addition

2<sup>nd</sup> field gives no of the register of first source operand \$s2 which is 17

3<sup>rd</sup> field gives no of the register of the 2<sup>nd</sup> source operand \$s1 which is 18

4<sup>th</sup> field gives the destination operand's register no \$t0 is 8.

5<sup>th</sup> field for 'shift' which is unused. So set to 0.

|        |       |       |       |       |        |
|--------|-------|-------|-------|-------|--------|
| 000000 | 10001 | 10010 | 01000 | 00000 | 100000 |
| 6      | 5     | 5     | 5     | 5     | 6      |

The Binary Representation is .

A problem occurs when an Instruction needs longer fields than these above. e.g. Load word instruction must specify 2 register and

(32) (Q)

a constant. If the address were to use one of the 5-bit field in the format, if the load word instruction would limit to only  $2^{5 \times 32}$ . The constant used often needs to be larger than 32. So 5-bit too small to use.

Hence we have conflict to keep ~~one~~ single instruction format.

Design Principles: Good Design Demands good Compromises.

### I-format or I-type

The second format of instruction used for immediate and Data transfer instructions.

The last 3 fields of the R-format instruction is joined as a single field of 16 bits.

The 16 bit address means a load word instruction can load any word within a region of  $\pm 2^{15}$  or 32768 bytes of address in base Register RS.

Adds immediate is limited to constant no longer than  $\pm 2^{15}$ .

~~e.g. Lw \$t0, 32(\$s3) # temporary reg \$t0 to get A[8]~~



Tw \$10 + 32 (\$13) # temporary Reg \$10 gets  
AP SJ

## decimal format

|    |    |   |    |
|----|----|---|----|
| 35 | 19 | 8 | 32 |
|----|----|---|----|

6 bits / 15 bits = 10%

16  $\delta_2$

## Binary format

|          |          |          |                   |
|----------|----------|----------|-------------------|
| 1000111  | 100111   | 01000    | 00000000   000000 |
| 00000000 | 00000000 | 00000000 | 00000000   000000 |

| Instruction | Format | OP | src | rt  | rd  | ctrl | funct | Adv   |
|-------------|--------|----|-----|-----|-----|------|-------|-------|
| addi        | R      | 0  | reg | reg | reg | 0    | 32    | NA    |
| sub         | R      | 0  | reg | reg | reg | 0    | 34    | NA    |
| addi        | I      | 8  | reg | reg | NA  | NA   | NA    | const |
| lw          | I      | 35 | reg | reg | NA  | NA   | NA    | addr  |
| sw          | I      | 43 | reg | reg | NA  | NA   | NA    | Addr  |

$$A[300] \leftarrow h + A[300]$$

LW \$to, 1200(\$H) # temp reg \$to get A[300]

add \$t0, \$s2, \$t0 # term seg get \$t0 get h+A[300]

SW: \$60,1200 (\$1) Stores h + AP[300] back into A[300]

| Word | refs | shifts | shifts | shifts  | shifts | bits  |
|------|------|--------|--------|---------|--------|-------|
| OP   | rs   | rt     | rd     | address | Shamt  | funct |
| 35   | 9    | 8      |        |         | 1200   |       |
| 0    | 18   | 8      | 8      | 0       |        | 32    |
| 43   | 9    | 8      |        |         | 1200   |       |

11 Write in Binary

|        |        |      |                  |
|--------|--------|------|------------------|
| 100011 | 01001  | 0000 | 0000100010110000 |
| 000000 | 010010 | 0100 | 0100000000000000 |
| 10011  | 01001  | 0100 | 0000000000000000 |

## J-type or J-format :-

(33)

This format is used for Representing Jump Instruction. This representation has only two fields.



where 6 bits are opcode , and remaining for holding the target or Address.

| Instruction | opcode     |
|-------------|------------|
| J           | 2 [000010] |
| Jal         | 3 [000011] |

e.g: J 32

Decimal  
Representation

|        |         |
|--------|---------|
| 2      | 128     |
| 6 bits | 26 bits |

2(128 - 0)  
2(64 - 0)  
2(32 - 0)  
2(16 - 0)  
2(8 - 0)  
2(4 - 0)  
2(2 - 0)  
1(1 - 0)

Binary Representation

|        |                                  |
|--------|----------------------------------|
| 000010 | 00000000000000000000000000000000 |
| 6 bits | 26 bits                          |

# MIPS Instruction format

| Format | 6 bits | 5 bits | 5 bits         | 5 bits           | 5 bits | 6 bits | Comments                           |
|--------|--------|--------|----------------|------------------|--------|--------|------------------------------------|
| R      | OP     | rs     | rt             | rd               | shamt  | funct  | Arithmetic                         |
| I      | OP     | rs     | rt             | addr / Immediate |        |        | Data Transfer,<br>Branch, immediat |
| J      | OP     |        | Target Address |                  |        |        | Jump.                              |

## Control Instructions or Control Operations

34

Control Instruction includes Conditional branch, Unconditional branch, procedures, switch case statements. Decision Making is commonly represented in programming languages using the if statement, combined with goto and labels.

### Conditional Branch

An instruction that requires the comparison of two values and that allows for a subsequent transfer of control to a new address in the program based on the outcome of the comparison.

e.g: if ( $i == j$ )  $f = g + h$   
else  $f = g - h$ .

### MIPS Instruction

bne \$s3, \$s4, ELSE # goto else if ( $\neq$ )  
add \$s0, \$s1, \$s3 #  $f = g + h$  (skipped if  $\neq$ )  
J EXIT # goto exit

ELSE: Sub \$s0, \$s1, \$s2 #  $f = g - h$

Exit :



Loops :- (1) Loops are used about 1 to 5%.

Decision are important both for choosing between two alternatives - found in if statements.

Loops - Iterating a computation.

while (`Save[i] == 1c`)

$i = i + 1;$

$\uparrow S_F$

$\uparrow \$S_6$

MIPS code :-  $k$  is associated with  $\$S_3$ ,  $\$S_3$  associated with  $i$ .  $\{ \text{to contains } k \}$ .

→ Add  $i$  to base array `Save`.

Loop:  $shl \$t_1, \$S_3, 2 \# \$t_1 = i \times 4$  due to bytes, Addressing by 4 problems

add  $\$t_1, \$t_1 + \$S_6 \# \text{to get address of } \text{Save}[i], \text{ we need to add } \$t_1 \text{ base offset}$

lw  $\$t_0, 0(\$t_1) \# \text{temp reg } \$t_0 = \text{Save}[i]$

~~bne~~  $\$t_0, \$S_5, \text{EXIT} \# \text{got to exit } \text{Save}[i] \neq 1c$ .

addi  $\$S_3, \$S_3, 1 \# i = i + 1$ .

jLoop # goto loop

exit.

basic blocks :-

A sequence of instruction without branches and without branch targets or branch labels called basic block.

## Case / Switch Statements:-

A Switch Case Statement that allows the programmer to select one of many alternatives depending on a single value.

The simplest way to implement Switch is through a sequence of conditional tests, turning the Switch statement into chain of if-then-else statements.

The Alternatives may be more efficiently encoded as a table of addresses of alternative instruction sequences called Jump Address table or jump table.

Jump Register (JR) instruction is used for this condition.

Jump table - A table of Address of alternative instruction sequences.

## Supporting Procedures in Computer Hardware

A procedure, or function is used to structure programs which supports code reusability. Procedures are one way to implement abstraction in software.

### Procedure:-

A stored subroutine that performs a specific task based on the parameter with which it is provided.

Execution of procedure must follow the following.

### Six Steps:-

- ① Put parameters in a place where the procedure can access them
- ② Transfer control to the procedure
- ③ Acquire the storage resources needed for procedure
- ④ Perform desired task.
- ⑤ Put the result value in a place where the calling program can access.
- ⑥ Return control to the point of origin.

## MIPS Instruction for procedures

36

Jump and link (Jal) instruction

Jal procedure Address

Jumps to an address

and simultaneously saves the address of the  
following instruction in Register \$ra.

$$\text{② } \$ra = PC + 4$$

MIPS software follows the following convention  
for procedures calling in allocating its 32 registers

$\$a_0 - \$a_3$  - Arguments register in which to pass

$\$v_0 - v_3$  - Value register in which to return values.

$\$ra$  - to return to point to origin.

Jr \$ra

→ It transfers control back to the caller,  
the function just has to jump to the  
address that was stored in \$ra.

Caller - A program that calls a procedure & provides  
necessary parameters

Called - A program that needs a source of information  
from a procedure provided by caller  
and then performs some task.

## Coprocessor Instructions

Coprocessors are additional processors which performs floating point operations.

All MIPS processor has two standard coprocessors CP0, CP1. Future inclusion of two additional coprocessors CP2 and CP3.

After Co-processor use op code 0100\*x where the last two bits specify coprocessor number.

Op code for  
Co-processor C1 = 010001

eg: add.s fd, fs, ft ] [ floating point add  
single ].  
→ 32 bit  
Single precision

V. P.

## Addressing Modes

QF  
Q

### Addressing Mode :-

The term addressing mode refers to the way in which the operands of an instruction are specified. It is also defined as the different ways of determining the address of the operands.

The information contained in the instruction code is the value of the operand or the address of the result or operand.

There are 5 types of MIPS addressing modes :-

- ① Immediate Addressing
- ② Register Addressing
- ③ Based or displacement addressing
- ④ PC-relative addressing
- ⑤ Pseudodirect Addressing

## ① Immediate Addressing

In this addressing mode, the operand is a constant and it is specified as the part of instruction.

Immediate addressing has the advantage of not requiring an extra memory access to fetch the operand. The operand is limited to 16 bit in size.

The branch instruction format can also be considered an example of Immediate addressing since destination is held in instruction.

e.g.: (i) Add \$30, \$51, 20  
(ii) LDR R0, [R1], 20

| OP     | rs     | rt     | Immediate |
|--------|--------|--------|-----------|
| 6 bits | 5 bits | 5 bits | 16 bits   |

Immediate addressing.

## 2) Register Addressing:

(38)

In this addressing mode, Name of the register is used to specify operand. This type of addressing mode can be represented using R-format or R-type instruction representation. Here each register fields are 5 bits long and opcode and funct are 6 bits long.

### Register Addressing



If it is a shift operation like SLL, SRL then shift bit is set to 1 otherwise it is set to '0'.

e.g: Add \$S<sub>0</sub>, \$S<sub>1</sub>, \$S<sub>2</sub>



\$S<sub>0</sub> - Register 16  
\$S<sub>1</sub> - Register 17  
\$S<sub>2</sub> - Register 18

### ③ Base or displacement Addressing:

In this mode, the operand is at the memory whose address is the sum of a Register and a constant in the instruction. The address is called as the "base address" and the constant to be added is called as "offset".

This type of addressing uses I-type or I-format instruction Representation.

e.g.:  $\text{LW } \$t_0, 32(\$s_3) \# \text{ to} = \text{Memory}[\$s_3 + 32]$



#### Base Addressing



#### 4) Pc- Relative Addressing

B9

In this addressing mode, the branch address is sum of the PC and constant in the instruction. It is used in conditional branch. Branch instruction can only move, above or below the program counter because the offset is 16 bit, no program counter could be bigger than  $2^{16}$  which is too small. It uses I-format instruction representation.

$$\text{Program Counter} = \text{Register} + \text{Branch Address}$$

e.g.: beq \$s3, \$s4, L1

#if  $s_3 = s_4$  move to Label L1.

#### Pc-relative addressing



## Pseudodirect Addressing:

In this addressing mode, memory address is mostly embedded in the instructions. The effective address is calculated by taking the upper 4 bit of the program counter (PC) concatenated to the 26 bit immediate value, and the lower two bits are 00 as shown below. The MIPS Jump instruction format uses -



eg:  $J\ 2600$

