

1

Instruction is a binary sequence kind with some operation.

Lecture 2

## VOON - NEWMANN ARCHITECTURE

→ Von Neumann's architecture states that the data and its program is stored in main memory = memory, but can be stored at different address locations.



Explanation =

i) Main Memory = data [ Main Memory ]  
program [ Main Memory ]

→ Stores data and its program

→ Data is the unprocessed raw crud

(2)

also such as variables, values, datatypes

Program = the implementation and values allocated to variables

```
lg = int a = 10; } data  
      int b = 20; } data } program  
      int c = a+b;
```

ii) CPU = (Central Processing Unit)



③

- iii) Input / Output Unit  
→ The entry and display of data through peripheral devices is termed as Input / Output Unit.

~~• H3~~ Note = Hardware architecture is totally opposite of von Neumann's architecture.

It states that the data is stored in another memory and its program stored in another memory.

→ Introduced in 1945.

# DIFFERENT TYPES OF REGISTERS

Explanation of Memory =



a = Number of words

b = Number of bits per word

Note = Here, In today's era memory is word addressed because memory ~~per~~ word can store as many bits as the system demand but ~~type~~ it is having fixed amount of bits i.e., 8 bits.

Here

$$a = 2^k$$

k = Address Locations or number of

b = Number of bits per word.

Eg =

$$4096 \times 16$$

(5)

$$2^{12} \times 16$$

12 = Number of bits for address in memory  
16 = Number of bits per word

$$\therefore k = 12$$

Registers :

i) Address Registers =

→ deals with the address locations or the total number of words.

ii) Data Register

= Instruction (Address) →  
decodes → Accumulator  
Performance

iii) Data Registers =

→ used to store data on which operations is to be performed.

iv) Accumulator

→ It is used to store Intermediate data.

When we fetch data then it gets stored in Accumulator before performing operations.

⑥

iv) Program Counter =

→ Used to store address of next instruction

v) Instruction Register = Used to store Instruction to be performed on data.



(Most significant bit) The operation's operand number which is to be performed.

→ 0 → direct addressing

→ 1 → Indirect  
Addressing

⑦ Note = Direct Addressing Refers to that operation is to be performed on given address.

Indirect Addressing Refers to the address location containing another address to location where the operation is to be performed.

vi) Temporary Register =

→ Used to store temporary data.

vii) Input Register =

→ Used to take input of data from peripheral devices.

→ Not having any link with memory.

⑦

viii) Output Register =  
→ used to display the processed information  
by arithmetical and logical unit.

e.g -



Number of words or number of  
Address or Number of memory locations = 1  
Number of bits per word or number of  
bits per address or Number of bits per  
memory locations = 16

i) Address Register = deals with Address



ii) Data Register = deals with data



iii) Accumulator = deals with Intermediate  
storage of data



iv) Program Counter = deals with Address  
of next instruction or

(8)

## Operation | Program Counter

- v) Instruction Register = deals with Address of operation to be performed and MSB (Most Significant bit) (Direct Addressing or Indirect Addressing)
- In combination with data



- vi) Temporary Register = deals with temporary storage for data.



- vii) Input Register = Used for taking Input from peripheral devices.
  - Not dependent on memory, hence not dependent on words.

- viii) Output Register = Used for displaying processed data to the output devices by fetching it from Arithmetical and Logical Unit through control signals and buses i.e; (Not more topology)
  - Not dependent on words or memory or address

①

~~lecture 4~~

## TYPES OF BUSES



16 bits  
2<sup>16</sup>

Address locations  
or Number  
of states in  
memory or  
number of words

$$2^8 = 256$$



here,



( correct indexing / direct addressing ) , Input from memory location , Output from memory location )

(10)

Indirect  
Addressing  
of memory  
location

Indirect  
Addressing  
for taking  
Input.

Direct Addressing  
for taking  
Output.

and will work similarly for each memory locations

- i) Memory Unit = Stores address containing Instructions
- ii) Data Signals = All data processed or unprocessed is stored in work through data signals.
- passing through a processor installed already

iii) Control Signals = Contains Timing Signals and Control Signals which are having Instructions set already to perform and Buses as networking or network topology

i)

Lecture 5

# Bus architecture =



00  $\rightarrow$  0

01  $\rightarrow$  1

02  $\rightarrow$  2

03  $\rightarrow$  3



$S_0$     $S_1$   
0   0

(12)

Note:

\*\*\* very important

Number of Multiplexers = Number of bits in Registers



13

lecture 6

# COMMON BUS SYSTEM



16 Bit Common Bus

- First you take input from Input Register.
- The process starts from program counter as it stores address of next location.
- Then the address of that value travels to Address register through bus by giving value to multiplexer by program counter.
- By Address Register, it travels to Memory.

14

Unit by again giving multiplexer value through address Register going into the bus.

- The memory register is having decoder which fetches the data of that particular Address.
- The data fetched by the memory register again travels through bus to data Register by giving the value of multiplexer through Mux Unit.

- From data register it passes through ALU (Arithmetical and logical Unit) and get stored into Accumulator for performance of operations.
- After performance of operations if the result again travels through bus by giving the value of multiplexer through Accumulator to temporary register.
- Through temporary register multiplexer's value it again comes to bus for going either for performance of new operations or for giving out the values.

Flow Chart = ( For my better understanding )



15

LOAD A  
 ADD B  
 STORE T  
 LOAD C  
 ADD D  
 MUL S  
 STORE X



Hence, final Answer

$$x = (A+B) * (C+D)$$

(16)

## ~~Lecture 7~~ \* TYPES OF INSTRUCTIONS

### Instructions

#### Data Transfer Instruction

- It deals with transfer or travelling of data through buses or network topology.

#### Data Manipulation Instruction

- Deals with manipulation of data such as logical, arithmetic calculations (simple or complex).

#### Program Control Instruction

- Deals with various program control statements such as if, else, for, while etc.

# DATA TRANSFER INSTRUCTIONS



- i) MOV →  $R_1, R_2$   
Means giving value of  $R_2$  to  $R_1$ .  
Eg =  $MOV \rightarrow R_1, 100$   
Giving value 100 to  $R_1$ .
- ii) LOAD → Loading the value from memory unit to register (Mostly used for accumulator).
- iii) STORE → Opposite of load, loading values from registers to memory unit.
- iv) EXCHANGE → Exchanging the values of Registers



- v) INPUT → Transferring Instructions from peripheral devices (Input devices) to Memory Unit.

(18)

vi) OUTPUT = Taking the processed Information and transferring it to output devices.

vii) PUSH = Storing the Information into stack or transferring the Information into stack.

viii) POP = Taking out or deleting the Information from stack.

\* Conclusion = Data Transfer Instructions

Move Load Store Exchange Input Output Push Pop

(19)

Lecture 9

## # ARITHMETIC INSTRUCTIONS

- i) Add → For adding values
- ii) Subtract → For subtracting values
- iii) Multiplication → For multiplying values
- iv) Division → For dividing values
- v) Addition with carry → Multiplexers (Source)
- vi) Subtraction with carry → Multiplexers (Source)
- vii) Negate → Changing positive to negative and negative to positive.



Eg =  $O = (a+b)$

Address of destination  
Operation to be performed



20

# Lecture 10

## # LOGICAL INSTRUCTIONS

- i) Complement (com or NOT) =  
 → The value will become opposite of that value

$$\begin{array}{l} T \rightarrow F \\ \boxed{1^c = 0} \\ \boxed{0^c = 1} \end{array}$$

- ii) Char =  
 → Chars all the information saved.

- iii) logical AND =

$$\begin{array}{ll} 0 & 0 \rightarrow 0 \\ 0 & 1 \rightarrow 0 \\ 1 & 0 \rightarrow 0 \\ 1 & 1 \rightarrow 1 \end{array} \quad \left. \begin{array}{l} \text{acts as multiplication} \\ \text{type:} \end{array} \right\}$$

- iv) logical OR =

$$\begin{array}{ll} 0 & 0 \rightarrow 0 \\ 0 & 1 \rightarrow 1 \\ 1 & 0 \rightarrow 1 \\ 1 & 1 \rightarrow 1 \end{array} \quad \left. \begin{array}{l} \text{acts as addition} \\ \text{type:} \end{array} \right\}$$

- v) Ex-OR =

→ Giving zero on similar Instructions and  
 on dissimilar Instructions

- vi) Clear Carry =

→ Clearing the value of carry for calculating  
 one's complement or two's complement.

(21)

vii) Set Carry =

→ Setting the value of carry for taking one's complement or two's complement.

viii) Complement Carry =

→ Taking out the carry of the complement.

\*\*\*\* ix) Enable Interrupt =

→ There is to enable the interrupt function.

Hardware Interrupt

→ Using mouse, pressing keyboard and many more, Interrupts the working of system.

Software Interrupt

→ Using different memory locations and many more Interrupts working of the system.

\*\*\*\* x) Disable Interrupt =

→ To disable or stop interrupt function.

Hardware Interrupt

Software Interrupt

(22)

Lecture 11

## \* SHIFT-INSTRUCTIONS

→ Address operand → Register → Beginning with shifting  
i) logical left shift = either left or right



ii) Logical Right Shift =



iii) Arithmetical Left Shift =



Note = Exact same as logical left sh

(23)

i.v) Arithmetic Right Shift =



} Similar idea  
as Removed  
from Vacant  
Position

v) Carry Left Shift =



vi) Rotate Right Shift =



vii) Rotate Left Shift with Carry =



24

viii) Rotate Right Shift with Carry



H3 Note =

Instruction Register

| MSB | OPCODE | OPERAND |
|-----|--------|---------|
| 0,1 | *      | S D D   |

But Now in Shift Register

| MSB | OPCODE | Type of Shift | Register | Shifting | OPERAND |
|-----|--------|---------------|----------|----------|---------|
|     |        |               |          |          |         |

} More  
Accura  
Struct

H3 Note = Immediate Operand = Direct Values

# PROGRAM CONTROL INSTRUCTIONS



→ According to given conditions, it works.

Eg = BE R<sub>1</sub> R<sub>2</sub> 2000; Means if Branched necessarily In equal to R<sub>1</sub> = R<sub>2</sub>.

then program counter will have address of Instruction was 2000 and will get executed.

## Unconditional

→ Jump = used to jump from one position

to another, not necessarily In Sequential Order.

→ Program Counter will store the address accordingly

## Skip

→ Skip the particular position to go to another position

- Like In simple Memory Unit, Program Counter gives the address of next location and hence gets decoded and transmits bus i.e., (Network Topology) Segments
- But if we want to skip or jump

# INSTRUCTION FORMAT



MODE   OPCODE   OPERAND

- The Address of Operand : Arithmetic Instructions
  - Address of Operand where to be performed  
Address of Register where Operand is present
  - Address of Register where Operand is present

- + Shifting if required
- On which to be performed





Date 1-1-1  
Page

Temporary Register

Bus

Output

11

*Autograph*

34 11

1

OPCON

GERMANY

## QUESTION

Janus Ryg, Ryg & Jønch. af

$$34 - (4 \times 5 + 5) = 8$$

~~25~~ 255

### Signed Integer

• Range

### Conclusion :-

## Types Of Instructions

## Data Transfer Instructions

MOV  
LOAD  
STORE  
EXCHANGE

INPUT  
OUTPUT  
PUSH  
POP

## Data Manipulation Instructions

ADD  
SUBTRACT  
MULTIPLICATION

## ADDITION USING CARRY

## SUBTRACTION USING CARRY

Program Control Instructions =  
Branched

```

graph TD
    Conditional[Conditional] --> Branched[Branched condition]
    Branched --> Equal[Equal]
    Branched --> Unequal[Unequal]
    Equal --> Jump[Jump]
    Unequal --> Jump
  
```

logical  
data  
Manipul-  
cation  
Instructions

COMPLEMENT  
CLEAR  
LOGICAL AND  
LOGICAL OR  
EX-OR  
CLEAR CARRY  
SET CARRY  
COMPLEMENT-CARRY  
\*\*\* ENABLE INTERRUPT  
\*\*\* DISABLE INTERRUPT

# ~~Lecture 15~~ \* TYPES OF CPU ORGANIZATION

i) Single Accumulator Organization

→ Last naam se kaam Rakhi hai Ishje Registers kaam  
Omnibus = Load → Value → Memory Unit

Store → Value → Register

↓  
Memory Unit

Non Single Accumulator means Instructions  
are stored In single Register and hence  
travelling in the buses.



(31)

$$y = x = A + B$$

Format: [ Mod | OPCODE | OPERAND ]

L. LOAD A

$$\begin{matrix} AC + B \\ A \leftarrow + \end{matrix}$$

$$x = AC$$

$$x = (A + B)$$

Store  $x \leftarrow AC$

and half Arithmetical and Logical data Manipulation.

~~Note~~ Note = Saara Jogh Ek Bhi Register Ultha Raha hai, Ishye speed kaam ho jaygi.

Isasta Raye chaar baar, Mihanga Ray haqr -

→ Single Address

(32)

Lecture 1.6

## ② General Register Organization

→ Two or more registers for enhancing the speed and working of system.

| MODE | OPCODE | OPERAND   |    |    |
|------|--------|-----------|----|----|
|      | +      | D         | R1 | R2 |
|      |        | x<br>= 24 | 8  | 16 |

$$R_1 \leftarrow 8$$

$$R_2 \leftarrow 16$$

$$D \leftarrow x$$

$$x = R_1 + R_2$$

$$n = 24$$

→ Not-Similar for two Registers, taking values from one Register

and making another Register as destination

→ Banki Sizs chadh jaygi kyski Register  
Zyada Use ho Raha hai.

→ Saath me Cest lhi hadh jayga -  
Registers & Cest & Size of Bus

Lecture 17

## # Register Stack Organization = 63 bit

Push =
 $SP \leftarrow SP + 1$  ← Increasing  
 $M[SP] = DR$  ← the  
value

storing the value

IF ( $SP == 0$ ) then ( $FULL \leftarrow 1$ ) $EMPTY \leftarrow 0$ Pop = $DR \leftarrow M[SP]$ ;  $Val$   
etc $SP = SP - 1$ ; //kk kk  
position

ghatiga

IF ( $SP == 0$ )then ( $Empty$ ) $FULL \leftarrow 0$ 

Note = Last will be less as the  
Many Registers have been used  
In Comparison of Single Register Architecture  
and General Register Architecture.

34

Lecture 18

## # Memory Stack Organization =

→ Stacks In RAM.



→ Here, program counter stores address of next instruction and gives it to memory unit to decode.

→ After decoding it trans to data register and trans again to bus to instruction register.



- Cost effective implementation of stack implementation
- Name that are stored between upper limit and lower limit of the stack implemented in RAM (memory).

## # Addressing Modes =

- In Instruction Register

| MODE | OPCODE | OPERAND |
|------|--------|---------|
|------|--------|---------|

Contains

Address of Operand

→ Registers  
→ Values  
→ Address

- Generally OPERAND also stores the address of instructions present in Register because the operations are to be saved in Registers Temporarily.



- Jumping On given address
- Switching the values are also possible.

- There are various kinds of addressing modes and we will be studying all of them in later lectures.

~~litter 2.0~~

## #. Types of Addressing Mode =

### i) Implied Mode =

- That the operand is called by the Opcode Implicitly.
- Deals with zero value or one value (Stacks) (Accumulator of Registers)

### i) One Value (Accumulator) =

ADD R<sub>1</sub>, R<sub>2</sub>

- Here, ADD is the Opcode and R<sub>1</sub>, R<sub>2</sub> is the operand which is called Implicitly with Opcode.

$$\boxed{R_1 \leftarrow R_1 + R_2}$$

Act like Accumulator

### ii) Zero Value (Stacks) =

- The last two Instructions will become Operand on which operation to be performed and getting stored in register.



\* Addressing Mode  
Immediate Mode =

- It gives direct value to the Operand.
- The value cannot be changed further or no Addressing must be done.



const int a = 10;

X

int a = 10;

a = a \* 10;

Because  
we  
could  
change

the value of a Accordingly

thus, we cannot change the value  
Accordingly:

Thus, Immediate Addressing mode.

Drawback =

- The value of OPERAND depends on size or more we can say number of bits per words.

38

## Lecture 22

\* Register Mode =

→ The Instructions are present in the Register

| MODE | OPCODE | OPERAND |
|------|--------|---------|
|------|--------|---------|

→ It will store the value of the Register

→ Instructions are stored in Register

Eg =

LOAD R<sub>1</sub>, R<sub>2</sub>

Means R<sub>1</sub> and R<sub>2</sub> are values of Register In which the storage of value both R<sub>1</sub> and R<sub>2</sub> will be in Register R<sub>1</sub>.

|                                                  |
|--------------------------------------------------|
| R <sub>1</sub> ← R <sub>1</sub> , R <sub>2</sub> |
|--------------------------------------------------|

→ LD R<sub>1</sub> (Value is getting stored in Accumulator)

Advantages =

- The size of Operand will get reduced.
- The process will become faster as registers are fast and temporary memory.

## \* Register Indirect Mode

→ OPERAND having value of Register → Register having value  
 Address of Instruction → Fetch Instruction to perform operations in OPCODE.



Q. LD  $R_1, [R_2]$

Indirect Addressing of  $R_2$

⇒  $R_1 \leftarrow R_1, M[R_2]$

At memory location of value of  $R_2$ , instruction is present

Q. ADD  $(R_1), (R_2)$

Register Indirect Addressing

$$\rightarrow M[R_1] = M[R_1] + M[R_2]$$

- ~~At~~ The value must be stored at memory Address of  $R_1$
- $\rightarrow$  The value at memory address  $R_1$  and Memory address  $R_2$  must be added  
(Before storing the value in  $R_1$ ).

Advantages =

- The size of OPERAND getting reduced.
- The process will be very fast.  
(Not as fast as direct because here, it is)
- The size of OPERAND gets reduced and memory getting increased for better functionality.

## Lecture 24

### #. Auto-Increment and Auto-Decrement Addressing Modes

- The extension concept of Register Indirect Addressing mode.
- At particular time stamp, provides one memory location to the register and the memory location will get incremented or decremented after the instruction has been performed after particular time stamp.

| MODE | OPCODE | OPERAND |
|------|--------|---------|
|------|--------|---------|



200

$LD \leftarrow R_1 + ;$  // Means the  
 $AC \leftarrow AC_M[R_1];$  Address will

get Incremented -

// Adding After performance  
 Value to of Instruction  
 the Register



200 201 202 203 204

Note = The addressing is also done  
 and will get Incremented by  
 the time stamp after performance of  
 instruction without help of CPU or  
 micros and decrement is also done the

## #. Direct Addressing Mode =

- The OPERAND contains the Address of memory location in which the Instruction (values) are present

MODE    OPCODE / OPERAND

Address of  
Instruction  
is present



Fetches the Instruction  
(values)

and hence  
operations is  
performed.

Eg = OPERAND →  $x$   
and addition to be done

$$\Rightarrow AC \leftarrow AC + M[x]$$

Storage done in  
Accumulator

Value present in  
Accumulator

Value at Memory Locat  
 $x$ .

Disadvantages =

- Size Constraint

(43)

Lecture 25



Date: \_\_\_\_\_  
Page: \_\_\_\_\_

## \*\*\* Most Important Topic

- #. Direct Addressing Mode = (Absolute Addressing)  
→ The Instructions are present at <sup>absolute mode</sup> Address  
given in OPERAND.  
→ Use of Variables.



### Disadvantages

- Size is limited according to the address  
of the register.

AA  
ecture 19

## #. Indirect Addressing Mode =

- The OPCODE contains address which contains address of Instructions
- Basically the concept of pointers



$$\begin{array}{l} \text{OPCODE} \rightarrow \text{ADD} \\ \text{OPERAND} \rightarrow z \end{array}$$

∴ Full Instruction becomes ADD z.

$$AC \leftarrow AC + M[M(z)]$$

All the working are done in accumulator.

## #. Relative Addressing Mode

- The program counter works according to the offset value.

- Offset value is at 501 hence, less usage of space

OPCODE | Address



- Jumping of the values.

$$EA = \text{Base Register Value} + \text{displacement}$$

Effective Address where the actual instruction is present.

## #. Base Register Addressing Mode

As Gamagi Yaa Mune kii chiz  
hatakar nayi chiz idals.



program.

The phenomena is that when a Register or having program has performed all its Instructions, then it is replaced by new



The two conditions cannot be same as the violation in security

has been occurred.

hence, between memory locations 400 to 499 we should give offset Branched value as

BR 60.

~~Note~~ = Absolute address ke liyaiye sir displacement using.

$$E - A = \text{Base Register Value} + \text{Displacement}$$

$$(400 + 60) = 460$$

- Indexed Addressing Mode
- Basically used for Implementation of the Array Elements.
  - Use of multiple registers can be performed.



Base Address is stored in it here, Base Address is 100.

$$\boxed{\text{Effective Address} = \text{Base Address} + \text{Index}}$$

Like we have to fetch the value of fourth Index Location

$$\text{Effective Address} = \underbrace{\text{Base Address}}_{100} + \underbrace{\text{Index}}_4$$

$$\Rightarrow \text{Effective Address} = 100 + 4$$

$$\Rightarrow \boxed{\text{Effective Address} = 104}$$

The address from where Instruction can be fetched.

## # Conclusion on Addressing Mode



## II MEMORY HIERARCHY IN COMPUTER ORGANIZATION

- According to the theory, there are various memory storage based on the size.
- The smallest in size will take less execution time. Hence, costlier.
- 



Note = The frequency of usage also increases with decrease in size and time.  
Hence, becomes costlier.

i) Registers = Used for temporary storage.  
→ Connected to CPU.

- ii) Cache Memory = Storage (<sup>SRAM</sup>)
- iii) Main Memory = Used for processing.
- iv) USB / Flash = Primary Memory (<sup>DRAM</sup>)
- v) Magnetic Disk / Hard Disk = Tertiary Memory
- vi) Magnetic Tapes / Tape drives = Used to store backups, tertiary memory.

Since, backups are not used so often, hence the frequency of usage is very low.

# TWO - LEVEL MEMORY ORGANIZATION

- Combination of Main Memory and Secondary Memory.

```

graph TD
    A[Combination of Main Memory and Secondary Memory] --> B[Independent Organization]
    A --> C[Hierarchical Organization]
  
```

**Independent Organization**

  - It runs parallelly as when it starts checking in first memory unit at the same time starts checking to another unit of memory.
  - agar ik min nahi milga, toh dusre min choga chii hoga.

**Hierarchical Organization**

  - It runs in series. If information is not present in one memory level then after we will move to another memory level.



$$H_1 + H_2 = I$$

$$H_2 = (1 - H_1)$$

$$T = H_1 \times T_1 + (1-H_1) \times T_2$$

as well if successful in  
the first one need to proceed.



$$H_1 + H_2 = 1$$

$$H_2 = (1 - H_1)$$

$$T = H_1 \times T_1 + (1 - H_1)(T_2)$$

52

A  
Date \_\_\_\_\_  
Page \_\_\_\_\_

But, if not found in combination, one after H<sub>2</sub>, and will be in another.

Thin time accordingly.  
( Worst case)

# THREE - LEVEL MEMORY ORGANIZATION

Independent Organization =



$$T = H_1 \times T_1 + (1 - H_1) \times H_2 \times T_2 + (1 - H_1)(1 - H_2) \times T_3$$

Hierarchical Organization



→ Program Running  
parallelly.

$$T = (1 - H_1) \times T_1 + (1 - H_1) \times H_2 \times (T_1 + T_2) + (1 - H_1) \times (1 - H_2) \times (T_1 + T_2 + T_3)$$

→ Program Running One  
after Another.

5A

## Lecture 34

Q.



$$H_1 = 0.8, H_2 = 0.2$$

$$\Rightarrow T_{avg} = H_1 \times T_1 + (1-H_1) \times T_2 + (1-H_1)(1-H_2) \times T_3$$

$$\Rightarrow T_{avg.} = \{ 0.8 \times 1 + (1-0.8) \times 0.9 \times 10 + (1-0.8)(1-0.9) \times 500 \}$$

$$\Rightarrow T_{avg.} = (0.8 + (0.2) \times 9 + (0.2) \times (0.1) \times 500)$$

$$\Rightarrow T_{avg.} = \cancel{0.8 + 1.8} = 12.6 \text{ ns}$$

If Not = If there no information is given, then priority will be set to Independent Method.

# CACHE MAPPING



- The full program must be present in Memory disk or hard disk according to the size.
  - Then some of the required Instructions are transferred to the words of Primary Memory (RAM = Random Access Memory).
  - Then the suitable Instructions or data from RAM is getting transferred to cache as per the suitable location.
  - They gets stored in register and hence, the Instruction having data gets processed by CPU.
- Conclusion =
- Memory Disk / Hard Disk → RAM  
(Secondary Memory)
  - RAM  
(Random Access Memory)
  - Primary Memory

Registers (Internal Memory) → Cache (Internal Memory)

Cache Mapping =

→ Direct Mapping

→ Fully Association Mapping

→ Set Association Mapping

51

Lecture 36

## \* DIRECT MAPPING



Let us assume 4 bits per words, then  
 memory locations in main memory =  $\frac{128}{4} = 32$

In cache memory =



To store 4 values bits say 0, 1, 2, 3 → Number of bits Required =  $2^2$



Similarly for cache memory



Note = There is one major drawback that my are storing blocks at fixed memory locations (lines) in cache memory.

Eg =  $B_0 \rightarrow L_0$  and the nomenclature goes accordingly.

## Q. QUESTION ON DIRECT MAPPING



Number of lines = 0



Number of lines = 10

Number of Tags = 17

## # FULLY ASSOCIATIVE MAPPING



4 hits per words  
Number of lines =  $\frac{128}{32}$

Now, here the blocks can be placed at any lines.

→ If blocks are placed anywhere, then line for can be removed.

|     |        |
|-----|--------|
| Tag | Offset |
|-----|--------|

Block Number  
true, [ Block Number = Tag ]

[ Block Capacity = Number of lines ]

• Drawbacks =

→ Comparisons becomes more. Hence, time taking

(61)

Date \_\_\_\_\_  
Page \_\_\_\_\_Advantages =

- Number of hit and trials becomes more.
- Size Increases

Q.

Cache Memory

L<sub>1</sub>L<sub>2</sub>L<sub>3</sub>

Main Memory

B<sub>0</sub>B<sub>1</sub>B<sub>2</sub>B<sub>3</sub>B<sub>4</sub>B<sub>5</sub>B<sub>6</sub>

7



|   |   |   |   |   |   |   |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 0 | 0 | 0 | 1 | 0 | 0 | 1 |
| 0 | 0 | 0 | 1 | 0 | 0 | 1 |
| 0 | 0 | 0 | 1 | 0 | 0 | 1 |

Tag

offset (Number of Comparisons)

00 -

01 -

10 -

11 -

1000100 → 5<sup>th</sup> Block → B<sub>4</sub> and having  
4 number of comparisons.

## # ADVANTAGES AND DISADVANTAGES

Advantages =

- Searching the blocks becomes easier.

Disadvantages =

- The hit and miss process occurs very often.



- If we search B1, it will get missed.

- Once, positioning is defined.
- We could not place the blocks anywhere instead of the empty position.

## # SET ASSOCIATIVE MAPPING

- The Cache Memory is divided into sets
- Combination of direct mapping Main Memory and fully association mapping Cache Memory



4 bits for words  
 $\Rightarrow$  Number of Blocks =  $\frac{128}{4} = 32$

Number of blocks  $= \boxed{0 - 31}$

Number of Sets =  $\frac{\text{Number of units}}{\text{Number of ways}}$

lets say we have 2-way Method

Number of Sets =  $\frac{x^2}{x} = 2$  Sets

Name,  
Mapping

Direct Method

$k \text{ map } n$

$o \text{ map } 4 = 0 \rightarrow$  which means anywhere  
In set 0.

(Hence, there here comes

fully associative mapping)

Similarly for 8,

$k \text{ map } n$

$l \text{ map } 4 = 1 \rightarrow$  anywhere in set 1.

→ Hence, by direct mapping method we will decide whether it is set hit or set miss.



Let's say we have an Instruction Register of 7 bits and 4 bits in each word

5

4

3

0



Tags = कैनसी

मेमोरी

एड्रेस

हेड

7 bits

4 bits

either zero or  
one, hence one

Set = कैनसी प्रोटोकोल में प्रेजेन्ट है।

1 bit

Offset = कैनसी लाइन है।

3 bits

|     |        |
|-----|--------|
| 001 | 010101 |
|-----|--------|

Tags Set offset

# # LOCALITY OF REFERENCE

Spatial Locality = (Close proximity)      Temporal Locality:

- If the word is accessed now, then there is high priority that the words adjacent to it will be accessed next.
- If any word is referenced now then same word will be referenced again.



Let's say we are accessing  $w_2$ , then there is very high chance to access the words  $w_1$  or instructions of the same line. Since, after accessing  $w_2$ , there is high chance to access  $w_1$ ,  $w_3$ ,  $w_0$  and  $w_1$ .



If one is entered at first then two is entered or accessed and then three is entered or accessed. Then next word or instruction will be inserted at block 3 because there is very high chance that words of block 3 will be getting accessed. → Funda of LRU (Least Recently Used).

## #. CACHE REPLACEMENT POLICIES

Cache Memory



Main Memory



- Let's say, If Cache Memory becomes full and gets filled by limited amount of blocks.
- If we want to transfer some more blocks from main memory to cache memory, we need to empty cache memory. But, this can be done on the basis of various factors such as =



→ (First In → (Least Recently → (Most Recently  
First Out) Used) Used)

# Lecture 4<sup>3</sup>

|   |                  |
|---|------------------|
| 0 | 45               |
| 1 | 322              |
| 2 | 25               |
| 3 | 8                |
| 4 | 193 <sup>m</sup> |
| 5 | 87               |
| 6 | 16               |
| 7 | 35               |

16, 25, 7, 16, 3, 25, 8, 19, 8, 25, 8, 16, 35, 45, 22, 8, 3, 16, 25, 7

At highest my  
distance is 4.  
Since, will replace

→ Steps =

- Steps -

  - i First we will fill according to direct cache access
  - ii Then we will check hit or not.
  - iii Then after we will check maximum distance element from the element which is to be placed and will remove that element and place a new.

→ hence, it will occur at sixth block  
i.e., (that is) line 85.

$$\text{Note: } 1 \text{ GB} = 2^{30} \text{ B}, 1 \text{ MB} = 2^{20} \text{ B}, 1 \text{ KB} = 2^{10} \text{ B}, 1 \text{ TB} = 2^{40} \text{ B}$$

$$32 = 2^5$$

Q Size of main memory =  $2^2 \times 2^{30} \text{ B} = 2^{32} \text{ B}$

### Instruction Register



$$\frac{16 \text{ KB}}{32 \text{ B}} = \frac{2^4 \times 2^{10}}{2^5} = 2^9$$

~~Block Offset = 5~~  
~~Tags~~

Set Number  
of Lines

$$\frac{2^9}{4} = \frac{2^9}{2^2} = 2^7 \rightarrow \text{Number of Set}$$

$$32 - (7+5) = (32-12) = 20$$

Lecture 45

# # FIFO CACHE REPLACEMENT POLICY

Q 16 Cache blocks

= 4 way set

0, 255, 1, 4, 3, 8, 133, 159, 216, 129, 63, 8, 48, 32, 73, 92,  
 $\therefore$  Number of set lines =  $\frac{16}{4} = 4$  set lines

bit

Cache Memory

|                | 0   | 4 | 8   | 216 | 48  | 32 | 92 | B mod n                   |
|----------------|-----|---|-----|-----|-----|----|----|---------------------------|
| S <sub>0</sub> | 0   | 4 | 8   | 216 | 48  | 32 | 92 | Block<br>Num<br>of<br>dim |
| S <sub>1</sub> | 1   | 1 | 33  | 129 | 73  |    |    |                           |
| S <sub>2</sub> | 2   |   |     |     |     |    |    |                           |
| S <sub>3</sub> | 255 | 3 | 159 | 63  | 155 |    |    |                           |

Steps =

- Find  $b_{out}$  mod  $n$  and then place in set lines according to capacity.
- If capacity is full then replace an Element according to FIFO.

10

Lecture 4.6

Date \_\_\_\_\_  
Page \_\_\_\_\_#. LEAST RECENTLY USED POLICY

- Q. 0, 2 5, 5, 1, 4, 3, 8, 133, 159, 216, 129,  
 63, 8, 48, 32, 73, 92, 155, 16 ~~88~~

|                |         |
|----------------|---------|
|                | 0 48    |
| S <sub>0</sub> | 1 32    |
|                | 8 92    |
|                | 2 16    |
|                | - 1     |
| S <sub>1</sub> | 133     |
|                | 129     |
|                | - 73    |
| S <sub>2</sub> |         |
| S <sub>3</sub> | 255 155 |
|                | 159     |
|                | 63      |

(11)

Lecture 47

Date \_\_\_\_\_  
Page \_\_\_\_\_

## # NEED OF PIPELINING

- # Advantages =
- Reduces time of processing

There are two methods to make our system faster and Optimistic =

i) Changing the circuit of the system.

→ Drawback = This will result more costlier and time taking.

ii) Doing changes in the hardware and software of the system which is cost effective and less consumption of time.  
(Here, the concept of pipelining has been used.)



If the performance is in pipelining then

| Time  | O | P | R |
|-------|---|---|---|
| $C_1$ | 0 |   |   |
| $C_2$ | - | 0 | P |
| $C_3$ | - | - | R |

72



Date \_\_\_\_\_

Page \_\_\_\_\_

## When performance of not pipelining

|                     | O | P | R |   |   |   |
|---------------------|---|---|---|---|---|---|
| C <sub>1</sub>      | O | P | R |   |   |   |
| Time C <sub>2</sub> | - | - | - | O | P | R |
| Con- C <sub>3</sub> | - | - | - | - | - | - |
| umption             |   |   |   |   |   |   |
| high                |   |   |   |   |   |   |

## DEFINITION OF PIPELINING

- The setup of hardware and operating system such that working of the system becomes faster.
- The simultaneous working of the system.
- In pipelining, multiple instructions are overlapped.



- The Input of Stage 2 will be the Output of Stage 1.
- For better processing, the Input is first stored at Intermediate Registers.
- The clock gives the signals that when the Instructions has to get executed.



# # PIPELINING V.S NON-PIPELINING

$$\underline{Q.} \quad P = 8I, \quad I_1 - I_8$$



If By Non-Pipelining Then  
 Clock Pulse =  $(8 \times 5) \text{ cc}$   
 $\Rightarrow$  Clock Pulse =  $40 \text{ cc}$

By Pipelining  
 Let us Assume  $\frac{\text{Clock pulse per instruction}}{\text{Clock pulse per instruction}} \approx 1$



**Note:** Speed Up =  $\frac{NP}{P}$

$$\text{Speed Up} = \frac{40}{12+3}^{20/10} = \frac{10}{3}$$

Time Taken In Slicing =  $n$

$$= k + (n-1)$$

$\begin{array}{r} 1 \times k \\ \hline k + (n-1) \end{array}$

Where,  $k$  is number of cuts  
and  $n$  is number of Instructions.

\* Efficiency =  $\frac{\text{Used Bonus (bits} \times \text{Number of Instructions)}}{\text{Total Number of Bonus Instructions}}$

$$\Rightarrow \text{Efficiency} = \left( \frac{8 \times 2}{60} \right) = \frac{2}{3}$$

As CPI will Increase, Instructions will also Increase.

15

lecture 5!

# NUMERICAL QUESTION

## ON PIPE LINING

$$\Rightarrow \text{Speed Up} = \frac{T_{NP}}{T_P}$$

$$\begin{aligned} \text{Speed Up} &= \left( \frac{F_{NP}}{F_P} \right) \\ \text{Speed Up} &= \left( \frac{4 \times 20}{2.5 \times 10^9} \right) \\ &= \frac{1}{2 \times 10^9} \end{aligned}$$

$$\Rightarrow \text{Speed Up} = \left( \frac{4 \times 20}{25} \right)^{3/2} = \frac{16}{5} = 3.2$$

14

Lecture 10

## STAGE DELAY IN PIPELINE

Q.



$$\text{Maximum Time} = 165$$

: The most frequency required =  $\frac{1}{t - 165}$

$$\text{For Clock Pulse} = \frac{(1 \times 4 \times 165 + 1 \times 999 \times 165)}{165.50 \mu\text{s}}$$

Bandwidth  $\times 31 \times$

Total Maximum

$$(k+n-1)t_p = (8+99)20 = 214 \quad \checkmark$$



## STRUCTURAL HAZARDS

→ When Instructions requires various resources like Registers and many more hardware resources for processing, then there structural hazards occurs.



Now, here the contradictions in performing same functions. Instructions occurs, then, either we have to change the whole circuitry or needs various registers so that Information can get stored.

Otherwise, cloud will be getting occurred which will perform continuous miss. Rarely a hit will be occurring.

18

Lecture 56  
Lecture 55

# READ AFTER WRITE HAZARD

→ Let's say

Four Phase

|                   |       |    |    |    |                            |
|-------------------|-------|----|----|----|----------------------------|
| $I_1 \rightarrow$ | IF/ID | OF | EX | WB | $R_3 \leftarrow R_1 + R_2$ |
| $I_2 \rightarrow$ | IF/ID | OF | EX | WB | $R_5 \leftarrow R_3 + R_4$ |

But,

Before the final execution of  $R_3$  after the starting of Instruction  $I_2$ .

- In Instruction  $I_2$ ,  $R_3$  has been used.
- Thus, this will produce contradiction.

BB Note = To resolve this contradiction stall has been done

→ Instruction 2 ( $I_2$ ) will be getting flushed

|       |       |    |    |    |                |
|-------|-------|----|----|----|----------------|
| $I_1$ | IF/ID | OF | EX | WB | $R_3$          |
| $I_2$ | IF/ID | OF | EX | WB | IF/ID OF EX WB |

used, stall.

Branching in the instruction has been

→ Can get rectified by operator shifting method

# CONTROL HAZARD

|      |    |    |    |   |    |  |
|------|----|----|----|---|----|--|
| 5.00 | IF | ID | EX | M | WB |  |
| 5.01 | IF | ID | EX | M | WB |  |

After while execution of Memory Address 500, Program counter (PC) will automatically shift to Memory location (501), hence if we want to execute the instruction of Memory location : <sup>700</sup> .

Memory location Then that will perform contradictions hence, pipelining causes contradictions for performance of Exec Instructions

H3 Note = As soon as we detect branching  
we can shift the instructions  
accordingly.

→ Since, the degradation in our performance  
will be stopped.

will be stopped  
let Flush and stall



$$\frac{dy}{dx} =$$

|     |    |    |    |   |    |  |
|-----|----|----|----|---|----|--|
| 500 | IF | ID | EX | M | WB |  |
|     | IF | ID | EX | M | WB |  |
|     | FF | ID | EX | M | WB |  |

Register the  
Instrument

if 1st row 2000th memory location, then flush of

lets' say 2000th memory location, then flush

# WRITE AFTER READ HAZARD

Q.

$$\begin{array}{l}
 I_1 = \begin{array}{ccccc} & 700 & 10 & 70 \\ & 200 & & & \\ R_1 & \leftarrow & R_2 * R_3 & & \\ I_2 = & & & & \\ & R_2 & \leftarrow & R_4 + R_5 & \\ & 70 & & 30 & 40 \end{array} \\
 \end{array}$$

|    |    |    |    |    |   |
|----|----|----|----|----|---|
| I1 | IF | ID | EX | WB | } |
| I2 | IF | ID | EX | WB |   |

here,  
Before

giving the value of  
 $R_2 * R_3$  into  $R_1$  Register of Instruction 1,  
 Thy  $R_2$  of Instruction 2 is reading the  
 value of  $R_4$  and  $R_5$ .  
 Hence, here write after Read hazard occurs

$$\text{Range } (I_1) = R_1$$

$$\text{Domain } (I_1) = R_2, R_3$$

$$\text{Range } (I_2) = R_2$$

$$\text{Domain } (I_2) = R_4, R_5$$

$$\therefore \boxed{\text{Domain of } (I_1) \cap \text{Range of } (I_2) = R_2 \neq \emptyset}$$

Hence, Write after read Instruction is existing here.

# WRITE AFTER HAZARD

$$\begin{array}{l} \text{---} \\ \text{---} \end{array} \quad \begin{array}{rcl} I_1 & = & R_3 \leftarrow R_1 * R_2 \\ I_2 & = & R_3 \leftarrow R_4 + R_5 \\ & & 100 \quad 50 \quad 50 \end{array}$$



At both positions (Memory

locations), the nature of

R<sub>3</sub> has been differing

This is perfect example of lost update problem.

## LOST UPDATE PROBLEM

- When two or more transitions gets  
overridden then the error in output has  
been changes occurred.

$$\underline{\text{Chick}} = \text{Range (I}_1\text{)} = R_3$$

Domain (I1) = R1, R2

$$\text{Range } (I_2) = R3$$

Domain (I 2) = R4, R5

In Shrite After Shrite Hazard

$$\text{Read}(I_1) = \text{Read}(I_2) = R_3 \neq 0$$

Hence, tensile strength of brittle after brittle hazard.

H3 Note =

## Data Hazards

Read After Write

Write after Write

Write after Read

→ Can be

done by  
register  
renam

→ Can get rectified  
by Operator  
shifting.

→ Can be  
rectified by  
Register Renaming.

(81)

Q. Consider a 4 - Segment Pipeline for 4 tasks, compare the number of cycle required in pipeline model and non-pipeline model.

$$\Rightarrow k + (n-1) = 4 + (4-1) = 7 \text{ Cycle (Pipelining)}$$

$$4 \times 4 = 16 \text{ Cycles (Non-Pipelining)}$$



$$(16-7) = 9 \text{ Stages Saved}$$

$$\Rightarrow \boxed{\{k+(n-1)\} \times t_p} \leftarrow \text{Pipelining}$$

$$\Rightarrow \boxed{n \cdot t_n} \leftarrow \text{Non-Pipelining}$$

$$\eta = \frac{S}{R}$$

\$ta

## FORMULA:

EXECUTION TIME , SPEED UP RATIO, CPI FOR PROGRAM / MACHINE

$$\rightarrow \left[ \text{Clock cycle} = \frac{1}{\text{clockrate}} \quad \text{or,} \quad \frac{1}{\text{clock frequency}} \right]$$

$$\rightarrow \text{Speed Up} = \frac{\text{Performance } x}{\text{Performance } Y}$$

or,

$$\text{Speed Up} = \frac{\text{Old Execution Time}}{\text{New Exec. Time}}$$

$$\rightarrow \text{Execution Time} = \text{clock cycle} \times \sum_{i=1}^n \text{CPI}_i \times I_i$$

{ where,

CPI = Average number of clocked

I = no. of Times instructions is executed. }

or,

$$E_T = I C \times CPI \times \frac{1}{\text{clock Rate}} \longrightarrow \text{or clock frequency.}$$

$$\rightarrow \left[ \text{CPI (overall)} = (\text{CPI}_1 \times f_1) + (\text{CPI}_2 \times f_2) + \dots + (\text{CPI}_n \times f_n) \right] \quad \left\{ \begin{array}{l} \text{where } f \text{ is frequency} \\ \text{in fraction not} \\ \text{in percentage} \end{array} \right.$$

**RELATED QUESTION BASED ON FORMULA :**

1. A processor having a clock cycle time of 0.25 nsec will have a clock rate of \_\_\_\_ MHz.

Solution: Clock cycle time  $C$  is the reciprocal of the clock rate  $f$ :

$$C = 1 / f$$

$$f = 1/C = 1/0.25\text{ns} = 4 \text{ GHz or } 4000 \text{ MHz}$$

**FORMULA USED:**

Clock time = 1/clock rate

Or, Clock time = 1/frequency

2. The performance of machine A is 10 times the performance of machine B, when running the same program. We say that machine A is \_\_\_\_ times faster than machine B when running the same program.

Solution:  $\text{Speedup} = \frac{\text{Performance}_X}{\text{Performance}_Y} = \frac{10}{1} = 10$

- Consider a program whose instruction count is 50,000, average CPI is 2.2, and clock rate is 1.9 GHz. Suppose we use a new compiler on the same program for which the new instruction count is 40,000, and new CPI is 3.1, which is running on a faster machine with clock rate 2.5 GHz. The speedup achieved will be \_\_\_\_.

Solution:  $IC_1=50,000$

$CPI_1=2.2$

$\text{Clock rate}_1 = 1.9 \text{ GHz}$

$$\begin{aligned} OldExecutionTime_1 &= IC_1 \times CPI_1 \times \frac{1}{clockrate_1} \\ &= 50000 \times 2.2 \times \frac{1}{1.9 \times 10^9} \approx 5.789 \times 10^{-5} \text{ sec} \end{aligned}$$

$$IC_2 = 40,000$$

$$CPI_2 = 3.1$$

$$\text{Clock rate}_2 = 2.5 \text{ GHz}$$

$$\begin{aligned} NewExecutionTime_2 &= IC_2 \times CPI_2 \times \frac{1}{\text{clockrate}_2} \\ &= 40000 \times 3.1 \times \frac{1}{2.5 \times 10^9} \approx 4.96 \times 10^{-5} \text{ sec} \end{aligned}$$

$$\text{Speedup} = \frac{\text{OldExecutionTime}_1}{\text{NewExecutionTime}_2} = \frac{5.789 \times 10^{-5} \text{ sec}}{4.96 \times 10^{-5} \text{ sec}} \approx 1.167$$

**NOTE: PERFORMANCE = 1/ EXECUTION TIME** -----EQ(1)

SO SPEED UP = PERFORMANCE X / PERFORMANCE Y

OR, SPEED UP = EXECUTION TIME OF Y / EXECUTION TIME OF X

[ FROM EQ(1) ]

- A program is running on a machine which has a total of 500 instructions, average cycles per instruction for the program is 2.5, and CPU clock rate is 1.78 GHz. The execution time of the program will be \_\_\_\_\_ second.

Solution: Instruction Count = 500

CPI = 2.5

Clock Rate = 1.78 GHz

Clock Cycle Time = 1/Clock rate

$$\text{ExecutionTime} = IC \times CPI \times \frac{1}{\text{clockrate}} = 500 \times 2.5 \times \frac{1}{1.78 \times 10^9} \approx 702.25 \times 10^{-9}$$

Suppose for a RISC ISA implementation, there are four instruction types LOAD, STORE, ALU and BRANCH with relative frequencies of 25%, 3%, 50% and 22% respectively, and CPI values of 5, 2.5, 1 and 2.2 respectively. The overall CPI will be 2.309.

Solutions:

$$CPI_{\text{overall}} = (0.25 \times 5) + (0.03 \times 2.5) + (0.5 \times 1) + (0.22 \times 2.2) = 2.309$$

$$CPI_{\text{OVERALL}} = (CPI_1 \times F_1) + (CPI_2 \times F_2) + \dots + (CPI_N \times F_N)$$

FORMULA:

$$\rightarrow \text{MIPS Rating} = \frac{\text{Instruction Count (IC)}}{\text{Execution Time} \times 10^6}$$

$$\text{or, } \frac{\text{Clock Rate (in MHz)}}{\text{CPI}}$$

$$\Rightarrow \text{Weighted Arithmetic Mean (WAM)} = \sum_{i=1}^n \frac{w_i t_i}{n}$$

{ where,  $w$  = weights of Programs  
 $t$  = time to run ;  $n$  = no. of Program }

$$\Rightarrow \text{Geometric Mean} \left( \frac{A}{B} \right) = \frac{\text{Geometric Mean A}}{\text{Geometric Mean B}}$$

$$\Rightarrow \text{Max Speedup} = \frac{1}{1 - F} \quad \left\{ \text{where } F = \text{Speed of fraction} \right\}$$

$$\Rightarrow \text{Data Transfer Rate} = \frac{\text{Amount of Data}}{\text{Transfer Time}}$$

{ Data Transfer Can be in bits per second (b/s), bytes per second (B/s); kilobit per second (KB/s); megabyte per second (MB/s) so on. }

Consider a processor having four types of instruction classes, A, B, C and D, with the corresponding CPI values 1.1, 1.7, 2.8 and 3.5 respectively. The processor runs at a clock rate of 2 GHz. For a given program, the instruction counts for the four types of instructions are 20, 15, 12 and 5 million respectively. The MIPS rating of the processor for this program will be

Solution:

|     | A   | B   | C   | D   |
|-----|-----|-----|-----|-----|
| CPI | 1.1 | 1.7 | 2.8 | 3.5 |
| IC  | 20  | 15  | 12  | 5   |

$$CPI = \frac{(20 \times 1.1) + (15 \times 1.7) + (12 \times 2.8) + (5 \times 3.5)}{20 + 15 + 12 + 5} = \frac{98.6}{52} \approx 1.896$$

$$\text{MIPS} = \text{Clock Rate (in MHz)} / (\text{CPI}) = \frac{2000}{1.896} \approx 1054.8 \text{ MIPS}$$

ALSO ,MIPS RATING = INSTRUCTION COUNT / EXECUTION TIME  $\times 10^6$

There are three computers A, B and C. A program P1 takes time 5, 20 and 100 respectively to run on the three computers. Similarly, another program P2 takes times 750, 75 and 15 respectively to run on the same three computers. Which computer is the fastest based on weighted average mean assuming the weights of programs P1 and P2 to be 40% and 60% respectively?

- a. Computer A
- b. Computer B
- c. Computer C
- d. Cannot say.

Solution: (c)

|    | Computer A | Computer B | Computer C |
|----|------------|------------|------------|
| P1 | 5          | 20         | 100        |
| P2 | 750        | 75         | 15         |

$$\text{Weighted Arithmetic Mean (WAM)} = \sum_{i=1}^n w_i t_i$$

$$\text{WAM for Computer A} = \frac{(0.4 \times 5) + (0.6 \times 750)}{2} = 226$$

$$\text{WAM for Computer B} = \frac{(0.4 \times 20) + (0.6 \times 75)}{2} = 26.5$$

$$\text{WAM for Computer C} = \frac{(0.4 \times 100) + (0.6 \times 15)}{2} = 24.5$$

Which of the following statements are true?

- a. Geometric means of normalized execution times are consistent.
- b. Arithmetic means of normalized execution times are consistent.
- c. Geometric means of normalized execution times are not consistent.
- d. Arithmetic means of normalized execution times are not consistent.

Solution: ((a) and (d)) The geometric mean is independent of data series is used in normalization due to the property:

$$\frac{\text{Geometric mean}(A)}{\text{Geometric mean}(B)} = \text{Geometric mean}\left(\frac{A}{B}\right)$$

Suppose we are enhancing the speed of a fraction F of a given computation. If F = 0.55, the maximum speed up that can be attained is

Solution:

$$\text{Maximum Speedup} = \frac{1}{1-F} = \frac{1}{1-0.55} = \frac{1}{0.45} = 2.22$$

Consider two alternatives for speeding up computation. In the first alternative, we make 20% of a program 90 times faster. In the second alternative, we make 95% of the program 15 times faster. The ratio of the speedups for the two cases will be

Solution:

$$90X \text{ faster for } 20\% \text{ of the program} = \frac{1}{(1-0.2)+\frac{0.2}{90}} = \frac{1}{0.8022} \approx 1.246$$

$$15X \text{ faster for } 95\% \text{ of the program} = \frac{1}{(1-0.95)+\frac{0.95}{15}} = \frac{1}{0.11333} \approx 8.8235$$

Ratios of speedup is  $1.246: 8.8235 \approx 1:7 = 0.143$

**DATA TRANSFER RATE or SPEED = AMOUNT OF DATA / TRANSFER TIME**

EXAMPLE:

For instance, say you transferred 100 GB at a rate of 7 MB/s. First, convert Gigabyte to Megabyte so you're working with the same units in every part of the equation.  $100 \times 1,024 = 102400$ . So, you transferred 102400 Megabyte at a rate of 7 MB/s. To solve for T, divide 102400 by 7, which is 14628.57. Therefore, it took 14628.57 sec. Now convert this to hours, divide by 3,600, which is 4.07. In other words, it took 4.06 hrs to transfer 100 GB at a rate of 7 MB/s.

CONVERSION TABLE:

|                    |                  |
|--------------------|------------------|
| $8b = 1B$          |                  |
| $1024\ B = 1\ KB$  | $60\ s = 1\ min$ |
| $1024\ kb = 1\ MB$ | $60\ min = 1\ h$ |
| $1024\ mb = 1\ GB$ |                  |
| $1024\ gb = 1\ TB$ |                  |

**Magnetic Disk in Computer Architecture-**

**QUESTION:**

Consider a disk pack with the following specifications- 16 surfaces, 128 tracks per surface, 256 sectors per track and 512 bytes per sector.

Answer the following questions-

1. What is the capacity of disk pack?
2. What is the number of bits required to address the sector?
3. If the format overhead is 32 bytes per sector, what is the formatted disk space?
4. If the format overhead is 64 bytes per sector, how much amount of memory is lost due to formatting?
5. If the diameter of innermost track is 21 cm, what is the maximum recording density?
6. If the diameter of innermost track is 21 cm with 2 KB/cm, what is the capacity of one track?
7. If the disk is rotating at 3600 RPM, what is the data transfer rate?
8. If the disk system has rotational speed of 3000 RPM, what is the average access time with a seek time of 11.5 msec?

SOL:

**Part-01: Capacity of Disk Pack-**

Capacity of disk pack

= Total number of surfaces x Number of tracks per surface x Number of sectors per track x Number of bytes per sector

=  $16 \times 128 \times 256 \times 512$  bytes

=  $2^{28}$  bytes

= 256 MB

**Part-02: Number of Bits Required To Address Sector-**

Total number of sectors

= Total number of surfaces x Number of tracks per surface x Number of sectors per track

=  $16 \times 128 \times 256$  sectors

=  $2^{19}$  sectors

Thus, Number of bits required to address the sector = 19 bits

### **Part-03: Formatted Disk Space-**

Formatting overhead

= Total number of sectors x overhead per sector

=  $2^{19} \times 32$  bytes

=  $2^{19} \times 2^5$  bytes

=  $2^{24}$  bytes

= 16 MB

Now, Formatted disk space

= Total disk space – Formatting overhead

= 256 MB – 16 MB

= 240 MB

### **Part-04: Formatting Overhead-**

Amount of memory lost due to formatting

= Formatting overhead

= Total number of sectors x Overhead per sector

=  $2^{19} \times 64$  bytes

=  $2^{19} \times 2^6$  bytes

=  $2^{25}$  bytes

= 32 MB

## **Part-05: Maximum Recording Density-**

Storage capacity of a track

$$= \text{Number of sectors per track} \times \text{Number of bytes per sector}$$

$$= 256 \times 512 \text{ bytes}$$

$$= 2^8 \times 2^9 \text{ bytes}$$

$$= 2^{17} \text{ bytes}$$

$$= 128 \text{ KB}$$

Circumference of innermost track

$$= 2 \times \pi \times \text{radius}$$

$$= \pi \times \text{diameter}$$

$$= 3.14 \times 21 \text{ cm}$$

$$= 65.94 \text{ cm}$$

Now, Maximum recording density

$$= \text{Recording density of innermost track}$$

$$= \text{Capacity of a track} / \text{Circumference of innermost track}$$

$$= 128 \text{ KB} / 65.94 \text{ cm}$$

$$= 1.94 \text{ KB/cm}$$

## **Part-06: Capacity Of Track-**

Circumference of innermost track

$$= 2 \times \pi \times \text{radius}$$

$$= \pi \times \text{diameter}$$

$$= 3.14 \times 21 \text{ cm}$$

$$= 65.94 \text{ cm}$$

Capacity of a track

= Storage density of the innermost track x Circumference of the innermost track

= 2 KB/cm x 65.94 cm

= 131.88 KB

$\cong$  132 KB

### **Part-07: Data Transfer Rate-**

Number of rotations in one second

= (3600 / 60) rotations/sec

= 60 rotations/sec

Now, Data transfer rate

= Number of heads x Capacity of one track x Number of rotations in one second

= 16 x (256 x 512 bytes) x 60

=  $2^4 \times 2^8 \times 2^9 \times 60$  bytes/sec

=  $60 \times 2^{21}$  bytes/sec

= 120 MBps

### **Part-08: Average Access Time-**

Time taken for one full rotation

= (60 / 3000) sec

= (1 / 50) sec

= 0.02 sec

= 20 msec

Average rotational delay

$$= 1/2 \times \text{Time taken for one full rotation}$$

$$= 1/2 \times 20 \text{ msec}$$

$$= 10 \text{ msec}$$

Now, average access time

$$= \text{Average seek time} + \text{Average rotational delay} + \text{Other factors}$$

$$= 11.5 \text{ msec} + 10 \text{ msec} + 0$$

$$= 21.5 \text{ msec}$$

### **Problem-02:**

What is the average access time for transferring 512 bytes of data with the following specifications-

- Average seek time = 5 msec
- Disk rotation = 6000 RPM
- Data rate = 40 KB/sec
- Controller overhead = 0.1 msec

### **Time Taken For One Full Rotation-**

Time taken for one full rotation

$$= (60 / 6000) \text{ sec}$$

$$= (1 / 100) \text{ sec}$$

$$= 0.01 \text{ sec}$$

$$= 10 \text{ msec}$$

### **Average Rotational Delay-**

Average rotational delay

$$= \frac{1}{2} \times \text{Time taken for one full rotation}$$

$$= \frac{1}{2} \times 10 \text{ msec}$$

$$= 5 \text{ msec}$$

### **Transfer Time-**

Transfer time

$$= (512 \text{ bytes} / 40 \text{ KB}) \text{ sec}$$

$$= 0.0125 \text{ sec}$$

$$= 12.5 \text{ msec}$$

### **Average Access Time-**

Average access time

$$= \text{Average seek time} + \text{Average rotational delay} + \text{Transfer time} + \text{Controller overhead} + \text{Queuing delay}$$

$$= 5 \text{ msec} + 5 \text{ msec} + 12.5 \text{ msec} + 0.1 \text{ msec} + 0$$

$$= 22.6 \text{ msec}$$

## Pipelining in Computer Architecture-

Question

- B// Given: → Four stage pipeline used ( $K=4$ )  
→ Delay of stages =  $\{60, 50, 30, 80\} \text{ ns}$   
→ Latch Delay = 10 ns  
→ Instruction (n) = 1

A/B. 
$$\text{Pipeline Cycle Time / Execution Time} = \boxed{\text{Maximum Delay} + \text{Latch/Register Delay}}$$

$$\text{Pipeline Cycle Time} = \max \{60, 50, 30, 80\} + 10 \text{ ns}$$

$$\Rightarrow (30 + 10) = \underline{\underline{100 \text{ ns}}} \quad \left. \begin{array}{l} \text{choose max} \\ \text{from delay set} \end{array} \right)$$

Also; 
$$\text{Non-pipeline Execution Time} = \frac{\text{Sum of all delay at stages}}{\text{For 1 instruction}} \times n$$

$$\text{Non-pipeline Execution Time} = \frac{(60 + 50 + 30 + 80)}{\underline{\underline{1}}} \times \underline{\underline{1}} = \underline{\underline{280 \text{ ns}}}$$

Also; 
$$\text{Speedup Ratio (Pipeline & non-pipeline)} = \frac{\text{Non-Pipeline Exec. Time}}{\text{Pipeline Exec. Time}}$$

Also; 
$$\text{Speed Ratio} = \frac{\text{Performance of pipelined processor}}{\text{non-pipelined}}$$

A/B = 
$$\text{Speed Ratio} = \frac{ET(\text{non-pipeline})}{ET(\text{pipeline})} = \frac{280 \text{ ns}}{100 \text{ ns}} = 2.8$$

(iv)  $\boxed{\text{Total Execution Time For } n \text{ task} = \frac{\text{pipeline of }}{K-1} + n \times \text{cycle time}}$

A/Q

$K = 4 \text{ stage}$        $n = 1000$  (Given 1000 instruction Task)

$$\text{Cycle Time} = \max\{60, 50, 50, 80\} + 10 \text{ ns} = \underline{100 \text{ ns}}$$

$$\begin{aligned} \text{Total E.T.} &= \{ (4-1) + 1000 \} \times 100 \\ &= \underline{100300 \text{ ns}} \end{aligned}$$

(v)  $\boxed{\text{For Non-Pipeline of } n \text{ task Total E.T.} = \frac{(n \times \text{total cycle time})}{\text{Time}}}$

A/Q Total E.T. (non-pipeline) =  $1000 \times 280 = \underline{280000 \text{ ns}}$

(vi)  $\boxed{\text{Efficiency} = \frac{\text{Speed}}{\text{stage}} = \frac{S}{K}}$

Note: Cycle per Instruction (CPI) for Ideal Pipeline is 1

Note: If clock rate is Given in Question in GHz i.e.  $\times 10^9$  then find clock time by formula  $\Rightarrow$

$$\text{Clock Time} = \frac{1}{\text{clock rate}}$$

Exa: clock rate = 4 GHz then

$$\begin{aligned} \text{Clock Time} &= \frac{1}{4 \times 10^9} = 0.25 \times 10^{-9} \\ &= \underline{0.25 \text{ nanosec}} \end{aligned}$$

### **Problem-03:**

Consider a non-pipelined processor with a clock rate of 2.5 gigahertz and average cycles per instruction of 4. The same processor is upgraded to a pipelined processor with five stages but due to the internal pipeline delay, the clock speed is reduced to 2 gigahertz. Assume there are no stalls in the pipeline. The speed up achieved in this pipelined processor is-

SOL:

#### **Cycle Time in Non-Pipelined Processor-**

Frequency of the clock = 2.5 gigahertz

Cycle time

$$= 1 / \text{frequency}$$

$$= 1 / (2.5 \text{ gigahertz})$$

$$= 1 / (2.5 \times 10^9 \text{ hertz})$$

$$= 0.4 \text{ ns}$$

#### **Non-Pipeline Execution Time-**

Non-pipeline execution time to process 1 instruction

= Number of clock cycles taken to execute one instruction

= 4 clock cycles

$$= 4 \times 0.4 \text{ ns}$$

$$= 1.6 \text{ ns}$$

## **Cycle Time in Pipelined Processor-**

Frequency of the clock = 2 gigahertz

Cycle time

$$= 1 / \text{frequency}$$

$$= 1 / (2 \text{ gigahertz})$$

$$= 1 / (2 \times 10^9 \text{ hertz})$$

$$= 0.5 \text{ ns}$$

## **Pipeline Execution Time-**

Since there are no stalls in the pipeline, so ideally one instruction is executed per clock cycle. So,

Pipeline execution time

$$= 1 \text{ clock cycle}$$

$$= 0.5 \text{ ns}$$

## **Speed Up-**

Speed up

$$= \text{Non-pipeline execution time} / \text{Pipeline execution time}$$

$$= 1.6 \text{ ns} / 0.5 \text{ ns}$$

$$= 3.2$$

Thus, Option (A) is correct.

### **Problem-04:**

The stage delays in a 4 stage pipeline are 800, 500, 400 and 300 picoseconds. The first stage is replaced with a functionally equivalent design involving two stages with respective delays 600 and 350 picoseconds.

The throughput increase of the pipeline is \_\_\_\_\_ %.

SOL:

### **Execution Time in 4 Stage Pipeline-**

Cycle time

= Maximum delay due to any stage + Delay due to its register

=  $\text{Max} \{ 800, 500, 400, 300 \} + 0$

= 800 picoseconds

Thus, Execution time in 4 stage pipeline = 1 clock cycle = 800 picoseconds.

### **Throughput in 4 Stage Pipeline-**

Throughput

= Number of instructions executed per unit time

= 1 instruction / 800 picoseconds

## Execution Time in 2 Stage Pipeline-

Cycle time

= Maximum delay due to any stage + Delay due to its register

=  $\text{Max} \{ 600, 350 \} + 0$

= 600 picoseconds

Thus, Execution time in 2 stage pipeline = 1 clock cycle = 600 picoseconds.

## Throughput in 2 Stage Pipeline-

Throughput

= Number of instructions executed per unit time

= 1 instruction / 600 picoseconds

## Throughput Increase-

Throughput increase

=  $\{ (\text{Final throughput} - \text{Initial throughput}) / \text{Initial throughput} \} \times 100$

=  $\{ (1 / 600 - 1 / 800) / (1 / 800) \} \times 100$

=  $\{ (800 / 600) - 1 \} \times 100$

=  $(1.33 - 1) \times 100$

=  $0.3333 \times 100$

= 33.33 %

1. Question 1

/ 1

Which of the following is/are false?

- a. Processor can directly access data from secondary memory.
- b. Primary memory is used as backup memory.
- c. Primary memory stores the active instructions and data for the program being executed on the processor.
- d. Primary memory can store only instructions.

Hide answer choices

1.

**a , b, c**

2.

**a and b**

3.

**a**

4.

**a and d**

---

2. Question 2

/ 1

Program counter:

Hide answer choices

1.

**Counts the total number of instructions present in a program**

2.

**Points to the next instruction that is to be executed.**

3.

---

**Points to the current instruction that is being executed**

---

4.

**Stores the data of the current instruction that is being executed.**

---

3. Question 3

/ 1

Which of the following contains circuitry to carry out operations such as addition, multiplication etc?

Hide answer choices

---

1.

**Control Unit**

---

2.

**Memory Unit**

---

3.

**ALU**

---

4.

**Input/Output Unit**

---

4. Question 4

/ 1

Which of the following statements are true?

- a. The ENIAC computer was built using mechanical relays.
- b. Harvard Mark1 computer was built using mechanical relays.
- c. PASCALINE computer could multiply and divide numbers by repeated addition and subtraction.
- d. Charles Babbage built his automatic computing engine in the 19th century.

Hide answer choices

---

1.

---

**a and b**

---

2. **a, b, d**

---

3. **b and c**

---

4. **d and c**

---

---

5. Question 5

---

/ 1

Which of the following generates the necessary signals required to execute an instruction in a computer?

Hide answer choices

---

1. **Arithmetic and Logic Unit**

---

2. **Memory Unit**

---

3. **Input/Output Unit**

---

4. **Control Unit**

---

---

6. Question 6

---

/ 1

An instruction ADD R1, A is stored at memory location 4004H. R1 is a processor register and A is a memory location with address 400CH. Each instruction is 32-

bit long. What will be the values of PC, IR and MAR during execution of the instruction?

Hide answer choices

1.

**PC=4004H; IR= ADD R1, A; MAR 400CH**

2.

**PC=4008H; IR= ADD; MA= 400CH**

3.

**PC=4008H; IR=ADD R1, A; MAR= 400CH**

4.

**None of the above**

---

7. Question 7

/ 1

Consider a 1 MB (Megabyte) byte-addressable memory system, with a word size of 32 bits. The number of bits in MAR and MDR will be:

Hide answer choices

1.

**23,32**

2.

**20, 8**

3.

**23, 32**

4.

**20, 32**

---

8. Question 8

/ 1

A computer has 2GB (Gigabytes) of byte-addressable memory. The number of address lines will be 32

1. T

True

2. F

False

9. Question 9

/ 1

Which of the following statements are false for Harvard architecture?

Hide answer choices

1.

**The program and data are stored in separate memory units.**

2.

**The program and data are stored in the same memory.**

3.

**The processor fetches instructions from program memory and accesses data from data memory.**

4.

**The program memory and data memory can be built using different memory technologies.**

10. Question 10

/ 1

For a 512 X 32 bit memory that contains 512 locations each with 32-bit data, what will be the address (in binary) of the 389th location? (Assume the first location as 0).

Hide answer choices

1.

**110000100**

2.

**110000101**

3.

**011000100**

4.

**011000101**

---

#### 11. Question 11

/ 1

Consider a 32-bit machine where an instruction (SUB R1, LOCA) is stored at location 2004H. LOCA is a memory location whose value is 1024H. The number of memory access required to execute this instruction will be?

Hide answer choices

1.

**4**

2.

**2**

3.

**8**

4.

**16**

---

#### 12. Question 12

Which of the following statements are true for Moore's law?

Hide answer choices

1.

**Moore's law predicts that power dissipation will double every 18 months.**

2.

**Moore's law predicts that the number of transistors per chip will double every 18 months**

3.

**Moore's law predicts that the speed of VLSI circuits will double every 18 months**

4.

**None of the above**

1. Question 1

Consider the following statement for representing signed numbers using sign-magnitude, 1's complement, and 2's complement format:

(i) Sign of the number can be identified using MSB.

(ii) By flipping the sign bit we can obtain the number of its opposite sign.

Which of the following is correct?

Hide answer choices

1.

**Only (i) is true**

2.

**Only (ii) is true**

3.

**Both (i) and (ii) are true**

4.

**Both (i) and (ii) are false**

---

2. Question 2

/ 1

The decimal number (35.5) base 10 in binary notation is?

Hide answer choices

1.

**100011.101**

2.

**100011.1**

3.

**00110101.0101**

4.

**110101.101**

---

3. Question 3

/ 1

The number (110.101001) base 2 in hexadecimal notation is?

Hide answer choices

1.

**6.A4**

2.

**C.A4**

3.

**6.29**

---

4.

C.29

---

4. Question 4

/ 1

The addressing mode that adds the displacement and the index register to get the effective address of the operand is called

Hide answer choices

---

1.

**Indexed addressing**

---

2.

**Base-Indexed addressing**

---

3.

**Register Indirect Addressing**

---

4.

**Relative addressing**

---

5. Question 5

/ 1

Registers R1 and R2 contain data values 600 and 800 respectively in decimal, and the word length of the processor is 4 bytes. The effective address of the memory operand for the instruction “LOAD R5,10(R1,R2)” will be 1410 .

1. T

True

---

2. F

False

---

6. Question 6

/ 1

Which of the following statements are false for MIPS 32 register set?

- a. Register R0 always contains the value 0.
- b. Any register other than R0 can be used for register indirect addressing.
- c. Any register other than R0 can be used to hold the return address for function calls.
- d. R31 is a special register used as a stack pointer.

Hide answer choices

1.

**a, c and d**

2.

**c and d**

3.

**d**

4.

**a and b**

#### 7. Question 7

/ 1

The MIPS code for  $A = B+C$  where B is loaded in register \$S2 and C is loaded in register \$S3 is?

Hide answer choices

1.

**ADD \$S1, \$S2, \$S3**

2.

**ADDI \$S3, \$S2, \$S1**

3.

**ADD \$S3, \$S2, \$S1**

4.

---

**None of the above**

---

8. Question 8

/ 1

In MIPS32, to fetch a 32-bit word from memory in a single cycle, the word has to be stored from memory location?

Hide answer choices

---

1.

**0019H**

---

2.

**001BH**

---

3.

**001AH**

---

4.

**0018H**

---

9. Question 9

/ 1

For the instruction STORE R1,35(R2) what will be the effective address of the memory operand if R2 is 200 (in decimal)?

Hide answer choices

---

1.

**35**

---

2.

**165**

---

3.

**235**

---

4.

---

**200**

---

10. Question 10

/ 1

How do you represent -10 using 16-bit, 2's complement representation?

Hide answer choices

---

1.

---

**1000 0000 0001 0110**

---

2.

---

**0000 0000 0000 1010**

---

3.

---

**1111 1111 1111 0110**

---

4.

---

**0000 0000 0001 0110**

1. Question 1

/ 1

What kind of processor is x86?

Hide answer choices

---

1.

---

**RISC**

---

2.

---

**CISC**

---

3.

---

**DISC**

4.

**None of the above**

---

2. Question 2

/ 1

A processor having a clock cycle time of 5ns will have a clock rate of \_\_\_\_\_ MHz.

Hide answer choices

1.

**200**

2.

**245**

3.

**257**

4.

**303**

---

3. Question 3

/ 1

Which of the following statements is/are true with respect to Amdahl's law?

- a. It expresses the law of diminishing returns.
- b. It expresses the maximum speedup that can be achieved.
- c. It provides a measure to compare the execution times of two machines.
- d. All of these

Hide answer choices

1.

**a and c**

2.

**b and c**

---

3.

**d only**

---

4.

**a and b**

---

4. Question 4

/ 1

The total execution time of a typical program is made up of 60% of CPU time and 40% of I/O time. Which of the following alternatives is better? Assume that there is no overlap between CPU and I/O operations.

Hide answer choices

---

1.

**Increase the CPU speed by 50%.**

---

2.

**Reduce the I/O time by half**

---

3.

**Both alternatives give the same speedup**

---

4.

**None of these**

---

5. Question 5

/ 1

Consider a program with 50 million instructions, a machine requires 25 milliseconds to execute this program. What will be the MIPS rating of the machine?

Hide answer choices

---

1.

**20**

---

2.

**200**

---

3.

---

**2000**

---

4.

---

**20000**

---

6. Question 6

/ 1

A program is running on a machine that has a total of 1000 instructions, the average cycles per instruction for the program are 1.5, and the execution time of the program is 1  $\mu$ sec. The clock rate of the machine is \_\_\_\_ GHz.

Hide answer choices

---

1.

**1.5**

---

2.

---

**2**

---

3.

---

**1**

---

4.

---

**1.75**

---

7. Question 7

/ 1

Suppose that a machine A executes a program with an average CPI of 3. Consider another machine B (with the same instruction set and a better compiler) that executes the same program with 10% less instructions and with a CPI of 1.5 at 1.5 GHz. The clock rate of A so that the two machines have the same performance is 3.30,3.40 GHz.

**1. T**

---

True

---

2. F

---

False

---

8. Question 8

/ 1

Suppose that a machine X executes a program with an average CPI of 2.5. Consider another machine Y (with the same instruction set and a little better compiler) that executes the same program with 10% fewer instructions and with the CPI of 1.5 at 4GHz. What should be the clock frequency of X so that both the machines have the same performance?

Hide answer choices

---

1.

**7.40GHz**

---

2.

**7.40MHz**

---

3.

**7.40KHz**

---

4.

**7.40Hz**

---

9. Question 9

/ 1

On which of the following does CPI depend on?

Hide answer choices

---

1.

**Instruction Set Architecture**

---

2.

---

**CPU organization**

---

3. **Complier**4. **All of these**

---

10. Question 10

/ 1

Consider a program whose instruction count is 1,500, average CPI is 2, and clock cycle time is 1 nanosecond. Suppose we use a new compiler on the same program for which the new instruction count is 2,500, and the new CPI is 1.5, which is running on a faster machine with a clock cycle time of 0.5 nanoseconds. The speedup achieved will be?

Hide answer choices

1. **1.66**2. **0.63**3. **1.63**4. **None of these**

---

1. Question 1

/ 1

Which of the following statement is/are correct?

Hide answer choices

1.

**MIPS instructions are 32 bits long.**

2.

**All instructions are stored in memory with address having the last two bits 00.**

3.

**Program Counter is decremented to 4 to point to the next instruction**

4.

**All of these**

---

2. Question 2

/ 1

The correct sequence in Fetch-Execute cycle is:

Hide answer choices

1.

**Decode, Fetch, Execute**

2.

**Fetch, Execute, Decode**

3.

**Fetch, Decode, Execute**

4.

**None of the above**

---

3. Question 3

/ 1

Which instruction does the following set of micro-operations refer to:

Steps Action

1 PCout, MARin, Read, Select4, Add, Zin

2 Zout, PCin, Yin, WMFC

3 MDRout, IRin

4 R1out, Yin

5 R2out, SelectY, Add, Zin  
6 Zout, R1in, End

Hide answer choices

1.

**ADD R2, R1**

2.

**ADD R1, R2**

3.

**MOVE R1, R2**

4.

**MOVE R2, R1**

4. Question 4



The Instruction Register (IR)

Hide answer choices

1.

**Holds the memory address of the next instruction**

2.

**Holds the memory address of the current instruction**

3.

**Holds the executed or decoded instruction.**

4.

**Holds the encoded instruction that is currently being executed or decoded.**

5. Question 5

Which of the following statements are true for vertical micro-instruction encoding?

Hide answer choices

1.

**If there are control signals, every control word stored in control memory (CM) consists of k bits, one bit for every control signal.**

2.

**Sequential activation of at most one control signal in a single time step.**

3.

**Low cost of implementation.**

4.

**None of these**

## 6. Question 6

Which of the following statements is/are false?

Hide answer choices

1.

**Diagonal micro-instructions encoding requires multiple decoders.**

2.

**In vertical micro-instructions encoding, more than one control signals cannot be activated at a time**

3.

**Horizontal micro-instructions encoding has a lower cost of implementation.**

4.

**None of these.**

---

7. Question 7

/ 1

Which of the following is true for MIPS32 register bank?

Hide answer choices

1.

**There is one read port and one write port.**

2.

**There is one read port and two write ports.**

3.

**There are two read ports and one write port.**

4.

**None of these**

---

8. Question 8

/ 1

Consider a single bus system, how many steps will be required to complete the instruction  
MOVE R1,R2

Hide answer choices

1.

**2**

2.

**3**

3.

**4**

4.

## 9. Question 9

/ 1

Which of the following is not true for a branch instruction in MIPS32?

Hide answer choices

1.

**The branch condition is computed in EX stage.**

2.

**The new PC value is loaded in the MEM stage.**

3.

**The target address is computed in ID stage.**

4.

**The WB stage is not required.**

## 10. Question 10

/ 1

Consider a hardwired control unit where each instruction of the machine requires maximum of 50 steps to complete its execution. If the total number of such instructions are 129, what should be the size of step decoder and instruction decoder respectively?

a.

Hide answer choices

1.

**6x64; 7x128**

2.

**50x1; 129x1**

3.

---

**64x1, 256x1**

---

4.

**6x64, 8x256**

---

1. Question 1

/ 1

How many address and data lines will be there for a 16M x 32 memory system?

Hide answer choices

1.

**24 and 5**

2.

**20 and 32**

3.

**24 and 32**

4.

**None of the above**

---

2. Question 2

/ 1

Assume that a 1G x 1 DRAM memory cell array is organized as 1M rows and 1K columns. The number of address bits required to select a row and a column will be?

Hide answer choices

1.

**20 and 10**

2.

**30 and 1**

3.

---

**220 and 210**

---

4. **None of the above**

---

3. Question 3

---

What is the function of the chip select line (CS') in a memory chip?

Hide answer choices

1. 

**Power supply is applied to the chip when CS' is activated.**

2. 

**The data bus is put in the high impedance state when CS' is deactivated.**

3. 

**It prevents two or more subsystems from using the memory simultaneously.**

4. **None of the above**

---

4. Question 4

---

Which of the following statement(s) is/are true?

Hide answer choices

1. 

**In memory, data are stored in the form of 0's and 1's.**

2. 

**Memory system stores data and instruction.**

3.

---

**Every bit of memory has a unique address.**

---

4.

**All of these**

---

5. Question 5

/ 1

Which of the following are true for Static RAM (SRAM)?

Hide answer choices

---

1.

**Power consumption is higher than Dynamic RAM (DRAM).**

---

2.

**Packing density is higher than Dynamic RAM (DRAM).**

---

3.

**It is faster than Dynamic RAM (DRAM).**

---

4.

**It is non-volatile.**

---

6. Question 6

/ 1

Which of the following statement(s) is/are false?

Hide answer choices

---

1.

**In volatile memory data is lost when power is switched off.**

---

2.

**Dynamic memory requires periodic refreshing.**

---

3.

---

**Magnetic tape does not allow random access of data.**

---

4.

**None of these.**

---

7. Question 7

/ 1

Which of the following is/are true for virtual memory systems?

Hide answer choices

---

1.

**It increases the size of the program that can be run**

---

2.

**It increases the size of the physical memory**

---

3.

**It increases the size of the secondary memory**

---

4.

**It improves the processor-memory bandwidth**

---

8. Question 8

/ 1

What is/are false for cache memory?

Hide answer choices

---

1.

**It consumes low power as compared to main memory.**

---

2.

**It is a type of non-volatile memory.**

---

3.

---

**It decreases the effective speed of the memory system.**

---

4.

**All of these.**

---

9. Question 9

/ 1

The total number of external connections required by an 8 x 8 memory will be 15

**1. T**

True

2. F

False

10. Question 10

/ 1

Which of the following statement is true for writing 1 in SRAM chip?

Hide answer choices

1.

**The bit line b is set with 1, and bit line b' is set with 0.**

2.

**The bit line b is set with 0, and bit line b' is set with 1.**

3.

**The bit line b is set with 1, and bit line b' is set with 1.**

4.

**The bit line b is set with 0, and bit line b' is set with 0.**

Question 1

/ 1

For a Hard disk rotating at a speed of 3600 rpm, Find the value of maximum delay

Hide answer choices

**6.31 millisec**

**8.30 millisec**

**16.60 millisec**

---

Question 2

---

/ 1

---

For a hard disk rotating at a speed of 9500 rpm, Find the value of average delay

---

Hide answer choices

---

**6.31 millisec**

**4.15 millisec**

**3.152 millisec**

**2.8 millisec**

---

Question 3

---

/ 1

---

---

The smallest unit of data transfer is \_\_\_\_\_

---

Hide answer choices

---

**sector**

---

**track**

---

**platters**

---

Question 4

---

/ 1

---

The important factors listed are:

---

- 1.The time required to move the head to the desired track.
  - 2.Average seek times are in the range 8 – 20 msec.
  - 3.The total time to transfer a block of data (typically, a sector).
- 
- 

Out of these factors which are related to Seek time

---

Hide answer choices

---

**1 only**

---

**1 and 2**

**2 and 3**

**all 1,2,3**

---

**Question 5**

---

**/ 1**

---

The data from a CPU need to be stored in a backup memory. Which is the best possible solution for the same

---

**Hide answer choices**

---

**SRAM**

**DRAM**

**Hard Disk**

**Volatile memory**

---

**Question 6**

---

**/ 1**

---

To read a bit from a FG transistor the necessary action to be performed is

---

**Hide answer choices**

---

**Apply a voltage on the control gate**

**Apply a voltage on the floating gate**

**Apply a voltage on the drain**

**Apply a voltage on the source**

Question 7

/ 1

The material in between control gate and floating gate is a

Hide answer choices

**Conductor**

**semi-conductor**

**Insulator**

**metal**

Question 8

/ 1

The start and stop bits are characterised for

[Hide answer choices](#)

---

**synchronous transmission**

**asynchronous transmission**

**clock driven transmission**

---

Question 9

---

**/ 1**

---

Separate decoders for memory and IO are required in

---

[Hide answer choices](#)

---

**IO mapped interfacing**

---

**Memory mapped Interfacing**

---

**direct interfacing**

---

Question 1

---

**/ 1**

---

Several program instructions have to be executed for each data word transferred

between the I/O device and memory.

The above statement corresponds to which mode of data transfer?

Hide answer choices

DMA

**Programmed I/O**

**Fixed I/O interfacing**

Question 2

/ 1

Consider a programmed I/O system where 20 instructions are required to be executed for the

transfer of each word of data. The cycles-per-instruction (CPI) of the machine is 1.5 and the

processor clock frequency is 2 GHz. The maximum data transfer rate will be \_\_\_\_\_

million words per second. (Assume 1 million =  $10^6$ )

Hide answer choices

**66.67**

**23.46**

**22.67**

**87.54**

---

Question 3

---

**/ 1**

---

Which of the following statement(s) is/are true for DMA data transfer?

---

Hide answer choices

---

**Data transfer requires very less CPU intervention.**

**Not Suitable for transferring large blocks of data**

**No direct data transfer between I/O and memory.**

---

Question 4

---

**/ 1**

---

Which of the following registers needs to be initialized before transfer of any data in DMA

---

mode?

---

Hide answer choices

---

**Memory Address**

## Word Count

---

Address of data on disk

---

All of the above

---

## Question 5

---

/ 1

---

How many total keys are supported by a 4\*4 MATRIX KEYBOARD

---

Hide answer choices

---

4

---

8

---

16

---

24

---

## Question 6

---

/ 1

---

If a key is pressed in a keyboard, the corresponding row and column are

---

Hide answer choices

---

**grounded**

**raised to high voltage**

**raised to high impedance**

---

Question 7

---

/ 1

---

Suppose that it is required to transfer 20K bytes in interrupt-driven mode of data transfer.

---

Every time an interrupt occurs, it involves the transfer of 64 bytes of data that takes 20

---

microseconds for the processor to service. The time required to transfer 40K bytes of data will

---

be ..... milliseconds? (Assume 1K = 1024)

---

Hide answer choices

---

**5.25**

**6.25**

**12.5**

**32.56**

---

Question 8

---

/ 1

Which of the following bus is more cheaper to implement?

Hide answer choices

**Parallel Bus**

**Serial Bus**

**High bus**

Question 9

/ 1

The maximum data transfer rates supported by USB 2.0 and USB 3.0 standards are respectively:

Hide answer choices

**480Mbps and 5Gbps**

**2 Gbps and 10 Gbps**

**12 Mbps and 5 Gbps**

**5 Gbps and 10 Gbps**

Question 10

---

/ 1

---

With Bridge-Based Bus Architectures

Hide answer choices

---

**Intel follows this kind of architecture.**

**different buses**

**can operate in parallel.**

**System includes a lot of buses that are**

**segregated by bridges.**

**All of these**

---

Question 1

/ 10

In the non-pipelined version, the execution time of an instruction is equal to

Hide answer choices

**the combined delay of all stages**

**individual delay**

**multiplied delay**

---

**Question 2**

---

**/ 1**

---

Consider a 5-stage instruction pipeline with stage delays of 20 nsec, 25 nsec, 35 nsec, 30 nsec,

---

and 22 nsec respectively. The delay of an inter-stage register stage of the pipeline is 2 nsec.  
The

---

total time required for the execution of 1000 instructions will be ..... microseconds.

---

[Hide answer choices](#)

---

**59.14 microsec**

**37.14 microsec**

**220.19microsec**

**322.14 microsec**

---

**Question 3**

---

**/ 1**

---

Consider a 3-stage instruction pipeline with stage delays of 25 nsec, 30 nsec and 15 nsec

---

respectively, and the delay of an inter-stage register stage of 5 nsec. Suppose the pipeline is

---

modified by splitting the 1st stage into two simpler stages with delays 10 nsec and 15 nsec, and

---

2nd stage into two simpler stages with delays 15 nsec and 15 nsec. For the execution of 1000

---

instructions, the speedup of the new 5-stage pipeline over the previous 3-stage pipeline will be

---

---

Hide answer choices

---

**1.746**

---

**20.08**

---

**35.07**

---

**25.67**

---

Question 4

---

**/ 1**

---

Since a new instruction is fetched every clock cycle, it is required to

---

\_\_\_\_\_ the \_\_\_\_\_ on each clock.

---

Hide answer choices

---

**increment, PC**

**decrement,PC**

**increment,IP**

**DECREMENT,IP**

---

Question 5

---

**/ 1**

---

Identify the statement related to pipeline hazard

---

Hide answer choices

---

**operation occurs at full speed**

**prevent a pipeline from operating at its maximum**

**possible clock speed.**

**operation suspends**

---

Question 6

---

**/ 1**

---

For the following MIPS32 program segment, how many stall cycles will be required .....?

---

1: LW R5, 200(R2)

---

2: ADD R1, R6, R8

---

3: SUB R3, R5, R8

---

Hide answer choices

---

2

---

1

---

0

---

4

---

Question 7

---

/ 1

---

Which of the following data hazards can cause performance degradation in the MIPS32 integer

---

pipeline?

---

a. WAR data hazard.

---

b. WAW data hazard.

---

c. RAW data hazard.

---

[Hide answer choices](#)

---

a

---

b

---

c

---

Question 8

---

/ 1

---

Consider the MIPS32 pipeline with ideal CPI of 1.5. Assume that 30% of all instructions executed

---

are branch, out of which 90% are taken branches. The pipeline speedup for (i) predict taken and

---

(ii) predict not taken approaches to reduce branch penalties will be approximately:

---

a. 3.94, 3.85

---

b. 4.34, 4.29

---

c. 3.85, 4.34

---

d. 3.85, 3.94

---

[Hide answer choices](#)

---

a

b

c

d

---

### Question 9

---

/ 1

---

Which of the following require additional hardware?

---

Hide answer choices

---

**Bypassing**

---

**Concurrent access**

---

**splitting**

---

---

### Question 10

---

/ 1

---

In Bypassing which of the following is not required?

---

- 
- a) Requires multiplexers to select these additional paths.

---

b) The control unit identifies data dependencies and selects the multiplexers in a suitable way.

---

c) Additional data transfer paths are to be added in the data path.

---

d) Splitting a clock cycle into two halves,

---

Hide answer choices

---

a

---

b

---

c

---

d

---

Question 1

/ 1

Loop unrolling requires significantly greater number of registers

---

---

**True**

**F**

False

Question 2

**/ 1**

Floating point will require more than one cycles in \_\_\_\_ stage

Hide answer choices

**EX**

**FETCH**

**DEC**

Question 3

**/ 1**

The number of cycles between an instruction producing a result and another instruction using it is known as

[Hide answer choices](#)

---

Hazard

Latency

delay

Interval

---

Question 4

---

/ 1

---

S.D F4, 40(R5) is used for

---

[Hide answer choices](#)

---

**Storing from a floating-point register pair:**

Loading into a floating-point register pair

---

Question 5

---

/ 1

---

In the floating-point extension of MIPS32, there are \_\_ floating-point

---

registers

---

Hide answer choices

---

4

8

16

32

---

Question 6

---

/ 1

---

There is no limit to the number of instructions that can be fetched and issued in every  
clock cycle.

---

T

True

F

False

Question 7

---

/ 1

---

Suppose the start-up time of vector multiply operation is 12 clock cycles. After start-up, the initiation rate is one per clock cycle. What will be the number of clock cycles required per result for a 64-element vector?

---

Hide answer choices

---

**2.13**

**1.19**

**3.21**

**7.56**

---

Question 8

---

/ 1

---

The start-up time (in clock cycles) will be equal to the

---

Hide answer choices

---

**pipeline stalls once per vector**

deeply parallel access.

**depth of the functional unit pipeline.**

**processor-memory speed gap**

---

Question 9

---

/ 1

---

Hyperthreading is

---

Hide answer choices

---

**Single processor appears to be two logical processors.**

**processor can be considered to be using micro-programming.**

**RISC architecture supposedly execute instructions faster than CISC.**

---

Question 10

---

/ 1

---

What is not true about Rapid Execution Engine?

---

- a) ALUs run at twice the processor frequency.
  - b) Basic integer operations execute in 1/2 processor clock tick.
  - c) Provides higher throughput and reduced latency of execution.
  - d) Each logical processor has its own set of registers.
- 

Hide answer choices

---

a

---

b

c

d

## Question 1

/ 1

Decimal equivalent of 11101000 represented in 2's complement format will be \_\_\_\_\_?

Hide answer choices

-18

18

21

-21

## Question 2

/ 1

---

Largest number that can be represented using 8-bit 1's complement representation will

---

be \_\_\_\_\_

---

Hide answer choices

---

**127**

**256**

**128**

**212**

---

Question 3

---

/ 1

---

The minimum number of NAND gates required to design half adder is

---

Hide answer choices

---

3

4

5

6

---

Question 4

---

/ 1

---

For a Full adder Delay for Carry = \_\_\_\_\_ and Delay for Sum = \_\_\_\_\_

---

Hide answer choices

---

**2  $\delta$  and 2  $\delta$**

**2  $\delta$  and 3  $\delta$**

**3  $\delta$  and 2  $\delta$**

**3  $\delta$  and 3  $\delta$**

---

Question 5

---

/ 1

---

Carry Look Ahead Adder is

---

Hide answer choices

---

**Serial Adder**

---

**Parallel Adder**

---

**Spontaneous adder**

---

Question 6

---

/ 1

---

In a 5-bit carry look-ahead adder, suppose we are adding two numbers  $A = (1, 0, 1, 1, 0)$  and  $B =$

---

$(1, 0, 1, 1, 1)$ . The carry generate and carry propagate signals will be:

---

Hide answer choices

---

**G = (1, 0, 1, 1, 0) and P = (1, 0, 1, 1, 1)**

**G = (1, 0, 1, 1, 0) and P = (0, 0, 1, 0, 1)**

**G = (1, 0, 1, 1, 0) and P = (0, 0, 0, 0, 1)**

---

**G = (1, 0, 1, 1, 0) and P = (1, 0, 0, 0, 1)**

---

Question 7

---

/ 1

---

Carry Select adder consists of \_\_-

---

Hide answer choices

---

**two parallel adders**

**two serial adders**

**two selection adders**

---

Question 8

---

/ 1

---

Suppose we are multiplying (-9) x (12) using Booth's multiplier, where each number is

---

represented in 5 bits. What will be the values of A (temporary register), and Q (multiplier) after

---

third step?

---

Hide answer choices

---

**00100, 00011**

**01001, 00011**

**00100, 10001**

---

Question 9

---

/ 1

---

For n bit multiplication , number of additions=\_\_ and shift operations=\_\_

---

Hide answer choices

---

**n and n**

**2n and n**

**2n and 2n**

---

Question 10

---

**/ 1**

---

Booth's multiplier inspects three bits of the multiplier at every step

---

**T**

. True

**F**

. False

Question 1

**/ 1**

Shift left by 1 bit means \_\_\_\_\_ by 2

Hide answer choices

**Multiply**

**Divide**

**Add**

**Subtract**

---

Question 2

---

/ 1

---

The number of significant digits depends on the number of bits in\_\_

---

Hide answer choices

---

**Mantissa**

**Exponent**

---

Question 3

---

/ 1

---

IEEE-754 format supports \_\_ rounding modes

---

Hide answer choices

---

**3**

**4**

**5**

**6**

---

Question 4

---

**/ 1**

---

For multiplying F1 = 270.75 and F2 = -2.375, result will include  $2^{\text{_____}}$

---

Hide answer choices

---

**8**

**9**

**7**

---

Question 5

---

**/ 1**

---

The range of the number depends on the number of bits in \_\_\_\_\_

---

Hide answer choices

---

**Exponent**

**Mantissa**

---

Question 6

---

/ 1

---

$F = (-1)^s M \times 2^E$  represents a \_\_

---

Hide answer choices

---

**Floating point number**

**Fixed point number**

---

Question 7

---

/ 1

---

For addition of 2 numbers, We add the \_\_\_\_ values after shifting one of them right for exponent

---

alignment.

---

Hide answer choices

---

**Mantissa**

**Exponent**

## Sign bit

---

### Question 8

---

/ 1

---

If the process of rounding generates a result that is not in normalized form, then

---

we need to \_\_\_\_ the result.

---

Hide answer choices

---

**discard**

**retain**

**re-normalize**

---

### Question 9

---

/ 1

---

For subtracting  $F_2 = 234$  from  $F_1 = 210.75$ , Shift the mantissa of  $F_2$  \_\_\_\_ position,

---

Hide answer choices

---

**right by 1**

**left by 1**

**right by 2**

**left by 2**

**Question 10**

**/ 1**

The number of columns in a reservation table represents\_\_

Hide answer choices

**evaluation time**

**generation time**

**delay time**

**Question 1**

**/ 1**

Which of the following statement is true in regards to memory?

Hide answer choices

**The faster, smaller memory are always closer to processor**

**The memory, that is farthest away is the costliest.**

**As we move away from the processor, speed increases.**

**None of the above**

---

**Question 2**

---

**/ 1**

---

Which of the following statements are false?

---

**Hide answer choices**

---

**Temporal locality arises because of loops in a program.**

**Spatial locality arises because of loops in a program.**

**Temporal locality arises because of sequential instruction execution.**

**Spatial locality arises because of sequential instruction execution.**

---

**Question 3**

---

**/ 1**

---

In a two-level cache system, the access times of L1 and L2 caches are 1 and 8 clock cycles respectively. The miss penalty from L2 cache to main memory is 18 clock cycles. The miss rate of L1 cache is twice that of L2. The average memory access time of the cache system is 2 cycles. The miss rates of L1 and L2 caches respectively are?

---

[Hide answer choices](#)

---

**0.130 and 0.065**

**0.056 and 0.111**

**0.0892 and 0.1784**

**0.1784 and 0.0892**

---

Question 4

---

**/ 1**

---

The memory access time is 1 nsec for a read operation with a hit in cache, 5 nsec for a read operation with a miss in cache, 2 nsec for a write operation with a hit in cache, and 10 nsec for a write operation with a miss in cache. The execution of a sequence of instructions involves 100 instruction fetch operations, 60 memory operand read operations, and 40 memory operand write operations. The cache hit ratio is 0.9. The average memory access time (in nanoseconds) in executing the sequence of instructions is?

---

[Hide answer choices](#)

---

**1.26**

**1.68**

**2.46**

**4.52**

---

### Question 5

---

/ 1

---

Assume that a read request takes 50 nsec on a cache miss and 5 nsec on a cache hit. While running a program, it is observed that 80% of the processor's read requests result in a cache hit. The average read access time is **14nsec**

---

### Question 6

---

/ 1

---

A cache memory system with capacity of N words and block size of B words is to be designed. If it is designed as a direct mapped cache, the length of the TAG field is 10 bits. If it is designed as a 16-way set associative cache, the length of the TAG field will be how many bits?

---

Hide answer choices

---

**19**

---

**17**

---

**16**

---

**13**

---

### Question 7

---

/ 1

---

Which of the following statements is true?

---

[Hide answer choices](#)

---

**The implementation of direct mapping technique for cache requires expensive hardware to carry out division.**

**The set associative mapping requires associative memory for implementation.**

**A main memory block can be placed in any of the sets in set associative mapping.**

**None of the above.**

---

**Question 8**

---

**/ 1**

---

7. Consider a direct-mapped cache with 64 blocks and a block size of 16 bytes. Byte address 1200 will map to block number 11 of the cache.

---

**T**

---

**True**

**F**

---

**False**

---

**Question 9**

---

**/ 1**

---

How can the cache miss rate be reduced?

---

[Hide answer choices](#)

---

**By using larger block size**

**By using larger cache size**

**By reducing the cache associativity**

**None of the above**

---

Question 10

---

/ 1

---

Consider a two-level memory hierarchy with separate instruction and data caches in level 1, and main memory in level 2. The clock cycle time is 1 ns. The miss penalty is 20 clock cycles for both read and write. 2% of the instructions are not found in I-cache, and 10% of data references not found in D-cache. 25% of the total memory accesses are for data, and cache access time (including hit detection) is 1 clock cycle. The average access time of the memory hierarchy will be which of the following nanoseconds?

---

Hide answer choices

---

**1.76 ns**

**1.68 ns**

**1.88 ns**

**1.86 ns**



---

**Course Name: Computer Architecture and Organization**

**Questions Solutions**

**TYPE OF QUESTION: MCQ/MSQ/SA**

---

**!!! KEEP IT SAFE AND SECURE !!!**

---

## Credits

**LINKED HANDLES :**

- <https://www.linkedin.com/in/aarish-ejaz-22b158200>
- <https://www.linkedin.com/in/piyush-rawat-a38335200>
- <https://www.linkedin.com/in/akash-bulla-211843204/>

**Instagram Handles :**

- [https://www.instagram.com/piyush\\_rawat.09/](https://www.instagram.com/piyush_rawat.09/)
- [https://www.instagram.com/program\\_1623/](https://www.instagram.com/program_1623/)
- [https://www.instagram.com/\\_sky.fire\\_/](https://www.instagram.com/_sky.fire_/)



---

**Course Name: Computer Architecture and Organization**

**Assignment- Week 1**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 12**

**Total mark: 12 x 1 = 12**

---

**QUESTION 1:**

Which of the following is/are false?

- a. Processor can directly access data from secondary memory.
- b. Primary memory are used as backup memory.
- c. Primary memory stores the active instructions and data for the program being executed on the processor.
- d. Primary memory can store only instructions.

**Correct Answer: a, b and d**

**Detailed Solution:**

Processor have access to primary memory which stores data and instructions for the program being executed. The secondary memory is used as a backup memory. Primary memory can store both instructions and data.

Thus option (a), (b) and (c) are false.

---

**QUESTION 2:**

Program counter:

- a. Counts the total number of instructions present in a program.
- b. Points to the current instruction that is being executed.
- c. Points to the next instruction that is to be executed.
- d. Stores the data of the current instruction that is being executed.

**Correct Answer: c**

**Detailed Solution:**



---

Program counter is used to point to the next instruction that is to be executed by the processor.

Thus option (c) is correct.

---

**QUESTION 3:**

Which of the following is/are false?

- a. Central Processing Unit (CPU) consists of Control Unit, Arithmetic Logic Unit (ALU) and Primary Memory.
- b. There are broadly two types of memory, primary memory and secondary memory.
- c. The arithmetic and logic operations are performed in the control unit.
- d. Control Unit is a part of main memory.

**Correct Answer: a, c, d**

**Detailed Solution:**

CPU consists of ALU and Control unit. Control unit is responsible for generating the required control signals for execution of instructions and ALU performs arithmetic and logic operations.

---

Thus option (a),(c) and (d) are false.

---

**QUESTION 4:**

Which of the following contains circuitry to carry out operations such as addition, multiplication etc?

- a. Control Unit
- b. Memory Unit
- c. Input/output Unit
- d. None of these.



**Correct Answer: d**

**Detailed Solution:**

In computer system the computations are performed in Arithmetic and Logical unit, thus none of the options are true.

---

**QUESTION 5:**

Tow registers are initialized as R1=30 and R2 = 25. The instruction ADD R1, R2 is in memory location 2018H. If the size of an instruction is 4 byte, then after the execution of the instruction the value of PC, R1 and R2 will be.

- a. PC = 2018H, R1 = 55, R2 =25
- b. PC = 2018H, R1 = 55, R2 =00
- c. PC = 201CH, R1 = 55, R2 =00
- d. PC = 201CH, R1 = 55, R2 =25

**Correct Answer: d**

**Detailed Solution:**

ADD R1, R2 is equivalent to  $R1 = R1 + R2$ , after executing the instruction R1 will contain 55 and value of R2 will remain unchanged. PC will point the next instruction as the instruction size is 4 byte thus PC = 201CH.

Thus option (d) is correct.

---

**QUESTION 6:**

Consider a 32-bit machine where an instruction (SUB R1, LOCA) is stored at location 2004H. LOCA is a memory location whose value is 1024H. The number of memory access required to execute this instruction will be .....?



---

**Correct Answer: 2**

**Detailed Solution:**

Initially, instruction is stored at location 2004H and hence, PC contains 2004H. After the instruction is fetched the PC points to next memory location. As this is a 32-bit machine PC will be incremented by 4. The steps carried out for executing this instruction is given below:

MAR  $\leftarrow$  PC / (2004H)

PC  $\leftarrow$  PC + 4 // PC = 2004 + 4 = 2008H

MDR  $\leftarrow$  Mem[MAR] // (ADD R1, LOCA) #Memory Access 1

IR  $\leftarrow$  MDR // (ADD R1, LOCA)

MAR  $\leftarrow$  IR[Operand] // (LOCA )

MDR  $\leftarrow$  Mem[MAR] // (Content of LOCA) #Memory Access 2

R1  $\leftarrow$  R1 + MDR

Therefore, we have two memory accesses.

---

The correct answer will be 2.

---

**QUESTION 7:**

Consider a 32-bit machine where an instruction (ADD R1, R2) is stored at memory location 2004H (in hexadecimal). What will be the value of IR and PC while the instruction is fetched and executed? Consider Individual instruction is 32-bit.

- a. IR = ADD R1, R2, PC = 2004H
- b. IR = 2004H, PC = ADD R1, R2
- c. IR = ADD R1, R2, PC = 2008H
- d. IR = 2008H, PC = ADD R1, R2

**Correct Answer: c**

**Detailed Solution:** Initially, instruction is stored at location 2004H and hence, PC contains 2004H. The steps carried out to execution this instruction is given below:

MAR  $\leftarrow$  PC // MAR = 2004

PC  $\leftarrow$  PC + 4 // PC = 2004 + 4 = 2008



MDR  $\leftarrow$  Mem[MAR] // MDR = ADD R1, R2 which is the content of location 2004.

IR  $\leftarrow$  MDR // IR = ADD R1, R2

R1  $\leftarrow$  R1 + R2

Therefore IR = ADD R1, R2, and PC = 2008H

Thus correct option will be (c).

---

**QUESTION 8:**

For a 512 X 32 bit memory that contains 512 locations each with 32-bit data, what will be the address (in binary) of the 389<sup>th</sup> location? (Assume first location as 0)

- a. 110000100
- b. 110000101
- c. 011000100
- d. 011000101

**Correct Answer: a**

**Detailed Solution:** The first location in memory has address 0, second location has address 1, and so on. Thus,

the 389<sup>th</sup> location has address 388, which is 110000100 in binary.

The correct option is (a).

---

**QUESTION 9:**

Consider the instruction XOR R3, R2. If register R1 and R2 contains value 09H and 47H respectively. What will be the value R3 after executing the instruction?

- a. 4E
- b. 2E
- c. 3E
- d. 5E

**Correct Answer: a**

**Detailed Solution:**



XOR R3, R2

R3: 0000 1001 (09)

R2: 0100 0111 (47)

R3: 0100 1110 (4E)

The correct option is (a).

---

**QUESTION 10:**

For a byte addressable computer which has 4Gigabytes of memory. If each word in the computer is 64bit. Then how many bits are needed to address a single word.

- a. 29
- b. 30
- c. 31
- d. 32

**Correct Answer: a**

**Detailed Solution:**

Address Space = 4GB =  $4 \times 2^{30}$  B =  $2^{32}$  B

1 word = 64 bits = 8 B

We have  $2^{32}/8 = 2^{29}$  words

Thus, we require 29 bits to address each word.

The correct option is (a).

---

**QUESTION 11:**

Consider the following statement and answer:

- (i) In Von-Neumann architecture, instruction and data are stored in same memory module.
- (ii) In Von-Neumann architecture, instruction and data access can be performed parallelly.
  - a. Only (i) is true.
  - b. Only (ii) is true.
  - c. Both (i) and (ii) are true.
  - d. Both (i) and (ii) are false.



**Correct Answer: a**

**Detailed Solution:**

In Von-Neumann architecture instructions and data are stored in same memory thus it is not possible to access both at the same time, hence only statement (i) is true.

The correct options is (a).

---

**QUESTION 12:**

Which of the following statement(s) is/are true?

- a. Nibble: A collection of 4 bits
- b. Word: Byte/Multiple of Bytes
- c. Byte: A collection of 8-bit
- d. All of these

**Correct Answer: d**

**Detailed Solution:**

Bit is a binary digit all other options are true.

Thus correct option is (d).

---

\*\*\*\*\*END\*\*\*\*\*



---

**Course Name: Computer Architecture and Organization**

**Assignment- Week 2**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 10**

**Total mark: 10 x 1 = 10**

---

**QUESTION 1:**

Which of the following statement(s) is/are true for **radix** of a positional number system.

- a. It represents number of unique digits, used to represent numbers.
- b. It represents number of binary digits, used to represent single digit of any number system.
- c. Radix of Hexadecimal number system is 4.
- d. Radix of Hexadecimal number system is 16.

**Correct Answer: a, d**

**Detailed Solution:**

Radix or base represents the number of unique digits available in the number system.

Hexadecimal number system have 16 unique digits (0-9) and (A-F).

The correct options are (a) and (d).

---

**QUESTION 2:**

What will be binary representation of  $(3.6E)_{16}$ ?

- a. 0011 . 0110 1111
- b. 0011 . 0110 1110
- c. 11 . 0110 111
- d. None of these

**Correct Answer: b, c**

**Detailed Solution:**

The conversion can be done as follows:



---

3.6E

3 = 0011

.6E = 0110 1110

Hence,  $(3.6E)_{16} = (0011.0110\ 1110)_2$ . However we can remove the trailing 0's and leading 0's thus one more representation will be  $(11.0110\ 111)_2$ .

The correct options are (b) and (c).

---

### **QUESTION 3:**

What is the largest number that can be represented using 10-bit 2's complement representation -----?

**Correct Answer: 511**

**Detailed Solution:**

The range of n-bit 2's complement representation is given by  $-(2^{n-1})$  to  $+(2^{n-1} - 1)$ . So, for n = 10 the largest number that can be represented will be  $+(2^{10-1} - 1) = 511$ .

---

### **QUESTION 4:**

Consider the following statement for representing signed numbers using sign magnitude, 1's complement and 2's complement format:

- (i) Sign of the number can be identified using MSB.
  - (ii) By flipping the sign bit we can obtain the number of its opposite sign.
- Which of the following is correct?

- a. Only (i) is true
- b. Only (ii) is true
- c. Both (i) and (ii) are true
- d. Both (i) and (ii) are false

**Correct Answer: a**

**Detailed Solution:**



In all representation we can identify sign of the number using MSB. But by flipping the sign bit we cannot obtain the number of its opposite sign for 1's and 2's complement representation.

Thus option (a) is correct.

---

#### **QUESTION 5:**

Which of the following addressing modes does not require any memory access for fetching the operands?

- a. Direct Addressing
- b. Immediate Addressing
- c. Register Indirect
- d. Register Addressing
- e. None of these

**Correct Answer: b and d**

#### **Detailed Solution:**

Immediate addressing does not require any memory access as the data is available as an operand in the instruction itself. Register addressing mode also not require any memory access as the registers are used as an operands.

Thus option (b) and (d) are correct.

---

#### **QUESTION 6:**

How do you represent -10 using 16-bit, 2's complement representation?

- a. 1000 0000 0001 0110
- b. 0000 0000 0000 1010
- c. 1111 1111 1111 0110
- d. 0000 0000 0001 0110

**Correct Answer: c**

#### **Detailed Solution:**



Firstly, the representation of +10 will be 01010.

In 16 bit +10 will be 0000 0000 0000 1010

$$\begin{array}{rcl} \text{Now, } -10 & = & \text{1's Complement of } 0000\ 0000\ 0000\ 1010 + 1 \\ & = & 1111\ 1111\ 1111\ 0101 \\ & & \hline & + 1 & \\ & = & 1111\ 1111\ 1111\ 0110 \end{array}$$

The correct option is (c).

---

### **QUESTION 7:**

For the instruction **STORE R1 , 35 (R2)** what will be effective address of the memory operand if R2 is 200 (in decimal)?

- a. 35
- b. 165
- c. 200
- d. None of these

**Correct Answer: d**

#### **Detailed Solution:**

For this instruction indexed addressing mode is used, the operand 35 specifies an offset of displacement, which is added to the index register (R2) to get the effective address. The effective address will be  $200 + 35 = 235$ . None of the options are correct.

Thus correct option is (d).

---

### **QUESTION 8:**

Which of the following statement(s) is/are true for CISC architecture?

- a. Supports large number of addressing modes.
- b. It does not support variable-length instruction.
- c. Pipeline implementation of CISC architecture is complex.
- d. Only load and store instruction can access memory

**Correct Answer: a, c**



**Detailed Solution:**

Refer lecture 8: slide number 2-5 for property of CISC and RISC.

Both option (a) and (c) are true for CISC architecture.

---

**QUESTION 9:**

Which of the following instruction is/are invalid for MIPS32 processor if \$s0 and \$s1 contains address of some variables (say A and B).

- a. add \$t0, \$t1, 20(\$s0)
- b. lw \$t0, 40(\$s0)
- c. add \$t0, \$t0, \$t1
- d. None of these

**Correct Answer: a**

**Detailed Answer:**

The instruction add \$t0, \$t1, 20(\$s0) is using indirect addressing and trying to access memory location which is not valid in MIPS.

The correct option is (a).

---

**QUESTION 10:**

Consider the following MIPS instruction:

**slt \$t0, \$s0, \$s1**

What does the instruction do if \$s0 and \$s1 is loaded with some data?

- a. Set \$t0 = 1 if \$s0 < \$s1
- b. Set \$t0 = 0 if \$s0 > \$s1
- c. Set \$t0 = 1 if \$s0 > \$s1
- d. Set \$t0 = 0 if \$s0 < \$s1

**Correct Answer: a, b**

**Detailed Solution:**

The instruction will set \$t0=1 if \$s0 < \$s1 else it will set \$t0=0.



NPTEL Online Certification Courses  
Indian Institute of Technology Kharagpur



---

Hence, the correct options are (a) and (b).

---

\*\*\*\*\*END\*\*\*\*\*



**Course Name: Computer Architecture and Organization**

**Assignment- Week 3**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 10**

**Total mark:  $10 \times 1 = 10$**

---

**QUESTION 1:**

A processor having clock cycle time of 5ns, will have a clock rate of \_\_\_\_\_ MHz.

**Correct Answer: 200**

**Detailed Answer:**

Frequency is given by the formula  $f = 1 / (\text{Cycle time})$ .

Hence, frequency =  $1 / 5\text{ns} = 1 / (5 \times 10^{-9}) = 2 \times 10^8 \text{ Hz} = 200 \text{ MHz}$

---

**QUESTION 2:**

Suppose a program requires 1000 instructions to execute. The average number of cycles per instruction (CPI) is 1.5. The clock frequency of the machine is 2.0 GHz. Time required to execute the program will be \_\_\_\_\_ nanoseconds.

**Correct Answer: 750**

**Detailed Answer:**

Execution time = IC x CPI x T

IC = Instruction Count = 1000

CPI = Cycles Per Instructions = 1.5

T = Clock cycle time =  $1/f = 1/2 \times 10^9 = .5 \times 10^{-9} \text{ s} = .5 \text{ ns}$

Execution Time = IC x CPI x C =  $1000 \times 1.5 \times .5 = 750 \text{ nanoseconds.}$

---

**QUESTION 3:**

Consider a program whose instruction count is 1,500, average CPI is 2, and clock cycle time is 1 nanosecond. Suppose we use a new compiler on the same program for which the new instruction count is 2,500, and new CPI is 1.5, which is running on a faster machine with clock cycle time of 0.5 nanosecond. The speedup achieved will be:



- 
- a. 1.66
  - b. 0.63
  - c. 1.60
  - d. 1.33

**Correct Answer: c**

**Detailed Answer:**

$$XT \text{ for compiler\_1} = 1500 \times 2 \times 1 = 3000 \text{ ns}$$

$$XT \text{ for compiler\_2} = 2500 \times 1.5 * 0.5 = 1875 \text{ ns}$$

$$\text{Speedup} = XT \text{ for compiler\_1} / XT \text{ for compiler\_2} = 3000 / 1875 = 1.60$$

The correct option is (c).

---

#### **QUESTION 4:**

On which of the following does CPI depend on?

- a. Instruction Set Architecture
- b. Compiler
- c. CPU organization
- d. All of these

**Correct Answer: d**

**Detailed Answer:**

CPI depends on ISA, compiler as well as the CPU organization.

The correct option is (d).

---

#### **QUESTION 5:**

Consider the following statements:

- (i) RISC architecture increases number of instructions per program.
- (ii) RISC architecture increases CPI and clock cycle time.

Which of the following is correct?

- a. Only (i) is true
- b. Only (ii) is true
- c. Both (i) and (ii) are true
- d. Both (i) and (ii) are false



**Correct Answer: a**

**Detailed Answer:**

RISC architecture uses simple instructions which can lead to larger number of instructions per program, however RISC provides execution of instruction mostly in single cycle thus reduces CPI.

The correct option is (a).

---

**QUESTION 6:**

Suppose that a machine X executes a program with an average CPI of 2.5. Consider another machine Y (with same instruction set and a little better compiler) that executes the same program with 10% less instructions and with the CPI of 1.5 at 4GHz. What should be the clock frequency of X so that both the machines have same performance?

- a. 7.40GHz
- b. 7.40MHz
- c. 7.40KHz
- d. 7.40Hz

**Correct Answer: a**

**Detailed Solution:**

Given,  $CPI_X = 2.5$ ,

$$CPI_Y = 1.5, f_Y = 4\text{GHz}, T_Y = 1 / f_Y = 0.25\text{ns}$$

$$IC_Y = IC_X - IC_X \times 0.1 = 0.9 IC_X$$

If both machine have same performance, then we must have

$$XT_X = XT_Y$$

$$IC_X \times CPI_X \times T_X = IC_Y \times CPI_Y \times T_Y$$

$$IC_X \times 2.5 \times T_X = 0.9 IC_X \times 1.5 \times 0.25\text{ns}$$

$$T_X = .3375 / 2.5 = 0.135\text{ns}$$

$$f_X = 1 / T_X = 1 / 0.135\text{ns} = 7.40 \text{ GHz}$$

The correct option is (a).

---

**QUESTION 7:**



Suppose for a CISC ISA implementation, there are four instruction types LOAD, STORE, ALU and BRANCH with relative frequencies of 25%, 25%, 40% and 10% respectively, and CPI values of 3, 2.5, 1 and 6 respectively. The overall CPI will be \_\_\_\_\_. (Provide answer up-to 2 decimal places)

**Correct Answer: 2.37 to 2.38**

**Detailed Solution:**

$$\text{CPI} = \sum(F_i \times \text{CPI}_i) = 0.25 \times 3 + 0.25 \times 2.5 + 0.40 \times 1 + 0.10 \times 6 = 0.75 + 0.625 + .40 + .60 = 2.375.$$

---

**QUESTION 8:**

Consider the following statements:

- (i) MIPS rating is used to compare performance of two processors.
- (ii) Higher MIPS rating indicates better performance.

Which of the following is correct?

- a. Only (i) is true
- b. Only (ii) is true
- c. Both (i) and (ii) are true
- d. Both (i) and (ii) are false

**Correct Answer: a**

**Detailed Solution:**

MIPS is used to compare performance of two processors; however, it is not always true that higher MIPS rating mean higher performance. Some computer may call other software routine for computation which might not be considered for evaluating MIPS rating.

The correct option is (a).

---

**QUESTION 9:**

Consider a program with 50 million instructions, a machine requires 25 milliseconds to execute this program. What will be the MIPS rating of the machine?

- a. 20
- b. 200
- c. 2000
- d. 20000



**Correct Answer: c**

**Detailed Solution:**

**MIPS rating = (Instruction Count / Execution time x10<sup>6</sup>)**

$$= (50 \times 10^6 / 25 \times 10^{-3} \times 10^6) = 50000/25 = 2000$$

**The correct option is (c).**

---

**QUESTION 10:**

Which of the following statements is/are true with respect to Amdahl's law?

- a. It express the law of diminishing returns.
- b. It provides a measure to compare execution time of two machines.
- c. It expresses the maximum speedup that can be achieved.
- d. All of these

**Correct Answer: a, c**

**Detailed Solution:**

**Amdahl's law expresses the law of diminishing returns, which identifies maximum speedup that can be achieved if we are want to improve some part of the program.**

**The correct options are (a) and (c).**

---

\*\*\*\*\*END\*\*\*\*\*



---

**Course Name: Computer Architecture and Organization**

**Assignment- Week 4**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions:** 10

**Total mark:**  $10 \times 1 = 10$

---

**QUESTION 1:**

If the instruction size of a machines is 64 bit long, which has a byte addressable memory then to point to the next instruction the PC value should be incremented by \_\_\_\_\_?

**Correct Answer: 8**

**Detailed Solution:**

In byte addressable memory each byte will have unique address, so if the instructions are 64-bit long then PC should be incremented by 8 to point to the next instruction to be executed.

The correct answer will be 8.

---

**QUESTION 2:**

During fetch operation of an instruction does CPU knows what kind of instruction is getting fetched?

- a. Yes
- b. No

**Correct Answer: b**

**Detailed Solution:**

The type of instruction is known at the decoding stage.

Thus option (b) is correct.

---

**QUESTION 3:**

Consider the following statement:

- (i) Program Counter holds address of the memory location containing the next instruction to be executed.



---

(ii) Instruction Register contains the next instruction to be executed.

Which of the following is correct?

- a. Only (i) is true
- b. Only (ii) is true
- c. Both (i) and (ii) are true
- d. Both (i) and (ii) are false

**Correct Answer: a**

**Detailed Solution:**

Program counter contains the address of the next instruction to be executed and Instruction register contains the current instruction being executed.

Thus option (a) is correct.

---

**QUESTION 4:**

For single bus architecture, register Y and Z can be used only to store intermediate value?

- a. True
- b. False

**Correct Answer: a**

**Detailed Solution:**

In single bus architecture register Y and Z are temporary registers which cannot be used by instruction explicitly. It can only be used to store intermediate value during computation.

Thus option (a) is correct.

---

**QUESTION 5:**

Consider the following set of micro-operations for a single bus architecture machine:

**Step    Action**

T1: PCout, MARin, Read, Select4, Add, Zin



T2: Zout, PCin, Yin, WMFC

T3: MDRout, IRin

T4: R3out, Yin, SelectY

T5: R4out, ADD, Zin

T6: Zout, R3in, End

The micro-operation corresponds to which instruction?

- a. ADD Y, R4
- b. ADD R3, Y
- c. ADD R3, R4
- d. MOVE R3, R4

**Correct Answer: c**

**Detailed Solution:**

In the first three basic steps loading instruction address to MAR, incrementing PC and loading instruction to register IR takes place. In fourth step Y is loaded with the content of R3, and in the next step ALU operation is performed which adds Y (content of R3) and content of R4. At step 6 the content of Z (result) is again stored back to register R3.

Thus option (c)is correct.

---

#### **QUESTION 6:**

Consider the following scenario where registers  $R_i$  is connected through a bus. Assume that  $R_i$  is initially loaded with data X. And in bus we have data Y.



What will happen if  $R_{i\text{in}} = 1$  is applied?

- a. The content of  $R_1$  will become Y
- b. Data X will be placed on the bus.



- c. R1 will be loaded with Y and, bus will be loaded with X.
- d. The content of R1 and Bus will be unchanged.

**Correct Answer: a**

**Detailed Solution:**

The control signal  $Ri_{in}$  indicates the input control of register  $Ri$  is activated thus content of  $Ri$  will be loaded with new value Y which is available in the BUS.

The correct option is (a).

---

**QUESTION 7:**

Which of the following statement(s) is/are true for fetching a word from memory?

- a. The word can be directly loaded to general purpose registers from memory.
- b. The information to be fetched, must be an instruction/operand.
- c. The word can be directly transferred to ALU for further operation.
- d. All of these.

**Correct Answer: b**

**Detailed Solution:**

During fetching of a word, the word is first loaded into register MDR.

The correct option is (b).

---

**QUESTION 8:**

Consider a single bus system, how many steps will be required to complete the instruction  
MOVE R1, R2

- a. 2
- b. 3
- c. 4
- d. 5



---

**Correct Answer: c**

**Detailed Solution:**

The steps to complete the operation will be as follows:

- T1: PC<sub>out</sub>, MAR<sub>in</sub>, Read, Select4, Add, Z<sub>in</sub>
- T2: Z<sub>out</sub>, PC<sub>in</sub>, Y<sub>in</sub>, WMFC
- T3: MDR<sub>out</sub>, IR<sub>in</sub>
- T4: R2<sub>out</sub>, R1<sub>in</sub>, END

Thus correct answer will be (c)

---

**QUESTION 9:**

Consider a hardwired control unit where each instruction of the machine requires maximum of 50 steps to complete its execution. If the total number of such instructions are 129, what should be the size of step decoder and instruction decoder respectively?

- a. 6x64; 7x128
- b. 50x1; 129x1
- c. 64x1, 256x1
- d. 6x64, 8x256

**Correct Answer: d**

**Detailed Solution:**

A decoder with n-select line can have  $2^n$  outputs thus option b, and c, are incorrect. Option a is also incorrect as first decoder can fulfill the requirement of maximum time steps to be generated by controller; however, with second decoder we can only decode 128 instructions. Thus option (d) is correct.

---

**QUESTION 10:**

Which of the following statement(s) is/are true for micro-programmed control unit?

- a. The control signals are generated by circuits such as encoder, decoder etc.
- b. It is flexible for modification.
- c. It is used in CISC architecture.
- d. All of these



---

**Correct Answer: b, c**

**Detailed Solution:**

In micro-programmed control unit control signals are stored in memory and loaded sequentially based on instruction. It is slow as compared to hardwired control unit but provides flexibility as control signals are stored in memory that can be modified as per requirement. It is used in CISC architecture.

The correct options are (b) and (c).

---

\*\*\*\*\*END\*\*\*\*\*



---

**Course Name: Computer Architecture and Organization**

**Assignment- Week 5**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 10**

**Total mark: 10 x 1 = 10**

---

**QUESTION 1:**

Which of the following statement(s) is/are true?

- a. In memory, data are stored in the form of 0's and 1's.
- b. Memory system stores data and instruction.
- c. Every bit of memory has a unique address.
- d. All of these

**Correct Answer: a, b**

**Detailed Solution:**

**Memory are used to store data and instruction. Every memory location has a unique address, however each location may allow access of multiple bit/byte/word.**

**The correct options are (a) and (b).**

---

**QUESTION 2:**

Consider a memory with "n" address line and "m" data lines what will be the total number of bits in the memory.

- a.  $2^m \times n$
- b.  $2^n \times m$
- c.  $2^{n+m}$
- d.  $m^2 \times n$

**Correct Answer: b**

**Detailed Solution:**

**The total number of bits in the memory with n address line and m data lines will be given by  $2^n \times m$ .**



---

The correct option is (b).

---

### **QUESTION 3:**

Which of the following statement(s) is/are false?

- a. In volatile memory data is lost when power is switched off.
- b. Dynamic memory requires periodic refreshing.
- c. Magnetic tape does not allow random access of data.
- d. None of these.

**Correct Answer: d**

**Detailed Solution:**

All statements are true, refer lecture 23, and slide no (5-6).

The correct option is (d).

---

### **QUESTION 4:**

What is/are false for cache memory?

- a. It consumes low power as compared to main memory.
- b. It is a type of non-volatile memory.
- c. It decrease the effective speed of the memory system.
- d. All of these.

**Correct Answer: d**

**Detailed Solution:**

Cache memory is a fast memory which is used to increase the effective speed of memory system. Cache memory consumes high power as compared to main memory thus use of larger size cache is not recommended.

The correct option is (d).

---



---

### **QUESTION 5:**

The total number of external connection required by an  $8 \times 8$  memory will be \_\_\_\_\_?

**Correct Answer: 15**

**Detailed Solution:**

**Address decoder of size  $3 \times 8 \rightarrow 3$  external connection**

**Data outputs 8 bit  $\rightarrow 8$  external connection**

**2 external connection for R/W and CS.**

**2 external connection for power supply and ground.**

**Total external connection:  $3+8+2+2 = 15$ .**

---

### **QUESTION 6:**

Which of the following is/are true for virtual memory system?

- a. It increases the size of the program that can be run
- b. It increases the size of the physical memory
- c. It increases the size of the secondary memory
- d. It improves the processor-memory bandwidth

**Correct Answer: a**

**Detailed Solution:**

**Virtual memory only increases the size of the program that can be run. But it does not allow by any means to increase the size of the memory (physical or secondary). It does not have any impact on processor memory bandwidth.**

**The correct options is (a).**

---

### **QUESTION 7:**

Which of the following statement is true for writing 1 in SRAM chip?

- a. The bit line b is set with 1, and bit line b' is set with 0.
- b. The bit line b is set with 0, and bit line b' is set with 1.
- c. The bit line b is set with 1, and bit line b' is set with 1.
- d. The bit line b is set with 0, and bit line b' is set with 0.

**Correct Answer: a**

**Detailed Solution:**

To write 1 in SRAM cell, bit line b is set with 1, and bit line b' is set with 0 and to write 0 in SRAM cell bit line b is set with 0, and bit line b' is set with 1.

The correct options is (a).

---

**QUESTION 8:**

Consider a 1Mbit memory organized as 1024 (rows) and 1024 (columns). If the data bus is 16-bit wide, total number of address lines required will be:

- a. 16
- b. 14
- c. 15
- d. 20

Correct Answer: a

**Detailed Solution:**

The 1Mbit memory is organized as  $2^{10} = 1024$  rows and  $2^{10} = 1024$  columns, with a 16-bit wide data bus. So, the memory can be organized as  $(2^{10}) \times (2^6 \times 2^4)$ . Therefore, total number of address lines will be  $10 + 6 = 16$ . The correct option is (a).

---

**QUESTION 9:**

For a DDR2 SDRAM if the internal clock is 140MHz and bus clock is 350MHz, what will be the maximum data transfer rate?

- a. 4.48 KB/s
- b. 4.48 MB/s
- c. 4.48 GB/s
- d. 4.48 TB/s

Correct Answer: c

**Detailed Solution:**

Maximum data transfer rate for the given internal clock and bus clock will be  
$$(2 \times 280 \times 10^6 \times 64) / 8 = 4.48 \text{ GB/s}$$

The correct option is (c).

---



**QUESTION 10:**

Consider a memory chip with size 1 Gbyte. Four such chips are connected together to build a larger byte-oriented memory system using memory interleaving, where the processor data lines are 32-bits wide. If we name the four chips as M0, M1, M2 and M3, to which memory modules will the memory addresses **0542364AH** and **1A54200CH** map to?

- a. M2 and M3
- b. M2 and M0
- c. M0 and M0
- d. M0 and M1

**Correct Answer: b**

**Detailed Solution:**

The 1Gbyte memory chip will have 30 address lines, as  $2^{30} = 1G$ .

The memory system will contain 4 such chips, and hence will have 32 address lines.

The high-order address lines  $A_{31}, A_{30} \dots A_2$  will be connected to the address lines of the four chips, while the low-order two address lines will be selecting one of the four chips:

M0:  $A_1 = 0, A_0 = 0$

M1:  $A_1 = 0, A_0 = 1$

M2:  $A_1 = 1, A_0 = 0$

M3:  $A_1 = 1, A_0 = 1$

For the memory address 0542364AH,  $A_1 = 1, A_0 = 0 \rightarrow M2$

For the memory address 0A54200CH,  $A_1 = 0, A_0 = 0 \rightarrow M0$

The correct option is (b).

---

\*\*\*\*\*END\*\*\*\*\*



**Course Name: Computer Architecture and Organization**

**Assignment- Week 6**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 10**

**Total mark: 10 x 1 = 10**

---

**QUESTION 1:**

Which of the following statement(s) is/are true?

- a. Organizing memory in multiple levels may result in faster data access.
- b. As we move away from the processor the speed of memory increase.
- c. By keeping commonly used data in the memory that is near to processor, memory access time may increase.
- d. None of these.

**Correct Answer: a**

**Detailed Solution:**

**Memory hierarchy can result in faster access time on average. As we move away from processor the speed of memory decreases, and by keeping commonly used data near to processor we can access those data quickly, and thus overall access time can be improved.**

**The correct options is (a).**

---

**QUESTION 2:**

Consider a three-level memory hierarchy (cache - main memory - magnetic disk). Which of the following interfaces of the memory hierarchy is/are managed by operating system?

- a. Cache – magnetic disk
- b. Cache - Main memory
- c. Main memory - magnetic disk
- d. None of these.

**Correct Answer: c**

**Detailed Solution:**

**For multi-level hierarchy, the interfaces between the cache and main memory are managed by hardware, whereas the interface between main memory and magnetic disk/HDD/SSD is managed by software (operating system).**



---

The correct options is (c).

---

**QUESTION 3:**

Consider a machine which takes 2 nanosecond to fulfill a read request for cache hit and 45 nanoseconds to fulfill a read request for cache miss. If a program results in 90% of cache hit then the average read access time will be \_\_\_\_\_ nanoseconds.

**Correct Answer: 2.25**

**Detailed Solution:**

$$\begin{aligned}\text{Average Access Time} &\Rightarrow \text{Hit. } T_{\text{hit}} + (1-\text{Hit}).T_{\text{miss}} \\ &= .90 \times 2 + .10 \times 45 = 1.80 + .45 = 2.25 \text{ nanoseconds.}\end{aligned}$$

---

**QUESTION 4:**

Consider a 2-level memory hierarchy consisting of a single-level cache memory and the main memory. The access times for the cache memory and main memory are 10 nanoseconds and 100 nanoseconds respectively. If a program is using cache for 85% of the time. The speedup gain by using cache will be \_\_\_\_\_.

**Correct Answer: 4.25 to 4.26**

**Detailed Solution:**

This can be solved by Amdahl's law

The memory access time of cache is 10 ( $r=10$ ) time faster than main memory and can be used for 85% ( $H=.85$ ) of the time.

$$\text{Speedup} = 1 / [(H/r) + (1-H)] = 1 / [(.85/10 + .15)] = 1 / (0.085 + .15) = 4.255$$

---

**QUESTION 5:**

Consider a main memory with 1024 blocks with block size of 32-bit each and a cache memory which consist of 128 blocks. If we use direct mapping then block 128 and 256 will be mapped to which blocks of cache memory?



- 
- a. 128,256
  - b. 0, 0
  - c. 1, 1
  - d. 0, 1

**Correct Answer: b**

**Detailed Solution:**

Mapping will be done as: main memory block % total cache block;

For 128<sup>th</sup> block: 128 % 128 = 0<sup>th</sup> block

For 256<sup>th</sup> block: 256 % 128 = 0<sup>th</sup> block

The correct option is (b).

---

**QUESTION 6:**

Consider a set-associative cache that consists of 256 blocks divided into 16-block sets and a byte addressable main memory of size 512Kbytes, with block size of 32 bytes each. How many bits will be there in the TAG, SET and WORD fields respectively?

- a. 10, 4, 5
- b. 10, 5, 4
- c. 10, 5, 5
- d. 10, 4, 4

**Correct Answer: a**

**Detailed Solution:**

Number of blocks in main memory: total size / block size = 512K/32 = 16K blocks.

Number of sets in cache = 256 / 16 = 16.

Since each block has 32 bytes, number of bits in WORD field = 5 bits (as  $2^5 = 32$ )

Since there are 16 sets, number of bits in SET field = 4 (as  $2^4 = 16$ )

Total number of bits in the address = 19 (as  $2^{19} = 512K$ )

Hence, number of bits in TAG field = 19 - (5 + 4) = 10

The correct option is (a).

---

**QUESTION 7:**



---

A computer has 4 Gbyte memory with 32-bit words, where the computer uses word-level addressing. Each block of memory stores 64 words. The computer has a direct-mapped cache of 128 blocks. How many bits will be there in the TAG field?

- a. 16
- b. 17
- c. 18
- d. 19

**Correct Answer: b**

**Detailed Solution:**

Main memory size = 4GB = 4G / 4 = 1G words (as 1 word = 4 bytes).

So, total number of address lines = 30 (as  $2^{30} = 1G$ )

Since each block is 64 words, number of bits in WORD field = 6 (as  $2^6 = 64$ )

As there are 128 blocks in cache, number of bits in BLOCK field = 7 (as  $2^7 = 128$ )

Hence, number of bits in TAG field =  $30 - (7 + 6) = 17$

The correct option is (b).

---

#### **QUESTION 8:**

Consider an N-way set associative memory? What will happen if we increase value of N?

- a. Search time in cache will increase.
- b. Freedom of mapping main memory block into cache will increase.
- c. Size of set will increase.

**Correct Answer: a, b, c**

**Detailed Solution:**

Increasing the value of N means we move towards N-way set associative which will increase search time, but allow more freedom to map blocks of main memory into cache. Also the size of set will increase and the number of sets will decrease.

Thus all the options are correct.

---

#### **QUESTION 9:**



Consider a processor with an average CPI of 1.5, which runs a program with the following instruction mix: ALU instructions – 50%, LOAD – 25%, STORE – 10%, BRANCH – 15%. Assume that the cache miss rate is 5%, and the miss penalty is 33 cycles. What will be the effective CPI for a unified L1-cache, using write back and write allocate, assuming that the probability that the cache is dirty is 10%.

- a. 1.35
- b. 1.85
- c. 2.45
- d. 3.95

**Correct Answer: d**

**Detailed Solution:**

Average CPI = 1.5

ALU = 0.5, LOAD = 0.25, STORE = 0.10, BRANCH = 0.15, Cache Miss Rate = 0.05, Miss Penalty ( $t_{MM}$ ) = 33 cycles, Cache Dirty Rate = 0.10, Cache Clean Rate = 1 – 0.10 = 0.90

Number of memory accesses per instruction =  $1 + 0.25 + 0.10 = 1.35$

(1 for instruction fetch, 0.25 for load, 0.10 for store)

$$\begin{aligned}\text{Memory stalls / access} &= (1 - H_{L1}) (t_{MM} \times \% \text{ clean} + 2 t_{MM} \times \% \text{ dirty}) \\ &= 0.05 (33 \times 0.90 + 66 \times 0.10) \\ &= 1.815\end{aligned}$$

Memory stalls / instruction =  $1.35 \times 1.815 = 2.450$  cycles

Thus, effective CPI =  $1.5 + 2.45 = 3.95$

The correct option is (c).

---

### **QUESTION 10:**

Which of the following approaches can be used for reducing cache miss rate?

- a. Use larger block size
- b. Use larger cache
- c. Use higher associativity

**Correct Answer: a, b, c**

**Detailed Solution:**



NPTEL Online Certification Courses  
Indian Institute of Technology Kharagpur



---

Refer lecture 32, slide-8.

---

\*\*\*\*\*END\*\*\*\*\*



**Course Name: Computer Architecture and Organization**

**Assignment- Week 7**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 10**

**Total mark:  $10 \times 1 = 10$**

---

**QUESTION 1:**

Decimal equivalent of 11101011 represented in 2's complement format will be \_\_\_\_\_?

**Correct Answer: -21**

**Detailed Solution:**

Decimal equivalent of number 11101011 will be:

$$-1x2^7 + 1x2^6 + 1x2^5 + 1x2^3 + 1x2^1 + 1x2^0 \Rightarrow -128 + 64 + 32 + 8 + 2 + 1 \Rightarrow -21$$

---

**QUESTION 2:**

Largest number that can be represented using 8-bit 1's complement representation will be \_\_\_\_\_

**Correct Answer: 127**

**Detailed Solution:**

Range of 1's complement representation is :  $-(2^{n-1}-1)$  to  $+(2^{n-1}-1)$ , for n=8 the largest number which can be represented will be  $+(2^{8-1}-1) = 127$ .

---

**QUESTION 3:**

If we implement a half adder only with basic gates (AND, OR, NOT), then how many basic gates will be required?

- a. 6
- b. 5
- c. 4
- d. 3

**Correct Answer: a**

**Detailed Solution:**



The SUM and CARRY outputs of a half-adder are given by:

$$S = A \text{ XOR } B = A' \cdot B + A \cdot B'$$

$$C = A \cdot B$$

For SUM, 5 gates (2 AND gate, 2 NOT gate, 1 OR gate) are needed and for CARRY, 1 AND gate is needed.

The correct option is (a).

---

#### **QUESTION 4:**

If the delay of each basic gates is “ $\delta$ ”, and the inputs are available in both complemented and uncomplemented forms, the total delay required by SUM and CARRY outputs of a full adder is:

- a.  $2\delta$  and  $2\delta$
- b.  $2\delta$  and  $3\delta$
- c.  $3\delta$  and  $2\delta$
- d.  $3\delta$  and  $3\delta$

**Correct Answer: a**

**Detailed Solution:** The CARRY generator circuit will take  $2\delta$  time to produce output.

The SUM circuit will also take  $2\delta$  time (time required for AND-OR circuit as the inputs are already available in complemented form).

Hence, the correct option is (a).

---

#### **QUESTION 5:**

Which of the following statement(s) is/are true for carry lookahead adder?

- a. Addition can be carried out in constant time.
- b. All carries can be generated in parallel.
- c. The cost of carry lookahead circuit increase rapidly with increase in number of bits.
- d. None of these

**Correct Answer: a, b, c**

**Detailed Solution:**



In a carry look-ahead adder, all the carries are generated in parallel, and addition can be carried out in constant time. However, carry look-ahead requires lot of extra hardware in the carry generation logic, which increases rapidly with n.

The correct options are (a), (b), and (c).

---

### **QUESTION 6:**

Carry save adder consist of?

- a. Cascaded full adders.
- b. Independent full adders.
- c. Parallel adder in the last stage.
- d. Carry select adders.
- e. None of these.

**Correct Answer: b, c**

### **Detailed Solution:**

A carry save adder consist of independent full adders which perform addition, and a parallel adder which add the results computed by independent full adders.

The correct options are (b) and (c).

---

### **QUESTION 7:**

In a 5-bit carry look-ahead adder, suppose we are adding two numbers A = (1, 0, 1, 1, 0) and B = (1, 0, 1, 1, 1). The carry generate and carry propagate signals will be:

- a. G = (1, 0, 1, 1, 0) and P = (1, 0, 1, 1, 1)
- b. G = (1, 0, 1, 1, 0) and P = (0, 0, 1, 0, 1)
- c. G = (1, 0, 1, 1, 0) and P = (0, 0, 0, 0, 1)
- d. G = (1, 0, 1, 1, 0) and P = (1, 0, 0, 0, 1)

**Correct Answer: c**

### **Detailed Solution:**

The carry generate and carry propagate signals are defined as  $G_i = A_i B_i$ , and  $P_i = A_i \text{ xor } B_i$ . Hence, G = (1, 0, 1, 1, 0) and P = (0, 0, 0, 0, 1).

The correct option is (c).

---



### **QUESTION 8:**

Suppose we are multiplying  $(-9) \times (12)$  using Booth's multiplier, where each number is represented in 5 bits. What will be the values of A (temporary register), and Q (multiplier) after third step?

- a. 00100, 00011
- b. 01001, 00011
- c. 00100, 10001
- d. None of these.

**Correct Answer: c**

**Detailed Solution:**

| Step | A     | Q     | $Q_{-1}$ | M     | Operation                       |
|------|-------|-------|----------|-------|---------------------------------|
| 0    | 00000 | 01100 | 0        | 10111 | Initialization                  |
| 1    | 00000 | 00110 | 0        | 10111 | Shift A-Q-Q <sub>-1</sub> right |
| 2    | 00000 | 00011 | 0        | 10111 | Shift A-Q-Q <sub>-1</sub> right |
| 3    | 01001 | 00011 | 0        | 10111 | $A = A - M$                     |
|      | 00100 | 10001 | 1        | 10111 | Shift A-Q-Q <sub>-1</sub> right |

### **QUESTION 9:**

Which of the following statement(s) is/are true?

- a. Booth's multiplier is faster as compared to shift and add multiplication approach
- b. Booth's multiplier inspects two bits of the multiplier at every step
- c. Arithmetic right shift operation is used in Booth's multiplier.
- d. None of these

**Correct Answer: a, b, c**

**Detailed Solution:**



Due to inspecting two bits, Booth's multiplier makes multiplication process faster by reducing number of add/subtract operations. It uses arithmetic right shift operation for shifting temporary register and multiplicand.

The correct options are (a), (b), and (c).

**QUESTION 10:**

Suppose we are dividing 37/6 using restoring division method, where A and M are represented in 4 bits. What will be the value of A after the third step?

- a. 0000
- b. 0100
- c. 1011
- d. 1110

**Correct Answer: b**

**Detailed Solution:**

| Step | A    | Q      | M    | Operation            |
|------|------|--------|------|----------------------|
| 0    | 0000 | 100101 | 0110 | Initialization       |
| 1    | 0001 | 001010 | 0110 | Shift A-Q left       |
|      | 1011 | 001010 | 0110 | $A = A - M$          |
|      | 0001 | 001010 | 0110 | $A = A + M, Q_0 = 0$ |
| 2    | 0010 | 010100 | 0110 | Shift A-Q left       |
|      | 1100 | 010100 | 0110 | $A = A - M$          |
|      | 0010 | 010100 | 0110 | $A = A + M, Q_0 = 0$ |



NPTEL Online Certification Courses  
Indian Institute of Technology Kharagpur



|   |      |        |      |                |
|---|------|--------|------|----------------|
| 3 | 0100 | 101000 | 0110 | Shift A-Q left |
|   | 1110 | 101000 | 0110 | A=A-M          |
|   | 0100 | 101000 | 0110 | A=A+M, Q0=0    |

---

\*\*\*\*\*END\*\*\*\*\*

**Course Name: Computer Architecture and Organization****Assignment- Week 8****TYPE OF QUESTION: MCQ/MSQ/SA****Number of questions: 10****Total mark: 10 x 1 = 10****QUESTION 1:**

Which of the following fractions can be represented exactly in binary?

- a. 0.75
- b. 0.50
- c. 0.33
- d. 0.66
- e. 0.99
- f. 0.25

**Correct Answer: a, b, f****Detailed Solution:**

The necessary condition that a fractional number can be represented exactly in binary is that, we should be able to express the number in the form  $x/2^k$ , where  $x$  is some integer. Options (c), (d) and (e) cannot be represented in the form  $x/2^k$ .

The correct options are (a), (b), and (f).

**QUESTION 2:**

For any binary number shifting the fraction point left by 2 position is equivalent to?

- a. Dividing value by 4
- b. Multiplying value by 4
- c. Adding 2
- d. Subtracting 2
- e. None of these.

**Correct Answer: a****Detailed Solution:**

Shifting the fraction point left by two position is equivalent to division by 4.



---

### **QUESTION 3:**

For a single precision floating point number representation in IEEE-754 format, how many bits are used to represent mantissa?

- a. 23
- b. 24
- c. 8
- d. 127
- e. None of these

**Correct Answer: a**

**Detailed Solution:**

In a single-precision floating-point number representation, 23 bits are used to represent mantissa.

The correct option is (a).

---

### **QUESTION 4:**

Consider a floating point representation with 39-bit mantissa (including the implied bit), the number of significant digit in decimal will be \_\_\_\_\_

**Correct Answer: 11 or 12**

**Detailed Solution:**

For 39-bit mantissa, let  $y$  denotes the number of equivalent digits in decimal.

We can write,  $2^{39} = 10^y$

Or,  $39 \log_{10} 2 = y \log_{10} 10$

Or,  $y = 11.74 \rightarrow 11$  to 12 significant digits in decimal.

---

### **QUESTION 5:**

The interpretation of the single precision floating point number 1000 0000 0001 1111 0000 0000 0000 1111 represented in IEEE-754 format will be:

- a. 0
- b. Very close to 0



- 
- c. Not a number
  - d. Infinity

**Correct Answer: b**

**Detailed Solution:**

**As the E is all 0 and M is not equal to 0 the number will be close to 0.**

**The correct option is (b)**

---

**QUESTION 6:**

Consider the following statements:

- (i) IEEE-754 floating point representation requires shifting of mantissa for multiplication and division operation.
- (ii) Shifting of mantissa affects exponent value of the number.

Which of the following statement is correct?

- a. Only (i) is true
- b. Only (ii) is true
- c. Both (i) and (ii) are true
- d. Both (i) and (ii) are false.

**Correct Answer: b**

**Detailed Solution:**

**Alignment of mantissa is required only for addition and subtraction operation.**

**When we shift the mantissa, the exponent has to be modified at the same time to keep the value of the number as same.**

**The correct option is (b).**

---

**QUESTION 7:**

What is required if we shift the mantissa left by 3 positions to normalize a number represented in IEEE754 format?

- a. We have to decrement Exponent by 3
- b. We have to increment Exponent by 3
- c. We have to decrement Exponent by  $2^3$



- d. We have to increment Exponent by  $2^3$

**Correct Answer:** a

**Detailed Solution:**

For normalization shifting of mantissa is done, shifting of mantissa is directly related to exponent value. For 3 left shift we require to decrease the value of E by 3.

The correct options is (a).

---

#### **QUESTION 8:**

Consider a 6-stage pipeline with stage delays of 45, 20, 15, 42, 23 and 20 nanoseconds respectively. Ignoring the delay of the latches between stages, the total time required to process 1500 sets of data in the pipeline will be ..... microseconds. (Provide your answer up-to two decimal places)

**Correct Answer: 67.72 to 67.73**

**Detailed Solution:**

The minimum clock period  $T = (\text{delay of slowest stage}) = 45 \text{ nsec}$ .

Total time required to process 1500 data sets is given by

$$((6 - 1) + 1500) \times 45 \text{ nsec} = 67725 \text{ nsec} = 67.725 \text{ microseconds.}$$

---

#### **QUESTION 9:**

Consider the following reservation table:

|                   | 1 <sub>[n]</sub> | 2 <sub>[n]</sub> | 3 <sub>[n]</sub> | 4 <sub>[n]</sub> | 5 <sub>[n]</sub> | 6 <sub>[n]</sub> | 7 <sub>[n]</sub> | 8 <sub>[n]</sub> |
|-------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|
| S <sub>1[n]</sub> | X <sub>[n]</sub> |                  |                  |                  | X <sub>[n]</sub> |                  |                  | X <sub>[n]</sub> |
| S <sub>2[n]</sub> |                  | X <sub>[n]</sub> |                  |                  |                  | X <sub>[n]</sub> |                  | X <sub>[n]</sub> |
| S <sub>3[n]</sub> |                  |                  | X <sub>[n]</sub> |                  | X <sub>[n]</sub> |                  |                  |                  |

In the reservation table multiple X in a row represents:

- Repeated use of the same stage in different cycles
- Extended use of a stage for more than one cycle
- Parallel use of stage in same cycle



- d. None of these.

**Correct Answer: a**

**Detailed Solution:**

In reservation table rows represents time step and column represents stages of function.

Thus, multiple X in rows shows use of same stage in different cycles.

The correct options is (a).

---

**QUESTION 10:**

For the following reservation table of an 8-stage pipeline, what will be the minimum average latency?

|                   | 1 <sub>[P]</sub> | 2 <sub>[P]</sub> | 3 <sub>[P]</sub> | 4 <sub>[P]</sub> | 5 <sub>[P]</sub> | 6 <sub>[P]</sub> | 7 <sub>[P]</sub> | 8 <sub>[P]</sub> |
|-------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|
| S <sub>1[P]</sub> | X <sub>[P]</sub> |                  |                  |                  | X <sub>[P]</sub> |                  |                  | X <sub>[P]</sub> |
| S <sub>2[P]</sub> |                  | X <sub>[P]</sub> |                  |                  | X <sub>[P]</sub> |                  |                  | X <sub>[P]</sub> |
| S <sub>3[P]</sub> |                  |                  | X <sub>[P]</sub> |                  | X <sub>[P]</sub> |                  |                  |                  |

- a. 5.0 clock cycles  
b. 4.5 clock cycles  
c. 8.0 clock cycles  
d. 1.0 clock cycles

**Correct Answer: b**

**Detailed Solution:**

The forbidden latencies are 2, 3, 4, 6, and 7.

The possible latency cycles are:

$$(1, 8) = (1, 8, 1, 8, \dots) \rightarrow \text{average latency} = (1+8)/2 = 4.5$$
$$(5) = (5, 5, 5, \dots) \rightarrow \text{average latency} = (5+5)/2 = 5.0$$

The minimum average latency = 4.5

The same result can be obtained by constructing the state diagram of permissible latencies, and determining the cycle with minimum average latency.

The correct option is (b).

---



NPTEL Online Certification Courses  
Indian Institute of Technology Kharagpur



---

\*\*\*\*\*END\*\*\*\*\*



**Course Name: Computer Architecture and Organization**

**Assignment- Week 9**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 10**

**Total mark: 10 x 1 = 10**

---

**QUESTION 1:**

Consider a hard disk that is rotating with a speed of 9500 rpm. The maximum rotational delay or latency will be ..... milliseconds? (Give answer correct up-to two decimal places).

**Correct Answer: 6.31 to 6.32**

**Detailed Solution:**

The disk makes 9500 revolutions in 1 minute = 60 sec

So, it will make 1 revolution in  $60 / 9500 \text{ sec} = 60 \times 1000 / 9500 \text{ msec} = 6.315 \text{ msec}$

---

**QUESTION 2:**

Which of the following memory devices cannot be used for backup (or as a secondary storage device)?

- a. DRAM
- b. SRAM
- c. Floppy Disk
- d. Hard Disk
- e. Flash Memory

**Correct Answer: a, b**

**Detailed Solution:**

DRAM and SRAM are volatile memories which retains data until power supply thus cannot be used as secondary storage.

The correct options are (a) and (b).

---



---

### **QUESTION 3:**

Which of the following statement(s) is/are true for hard disk?

- a. It is faster than Solid-state drives.
- b. Sector is the smallest unit of data transfer.
- c. It does not have any moving parts.
- d. It is volatile in nature.
- e. None of these.

**Correct Answer: b**

**Detailed Solution:**

Hard drives are slower as compared to solid-state drive; there is the moving part (viz. head) which is used for reading and writing data. The hard disk surface is divided into tracks, and tracks are further divided into sectors, which is the smallest unit of data transfer. Also, as it is a secondary storage device it holds non-volatile property.

The correct options is (b).

---

### **QUESTION 4:**

Consider a hard disk with 2 double sided platters, 2500 tracks per surface, 200 sectors per track, and sector size of 1024 bytes. The total capacity of the disk will be ..... Giga bytes.

(Assume 1024 = 1K)

**Correct Answer: 1.90 to 1.91**

**Detailed Solution:**

$$\text{Bytes/Track} = 200 * 1024 = 200\text{K}$$

$$\text{Bytes/Surface} = 200\text{K} * 2500 = 500000\text{K}$$

$$\text{Total Capacity} = 4 * 500000\text{K} = 2000000\text{K} \rightarrow 1.907 \text{ GB}$$

---

### **QUESTION 5:**

Which of the following operation is used to read a bit from floating gate transistor?

- a. Apply read voltage at control gate and measure drain current.
- b. Apply read voltage at control gate and measure source current.
- c. Apply read voltage at floating gate and measure drain current.
- d. Apply read voltage at floating gate and measure source current.

**Correct Answer: a**



**Detailed Solution:**

Drain is used to read value in floating point transistor, for which first a voltage  $V_{read}$  is applied at control gate and the current value at drain is measured.

The correct options is (a).

---

**QUESTION 6:**

Which of the following statement(s) is/are true for I/O device interfacing?

- a. An input port is implemented using a tri-state bus driver.
- b. An input port is implemented using a parallel-in parallel-out register.
- c. An output port is implemented using a tri-state bus driver.
- d. An output port is implemented using a parallel-in parallel-out register.

**Correct Answer: a, d**

**Detailed Solution:**

For I/O device interfacing, an input port is implemented using a tri-state bus driver, whereas an output port is implemented using a parallel-in parallel-out register. An output port is used to interface output devices.

The correct options are (a) and (d).

---

**QUESTION 7:**

Which of the following is/are true for I/O mapped device interfacing?

- a. Separate address decoder is used to select memory and I/O ports.
- b. Some of the memory address space is occupied by I/O devices.
- c. Same instructions for memory and I/O operations.
- d. All of these,

**Correct Answer: a**

**Detailed Solution:**

In I/O mapped device interfacing, we make distinction between memory locations and I/O ports, and separate processor instructions are used for read/write operations in memory and I/O device. Also, the separate address decoder is used to select memory modules and I/O ports.



---

The correct options is (a).

---

### **QUESTION 8:**

Assume that we are executing some program P1, during the execution some interrupts are generated which are listed. Mark the interrupt types which allow the instruction being executed to be completed before handling it.

- a. Timer interrupt
- b. Page fault interrupt
- c. I/O interrupt
- d. All of these.

**Correct Answer: a, c**

**Detailed Solution:**

I/O interrupt and Timer interrupt can be handled after finishing the execution of the current instruction. However, page fault occurs when some requested data is not found in memory, which makes it necessary to re-execute the instruction after handling the interrupt.

The correct options are (a) and (c).

---

### **QUESTION 9:**

Which of the following statement(s) is/are true for data transfer techniques?

- a. Interrupt-driven technique transfers data faster than DMA mode of data transfer.
- b. Asynchronous data transfer can be used for high-speed devices.
- c. Interrupt-driven data transfer wastes more CPU time than asynchronous data transfer.
- d. None of these

**Correct Answer: d**

**Detailed Solution:**

DMA provides fastest data transfer rate as compared to other data transfer techniques, asynchronous data transfer mode cannot be used for high speed devices as it waste lots of cpu time for checking device status after sending some blocks of data. Also, interrupt-driven transfer is faster than asynchronous data transfer. None of the statements are true.



---

The correct options is (d).

---

**QUESTION 10:**

Synchronous data transfer mode can be used for keyboard?

- a. Yes
- b. No

**Correct Answer: b**

**Detailed Solution:**

Synchronous data transfer mode cannot be used to interface keyboard as it allows data transfer at fixed speed only, but we cannot assure that the end user who is typing through keyboard will do so at the same speed.

---

The correct option is (b).

---

\*\*\*\*\*END\*\*\*\*\*

**Course Name: Computer Architecture and Organization****Assignment- Week 10****TYPE OF QUESTION: MCQ/MSQ/SA****Number of questions: 10****Total mark: 10 x 1 = 10****QUESTION 1:**

Consider the following statement:

- (i) In programmed I/O several instructions are executed for transfer of each word of data.
- (ii) Programmed I/O is not suitable for high-speed data transfer.

Which of the following is correct?

- a. Only (i) is true.
- b. Only (ii) is true.
- c. Both (i) and (ii) are true.
- d. Both (i) and (ii) are false.

**Correct Answer: c****Detailed Solution:**

**Programmed I/O is not suitable for high-speed data transfer due to the following reason:**

- i) Several program instructions have to be executed for each data word transferred between the I/O device and memory.
- ii) Many high-speed peripheral devices like disk have a synchronous mode of operation, where data are transferred at a fixed rate. This sustained data transfer rate is comparable to the memory bandwidth and cannot be handled by programmed I/O.

**The correct options is (c).**

**QUESTION 2:**

Consider a programmed I/O system where 20 instructions are required to be executed for the transfer of each word of data. The cycles-per-instruction (CPI) of the machine is 1.5 and the processor clock frequency is 2 GHz. The maximum data transfer rate will be \_\_\_\_\_ million words per second. (Assume 1 million =  $10^6$ )

**Correct Answer: Range 66.00 to 67.50**



**Detailed Solution:**

Time required for each word transfer =  $20 * 1.5 * 1 / (2 \times 10^9) = 15 \text{ nsec}$

So, maximum data transfer rate =  $1 / (15 \times 10^{-9}) = 66666666.666 = 66.67 \text{ million words per second.}$

---

**QUESTION 3:**

Which of the following statement(s) is/are true for DMA data transfer?

- a. Data transfer requires very less CPU intervention.
- b. Suitable for transferring large blocks of data
- c. Allow direct data transfer between I/O and memory.

**Correct Answer: a, b, c**

**Detailed Solution:**

DMA mode of data transfer allows direct data transfer between I/O device and memory. It only requires CPU intervention at start and at the end, thus suitable for transferring large blocks. All the statements are true.

The correct options are (a), (b) and (c).

---

**QUESTION 4:**

Suppose the rotating speed of disk is 36000 rpm, with average rotational delay of 10 msec, suppose there are 512 Kbytes of data recorded in every track. Once the disk head reaches the desired track, the sustained data transfer rate will be ..... Mbyte/sec.

**Correct Answer: Range 51.10 to 51.30**

**Detailed Solution:**

Data transfer rate will be  $512\text{KB} / 10\text{msec} = 51.2 \text{ MBps}$

---

**QUESTION 5:**

Which of the following registers needs to be initialized before transfer of any data in DMA mode?



- a. Memory address
- b. Word count
- c. Address of data on disk
- d. None of these.

**Correct Answer: a, b, c**

**Detailed Solution:**

For every DMA channel, the DMA controller will have three registers: Memory address, Word count, Address of data on disk which are initialized by CPU before each DMA transfer operation.

---

**QUESTION 6:**

Consider the following statements for Bus implementation:

- (i) Bus width defines number of wires available in the bus for transferring data.
- (ii) Bus bandwidth defines the total amount of data that can be transferred over the bus per unit of time.

- a. Only (i) is true
- b. Only (ii) is true
- c. Both (i) and (ii) is true
- d. Both (i) and (ii) are false

**Correct Answer: c**

**Detailed Solution:**

Both the statements are true, as bus width = number of wires available and bus bandwidth indicates total amount of data that can be transferred over the bus per unit of time.

The correct option is (c).

---

**QUESTION 7:**

Consider a matrix keyboard consisting of 256 keys, organized as 16 rows and 16 columns. How many port lines will be required to interface the keyboard?



- 
- a. 256
  - b. 128
  - c. 64
  - d. 32

**Correct Answer: d**

**Detailed Solution:**

If the keyboard is organized in a matrix format the number of port lines required will be the sum of the number of rows and number of columns.

Here,  $16 + 16 = 32$

The correct option is (d).

---

**QUESTION 8:**

Suppose that it is required to transfer 20K bytes in interrupt-driven mode of data transfer. Every time an interrupt occurs, it involves the transfer of 64 bytes of data that takes 20 microseconds for the processor to service. The time required to transfer 20K bytes of data will be ..... milliseconds? (Assume 1K = 1024)

**Correct Answer: 6.35 to 6.45**

**Detailed Solution:**

To transfer 64 Bytes of data, we require 20 microseconds.

So, to transfer 1 byte of data, we require  $20 / 64 = 0.3125$  microseconds.

So, to transfer 20K bytes of data, we require  $20K * 0.3125$  microseconds = 6400 microseconds = 6.4 milliseconds.

---

**QUESTION 9:**

Which of the following is/are advantage of serial bus over parallel bus?

- a. Low implementation cost
- b. High speed
- c. No interference
- d. All of these



**Correct Answer: a, c**

**Detailed Solution:**

The advantage of serial bus connection is the implementation cost of serial bus is very low and does not produce any interference, but transfer speed is slow.

The correct options are (a) and (c).

---

**QUESTION 10:**

The maximum data transfer rates supported by USB 1.1 and USB 3.0 standards are respectively:

- a. 12 Gbps and 50 Gbps
- b. 12 Mbps and 5 Gbps
- c. 5 Gbps and 10 Gbps
- d. 2 Gbps and 10 Gbps

**Correct Answer: b**

**Detailed Solution:**

The maximum data transfer rates supported by various USB versions are:

|         |               |
|---------|---------------|
| USB 1.1 | Up to 12Mbps  |
| USB 2.0 | Up to 480Mbps |
| USB 3.0 | Up to 5Gbps   |
| USB 3.1 | Up to 10Gbps  |

The correct option is (b).

---

\*\*\*\*\*END\*\*\*\*\*



## Course Name: Computer Architecture and Organization

### Assignment- Week 11

TYPE OF QUESTION: MCQ/MSQ/SA

Number of questions: 10

Total mark:  $10 \times 1 = 10$

---

#### **QUESTION 1:**

Consider a 5-stage instruction pipeline with stage delays of 20 nsec, 25 nsec, 35 nsec, 30 nsec, and 22 nsec respectively. The delay of an inter-stage register stage of the pipeline is 2 nsec. The total time required for the execution of 1000 instructions will be ..... microseconds.

**Correct Answer: 37.10 to 37.20**

**Detailed Solution:**

$$\text{Pipeline clock period } T = \max. \{20, 25, 35, 30, 22\} + 2 \text{ nsec} = 35 + 2 = 37 \text{ nsec}$$

Number of stages  $k = 5$

$$\begin{aligned}\text{Total time required} &= ((k - 1) + 1000) * 37 \text{ nsec} \\ &= (4 + 1000) * 37 \text{ nsec} = 37148 \text{ nsec} \\ &= 37.148 \text{ microseconds}\end{aligned}$$

---

#### **QUESTION 2:**

Consider a 3-stage instruction pipeline with stage delays of 25 nsec, 30 nsec and 15 nsec respectively, and the delay of an inter-stage register stage of 5 nsec. Suppose the pipeline is modified by splitting the 1<sup>st</sup> stage into two simpler stages with delays 10 nsec and 15 nsec, and 2<sup>nd</sup> stage into two simpler stages with delays 15 nsec and 15 nsec. For the execution of 1000 instructions, the speedup of the new 5-stage pipeline over the previous 3-stage pipeline will be \_\_\_\_\_.

**Correct Answer: 1.70 to 1.80**

**Detailed Solution:**

**For the 3-stage pipeline:**

$$\text{Pipeline clock period } T_1 = 30 + 5 = 35 \text{ nsec}$$

Number of stages  $k_1 = 3$

$$\text{Time required} = ((3 - 1) + 1000) * 35 \text{ nsec} = 35.07 \text{ microseconds}$$



For the 5-stage pipeline:

$$\text{Pipeline clock period } T_2 = 15 + 5 = 20 \text{ nsec}$$

$$\text{Number of stages } k_2 = 5$$

$$\text{Time required} = ((5 - 1) + 1000) * 20 \text{ nsec} = 20.08 \text{ microseconds}$$

$$\text{Hence, the speedup} = 35.07 / 20.08 = 1.746$$

---

### **QUESTION 3:**

Consider a non-pipelined CPU working at 1GHz clock. The frequency of ALU operations, branches and memory operations are 50%, 25% and 25% respectively. If ALU and memory operations take 4 cycles and branch operation takes 7 cycles, the average instruction execution time will be ..... nsec.

**Correct Answer: 4.70 to 4.80**

**Detailed Solution:**

$$\text{Clock period} = 1 \text{ nsec}$$

$$\text{Average execution time} = 1\text{ns} \times (.5*4 + .25*7 + .25*4) = 4.75 \text{ nsec.}$$

---

### **QUESTION 4:**

Which of the following statement(s) is/ are true?

- Pipeline hazards prevent pipeline from operating at its maximum possible speed.
- Data hazard arise due to resource conflicts.
- Control hazard arise due to branch and other instructions that change the PC.
- All of these.

**Correct Answer: a, c**

**Detailed Solution:**

Ideally pipeline implementation can complete execution of an instruction in every clock cycles, but the hazards such as data dependency between instructions, branching instructions etc. prevents pipeline implementation to achieve its top speed. Data hazard arise due to data dependencies between instructions and control hazard arise due to branch and other instructions that change the PC.

Thus options (a) and (c) are true.



---

### **QUESTION 5:**

Consider a non-pipelined processor with a clock rate of 4 GHz and average cycles per instruction of 10. The same processor is upgraded to a 6-stage pipelined processor but due to the internal pipeline delay, the clock rate is reduced to 2 GHz. Assume there are no stalls in the pipeline. The speed up achieved in this pipelined processor will be .....

**Correct Answer: 4.9 to 5.1**

**Detailed Solution:**

**For non-pipelined processor:**

Cycle time = 0.25 nsec

Thus single instruction will execute in  $10 * 0.25 = 2.5$  nsec

**For pipelined processor:**

Cycle time = 0.5nsec

Thus single instruction will execute in  $1 * 0.5 = 0.5$  nsec

(As there are no stalls; ideally each instruction will be executed in single clock)

Speedup =  $2.5 / 0.5 = 5$

---

### **QUESTION 6:**

Consider the execution of following instructions in a 5-stage MIPS pipeline (IF, ID, EX, MEM, WB):

**1: ADD R2, R5, R8**

**2: MUL R1, R4, R5**

**3: SUB R9, R2, R6**

**4: ADD R1, R5, R6**

The **Read After Write (RAW)** data dependency exist between which pair of instructions that can lead to data hazard?

- a. 1 and 2
- b. 1 and 3



- 
- c. 2 and 3
  - d. 2 and 4

**Correct Answer: b**

**Detailed Solution:**

The dependency exist between instruction 1 and 3 is RAW data dependency (instruction 3 should not read the content of R2 before the correct value is written by instruction 1).

Thus the correct option is (b).

---

**QUESTION 7:**

For the following MIPS32 program segment, how many stall cycles will be required .....?

1: LW R5, 200(R2)

2: ADD R1, R6, R8

3: SUB R3, R5, R8

**Correct Answer: 0**

**Detailed Solution:**

Between instructions (1, 2) and (2, 3), there is no data dependency. Between instructions (1, 3) there is a data dependency, but the instructions are separated by 2 units. Hence there will be no hazard.

---

**QUESTION 8:**

Which of the following data hazards can cause performance degradation in the MIPS32 integer pipeline?

- a. WAR data hazard.
- b. WAW data hazard.
- c. RAW data hazard.
- d. Memory load followed by use of the loaded data.

**Correct Answer: c, d**



**Detailed Solution:**

For MIPS32 integer pipeline, only RAW hazard can affect the pipeline performance. WAR and WAW hazards are not possible in the integer pipeline.

Thus, options (c) and (d) are correct.

---

**QUESTION 9:**

Consider the MIPS32 pipeline with ideal CPI of 1.5. Assume that 30% of all instructions executed are branch, out of which 90% are taken branches. The pipeline speedup for (i) predict taken and (ii) predict not taken approaches to reduce branch penalties will be approximately:

- a. 3.94, 3.85
- b. 4.34, 4.29
- c. 3.85, 4.34
- d. 3.85, 3.94

**Correct Answer: d**

**Detailed Solution:**

For predict-taken:

$$\text{Branch penalty} = 1$$

$$\text{Speedup} = 5 / (1 + 0.3 \times 1) = 3.846$$

For predict-not-taken:

$$\text{Branch penalty} = 1$$

$$\text{Speedup} = 5 / (1 + 0.30 \times 0.90) = 3.937$$

The correct option is (d).

---

**QUESTION 10:**

In a MIPS pipeline with Branch Target Buffer (BTB), assume that 85% of the branches are found in BTB, 15% of the predictions are incorrect, and 75% of the branches are taken. The branch penalty will be ..... clock cycles.

**Correct Answer: 0.49 to 0.53**

**Detailed Solution:**

$$\begin{aligned}\text{Branch penalty} &= (\% \text{ Branches found in BTB} \times \% \text{ Miss Predictions} \times 2) \\ &\quad + (\% \text{ Branches not found in BTB} \times \% \text{ Taken branches} \times 2)\end{aligned}$$



+ (% Branches not found in BTB x % Not-taken branches x 1)

$$= (0.85 \times 0.15 \times 2) + (0.15 \times 0.75 \times 2) + (0.15 \times 0.25 \times 1) = 0.255 + 0.225 + 0.0375 = 0.5175 \text{ clock cycles.}$$

---

\*\*\*\*\*END\*\*\*\*\*



**Course Name: Computer Architecture and Organization**

**Assignment- Week 12**

**TYPE OF QUESTION: MCQ/MSQ/SA**

**Number of questions: 8**

**Total mark:  $8 \times 1.25 = 10$**

---

**QUESTION 1:**

In MIPS32 floating-point extension, for double-precision operations the register pair <F14, F15> is referred as:

- a. F15
- b. F14
- c. F13
- d. F12
- e. None of these

**Correct Answer: b**

**Detailed Solution:**

For double-precision operations, 64-bit register pairs are used to store the operands and also the result. For this purpose we require to pair two 32-bit registers to hold 64-bit values. In this case register pair <F14, F15> is also referred to as F14.

Thus, option (b) is correct.

---

**QUESTION 2:**

Consider the given floating-point instruction:

**L.D F2, 400(R6)**

The data from location [R6+400] and [R6+404] will be loaded to which of the following registers?

- a. F1, F2
- b. F2, F3
- c. F3, F2
- d. None of these

**Correct Answer: b**



---

**Detailed Solution:**

For double-precision operations, the data are loaded in register pairs, F2 actually refers to the register pair <F2, F3>. The data from location R6+400 and R6+404 will be loaded into registers F2 and F3 respectively.

Thus, option (b) is correct.

---

**QUESTION 3:**

Which of the following techniques can be used to improve the CPI?

- a. Sequencing unrelated instructions
- b. Separating related instructions
- c. Loop unrolling
- d. None of these

**Correct Answer: a, b, c**

**Detailed Solution:**

By identifying related and unrelated instructions we can make sequence of unrelated instructions that can be overlapped without causing hazard. Similarly, we can separate related instructions by appropriate number of clock cycles to avoid hazards. With this we can exploit parallelism which can lower the CPI. Replicating the body of the loop multiple times using loop unrolling can reduce loop-overhead “per iteration”.

Thus, options (a), (b) and (c) are correct.

---

**QUESTION 4:**

Which of the following statement(s) is/are false for superscalar MIPS32 machine?

- a. It can issue multiple independent instructions every clock cycle.
- b. It can result in a CPI of less than 1.
- c. It can dynamically check dependency between instructions.
- d. It consists of more than one functional units that can run in parallel
- e. None of these.



---

**Correct Answer:** e

**Detailed Solution:**

Superscalar machine identifies the dependency between instructions and issue multiple independent instruction into multiple functional units in single clock which results CPI less than 1. The dependency among instructions is checked dynamically by the hardware. All the options are true.

Thus option (e) is correct.

---

**QUESTION 5:**

Loop unrolling requires significantly greater number of registers?

- a. True
- b. False

**Correct Answer:** a

**Detailed Solution:**

Loop unrolling consists of replicating the body of a loop such that more instruction level parallelism can be exposed. However, the number of registers required increases significantly.

Thus, option (a) is correct.

---

**QUESTION 6:**

Which of the following is/are advantage of vector processor?

- a. It gives good speedup when we carry out similar operations on vectors.
- b. No loop overhead.
- c. The number of instructions gets reduced.
- d. None of these.

**Correct Answer:** a, b, c

**Detailed Solution:**



A vector processor reduces the number of instructions that are executed, as a single high-level instruction can represent an entire loop; also it reduces loop overheads. A similar operation for all numbers can be performed in fewer cycles.

Thus, options (a), (b) and (c) are correct.

---

### **QUESTION 7:**

In a vector processor, suppose that the start-up time of vector multiply operation is 20 clock cycles. After start-up, the initiation rate is 5 clock cycles. The number of clock cycles required per result for a 128-element vector will be \_\_\_\_\_

**Correct Answer: 5.13 to 5.17**

**Detailed Solution:**

**clock cycles required per result for a vector operation of length n**

$$\begin{aligned} &= \text{Total time / Vector length} = (\text{Start-up Time} + (n \times \text{Initiation Rate})) / \text{Vector length} \\ &= (20 + (128 * 5)) / 128 = 5.156 \end{aligned}$$

---

### **QUESTION 8:**

Which of the following statement(s) is/are false for various types of multi-core processors?

- a. In asymmetric multi-core system, all the cores are identical.
- b. In symmetric multi-core system, different cores may have different functionalities.
- c. In a tightly coupled multiprocessor all the processors have access to a common shared memory.
- d. None of these.

**Correct Answer: a, b**

**Detailed Solution:**

**In asymmetric multi-core system, different cores may have different functionalities. Whereas in symmetric multi-core system, all the cores are identical. In a tightly coupled multiprocessor, there are multiple processors that have access to a common shared memory. The processors communicate among themselves through the shared memory.**

Thus, options (a) and (b) correct.

---



NPTEL Online Certification Courses  
Indian Institute of Technology Kharagpur



---

\*\*\*\*\*END\*\*\*\*\*