

# COMPUTER SCIENCE

## Computer Organization and Architecture

### Instruction Pipelining

Lecture\_03



Vijay Agarwal sir





**TOPICS  
TO BE  
COVERED**

o1

**Pipelining Concepts**

o1

**Pipelining Hazards**

## PIPELINE CONCEPT

Execution time in Pipeline & Non pipeline.

## SPEED UP FACTOR

UNIFORM & NON Uniform Delay Pipeline.

# Pipelining Strategy

Similar to the use of  
An assembly line in a  
Manufacturing plant

To apply this concept  
To instruction  
Execution we must  
Recognize that an  
Instruction has a  
Number of stages



New inputs are  
Accepted at one end  
Before previously  
Accepted inputs  
Appear as output at  
The other end

- Pipelining is a mechanism which is used to improve the performance of the system in which task (Instruction) are executed in overlapping manner.
- Pipelining is a decomposition technique that means the problem is divided into sub problem & Assign the sub problem to the pipes then operate the pipe under the same clock.

Let us consider 4 segment pipeline used to execute 4 instructions the execution sequence as:

Segment/stages =  $[S_1 \ S_2 \ S_3 \ S_4]$

Instruction:  $[I_1 \ I_2 \ I_3 \ I_4]$



$n = 4, t_n = 4$ , Non pipeline

**Non-PIPELINE**



PIPELINE

 $k = 4$  $n = 4$ 

PIPELINE

Let us consider 4 segment pipeline used to execute 4 instruction the execution sequence as: PIPELINING

Segment/stages =  $[S_1 \ S_2 \ S_3 \ S_4]$

Instruction:  $[I_1 \ I_2 \ I_3 \ I_4]$



$$n = 4, t_n = 4, \text{ Non pipeline}$$



PIPELINE

$k = 4$

$n = 4$

In the Pipeline.

$$\text{CPT} = \perp$$

- I. **(Q.1)** How Design (Construct) the Pipeline. ?
- II. **(Q.2)** What is the Meaning of  $CPI = 1$  ?
- III. **(Q.3)** How to Set this CPI in Uniform Delay. ?
- IV. **(Q.4)** How to Set this CPI in Non Uniform Delay. ?

# PIPELINE Design



RISC : 5 Stage



'5'

- ① If we want to construct 'N' stage pipeline, then entire CPU [entire Control Unit] is divided into 'N' functional unit, [Independent functional Unit] which is independent from each other.



Independent functional Unit means, when one functional Unit perform the Task, in the same time other functional Unit performing the other Task [operation].

### ③ Instruction Pipeline

$I_1$  is in Decode Stage

&  $I_2$  is in Fetch Stage.

in the Clock Cycle No 2.



Note

Characteristic of the PIPELINE is, in Every New Cycle  
New Input must be inserted into the Pipeline.

$$CPI = 1$$

Cycle Per Instn = 1  
 $\Downarrow$   
One Clock Pulse Transition.



# In PIPELINE.

~~RISC~~  
~~Successor~~  
~~Pipeline~~  
CPI = 1

$$CPI = 1$$

Clock cycle per  
Instruction = 1.

① How to Set this CPI ?  
② Why clock is Required .

## Q) Why Clock is Required ?

Soln

- (i) Because whenever, we enable the clock, the operation are performed (Data is moved from one register to other @ etc ~~one task~~ register)
- & (ii) Provide the synchronization between the stage.

# PIPELINE Design

UNIFORM Delay

$$\text{Time} = 22 \text{nsec}$$

$$CPI = 1$$



$$F = \frac{1}{22 \text{nsec}} \text{ Hz}$$

Cycle Per Instr<sup>n</sup> = 1 Cycle

$$\text{Cycletime} = 22 \text{nsec}$$

every CPI (22nsec) will insert a pipeline



# In Uniform Delay PIPELINE

first Instn takes

$$k \times t_p$$

$$4 \times 22 + (n-1) \underline{22 \text{ nsec}}$$

$$(n-1) \text{ Instn}$$

Remaining. After every 2 cycle getting OP.

|    | 1              | 2              | 3              | 4              | 5 | 6 | 7 |
|----|----------------|----------------|----------------|----------------|---|---|---|
| WB | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> |                |   |   |   |
| EX |                | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> |   |   |   |
| DI | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> |                |   |   |   |
| IF | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> |                |   |   |   |

at 5: I<sub>2</sub>

at 6: I<sub>3</sub>

# PIPELINE Design

Non Uniform Delay.



$$t_p = \max \left( \frac{\text{Stage Delay} + \text{Bubble Delay}}{\text{Clock Period}} \right)$$

$$t_p = 30\text{ns}$$



Note

If we take Minimum means  $(20+2) = 22 \text{ nsec}$ .

$$\text{Clock} = \frac{1}{22 \text{ nsec}} \text{ Hz}$$

In 22 nsec Stage 1 work finish & Data Movement from Stage 1 to Stage 2.

But in 22 ns [this clock] Stage 2, Task Not Complete  
Stage 3, Stage 4 (in this example)  
[Still Processing]

So Proper Working functioning of the Pipeline max Stage Delay is Taken.  
So always take maximum of the (Stage + Buffer Delay)

maximum.

So  $t_P = 30\text{ nsec}$ .



$\xleftarrow[30\text{ nsec}]{}$

## Synchronization

If we take Maximum = 30 nsec.

then in S1 Task finish in 22 nsec.

but clock is set to be 30 nsec.

Every Phase output will be available at

one instruction coming out every 30 nsec: [When 1 cycle(30n) then stage 1 O/P given to S2 & so on] So synchronization.

# PIPELINE Design



# PIPELINE Design

Non Uniform Delay.

& Register Delay is Different

$$t_p = \max \left( \frac{(\text{Stage Delay} + \text{Bubble Delay})}{\text{Number of Stages}} \right)$$

$t_p = 3L$

$$\max(20+4, 25+6, 28+1, 26+4)$$

$$\max(24, 31, 29, 30)$$



## PIPELINE Performance Evaluation

Now consider the case where a  $k$ -segment pipeline with a clock cycle time  $t_p$ , is used to execute  $n$  tasks.

The first task  $T_1$  requires a time equal to  $kt_p$ , to complete its operation since there are  $k$  segments in the pipe.

The remaining  $n - 1$  tasks emerge from the pipe at the rate of one task per clock cycle and they will be completed after a time equal to  $(n - 1)t_p$ .

Therefore, to complete  $n$  tasks using a  $k$ -segment pipeline requires  $k + (n - 1)$  clock cycles.

## Space-time diagram for pipeline.

Segment:

|   | 1     | 2     | 3     | 4     | 5     | 6     | 7     | 8     | 9     | → Clock Cycles |
|---|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------------|
| 1 | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ |       |       |       |                |
| 2 |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ |       |       |                |
| 3 |       |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ |       |                |
| 4 |       |       |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ |                |

For example, the diagram of Fig. 9-4 shows four segments and six tasks. The time required to complete all the operations is  $4 + (6 - 1) = 9$  clock cycles, as indicated in the diagram.

So, to complete  $n$  tasks using a  $k$ -segment pipeline is:

$$\text{ET pipe} = k + (n - 1) \text{ clock cycles}$$

So, to complete n tasks in Non-pipeline is:

$$\text{ET Non-pipeline} = n * tn$$

$$S = \frac{nt_n}{(k+n-1)t_p}$$

As the number of tasks increases,  $n$  becomes much larger than  $k - 1$ , and  $k + n - 1$  approaches the value of  $n$ . Under this condition, the speedup becomes

$$S = \frac{t_n}{t_p}$$

## Additional Stages

- Fetch Instruction (FI)
  - ❖ Read the next expected Instruction into a buffer.
- Decode Instruction (DI)
  - ❖ Determine the opcode and the operand specifiers.
- Calculate operands(CO)
  - ❖ Calculate the effective address of each source operand.
  - ❖ This may involve displacement, register indirect or other forms of address calculations.
- Fetch Operands(FO)
  - ❖ Fetch each operand from memory.
  - ❖ Operands in register need not be fetched.
- Executed Instruction(EI)
  - ❖ Perform the indicated operation and store the result, if any, in the specified destination operand location
- Write Operand(WO)
  - ❖ Store the result in memory

|               | Time → |    |    |    |    |    |    |    |    |    |    |    |    |    |
|---------------|--------|----|----|----|----|----|----|----|----|----|----|----|----|----|
|               | 1      | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 | 14 |
| Instruction 1 | FI     | DI | CO | FO | EI | WO |    |    |    |    |    |    |    |    |
| Instruction 2 |        | FI | DI | CO | FO | EI | WO |    |    |    |    |    |    |    |
| Instruction 3 |        |    | FI | DI | CO | FO | EI | WO |    |    |    |    |    |    |
| Instruction 4 |        |    |    | FI | DI | CO | FO | EI | WO |    |    |    |    |    |
| Instruction 5 |        |    |    |    | FI | DI | CO | FO | EI | WO |    |    |    |    |
| Instruction 6 |        |    |    |    |    | FI | DI | CO | FO | EI | WO |    |    |    |
| Instruction 7 |        |    |    |    |    |    | FI | DI | CO | FO | EI | WO |    |    |
| Instruction 8 |        |    |    |    |    |    |    | FI | DI | CO | FO | EI | WO |    |
| Instruction 9 |        |    |    |    |    |    |    |    | FI | DI | CO | FO | EI | WO |

Timing Diagram for Instruction pipeline operation

Q. 4

Consider an instruction pipeline with four stages (S1, S2, S3 and S4) each with combinational circuit only. The pipeline registers are required between each stage and at the end of the last stage. Delays for the stages and for the pipeline registers are as given in the figure

When Merged  
Uniform Delay &  
ideal Gage  
then  $S=4$



What is the approximate speed up of the pipeline in steady state under ideal conditions when compared to the corresponding non-pipeline implementation?

[GATE-2011: 2 Marks]

In Ideal Case:  $S = \frac{t_1}{t_p} = \frac{20}{12} = 2.5$

- A 4.0
- B 2.5
- C 1.1
- D 3.0

But we Discussed In Ideal Case

Speed Up factor =  $k (\# \text{Stages}) = 4$

But Answer is 2.5 Why?

*Sol* Bcz its Non Uniform Delay Pipeline.

When Speed Up factor = 4 [if Uniform Delay & Register Delay is Not Given  
 @ No Time is given for any stage Only then (Ideal Given)]

## Uniform Delay Pipeline

$$t_p = 2 \text{ ns}$$

$$t_n = \underbrace{2+2+2+2}_{\downarrow} \Rightarrow k * t_p \Rightarrow 4 \times 2 = 8 \text{ ns}$$



$$S = \frac{t_n}{t_p}$$

$$= \frac{k * t_p}{t_p} \Rightarrow S = k$$

Uniform Delay Pipeline.

The stage delays in a 4-stage pipeline are 800, 500, 400 and 300 picoseconds. The first stage (with delay 800 picoseconds) is replaced with a functionally equivalent design involving two stages with respective delays 600 and 350 picoseconds. The throughput increase of the pipeline is \_\_\_ percent.

[GATE-2016(Set-1)-CS: 2M]

Consider a non-pipelined processor with a clock rate of 2.5 gigahertz and average cycles per instruction of four. The same processor is upgraded to a pipelined processor with five stages; but due to the internal pipeline delay, the clock speed is reduced to 2 gigahertz. Assume that there are no stalls in the pipeline. The speed up achieved in this pipelined processor is 3.2 Ave

[GATE-2015(Set-1)-CS: 2M]

Ave (3.2)

$$2.5 \text{ Hz}^2$$

Cycle time =  $\frac{1}{2.5 \times 10^9} \text{ sec} = 0.4 \text{ nsec}$

### Non Pipeline

Each Instruction takes : 4 cycle

$$\text{Cycle time} = \frac{1}{2.5 \times 10^9} \text{ sec} = 0.4 \text{ nsec}$$

$$\begin{aligned}\text{ET in Non Pipeline (t}_n\text{)} &= 4 \times 0.4 \\ &= 1.6 \text{ nsec.}\end{aligned}$$

$$2.5 \text{ Hz} \quad \text{time} = \frac{1}{2.5 \times 10^9} = 0.4 \text{ nsec}$$

### In Pipeline

1 Instn takes = 1 Cycle

CPI = 1 Cycle

$$\begin{aligned}\text{Cycle time} &= \frac{1}{2 \times 10^9} \text{ sec.} \\ &= 0.5 \text{ nsec}\end{aligned}$$

ET in Pipeline = 0.5 nsec.

$$S = \frac{ET_{NP}}{ET_p} \Rightarrow \frac{1.6}{0.5} = \textcircled{3.2} \text{ Avg}$$

Consider a 4-stage pipeline processor. We want to execute a loop:  
For( $i=1; i \leq 1000; i++$ ) { I1, I2, I3, I4} where the time taken (in ns) by instruction I1 to I4 for stages S1, S2, S3, S4 is shown below:

|    | S1 | S2 | S3 | S4 |
|----|----|----|----|----|
| I1 | 1  | 2  | 1  | 2  |
| I2 | 2  | 1  | 2  | 1  |
| I3 | 1  | 1  | 2  | 1  |
| I4 | 2  | 1  | 2  | 1  |

The Output of I1 for i=2 will be available after ?

[GATE-2004-CS: 2M]

A 11ns

Ans (C).

B 12ns

C 13ns

D 28ns

$$\begin{array}{c|cccc}
 & s_1 & s_2 & s_3 & s_4 \\
 \hline
 I_1 & 1 & 2 & 1 & 2 \\
 I_2 & 2 & 1 & 2 & 1 \\
 I_3 & 1 & 1 & 2 & 1 \\
 I_4 & 2 & 1 & 2 & 1
 \end{array} \quad i=1 \quad [I_1 \ I_2 \ I_3 \ I_4] \ .$$

|       |       |       |       |       |       |       |       |       |       |       |       |       |
|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| $S_4$ |       |       |       | $I_1$ | $I_1$ | $I_2$ |       | $I_3$ |       | $I_4$ | $I_1$ | $I_1$ |
| $S_3$ |       |       |       | $I_1$ | $I_2$ | $I_2$ | $I_3$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ |       |
| $S_2$ |       | $I_1$ | $I_1$ | $I_2$ | $I_3$ |       | $I_4$ | $I_4$ | $I_4$ |       |       |       |
| $S_1$ | $I_1$ | $I_2$ | $I_2$ | $I_3$ | $I_4$ | $I_4$ | $I_4$ |       |       |       |       |       |
|       | 1     | 2     | 3     | 4     | 5     | 6     | 7     | 8     | 9     | 10    | 11    | 12    |

**MCQ** Q. 12

Consider a 4-stage pipeline processor. The number of cycles needed by the four instructions I1, I2, I3, I4 in stages S1, S2, S3, S4 is shown below:

|    | S1 | S2 | S3 | S4 |
|----|----|----|----|----|
| I1 | 2  | 1  | 1  | 1  |
| I2 | 1  | 3  | 2  | 2  |
| I3 | 2  | 1  | 1  | 3  |
| I4 | 1  | 2  | 2  | 2  |

What is the number of cycles needed to execute the following loop?

for (i = 1 to 2) {I1; I2; I3; I4;}

[GATE-2009-CS: 2M]

A 16

Ans (B)

B 23

C 28

D 30

$$\begin{array}{c|cccc} I_1 & 2 & 1 & 1 & 1 \\ I_2 & 1 & 3 & 2 & 2 \\ \hline I_3 & 2 & 1 & 1 & 3 \\ I_4 & 1 & 2 & 2 & 2 \\ \hline S_1 & S_2 & S_3 & S_4 \end{array}$$

|       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |    |    |    |    |    |    |
|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----|----|----|----|----|----|
| $S_1$ | $I_1$ | $I_2$ | $I_2$ | $I_3$ | $I_3$ | $I_3$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ | $I_1$ | $I_2$ | $I_2$ | $I_2$ | $I_3$ | $I_3$ | $I_3$ | $I_4$ | $I_4$ |    |    |    |    |    |    |
| $S_2$ | $I_1$ | $I_2$ | $I_2$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ | $I_1$ | $I_2$ | $I_2$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ | $I_2$ | $I_3$ | $I_3$ | $I_4$ | $I_4$ |    |    |    |    |    |    |
| $S_3$ | $I_1$ | $I_2$ | $I_2$ | $I_3$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ | $I_2$ | $I_2$ | $I_2$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ | $I_2$ | $I_3$ | $I_3$ | $I_4$ |    |    |    |    |    |    |
| $S_4$ | $I_1$ | $I_2$ | $I_3$ | $I_3$ | $I_4$ | $I_1$ | $I_1$ | $I_2$ | $I_3$ | $I_2$ | $I_3$ | $I_4$ | $I_4$ | $I_1$ | $I_2$ | $I_3$ | $I_3$ | $I_4$ | $I_4$ |    |    |    |    |    |    |
|       | 1     | 2     | 3     | 4     | 5     | 7     | 8     | 9     | 10    | 11    | 12    | 13    | 14    | 15    | 16    | 17    | 18    | 19    | 20    | 21 | 22 | 23 | 24 | 25 | 26 |

Consider a 3 GHz (gigahertz) processor with a three-stage pipeline and stage latencies  $\tau_1$ ,  $\tau_2$ , and  $\tau_3$  such that  $\tau_1 = 3\tau_2/4 = 2\tau_3$ . If the longest pipeline stage is split into two pipeline stages of equal latency, the new frequency is \_\_\_\_\_ GHz, ignoring delays in the pipeline registers.

Ans(4GHz)

[GATE-2016(Set-2)-CS: 2M]

$$\tau_1 = \frac{3\tau_2}{4} = 2\tau_3$$

$$\tau_1 : \tau_2 : \tau_3 = 6 : 8 : 3$$

$$\left(\tau_1 = \frac{3\tau_2}{4}\right) \text{ & } \left(\frac{3\tau_2}{4} = 2\tau_3\right)$$

$$\frac{\tau_1}{\tau_2} = \frac{3}{4} \Rightarrow \frac{6}{8}$$

$$\frac{\tau_2}{\tau_3} = \frac{8}{3}$$

$$T_1 : T_2 : T_3 = 6 : 8 : 3$$

Assume time  $\pi$

$$T_1: 6\pi, T_2: 8\pi, T_3 = 3\pi.$$

$$tp: \max(6\pi, 8\pi, 3\pi)$$

$$tp = 8\pi$$

Frequency =  $\frac{1}{\text{Time}}$

$$3GHz = \frac{1}{8\pi}$$

$$k = 24GHz \quad - eq①$$

### New Design

Largest Stage Split into 2 equal Stage Delay



$$tp_{\text{new}} = \max(6\pi, 4\pi, 4\pi, 3\pi)$$

$$tp_{\text{new}} = 6\pi$$

$$\text{Frequency}_{\text{new}} = \frac{1}{tp_{\text{new}}}$$

$$f_{\text{new}} = \frac{1}{6\pi} = \frac{1}{6} \times \frac{1}{\pi}$$

$$f_{\text{new}} = 4GHz \quad \underline{\text{Ans}} \quad \frac{1}{6} \times 24GHz$$

Consider two processors  $P_1$  and  $P_2$  executing the same instructions set. Assume that under identical conditions, for the same input, a program running on  $P_2$  takes 25% less time but incurs 20% more CPI (clock cycles per instruction) as compared to the program running on  $P_1$ . If the clock frequency of  $P_1$  is 1GHz, then the clock frequency of  $P_2$  (in GHz) is \_\_\_\_\_.

[GATE-2014(Set-1)-CS: 2M]

$$ET_{P_1} = \#Inst^n \times \text{Cycle time} \times CPI$$

$\#Inst^n$  Not given in  $P_1$  &  $P_2$  ( $\&$  Same Inst $n$ )

$$ET_{P_1} = CPI \times \text{Cycle time}$$

$$ET_{P_2} = 1.2 CPI \times \text{Cycle time}_2$$

$$ET_{P_2} = 0.75 ET_{P_1}$$

Consider the following processor design characteristics:

- I. Register-to-register arithmetic operations only.
- II. Fixed-length instruction format.
- III. Hardwired control unit.

Which of the characteristics above are used in the design of a RISC processor?

[GATE-2018-CS: 1M]

A I and II only

B II and III only

C I and III only

D I, II and III

Ans(D)

Consider a machine with 40 MHz processor which has run a benchmark program. The executed program consists of 100,000(latch) instruction executions, with the following instruction mix and clock cycle count. What will be the effective CPI, MIPS rate, and execution time.

| Instruction Type   | Instruction Count | Cycles/ Instructions |   |
|--------------------|-------------------|----------------------|---|
| Integer arithmetic | 45000             | 45.1.                | 1 |
| Data Transfer      | 32000             | 32.1.                | 2 |
| Floating point     | 15000             | 15.1.                | 2 |
| Control transfer   | 8000              | 8.1.                 | 2 |

$$ET = \sum [IC \times CPI] \times \text{Cycle time}$$

$$\cdot45 \times 2 + .32 \times 2 + .15 \times 2 + 0.08 \times 2 \quad T \propto \frac{1}{F}$$

$$= 0.45 + 0.64 + 0.30 + 0.16$$

$\text{Avg CPI} = 1.55 \text{ cycle}$

$\underline{\text{Avg}}$

$\text{Cycle time} = \frac{1}{40 \times 10^6}$

$$\text{Avg Inst ET} = 1.55 \times \frac{1}{40} \times 10^{-6}$$

$\text{Avg Inst ET} = 0.03875 \times 10^{-6}$

(ii)  $\text{Instn ET} = 0.03875 \times 10^{-6} \text{ sec}$

$$\text{In 1 sec} = \frac{1}{0.03875} \times 10^6$$

$$\Rightarrow 25.8 \text{ MIPS.} \quad \underline{\text{Ans}}$$

$10^5$  Instn<sup>n</sup> 1000000

Total Prog ET =  $1.55 \times \frac{1}{40 \times 10^6} \times 10^8$

$$\frac{1.55}{40} \times 10^{-1}$$

$$\Rightarrow \frac{1.55}{4} \times 10^{-2}$$

$$= 0.387 \times 10^{-2}$$

$$= 3.87 \times 10^{-3}$$

Avg

3.87 msec

Prog ET

- A CPI:3.55; MIPS: 30; Execution time:1.87 ms
- B CPI:1.55; MIPS: 25.8; Execution time:3.87 ms
- C CPI:5.60; MIPS: 45.8; Execution time:2.87 ms
- D CPI:2.55; MIPS: 35.8; Execution time:4.87 ms

Which of the following is not a form of main memory ?

- A Instruction cache
- C Instruction opcode
- B Instruction register
- D Translation look-aside buffer

In a 10-bit computer instruction format, the size of address field is 3-bits. The computer uses expanding OP code technique and has 4 two-address instructions and 16 one-address instructions. The number of zero address instructions it can support is

- A 256
- C 640



- B 356
- D 756

Diagram of a 10-bit instruction format:

|    |    |
|----|----|
| OP | AF |
|----|----|

Annotations: 10bit → above the OP field, 4bit circled below the AF field.

$$\text{Total operation} = 2^4 = 16$$

Given = 4  
Free opCode = 16 - 4 = 12

$$80 \times 2^3 \Rightarrow 80 \times 8 = 640$$

Diagram of a 10-bit instruction format:

|    |                 |                 |
|----|-----------------|-----------------|
| OP | AF <sub>1</sub> | AF <sub>2</sub> |
|----|-----------------|-----------------|

Annotations: 10bit → above the OP field, 4bit circled below the AF fields.

$$\text{Total operation} = 2^4 = 16$$

Given = 4

## A micro programmed control unit

- A is faster than a hardwired unit
- B Facilitates easy implementation of a new instruction
- C is useful when small programs are to be run
- D All of the above

**THANK  
YOU!**

