

Department of Electrical Engineering, IIT Delhi  
 ELL782 Computer Architecture: Minor I Examination  
 (Closed book/Closed Notes) Time: 1 hour Maximum Marks: 28

"Thou shalt not covet thy neighbour's answers"

1. Messy Mesh, Meshy Mess! Consider the problem of a single-node broadcast on a wrap-around mesh with  $p$  processors, with processors numbered from  $[0,0]$  to  $[\sqrt{p}-1, \sqrt{p}-1]$ . The following piece of pseudo-code is written for processor  $[i,j]$ . On your answer sheet, write the contents of the blanks in the pseudo-code. below  
 $(16 \times 1 \text{ marks})$

```

if ( $i == 0$ )
    /* — row phase — */
    if ( $j == 0$ ) { send(); send(); }
    else if ( $j == \sqrt{p}/2$ ) { wait(); }  $\xrightarrow{(P/2-1)}$ 
    else if ( $j == \sqrt{p}/2 + 1$ ) { wait(); }  $\xrightarrow{P/2+1}$ 
    else if ( $j > 0$  and  $j < \sqrt{p}/2$ ) { wait(); send(); }
    else { wait(); send(); } /*  $j > \sqrt{p}/2 + 1$  and  $j \leq \sqrt{p} - 1$  */
} /* — row phase — */
/* — column phase — */
if ( $i == 0$ ) { send(); send(); }
else if ( $i == \sqrt{p}/2$ ) { wait(); }
else if ( $i == \sqrt{p}/2 + 1$ ) { wait(); }
else if ( $i > 0$  and  $i < \sqrt{p}/2$ ) { wait(); send(); }
else { wait(); send(); } /*  $i > \sqrt{p}/2 + 1$  and  $i \leq \sqrt{p} - 1$  */
    
```

2. ring, ring!  $(8 \text{ marks})$   
 The optimal time to add a set of numbers on a  $p$ -processor linear ring is  $(t_s + mt_w)p/2 + t_a \log p$ , where  $t_s$  is the setup time,  $m$  is the number of words in a packet and  $t_w$  the per-word transfer time, and  $t_a$  is assumed to be the constant time to perform an addition at a processor. Let the answer give you a hint as to how it is done: please show the 8 steps needed on a 16-node linear ring, with the communication operations and the additions clearly identified in each of the 8 steps, as required.

3. Short Takes: Brevity is the Sole Soul of Wit!

- (a) Given an 8-processor hypercube, show how you can embed a linear ring onto this architecture  
 (b) Compare the number of links on a wrap-around mesh, and a hypercube, for the same number of processors  $p$ . Assume  $p$  to be a perfect square, and a power of 2.  $(2 + 2 \text{ marks})$



Sumantra Dutta Roy, EE, IITD

#link(HYPERCUBE OF "n" Dimension)  
 $= n \cdot 2^{n-1} \quad \forall n \geq 1$

|                   |                                                     |
|-------------------|-----------------------------------------------------|
| $n$               | $P$                                                 |
| 1 $\rightarrow$ 1 | $\xrightarrow{2 \rightarrow 4}$                     |
| 2 $\rightarrow$ 4 | $\therefore P = n \cdot 2^n \Rightarrow n = \log p$ |
| 3 $\rightarrow$ 8 |                                                     |

sumantra@ee.iitd.ac.in

"Thou shalt not covet thy neighbour's answers"

1. Messy Mesh, Meshy Mess! Consider the problem of a single-node broadcast on a wrap-around mesh with  $p$  processors, with processors numbered from  $[0,0]$  to  $[\sqrt{p}-1, \sqrt{p}-1]$ . The following piece of pseudo-code is written for processor  $[i,j]$ . On your answer sheet, write the contents of the blanks in the pseudo-code, below (16 × 1 marks)

```

if ( $i == 0$ )
    /* — row phase — */
    if ( $j == 0$ ) { send(—); }
    else if ( $j == \sqrt{p}/2$ ) { wait(—); }
    else if ( $j == \sqrt{p}/2 + 1$ ) { wait(—); }
    else if ( $j > 0$  and  $j < \sqrt{p}/2$ ) { wait(—); send(—); }
    else { wait(—); send(—); /*  $j > \sqrt{p}/2 + 1$  and  $j \leq \sqrt{p} - 1$  */ }

    /* — column phase — */
    if ( $i == 0$ ) { send(—), send(—); }
    else if ( $i == \sqrt{p}/2$ ) { wait(—); }
    else if ( $i == \sqrt{p}/2 + 1$ ) { wait(—); }
    else if ( $i > 0$  and  $i < \sqrt{p}/2$ ) { wait(—); send(—); }
    else { wait(—); send(—); /*  $i > \sqrt{p}/2 + 1$  and  $i \leq \sqrt{p} - 1$  */ }

     $\downarrow (j+1) \text{ mod } (\sqrt{p}+1)$   $\downarrow (i+1) \text{ mod } (\sqrt{p}+1)$   $\downarrow (i-1) \text{ mod } (\sqrt{p}+1)$ 

```

2. ring, ring! (8 marks)

The optimal time to add a set of numbers on a  $p$ -processor linear ring is  $(t_s + mt_w)p/2 + t_a \log p$ , where  $t_s$  is the setup time,  $m$  is the number of words in a packet and  $t_w$  the per-word transfer time, and  $t_a$  is assumed to be the constant time to perform an addition at a processor. Let the answer give you a hint as to how it is done: please show the 8 steps needed on a 16-node linear ring, with the communication operations and the additions

correspondingly identified in each of the 8 steps, as required.

you need to write the second update explicitly

(3)(a) (Linear Ring Node No)

corresponding Hypercube node no.

3. Short Takes: Brevity is the Sole Soul of Wit!

good

(processor # in Hypercube)

(a) Given an 8-processor hypercube, show how you can embed a linear ring onto this architecture.

(b) Compare the number of links on a wrap-around mesh, and a hypercube, for the same number of processors  $p$ . Assume  $p$  to be a perfect square, and a power of 2. (2 + 2 marks)

|   | (Linear Ring Node No) |
|---|-----------------------|
| 0 | 000 → 0               |
| 1 | 001 → 1               |
| 2 | 011 → 3               |
| 3 | 010 → 2               |
| 4 | 110 → 6               |
| 5 | 111 → 7               |
| 6 | 101 → 5               |
| 7 | 100 → 4               |

processors# in Linear ring.



"Above table is the mapping"

Sumantra Dutta Roy, EE, IITD

sumantra@ee.iitd.ac.in

comparison :-

| $P =$             | 4 | 8  | 16 | 32 | ... |
|-------------------|---|----|----|----|-----|
| # links(Mesh) =   | 4 | 12 | 32 | 80 | ... |
| # links(H-cube) = | 8 | 16 | 32 | 64 | ... |

$\therefore \# \text{links in H-cube} > \# \text{links in Mesh}$   $\forall P \geq 32$

(Exponential increase)  $\rightarrow$  (Linear increase)

The optimal time to add a set of no's on a "p" processor Linear ring

$$T(P) = \underbrace{(ts + mtw)}_{\text{one-2-All sharing time in Linear Ring.}} \frac{P}{2} + \underbrace{ta \log P}_{\substack{\text{addition time.} \\ \text{E} \\ (\log P \text{ times addition})}}$$



For 16 node Linear Ring.

$$T(16) = (ts + mtw) \cdot 16/2 + ta \frac{(\log 16)}{2}$$

→ 4. (so some outer loop that does the task of addition runs  $\log_2 16$  times).

1. It takes  $16/2$  steps for sharing, they add their part and forward as per one-2-all sharing, which takes 8 steps.
2. Additions happening in parallel for different bit positions.



Excellent Handwriting!

22

28

FOR

Computer Architecture Minor II-1

Department of Electrical Engineering, IIT Delhi  
ELL782 Computer Architecture: Minor I Examination  
(Closed book/Closed Notes) Time: 1 hour Maximum Marks: 28

"Thou shalt not covet thy neighbour's answers"

Name: Srijeeet chatterjee

Roll No. 2017-EET2295

1. RISCy Pipeline Question

100 LOAD A, M1  
101 ADD A, 1  
102 JUMP 104  
103 ADD B, A  
104 LOAD A, M2  
105 ADD A, 1  
106 STORE M3, A  
107 ADD A, 1  
108 JUMP 110  
109 ADD B, A  
110 STORE M4, A

Suppose you were to generate the best possible optimized code corresponding to the given segment of input code, for a pipelined RISC machine. Explain each point in the code, where you had to make some modification, along with a space-time diagram of the operations in this new piece of code. Clearly mention all line numbers, and write the complete new program, including the statements that would not be executed if execution were to start at line number 100. Assume that LOAD and STORE instructions consist of three machine cycles - (i) I: Instruction fetch and decode, (ii) E: Execution, and (iii) D: Memory Operation. All other instructions consist of the first two phases, alone. Here, Mi denote memory locations, and A and B are registers. The first statement loads A with the contents of memory location M1; the statement ADD A, 1 performs  $A \leftarrow A + 1$ ; etc.

(6 marks)

\* Optimization:- Will use delayed branching here. That reduces flushing and saves us from additional circuitry.

$$Q1 \rightarrow 2 + 2 + 2 = 4/6$$

Here by the time JUMP will make the PC to update address to 104, ADD A, 1 execution will take effect.

100 LOAD A, M1  
101 JUMP 104  
102 ADD A, 1  
103 ADD B, A  
104 LOAD A, M2  
105 ADD A, 1  
106 STORE M3, A  
107 JUMP 110  
108 ADD A, 1  
109 ADD B, A  
110 STORE M4, A

never gets executed

Never gets executed

NOP for data dependence avoidance

can be exch.

|      | t0 | t1 | t2 | t3 |
|------|----|----|----|----|
| I100 | I  | D  | E  |    |
| I101 |    | I  | E  |    |
| I102 |    | I  | E  |    |
| I104 |    |    | I  |    |

effectively the execution of  $A \leftarrow A + 1$  is done by the time PC is changed and loaded with new I.

So, Final code with "2" delayed branch change.

100 LOAD A, M1  
101 JUMP 104  
102 ADD A, 1  
103 ADD B, A  
104 LOAD A, M2  
105 ADD A, 1  
106 STORE M3, A  
107 ADD A, 1  
108 JUMP 110  
109 ADD B, A  
110 STORE M4, A.

Sumantra Dutta Roy, EE, IITD

sumantra@ee.iitd.ac.in

NOTE!- We can remove the never executed pieces of instructions from the (instruction) program set to optimize it spatially; but keeping them shouldn't be a much space complex code.

$$Q2(a) \frac{2}{2} + (b) \frac{2}{2} + (c) \frac{2}{2} + (d) \frac{1}{3} + (e) \frac{6}{8} = \frac{11}{15}$$

2. Brevity is the Sole Soul of Wit: Short Answers Only!

- (a) What are the advantages and disadvantages of having a small number of stages in a pipeline, vis-a-vis a large number?

Adv: (1) Loss due to flushing reduces when hazards occurs.  
(2) buffer delays b/w diff stages of pipeline will be less.

Disad: (i) degree of parallelism is less for small no. of stages.  
as. speedup = # stages (when efficiency=1).

- (b) What will be the average latency for a dynamic pipeline which has a latency value of 0, as a function of the number of stages of the pipeline and the number of functions evaluated in it?

$$\text{Avg. Latency} = \frac{\sum \text{latencies}}{\text{# stages}}$$

$$\text{Dynamic pipeline} = \frac{\sum \text{latencies across all the diff functions}}{\text{# functions}}$$

- (c) What is strength reduction?  $\Rightarrow T_{\min} = (\sum \text{Avg. (Latency)}, \text{# stages})_{\text{min}}$   
Replace costly operations with the less costly ones. e.g.  
Multiplication can be reduced with leftshift.

i.e.  $B \leftarrow A * 2$   
 $D \leftarrow A \ll 1$

- (d) Consider loop unrolling for the following C code

for(i = 10; i>=0; i--) a[i] += 5.0;

Suppose that the compiler has to re-write loop unrolled C code for this, with loops unrolled in units of 3 iterations. Write this.

not assembly!

3x3 = 9 times + 9 more time.  
(as unrolling = 3 is not a multiple of 2).  
 $\left\{ \begin{array}{l} \text{LD F0, O(RI)} \\ \text{ADD D F4, F0, F2} \\ \text{DADDUI R1, R1, #-8} \\ \text{SD F4, O(RI)} \\ \text{LD F6, O(RI)} \\ \text{ADD D F8, F6, F2} \\ \text{DADDUI R1, R1, #-8} \\ \text{SD F8, O(#8RI)} \\ \text{LD F10, O(RI)} \\ \text{ADD D F12, F10, F2} \\ \text{DADDUI R1, R1, #-16} \\ \text{SD F12, O(#16RI)} \\ \text{BNE R1, R2, LOOP} \end{array} \right\} \times 3 + \left\{ \begin{array}{l} \text{LD F0, O(RI)} \\ \text{ADD D F4, F0, F2} \\ \text{DADDUI R1, R1, #-8} \\ \text{SD F4, O(RI)} \end{array} \right\} \times 2$

- (e) Differentiate between a Data Dependence, an Anti-dependence and and Output Dependence. (2 + 2 + 2 + 3 + 6 marks)

ADD.D (R4), R2, R1  
\$MULT R5, R4, RO

The diagram illustrates three types of data hazards:

- True Data dependence** (Read after write)
- Anti-dependence** (Write after read)
- Output dependence** (Write after write)

Annotations above the diagram indicate that these hazards fall under the category of [Data Hazards].

### 3. Crass Pro-Procrastination

Consider the following Reservation Table:

## Resolve Register renaming

- \* No need to re-arrange codes here as it will lead to data dependence then.
- \* No actual data dependence.

|         | 0 | 1 | 2 | 3 | 4 | 5 |
|---------|---|---|---|---|---|---|
| Stage 1 | x | . | x | . |   | x |
| Stage 2 |   | x | x |   | x |   |
| Stage 3 | x |   | x | x |   |   |

Find the MAL lower bound (from the MAL Lemma (Lemma 3-1)), the MAL upper bound for a greedy simple cycle (from Lemma 3-3), the actual MAL (from the Modified State Diagram), and the/an optimal cycle. Now, consider the following situation. Suppose it is possible to change the reservation table through the insertion of delays, in such a manner that the overall functionality of the Reservation Table is not affected (*i.e.*, it still computes the same function):

~~(b) optimal cycle = 4  
from the  
MSD Drawn~~

|         | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 |
|---------|----|----|----|----|----|----|----|----|----|----|----|
| Stage 1 | x  | .  | x  | *  |    | .  |    | *  | *  | x  | x  |
| Stage 2 |    | x  | .  | x  | *  | x  | *  |    | x  | .  | *  |
| Stage 3 |    |    | x  | *  | x  | *  |    | *  | *  | *  | *  |

Find the MAL bound (from the MAL Lemma (Lemma 3-1)), the MAL upper bound for a greedy simple cycle (from Lemma 3-3), the actual MAL (from the Modified State Diagram), and the/an optimal cycle. Do you find anything surprising? ((1 + 1) + (1 + 1 + 3)) marks

- I)  $(MAL)_{lower} = 3$
- II)  $MAL_{upper} = 5$
- III) Optimal  
 $MAL = 3$   
~~(surprisingly even after introd. delays)~~



Rough

$$\begin{array}{r} 10101000101 \\ 01000101000 \\ \hline 11101101101 \end{array}$$

$$\begin{array}{r} 10101000101 \\ 01010001010 \\ \hline 11110001111 \end{array}$$

$$\begin{array}{r} 000101 \\ 10101000101 \\ \hline 10111100101 \end{array}$$

$$\begin{array}{r} 00101 \\ 10101000101 \\ \hline 10101000101 \end{array}$$

$$\begin{array}{r} 01010000000 \\ 10101000101 \\ \hline 11111000101 \end{array}$$

$$\begin{array}{r} 10100000000 \\ 10101000101 \\ \hline 10101000101 \end{array}$$

01

Observation

Even in the new (delayed) MSD

$$3 < \text{Max Latency} < 5$$

and, optimal cycle remains same.



\* No need to compute further:

As we could find a cycle with  
ML = 3. (which is the minimum)  
possible

Observation: we could get Min. possible  
latency even after introducing  
delays in the pipeline.

$Q_3 \rightarrow 7/7$  good

- 1) Write the floating point number representation (8 bit) of -19.9. [2]
- 2) The output of the combinational circuit will be ? [2]
  - $A+B+C$ , b)  $C(A+B)$ , c)  $A(B+C)$ , d)  $B(C+A)$



- 3) In the figure shown below, the output  $Y$  is required to be  $Y=AB+\overline{C}D$ . The gates  $G_1$  and  $G_2$  must be, respectively.... [2]
  - NOR, OR
  - OR, NAND
  - NAND, OR
  - AND, NAND



- 4) Write the instructions for exchanging the contents of memory locations 2005H and 2008H using **indirect addressing mode**. [3]

- 5) Write instructions for adding two 8 bit numbers, stored in memory location 2000H and 2001H and place the result in memory location 2002H. [3]

- 6) Consider three different processors P1, P2 and P3 executing the same instruction set. P1 has a 6Ghz clock rate and CPI of 3.5, P2 has a 2.5 Ghz clock rate and a CPI of 1.0, P3 has a 4.0Ghz clock rate and has a CPI of 2.2.

- Which processor has the highest performance expressed in instructions per second? [2]
- If the processor each execute a program in 20 seconds, find the number of cycles and the number of instructions. [3]

7) Signal A and B are represented in the figures (1,2,3) given below; What would be the XOR and XNOR outcome in each case? [3]



Figure:1



Figure:2



Figure:3

Department of Electrical Engineering, HT Delhi

**ELL782 Computer Architecture: Minor I Examination**

(Closed book/Closed Notes) Time: 1 hour Maximum Marks: 28

"Thou shalt not covet thy neighbour's answers"

1. **Accumulating Knowledge!** Consider the problem of single-node accumulation on a hypercube. There are  $p$  processors, and the following piece of pseudo-code is written for processor  $k$ . Please note the following restriction: in each of the two blanks where you have to fill a bitwise logical operation, you are allowed to use at most two of { AND, OR, NOT }, and not any other Boolean operator. On your answer sheet, write the contents of the blanks in the pseudo-code, below.
- $(7 \times 2 \text{ marks})$

```
mask = ____;  
for (phase = ____; phase >= 1; phase--)  
{  
    phase_max = 2^phase;  
    if (0 <= k <= ____) /* insert a function of phase_max */  
    {  
        sender = k ____ mask; /* insert bitwise logical operation(s) */  
        wait(&data, sender);  
        accumulate(data);  
    }  
    else if (____ <= k <= ____ ) /* insert functions of phase_max */  
    {  
        recipient = k ____ mask; /* insert bitwise logical operation(s) */  
        send(data, recipient);  
    }  
    mask = mask >> 1;  
}
```

2. Personalised treatment: No getting ‘hyper’ on this! Consider the problem of one-to-all-personalised broadcast (single node scatter) on a hypercube. Assume that node 0 has data for each node  $k$ ,  $1 \leq k \leq (p - 1)$ . Assume store-and-forward routing. Assume that in one time interval, a pair of nodes can only have an information transfer in one direction, and that all operations are blocking in nature (they wait for completion of the operation).
- (a) Assume all processors equally powerful, and all links to have the same bandwidth. Show the set of all operations that will take place at every time interval, for a hypercube with 16 nodes. An arrow represents flow of information between a pair of nodes. For instance, use the following notation to send a packet consisting of personalised messages for nodes 3, 6 and 7 from node  $\textcircled{a}$  to node  $\textcircled{c}$ , in a suitable *phase*:  $\textcircled{a} \xrightarrow[\textit{phase}]{[3,6,7]} \textcircled{c}$ . Consider the following restriction: the communication phases start with the LSB, and go on in steps, all the way to the MSB. Assume that initially, node 0000 starts with a packet  $[0, 1, 2 \dots 15]$ , and that node  $k$  does not need to send  $k$  to itself. Each node will pick and choose from what it receives/has, and send a suitably modified packet onward to a neighbouring node, in a *phase*. (4 marks)
- (b) Now, write suitable pseudo-code for a non-ideal case, using the notation in the previous question. At the start of each communication operation, a sender has a packet called `remaining`. From this, it creates a packet `data` suitably, to send it to a recipient, and updates its copy of `remaining`, in case it has to be a sender, in the next phase, as well. A variable `packet_size` holds the value of the size of the packet, for each transmission. Have suitable initialisations for `remaining`, `mask` and `packet_size`. (10 marks)

Department of Electrical Engineering, IIT Delhi  
**ELL782 Computer Architecture: Minor II Examination**  
(Closed book/Closed Notes) Time: 1 hour Maximum Marks: 28

"Thou shalt not covet thy neighbour's answers"

**1. RISCy Pipeline Question** Consider the given C code, and its assembly language equivalent (as done in class).  $F_i$  and  $R_i$  are registers, and  $\text{offset}(R_i)$  indicates an address which is located at an *offset* from the value in register  $R_i$ . Assume that in case of a data dependence, the following three rules hold good: a floating point ALU operation following a load incurs 1 clock cycle stall; a store following a floating point ALU operation incurs a stall of 2; and a branch following a DADDUI incurs a stall of 1. No other group of consecutive instructions causes any stall. Assume that all other cases of instructions take one cycle, including a NOP instruction.

(1+2+4+6+(2+1)+2+2+2 marks)  
 $\begin{matrix} a & b & c & d & e & f & g & h \end{matrix}$

- (a) Identify the anti-dependence in the assembly code
- (b) For the original assembly code, use NOP statements to eliminate stalls between two successive statements in the loop body.
- (c) For this and the rest of the question, now assume that you are not allowed to use NOP statements. Use an *optimised delayed branch* to reduce stalls. You are allowed to use scheduling.
- (d) For this and the rest of the question, now assume that you are not allowed to use an *optimised delayed branch*. Use *both* loop unrolling (with a factor of 4) and scheduling to get optimised assembly code.
- (e) What is the specific role of using only loop unrolling (without scheduling), and what is the trade-off?
- (f) What is the specific role of using only scheduling (without loop unrolling)?

loop: L.D F0, 0(R1) ] 1  
ADD.D F4, F0, F2 ] 2  
S.D F4, 0(R1)  
DADDUI R1, R1, #-8 ] 1  
BNE R1, R2, loop

- (g) Suppose the C loop had `for(i=N;i>=0;i--)`. Can we have loop unrolling with the date type of N not specified? Explain.
- (h) When can we have loop unrolling (and when can we not), if now we have a statement like `int N` in the program? Explain.

## 2. Brevity is the Sole Soul of Wit: Short-Answer Questions

- (a) If you see the following statement in a RISC pipelined code, why should it shock you: `ADD.D F1, 0(R2), 16(R1)`?
- (b) Consider the following reservation table in a static pipeline.

|         | 0 | 1 | 2 | 3 | 4 |
|---------|---|---|---|---|---|
| Stage 1 |   | x |   | x |   |
| Stage 2 | x |   | x |   | x |
| Stage 3 |   | x |   | x |   |

What is the MAL - the Minimum Attainable (average) Latency?  
What result(s) did you use to arrive at the above answer?

(2+4 marks)

Department of Electrical Engineering, IIT Delhi  
**ELL305 Computer Architecture: Minor II Examination**  
(Closed book/Closed Notes) Time: 1 hour Maximum Marks: 25

All questions to be answered on the question paper itself, which  
is to be returned at the end of the examination

"Thou shalt not covet thy neighbour's answers"

Name:

Roll No.

1. General Memory Hierarchy-related Questions, as covered in the first class, 23 Aug'17: *Morning shows the day*

- (a) Can a computer have only RAM and not ROM? Explain briefly.

- (b) How does any self-respecting computer manage to start with address 0 in its program counter? The program counter is a register, and any flip flop when powered on, can be in any state, 0 or 1.

- (c) Why do mathematical operations have to be done in registers: why can they not be done in the memory, or in the disk?

- (d) A computer's control unit sees a byte/word in memory. How does it know if it is an instruction, or if it is data? (2+1+2+2 marks)

**2. Organised Crime, Nothing On-line, only Cache Transactions**

- (a) In the cache organisations you have studied about, which one do you expect to have the largest number of misses, and why?
- (b) Consider a fully associative cache which numerous cache blocks. Why may you still see cache misses?
- (c) Why may a computer system with a bad cache be worse than one without any cache? Explain.
- (d) In the split cache-versus-unified cache numerical done in class, what was the significance of the computer being a RISC machine?
- (e) How many different types of bits can one have, associated with a cache block? List them. (*2+1+2+1+3 marks*)

3. The Write Invalidate Protocol: as mentioned in the beginning, multicore write coherence is not right. (via any snooping protocol). It will result in a 'marks invalidate' operation.

(a) Why can a processor's individual cache never receive a Read Hit/Write Hit signal from the bus?

(b) What results in an invalidate signal being put on the bus?

(c) Enumerate the case(s) in the Write Invalidate protocol when a processor's cache receives a block from another processor's cache, and not the common cache. (1+1+2 marks)

4. The Association between Caches and Virtual Memory Both fully associative caches and Virtual memory allow a block to be placed anywhere. Consider a 32-bit address bus, 16MB block size. Calculate the size of the mapping required, if the organisation were cache-like, and if it implemented paging. Why do you think virtual memory is not implemented like a fully associative cache? (2+2+1 marks)

consider Byte add.  
 $\# \text{lines} = \frac{16 \text{ MB}}{2^4} = 2^{24}$   
 Main Memo: 512 MB,  
 $= 2^9 \text{ bytes}$   
 $AB = 32 \text{ bit}$



$$\begin{aligned} \text{Size of mapping} &= 2^{24} \times 8 \text{ bits} \\ &= 2^{24} \times 2^3 = 2^{27} \\ &= 128 \text{ MB} \end{aligned}$$

Department of Electrical Engineering, IIT Delhi  
**EEL601 Computer Architecture: Minor II Examination**  
(Closed book/Closed Notes) Time: 1 hour Maximum Marks: 28

"Thou shalt not covet thy neighbour's answers"

- ✓ **Hazardous Question:** Consider a pipelined computer  $P_s$  with structural hazards and another  $P_n$ , without structural hazards.  $P_s$ 's clock is  $\alpha$  times faster than  $P_n$ 's ( $\alpha > 1$ ). Assume data references cause these structural hazards. Data references constitute a fraction  $\beta$  ( $0 \leq \beta \leq 1$ ) of the total program. If it were not for the hazard, both  $P_s$  and  $P_n$  would take 1 clock cycle per instruction, but with the hazard,  $P_s$  takes an extra  $w_{clock}$  cycles per instruction. What should be the relation between  $\alpha$ ,  $\beta$  and  $w$  for  $P_n$  to be still faster overall, as compared to  $P_s$ ? (2 marks)

**2. RISCy Pipeline Question**

- |                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|--------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 101 LOAD A, M1<br>102 JUMP 105<br>103 ADD A, B<br>104 SUB B, A<br>105 ADD A, 1 | (a) Generate optimised code with an optimised delayed branch, and show the timing diagram<br><br>(b) Now, generate optimised code without an optimised delayed branch, and show that this had no added complexity<br><br>(c) In both the above cases, show points where a structural hazard could be present, if a multi-port memory were not available.<br><br>Assume that LOAD and STORE instructions consist of three machine cycles -<br>(i) I: Instruction fetch and decode, (ii) E: Execution, and (iii) D: Memory Operation. All other instructions consist of the first two phases, alone. Here, $M_i$ denote memory locations, and A and B are registers. The first statement loads A with the contents of memory location $M_1$ ; the statement ADD A, 1 performs $A \leftarrow A + 1$ ; etc. (6 marks) |
|--------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

- ✓ **3. Reserving a Table...for a Function...** Consider the following reservation table, and consider a static pipeline.

|         | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|---------|---|---|---|---|---|---|---|
| Stage 1 |   |   | ✗ |   | ✗ |   |   |
| Stage 2 |   | ✗ |   | ✗ |   | ✗ |   |
| Stage 3 | ✗ |   | ✗ |   |   |   | ✗ |

- ✓ (a) What is the MAL lower bound given by the MAL Lemma (Lemma 3-1)?
- ✓ (b) What is the upper bound on the MAL given by Lemma 3-3?
- ✓ (c) Draw the modified state diagram for this reservation table.
- ✓ (d) What is the physical significance of Lemma 3-2 here? (1+1+2+2 marks)

P. T. O.

Department of Electrical Engineering, IIT Delhi  
**EEL601 Computer Architecture: Major Examination**  
(Closed book/Closed Notes) Time: 2 hours Maximum Marks: 28

"Thou shalt not covet thy neighbour's answers"

**1. One for sorrow, two for joy? United we stand, divided we fall?**

Consider a comparison of a combined/unified cache of capacity  $M$ , and two-cache/split-cache organisation with an instruction cache and a data cache, each of size  $M/2$ . Suppose the average number of misses per instruction for the instruction cache, data cache and the unified cache are  $m_I$ ,  $m_D$  and  $m_U$ , respectively. Assume that of all instructions in a typical program, on an average, a fraction  $f$  of them are data transfer instructions (LOAD/STORE). Assume the miss penalty to be  $p$  clock cycles. Assume that the cache time is a total of  $h$  clock cycles on a unified cache, and 1 clock cycle, otherwise. Assume write-through caches with write buffers, and ignore stalls due to the write buffer. Assume that  $m_I + m_D > m_U$ . Show that the unified cache is better than the split cache if the following condition holds:  $h < 1 + (m_I + m_D - m_U)p/f$ . (7 marks)

**2. Multi-core Write-Back Caches with the Write Invalidate Protocol, assuming Inclusion Concise, but precise answers, please!**

- (a) What is the inclusion property, for a multi-core processor?
- (b) The processor's local cache organisation can be direct-mapped/fully associative/set associative. Which one will give the worst-case performance, with regard to two parameters: the number of tag bits, and second, the rate of collisions?
- (c) When is a cache block said to be in an 'Invalid' state, even when it has some data pertaining to a valid memory block?
- (d) Can a block in a 'Shared' state be the only one that has a required block? If such a case can occur, what is the difference between a block in a 'Shared' state and one in a 'Modified/Exclusive' state?
- (e) The signals that can come from the bus to a particular cache block, can be an 'Invalidate', a 'Read Miss', or a 'Write Miss'. Why can there be nothing else, such as a 'Read Hit', or a 'Write Hit'?
- (f) A cache block may be in any of the three states 'Invalid', 'Shared' and 'Modified/Exclusive'. Based on this information, if one were to look at all possible combinations how many can there possibly be? We know that all of these combinations may not be practical.
- (g) Ok, I will press you further on this. How many of the above conditions are physically impossible, for signals coming from the processor, to the local cache (and not from the bus), and why?
- (h) Describe the case when a 'Read Miss' results in a *memory write*.
- (i) Under what situation does a processor's local cache send out an 'Invalidate' signal onto the bus?
- (j) Comment on the following separate situations: two local caches have dirty blocks, either with the same data, or different
- (k) Comment on the following situation: a cache receives an invalidate signal on the bus, with the corresponding block inside it, in a 'Modified/Exclusive' state. (1+2+2+2+2+2+2+2+2+2) marks

**MINOR 1**  
**(ELL305) Computer Architecture**

**Time: 1 Hour**

2016

Max. Marks: 25

**NAME:**

**Entry No.: -**

**Group: -**

**Serial Number as in Attendance Sheet: -**

**N. B.:** Do the calculations on last page attached. This is open book / notes examination, but exchange to each other is strictly prohibited.

**Q1:** - Select **only one** correct option in the following questions by ( ✓ ) mark. (2)

(a) 8086 (Intel family) can communicate with peripheral devices in modes:

(a) Synchronous [ ] (b) Semi-synchronous [ ] (c) Asynchronous[ ] (d) Watch dog mode [ ]

(b) In case of CMOS (ordinary) IC to be able to drive TTL IC we need:

(a) Buffer IC [ ] (b) Pull up resistor [ ] (c) Both of them [ ] (d) None of them [ ]

**Take the value of  $m$   $n$  =                  for this question paper**

(b): - Write the contents of 'AX' in *hexadecimal after* every numbered operation in the following:

(This is a **continuous** program which starts from **AX = 0000 H** and is executed sequentially). (3)

|     |      |                  |                     |
|-----|------|------------------|---------------------|
|     | MOV  | AL, <u>m_n</u> H |                     |
|     | MOV  | AH, <u>m_n</u> H |                     |
|     | AND  | AX, 0FFFF H      |                     |
|     | OR   | AX, 55AA H       |                     |
| (1) | TEST | AX, 0234 H       | New value of AX = H |
|     | INC  | AH,              |                     |
|     | MOV  | CL, 04 H         |                     |
| (2) | SHL  | AL, CL           | New value of AX = H |
|     | ADD  | AX, 0001 H       |                     |
|     | AAA  |                  |                     |
|     | AND  | AX, 0F0F H       |                     |
| (3) | AAD  |                  | New value of AX = H |
|     | XOR  | AX, 55BB H       |                     |
|     | DCR  | AL               |                     |
| (4) | OR   | AX, 3333 H       | New value of AX = H |

#### **CALCULATION SPACE:**

**Q2: - (8086 based)** What will be the value in AX after executing the following instructions? Give the answer in both hexadecimal and binary. (Numbers are decimal if they are not indicated with 'H') (2)

|     |          |
|-----|----------|
| MOV | AL, 15   |
| MOV | AH, 14   |
| XOR | AL, AL   |
| MOV | CL, 03   |
| SHR | AX, CL   |
| ADD | AL, 90 H |
| ADC | AH, 0    |

**ANSWER:** Final AX in decimal =

**Final AX in Hexadecimal =**

#### CALCULATION SPACE:

**O5:** - Draw neat schematic diagram of 8086 based controller board with following feature.

(4)

It should be able to drive a stepper motor in clock wise and anti-clock wise rotation. Show the scheme (By port pins only) and indicate the programming steps needed. If a delay routine of 10 milisecond is given what maximum RPM you may achieve using a stepper motor of 7.2 degree per step rotation.

### DIAGRAM & ANSWER:

(b) :- (8086 based) Show how to use XLAT instruction to access 9th element in the table called 'MARKS' that is located in the stack segment.

**Answer:**

~~XLAT B~~ ~~BX~~ \*  
MOV BX, ~~WORD PTR~~  
XLAT SS: Marks

**Q1 (a):** - Write the value of 'AL' register after the last operation in I, II, and III.

(All numbers are in 2's complement representation).

|     |     |                |
|-----|-----|----------------|
| (I) | MOV | BL, <u>m n</u> |
|     | MOV | AL, 04 H       |
|     | MOV | CL, 04 H       |
|     | SAR | BL, CL         |
|     | ADD | AL, BL         |

The value in 'AL' = H

|      |     |                |
|------|-----|----------------|
| (II) | MOV | AL, <u>m n</u> |
|      | MOV | BL, 0F0 H      |
|      | MOV | CL, 04 H       |
|      | SHL | AL, CL         |
|      | AND | AL, BL         |

The value in 'AL' = H

(III)    MOV                BL, m n  
           OR                BL, 80 H

|     |          |
|-----|----------|
| MOV | CL, 04 H |
| MOV | AL, 00 H |
| SAR | BL, CL   |
| XOR | AL, BL   |

The value in 'AL' = H

(b): - Write the flag status below respective flags after the last operation in a, b, and c.  
Put [X] for the undefined values.  
(All numbers are in 2's complement representation).

(a) MOV BL, m n  
MOV AL, B1 H  
SAR AL, 1  
SUB AL, BL

(b) MOV AL, m n  
MOV BL, 80 H  
ADD AL, BL  
DEC BL

(c) MOV BL, m n  
MOV AL, 04 H  
ADD AL, BL

Answer:

|   | OF | AF |
|---|----|----|
| a |    |    |
| b |    |    |
| c |    |    |

$$\begin{array}{l}
 \text{110101010} \\
 \text{---} \\
 \text{-2}^7 + 2^8 + 32 \\
 \text{42-128} \\
 \underline{-86} \quad (-87) \xrightarrow{25} \quad 88 \\
 \text{---} \quad \underline{\underline{0}} \quad \underline{\underline{0}}
 \end{array}$$

Q2: - A task in x86 program is to multiply two **16-bit** numbers together as if they were **fixed-point** numbers, where the implicit decimal point is in the **middle** of each number (i.e. eight bits of whole number to the left and eight bits of fraction to the right). One operand is in **AX** and other is in **BX**. Write a code fragment (not a subroutine or complete program) in correct x86 assembly language, which returns the **fixed point product** in **AX**. Ignore problems with overflow, and don't worry about register transparency; simply return in **AX** the middle **16** bits of the **32-bit** product.

Answer: MUL BX  
MOV CX, 8  
AGAIN: SHR DX, 1  
ROR AX, 1  
Loop AGAIN



Q1: In a new concept of an incubator there is main power switch 'P' which energizes the system. In this incubator following event occurs (which are controlled by 8086 based system): (2 + 4 + 2)

- (a) The heater (with fan) will be on when the on-switch ('P') is activated, the cover is closed and the temperature is below the set level.
- (b) One Peltier-effect based cooling (with fan) will be on if temperature is above the limit and the cover is closed.
- (c) A small light inside will be turned on if the light-switch ('S') is turned on or whenever cover is closed.
- (d) One cover of a humidifier is opened whenever **Humidity** level is below the limit and the doors are closed.
- (e) A gap of 5 minutes is needed between heater and cooler **switchovers** and vice versa.

**Notations:** -For cooler E (1 = on), light L (1 = on), power switch P (1 = on), heater H (1 = on), temperature  $T_A$  (1 = over the limit), temperature  $T_B$  (1 = below the limit), light switch S (1 = on), Cover C (1 = closed), Humidity slit HY (1 = open). Humidity level in % (1 = below limit).

Draw the schematic diagram (Neatly) and make a flow chart with appropriate remarks for above mentioned 8086-based incubator.

Write an assembly (8086) program to implement (e) as above (using your own port numbers and notations given). Assume all needed ports, A/D converters, external timing pulses of 1 minute and 5 minutes are available. Use wherever S/H or I. S. R. are needed. Neatness and clarity may yield better marks. No need to draw ladder diagram but it is desirable that you mention the Boolean equations for implementation of flow chart.

**Q1:-** In a green-house controller when the on-switch ('P') is activated following event occurs:

- (a) The heater will be on if the door is closed and the temperature is below the **limit 2**.
- (b) The exhaust-cooler will be on if the temperature is above the **limit 1** and the door is closed.
- (c) Lights inside will be turned on if the light-switch ('S') is turned on or whenever the door is opened.
- (d) Carbon dioxide cylinder is opened whenever **CO<sub>2</sub>** -level is below the limit and the door is closed.
- (e) A gap of 2 minutes is needed between heater and exhaust **switchovers**, more over both of them are never on at a time.

**Notations:**

For exhaust-cooler E (1 = on), light L (1 = on), power switch P (1 = on), heater H (1 = on), temperature  $T_A$  (1 = over the limit 1), temperature  $T_B$  (1 = below the limit 2), light switch S (1 = on), door switch D (1 = closed), CO<sub>2</sub> cylinder CY (1 = open), CO<sub>2</sub> level PPM (1 = below limit).

Draw the schematic diagram (Neatly) and make a flow chart with appropriate remarks for above mentioned 8086-based green-house controller.

Write an assembly (8086) program to implement (e) as above (use your own port numbers and notations given).

Assume all needed ports, A/D converters, external timing pulses of 1 minute and 5 minutes are available. Use wherever S/H or I. S. R. are needed. Neatness and clarity may yield better marks.

(1+3+3)

**ANSWER:**

### SCHEMATIC DIAGRAM

### FLOW CHART:

Write an 8086 program to reverse the order of bits in register AL. For example, if initially the pattern in AL is 111...01, the result left in AL should be 10...111. (Write program very neatly by assuming some more registers but final answer should be in AL). Program should have minimum number of instructions.

(4)

- (a) Write program fragment to reverse the contents of AL.

We shift left the bits of AL, one at a time, and put them in BL such that we end up with:

|    |      |      |      |      |      |      |      |        |
|----|------|------|------|------|------|------|------|--------|
| AL | 0(1) | 0(0) | 0(0) | 0(0) | 0(1) | 0(0) | 0(1) | 0(1) 0 |
|----|------|------|------|------|------|------|------|--------|

|    |    |    |    |    |    |    |    |    |
|----|----|----|----|----|----|----|----|----|
| BL | a0 | a1 | a2 | a3 | a4 | a5 | a6 | a7 |
|----|----|----|----|----|----|----|----|----|

We then copy BL to AL

PUSH BX  
PUSH CX  
MOV CX, 8

CF (1)

AL

0

SHL AL, 1

(3)

RCR BL, 1

(2)

BL

LOOP L1

MOV AL, BL

POP CX

POP BX

CH CCL  
[0 0] [0 8]

EX BX

Q3:- Write the contents of 'AL/AX' after the last operation in the following:

(2)

- (a) MOV AL, mn H
- MOV AH, mn H
- MOV CL, 04 H
- SAR AL, CL
- XOR AX, FF0F H
- AAD



Final AX after AAD = H

(b)

- MOV AL, mn H
- TEST AL, AA H
- OR AX, FF00 H
- STC
- ADC AL, AH
- CLC
- ROL AL, 1



New AL after ROL = H

Q2:- Use ideal operational amplifiers, thermocouple with amplifier and AD590 with amplifier circuit along with given guidelines using minimum number of Op-Amps.

- (1) Thermocouple with amplifier gives 5 m. volts per degree Centigrade (Use it in circuit as block).
- (2) AD590 with amplifier circuit gives 0.32 volts for freezing point of water and 2.12 volts for boiling point of water.
- (3) Resistances available in laboratory are all standard values between 10 K. Ohm and 1 M. Ohm (Variable resistances of different standard values are also available).
- (4) The final circuit should give 1.800 volts for 1800 ° C (Hot junction temperature) correctly even at the cold junction temperature of 100° C.

(5) Mention gain/attenuation of each stage and highlight voltages as well.

(6)

SEPT 11

Circuit diagram with calculations:

Q3:- For the circuit given in figure (3) below answer the questions.

(3)



Here 3M3 implies 3.3 M. Ohm and similarly others.

(a) - What is the D.C. gain of the input stages (excluding the rightmost op-amp stage)?

Answer:

(b) - If  $C_1 = 2$  microfarad what is the lower cutoff frequency of this circuit?

Answer:

(c) - If  $C_2 = 0.0047$  microfarad what is the upper cutoff frequency of this circuit?

$$f_u = \frac{1}{2\pi R_F C_F}$$

Answer:

(d) - What is the overall gain of the circuit at the mid-band frequency?

Answer:

**Q1:-** A piezo-electric sensor is attached to steam turbine, which has constant pressure as well as a cyclic (fluctuating) pressure. The circuit diagram for processing the output of the pressure sensor prior to digitization is given in Figure below. The above sensor which is attached to this circuit has a sensitivity of approximately **1 mV per mili-bar** and can tolerate up to maximum 30 mili-bars. The outputs  $V_b$  and  $V_f$  are the constant pressure signal and the fluctuating signal respectively.

(5)



→ Given:  $R_1 = 10$  K. Ohm,  $R_3 = 50$  K. Ohm,  $R_5 = 10$  K. Ohm,  $C_6 = 47$  Nano-Farad and noise is 50 Hz and higher.

(a) Both the second stage outputs are digitized and to minimize the cost of manufacture they will both be digitized by the same device (A/D converter) which has a 0–6 V input range. Given that the observed fluctuations are up to 3 milli-bars (at the input). What overall gains should the two branches ( $V_b$  and  $V_f$ ) have?

(b) The first stage should have a gain of 25. What value should  $R_2$  be set to?

(c) What is the purpose of  $C_4$ ? What value should it and  $R_4$  therefore be set to given the decisions made so far?

(d) The second stage for  $V_f$  is configured as a pass-band with an upper cut off of 'X' Hz. What purpose does the capacitor  $C_5$  serve? What value should  $R_6$  be set to (If  $C_5 = 100$  micro farad)? Give 2<sup>nd</sup> stage gain for branch  $V_f$ . What are the lower and upper cut off frequencies of this branch?

**Q2:** - In a green-house controller there is a relay coil coupled power switch 'P' which energizes the master relay coil CR<sub>1</sub>. In this green house for the condition when the on-switch ('P') is activated following event occurs:

- (a) The heater will be on when the doors are closed and the temperature is below the limit 2.
- (b) The exhaust-cooler will be on when the temperature is above the limit 1 and the doors are closed.
- (c) Lights inside will be turned on if the light-switch ('S') is turned on or whenever the doors are opened.
- (d) Carbon dioxide cylinder is opened whenever CO<sub>2</sub> -level is below the limit and the doors are closed.
- (e) A gap of 5 minutes is needed between heater and exhaust switchovers, more over both of them are never on at a time.

#### Notes:

For exhaust扇风机 (1 = open), light L (1 = on), power switch P (1 = on), burner II (1 = on), water tank T<sub>1</sub> (1 = over the limit 1), temperature T<sub>2</sub> (1 = below the limit 2), light switch S (1 = on), door switch D (1 = closed), CO<sub>2</sub> cylinder CY (1 = open), CO<sub>2</sub> level PFM (1 = before limit).

Assume also that + NC switch L<sub>1</sub> is open when temperature is above the set level, similarly another NC switch L<sub>2</sub>. Whatever arrangements you make about their switches, clearly highlight them and use them. You can also assume relay coils associated with burner and exhaust fan for your convenience in implementation.

Draw the electrical diagram (Neatly) and make a flow chart with appropriate remarks for above mentioned RISC-based green-house controller.

Note on assembly: (10%) program to implement (b) as above (use your own port numbers and connection points).

Assume all needed parts: A/D converter, external timing pulses of 1 minute and 5 minutes are available. Use whenever S.M or I, S, R, are required. Neatness and clarity may yield better marks. No need to draw ladder diagram but it is desirable that you mention the Boolean equations for flow chart.  
(2 x 2 = 2)

Q1. Explain the working principle of a microcontroller.

Q2. What is meant by a microcontroller?

Q3. Define the term microprocessor.

Q4. What is the difference between microprocessor and microcontroller?

Q5. What is the difference between 8085 and 8086?

Q6. What is the difference between 8085 and 8088?

Q7. What is the difference between 8086 and 8088?

Q8. What is the difference between 8086 and 8080?

Q9. What is the difference between 8080 and 8085?

Q10. What is the difference between 8085 and 808086?

Q11. What is the difference between 8085 and 808088?

Q12. What is the difference between 808086 and 808088?

Q13. What is the difference between 808086 and 808080?

Q14. What is the difference between 808080 and 808088?

Q15. What is the difference between 808080 and 808086?

Q16. What is the difference between 808086 and 808080?

Q17. What is the difference between 808088 and 808086?

Q18. What is the difference between 808088 and 808080?

Q19. What is the difference between 808086 and 808088?

Q20. What is the difference between 808080 and 808086?

Q21. What is the difference between 808088 and 808080?

Q22. What is the difference between 808086 and 808088?

Q23. What is the difference between 808080 and 808086?

Q24. What is the difference between 808086 and 808080?

Q25. What is the difference between 808088 and 808086?

Q26. What is the difference between 808086 and 808088?

Q27. What is the difference between 808080 and 808086?

Q28. What is the difference between 808086 and 808080?

Q29. What is the difference between 808088 and 808086?

Q30. What is the difference between 808086 and 808088?

Q31. What is the difference between 808080 and 808086?

Q32. What is the difference between 808086 and 808080?

Q33. What is the difference between 808088 and 808086?

Q34. What is the difference between 808086 and 808088?

Q35. What is the difference between 808080 and 808086?

Q36. What is the difference between 808086 and 808080?

Q37. What is the difference between 808088 and 808086?

Q38. What is the difference between 808086 and 808088?

Q39. What is the difference between 808080 and 808086?

Q40. What is the difference between 808086 and 808080?

Q41. What is the difference between 808088 and 808086?

Q42. What is the difference between 808086 and 808088?

Q43. What is the difference between 808080 and 808086?

Q44. What is the difference between 808086 and 808080?

Q45. What is the difference between 808088 and 808086?

Q46. What is the difference between 808086 and 808088?

Q47. What is the difference between 808080 and 808086?

Q48. What is the difference between 808086 and 808080?

Q49. What is the difference between 808088 and 808086?

Q50. What is the difference between 808086 and 808088?

Q51. What is the difference between 808080 and 808086?

Q52. What is the difference between 808086 and 808080?

Q53. What is the difference between 808088 and 808086?

Q54. What is the difference between 808086 and 808088?

Q55. What is the difference between 808080 and 808086?

Q56. What is the difference between 808086 and 808080?

Q57. What is the difference between 808088 and 808086?

Q58. What is the difference between 808086 and 808088?

Q59. What is the difference between 808080 and 808086?

Q60. What is the difference between 808086 and 808080?

Q61. What is the difference between 808088 and 808086?

Q62. What is the difference between 808086 and 808088?

Q63. What is the difference between 808080 and 808086?

Q64. What is the difference between 808086 and 808080?

Q65. What is the difference between 808088 and 808086?

Q66. What is the difference between 808086 and 808088?

Q67. What is the difference between 808080 and 808086?

Q68. What is the difference between 808086 and 808080?

Q69. What is the difference between 808088 and 808086?

Q70. What is the difference between 808086 and 808088?

Q71. What is the difference between 808080 and 808086?

Q72. What is the difference between 808086 and 808080?

Q73. What is the difference between 808088 and 808086?

Q74. What is the difference between 808086 and 808088?

Q75. What is the difference between 808080 and 808086?

Q76. What is the difference between 808086 and 808080?

Q77. What is the difference between 808088 and 808086?

Q78. What is the difference between 808086 and 808088?

Q79. What is the difference between 808080 and 808086?

Q80. What is the difference between 808086 and 808080?

Q81. What is the difference between 808088 and 808086?

Q82. What is the difference between 808086 and 808088?

Q83. What is the difference between 808080 and 808086?

Q84. What is the difference between 808086 and 808080?

Q85. What is the difference between 808088 and 808086?

Q86. What is the difference between 808086 and 808088?

Q87. What is the difference between 808080 and 808086?

Q88. What is the difference between 808086 and 808080?

Q89. What is the difference between 808088 and 808086?

Q90. What is the difference between 808086 and 808088?

Q91. What is the difference between 808080 and 808086?

Q92. What is the difference between 808086 and 808080?

Q93. What is the difference between 808088 and 808086?

Q94. What is the difference between 808086 and 808088?

Q95. What is the difference between 808080 and 808086?

Q96. What is the difference between 808086 and 808080?

Q97. What is the difference between 808088 and 808086?

Q98. What is the difference between 808086 and 808088?

Q99. What is the difference between 808080 and 808086?

Q100. What is the difference between 808086 and 808080?

Q101. What is the difference between 808088 and 808086?

Q102. What is the difference between 808086 and 808088?

Q103. What is the difference between 808080 and 808086?

Q104. What is the difference between 808086 and 808080?

Q105. What is the difference between 808088 and 808086?

Q106. What is the difference between 808086 and 808088?

Q107. What is the difference between 808080 and 808086?

Q108. What is the difference between 808086 and 808080?

Q109. What is the difference between 808088 and 808086?

Q110. What is the difference between 808086 and 808088?

Q111. What is the difference between 808080 and 808086?

Q112. What is the difference between 808086 and 808080?

Q113. What is the difference between 808088 and 808086?

Q114. What is the difference between 808086 and 808088?

Q115. What is the difference between 808080 and 808086?

Q116. What is the difference between 808086 and 808080?

Q117. What is the difference between 808088 and 808086?

Q118. What is the difference between 808086 and 808088?

Q119. What is the difference between 808080 and 808086?

Q120. What is the difference between 808086 and 808080?

Q121. What is the difference between 808088 and 808086?

Q122. What is the difference between 808086 and 808088?

Q123. What is the difference between 808080 and 808086?

Q124. What is the difference between 808086 and 808080?

Q125. What is the difference between 808088 and 808086?

Q126. What is the difference between 808086 and 808088?

Q127. What is the difference between 808080 and 808086?

Q128. What is the difference between 808086 and 808080?

Q129. What is the difference between 808088 and 808086?

Q130. What is the difference between 808086 and 808088?

Q131. What is the difference between 808080 and 808086?

Q132. What is the difference between 808086 and 808080?

Q133. What is the difference between 808088 and 808086?

Q134. What is the difference between 808086 and 808088?

Q135. What is the difference between 808080 and 808086?

Q136. What is the difference between 808086 and 808080?

Q137. What is the difference between 808088 and 808086?

Q138. What is the difference between 808086 and 808088?

Q139. What is the difference between 808080 and 808086?

Q140. What is the difference between 808086 and 808080?

Q141. What is the difference between 808088 and 808086?

Q142. What is the difference between 808086 and 808088?

Q143. What is the difference between 808080 and 808086?

Q144. What is the difference between 808086 and 808080?

Q145. What is the difference between 808088 and 808086?

Q146. What is the difference between 808086 and 808088?

Q147. What is the difference between 808080 and 808086?

Q148. What is the difference between 808086 and 808080?

Q149. What is the difference between 808088 and 808086?

Q150. What is the difference between 808086 and 808088?

Q151. What is the difference between 808080 and 808086?

Q152. What is the difference between 808086 and 808080?

Q153. What is the difference between 808088 and 808086?

Q154. What is the difference between 808086 and 808088?

Q155. What is the difference between 808080 and 808086?

Q156. What is the difference between 808086 and 808080?

Q157. What is the difference between 808088 and 808086?

Q158. What is the difference between 808086 and 808088?

Q159. What is the difference between 808080 and 808086?

Q160. What is the difference between 808086 and 808080?

Q161. What is the difference between 808088 and 808086?

Q162. What is the difference between 808086 and 808088?

Q163. What is the difference between 808080 and 808086?

Q164. What is the difference between 808086 and 808080?

Q165. What is the difference between 808088 and 808086?

Q166. What is the difference between 808086 and 808088?

Q167. What is the difference between 808080 and 808086?

Q168. What is the difference between 808086 and 808080?

Q169. What is the difference between 808088 and 808086?

Q170. What is the difference between 808086 and 808088?

Q171. What is the difference between 808080 and 808086?

Q172. What is the difference between 808086 and 808080?

Q173. What is the difference between 808088 and 808086?

Q174. What is the difference between 808086 and 808088?

Q175. What is the difference between 808080 and 808086?

Q176. What is the difference between 808086 and 808080?

Q177. What is the difference between 808088 and 808086?

Q178. What is the difference between 808086 and 808088?

Q179. What is the difference between 808080 and 808086?

Q180. What is the difference between 808086 and 808080?

Q181. What is the difference between 808088 and 808086?



Figure 1.37: (a) Inverting comparator with positive feedback. (b) Plot of  $v_O$  versus  $v_I$ .

switching speed of a comparator by increasing the amount of positive feedback at high frequencies. It has no effect on the input voltage at which the op amp switches states.

The output voltage from the circuit of Fig. 1.37(a) can be written

$$v_O = V_{SAT} \operatorname{sgn}(v_+ - v_I) \quad (1.83)$$

Because  $v_O$  has the two stable states  $v_O = +V_{SAT}$  and  $v_O = -V_{SAT}$ , it follows that  $v_+$  can have two stable states given by

$$V_A = V_{REF} \frac{R_F}{R_F + R_1} - V_{SAT} \frac{R_1}{R_F + R_1} \quad (1.84)$$

$$V_B = V_{REF} \frac{R_F}{R_F + R_1} + V_{SAT} \frac{R_1}{R_F + R_1} \quad (1.85)$$

where superposition and voltage division have been used for each equation. For  $v_I < V_A$ , it follows that  $v_O = +V_{SAT}$ . For  $v_I > V_B$ , it follows that  $v_O = -V_{SAT}$ . For  $V_A < v_I < V_B$ ,  $v_O$  can have two stable states, i.e.  $v_O = \pm V_{SAT}$ . The graph of  $v_O$  versus  $v_I$  is given in Fig. 1.37(b).

The value of  $v_O$  for  $V_A < v_I < V_B$  depends on whether  $v_I$  increases from a value less than  $V_A$  or  $v_I$  decreases from a value greater than  $V_B$ . That is, the circuit has memory. If  $v_I < V_A$  initially and  $v_I$  begins to increase,  $v_O$  remains at the  $+V_{SAT}$  state until  $v_I$  becomes greater than  $V_B$ . At this point  $v_O$  switches to the  $-V_{SAT}$  state. If  $v_I > V_B$  initially and  $v_I$  begins to decrease,  $v_O$  remains at the  $-V_{SAT}$  state until  $v_I$  becomes less than  $V_A$ . Then  $v_O$  switches to the  $+V_{SAT}$  state. The path for  $v_O$  on the graph in Fig. 1.37(b) is indicated with arrows. The loop in the graph is commonly called a *hysteresis loop*.

**Example 25** The Schmidt trigger circuit of Fig. 1.37(a) has the element values  $R_F = 1 \text{ M}\Omega$  and  $R_1 = 33 \text{k}\Omega$ . If  $V_{REF} = 3 \text{ V}$  and the op amp saturation voltage is  $V_{SAT} = 12 \text{ V}$ , calculate the two threshold voltages  $V_A$  and  $V_B$ .

**Solution.** By Eqs. (1.84) and (1.85), we have

$$V_A = 3 \frac{1}{1 + 0.033} - 12 \frac{0.033}{1 + 0.033} = 2.52 \text{ V}$$

$$V_B = 3 \frac{1}{1 + 0.033} + 12 \frac{0.033}{1 + 0.033} = 3.29 \text{ V}$$