

# Computation Structures

## CMOS Technology Worksheet

### Concept Inventory:

- PFET, NFET: voltage controlled switches
- CMOS composition rules: complementary pullup and pulldown
- CMOS gates are naturally inverting
- $t_{PD}$  and  $t_{CD}$  timing specifications
- Lenient gates

### Notes:



We want *complementary* pullup and pulldown logic, i.e., the pulldown should be “on” when the pullup is “off” and vice versa.

| pullup | pulldown | F(inputs)     |
|--------|----------|---------------|
| on     | off      | driven “1”    |
| off    | on       | driven “0”    |
| on     | on       | driven “X”    |
| off    | off      | no connection |



### CMOS gates are naturally inverting:

- Rising input (0 to 1): NFETs turn on, PFETs turn off; if output changes, it falls (1 to 0)
- Falling input (1 to 0): NFETs turn off, PFETs turn on; if output changes, it rises (0 to 1)

### Timing:

- $t_{PD}$  (propagation delay): how long after inputs are stable and valid until outputs are stable and valid = max over all paths from input to output (sum of component  $t_{PD}$  along path)
  - $t_{PD}$  specification is an upper bound on all measured propagation delays
- $t_{CD}$  (contamination delay): how long output stays valid after inputs go invalid = min over all paths from input to output (sum of component  $t_{CD}$  along path)
  - $t_{CD}$  specification is a lower bound on all measured contamination delays

### Lenient gate:

- If a subset of a lenient gate’s inputs is suffice to guarantee an specific output value (i.e., the values of the other inputs don’t matter in this case), then the output will remain valid and stable by transitions on the irrelevant inputs.
- CMOS gates are naturally lenient

| NOR: |   | A | B | Z |
|------|---|---|---|---|
| 0    | 0 | 1 |   |   |
| 0    | 1 | 0 |   |   |
| 1    | 0 | 0 |   |   |
| 1    | 1 | 0 |   |   |

| Lenient NOR: |   | A | B | Z |
|--------------|---|---|---|---|
| 0            | 0 | 1 |   |   |
| X            | 1 | 0 |   |   |
| 1            | X | 0 |   |   |



**Problem 1.**

- (A) Which of the above CMOS pulldown circuits would implement F if the corresponding complementary pullup circuit was also provided? For each pulldown, select Yes if it is a valid pulldown for F, and No if it is not a valid pulldown for F.

Yes PD1 (Yes/No): No  
 PD2 (Yes/No): No  
 PD3 (Yes/No): No  
 PD4 (Yes/No): No  
 PD5 (Yes/No): Yes

| A | B | C | D | F |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 1 |
| 0 | 0 | 0 | 1 | 1 |
| 0 | 0 | 1 | 0 | 1 |
| 0 | 0 | 1 | 1 | 1 |
| 0 | 1 | 0 | 0 | 1 |
| 0 | 1 | 0 | 1 | 1 |
| 0 | 1 | 1 | 0 | 1 |
| 0 | 1 | 1 | 1 | 1 |
| 1 | 0 | 0 | 0 | 1 |
| 1 | 0 | 0 | 1 | 1 |
| 1 | 0 | 1 | 0 | 1 |
| 1 | 0 | 1 | 1 | 0 |
| 1 | 1 | 0 | 0 | 1 |
| 1 | 1 | 0 | 1 | 0 |
| 1 | 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 1 | 0 |



- (B) Are all the implementations you selected for part (A) lenient?

All lenient (Yes/No): Yes.

**Problem 2.**

- (A) A single CMOS gate, consisting of an output node connected to a single PFET-based pullup circuit and a single NFET-based pulldown circuit (as described in lecture) computes  $F(A, B, C, D)$ . It is observed that  $F(1, 0, 1, 0) = 1$ . What can you say about the following values?

$\rightarrow 0$  (circle one)  $F(0, 0, 1, 0) = : 0 \dots 1 \dots$  (can't say)

$0 \rightarrow$  (circle one)  $F(1, 1, 1, 0) = : 0 \dots 1 \dots$  (can't say)

(circle one)  $F(1, 1, 1, 1) = : 0 \dots 1 \dots$  (can't say)

all (1)  $\Rightarrow 0$ .

- (B) The Boolean function  $F(A,B,C)$  can be implemented using a *single* CMOS gate operating as a combinational device that obeys the static discipline. It's known that  $F(1,1,0) = 1$  and  $F(0,1,1) = 0$ . What can be determined about the value of  $F$  in the following cases? Please circle one of "0", "1" or "Can't tell".



(circle one)  $F(1,0,0) = \underline{0} \dots \underline{1} \dots$  Can't tell

(circle one)  $F(1,0,1) = \underline{0} \dots \underline{1} \dots$  Can't tell

(circle one)  $F(1,1,1) = \underline{0} \dots \underline{1} \dots$  Can't tell

- (C) A single CMOS gate, consisting of an output node connected to a *single* pullup circuit containing one or more PFETs and a single pulldown circuit containing one or more NFETs (as described in lecture), computes  $F(A,B)$ .  $F$  has the property that for all  $A$ ,  $F(A,0) = \overline{F(A,1)}$ . What can you say about the value of  $F(1,0)$ ?

$$F(1,0) = \overline{F(1,1)}$$

(circle one):  $F(1,0) = \underline{0} \dots 0 \dots$  can't tell

Problem 3.

$$\Rightarrow 1. \quad \Rightarrow 0.$$

For each of the functions  $F$  and  $G$ , if the function can be implemented using a **single CMOS gate**, please draw the corresponding single CMOS gate. If it cannot be implemented using a single CMOS gate, then write NONE. **For full credit use a minimum number of FETs.**

| Draw CMOS implementation of $F(A,B,C)$ below or write NONE if $F$ cannot be implemented as single CMOS gate. | Draw CMOS implementation of $G(A,B,C)$ below or write NONE if $G$ cannot be implemented as single CMOS gate.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|--------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <p>No No ?</p> <p><math>F(1,1,1) \Rightarrow 1</math></p> <p>Not a single CMOS.</p>                          | <p><math>\begin{array}{ c c c c c } \hline A &amp; B &amp; C &amp; F &amp; G \\ \hline 0 &amp; 0 &amp; 0 &amp; 1 &amp; 1 \\ 0 &amp; 0 &amp; 1 &amp; 1 &amp; 1 \\ 0 &amp; 1 &amp; 0 &amp; 0 &amp; 1 \\ 0 &amp; 1 &amp; 1 &amp; 1 &amp; 0 \\ 1 &amp; 0 &amp; 0 &amp; 1 &amp; 1 \\ 0 &amp; 0 &amp; 0 &amp; 0 &amp; 0 \\ 1 &amp; 1 &amp; 0 &amp; 0 &amp; 1 \\ 1 &amp; 1 &amp; 1 &amp; 1 &amp; 0 \\ \hline \end{array}</math></p> <p><math>B \ 0 \rightarrow 1</math> (pull down?).</p> <p><math>F(1 \rightarrow 0)</math>.</p> <p><math>C \ \underline{0} \rightarrow 1</math> pulldown.</p> <p><math>F(t \rightarrow 0)</math></p> |

### Problem 4.

Consider the Boolean function that has the truth table shown to the right; a possible implementation as a combinational circuit is shown in the schematic below. You may assume that the NOR2 and NAND2 components are combinational.



|       | $t_{CD}$ | $t_{PD}$ |
|-------|----------|----------|
| NOR2  | 0.01ns   | 0.05ns   |
| NAND2 | 0.01ns   | 0.03ns   |

| A | B | C | H |
|---|---|---|---|
| 0 | 0 | 0 | 1 |
| 0 | 0 | 1 | 1 |
| 0 | 1 | 0 | 1 |
| 0 | 1 | 1 | 1 |
| 1 | 0 | 0 | 1 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 0 |

- (A) Using the timing specifications shown above for NOR2 and NAND2, compute the contamination and propagation delay for the implementation of H shown above.

为 H4 = 0.01ns,  
只有 A 变化，输出 H 也会变化。  
timing for H (ns):  $t_{CD} = \underline{0.01}$   $t_{PD} = \underline{0.16}$ .

- (B) Can H be implemented as a single CMOS gate (only PFETs in the pullup circuit, only NFETs in the pulldown circuit)? If so draw the MOSFET schematic for H to the right, otherwise write "NO".

Draw schematic or write "NO"



### Problem 5.

A gate-level schematic is shown below. Using the  $t_{CD}$  and  $t_{PD}$  information for the gate components shown in the table below, compute  $t_{CD}$  and  $t_{PD}$  for the circuit.



Compute timing specs:

$$t_{CD} = \underline{0.5} \text{ ns}$$

$$t_{PD} = \underline{5} \text{ ns}$$

| Gate  | $t_{CD}$ | $t_{PD}$ |
|-------|----------|----------|
| INV   | 0.1ns    | 1.0ns    |
| NAND2 | 0.2ns    | 1.5ns    |
| NAND3 | 0.3ns    | 1.8ns    |
| XOR2  | 0.6ns    | 2.5ns    |

**Problem 6.**

A minority gate has three inputs (call them A, B, C) and one output (call it Y). The output will be 0 if two or more of the inputs are 1, and 1 if two or more of the inputs are 0.

In the space below, draw the *pulldown* circuit for a single CMOS gate that implements the minority function, using the minimum number of NFETs. You needn't draw the *pullup* circuit.

If you're convinced that the function cannot be implemented as a single CMOS gate, give a brief, convincing explanation.

Can it be implemented as single CMOS gate? Circle one:  YES    can't tell    NO



### Problem 7.

In his bid for the Lemelson Prize, Ben Bitdiddle has invented the “flexible gate,” a single CMOS gate that implements different functions depending how its inputs are wired up. The FlexGate® (see figure at right) uses 6 PFETs in its pullup circuit and 6 NFETs in its pulldown circuit..

Each of the FlexGate’s twelve inputs can connected to an input signal (X, Y, ...), GND (logical “0”) or VDD (logical “1”). To show off its versatility, Ben has asked you to show how to hook up the inputs so the FlexGate computes several different functions whose Boolean equations are given below. Associated with each equation is a table with 12 entries; in each cell of the table please write an input name, GND or VDD as appropriate. Note that there may be several possible implementations for each of the three functions – any correct answer will be acceptable. Hint: there should be an entry in each cell, i.e., a connection should be specified each input!

If the desired function cannot be implemented, please draw a big “X” through the table.

$$\begin{array}{cc} X & Y \\ \begin{matrix} 0 & 0 \\ 0 & 1 \\ 1 & 0 \\ 1 & 1 \end{matrix} & \begin{matrix} 1 \\ | \\ 1 \\ 0 \end{matrix} \\ OUT = \overline{X \cdot Y} \end{array}$$

$$\begin{array}{cc} & OUT \\ \begin{matrix} 1 & 0 & 0 \\ 0 & 0 & 0 \end{matrix} & \begin{matrix} 0 \\ | \\ 1 \end{matrix} \\ OUT = \overline{X + Y + Z} \end{array}$$

Fill in tables below or mark with “X”

$$0 + \cancel{0}/2 = 1.$$

$$OUT = \overline{X + Y \cdot Z}$$

| input | LEFT | RIGHT |
|-------|------|-------|
| A     | G    | G     |
| B     | V    | V     |
| C     | X    | Y     |
| D     | X    | G     |
| E     | G    | G     |
| F     | G    | Y     |

| input | LEFT | RIGHT          |
|-------|------|----------------|
| A     | X    | V              |
| B     | V    | Y              |
| C     | Z    | V              |
| D     | X    | X <del>Z</del> |
| E     | G    | Y              |
| F     | Z    | X <del>Z</del> |

| input | LEFT | RIGHT |
|-------|------|-------|
| A     | X    | V     |
| B     | V    | G     |
| C     | Y    | Z     |
| D     | X    | V     |
| E     | G    | Y     |
| F     | Z    | V     |



**Problem 8.**

The response of a combinational gate to a test input waveform is shown below. Each horizontal division of the plot represents 10 ps.



- (A) Based on the figure below, what is an appropriate choice for the contamination delay of the gate?

$$\underbrace{30 \text{ ps}}_{\text{invalid}} \text{ or } 20 \text{ ps.} \Rightarrow 30 \text{ ps.}$$

$$\text{invalid} \Rightarrow \text{invalid.} \quad \min(40 \text{ ps}), 30 \text{ ps.}$$

- (B) Based on the figure below, what is an appropriate choice for the propagation delay of the gate?

$$\text{valid} \Rightarrow \text{valid.}$$

$$90 \text{ ps. or } 50 \text{ ps} \Rightarrow 90 \text{ ps}$$

$$\max(90 \text{ ps}, 50 \text{ ps}) \Rightarrow 90 \text{ ps.}$$

# Computation Structures

## Combinational Logic Worksheet

### Concept Inventory:

- Truth tables  $\leftrightarrow$  sum-of-products equations
- implementation using NOT/AND/OR
- Demorgan's Law, implementation using NAND/NOR

- Simplification, truth tables w/ don't cares
- Karnaugh maps
- Implementation using MUXes and ROMs

### Here's a Design Approach

**Truth Table**

| C | B | A | Y |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 1 |
| 0 | 1 | 0 | 0 |
| 0 | 1 | 1 | 0 |
| 1 | 0 | 0 | 0 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 1 |

1. Write out our functional spec as a truth table  
2. Write down a Boolean expression with terms covering each '1' in the output:  

$$Y = \bar{C}BA + \bar{C}BA + C\bar{B}A + CBA$$
  
3. We'll show how to build a circuit using this equation in the next two slides.

This approach will always give us Boolean expressions in a particular form: SUM-OF-PRODUCTS

6.004 Computation Structures L04: Logic Synthesis, Slide #7

### Straightforward Synthesis

We can implement SUM-OF-PRODUCTS with just three levels of logic:

1. Inverters
2. ANDs
3. OR



-it's systematic!  
-it's work!  
-it's easy!  
-are we done yet???

Propagation delay -- No more than 3 gate delays?\*

\*assuming gates with an arbitrary number of inputs, which, as we'll see, isn't a good assumption!

6.004 Computation Structures L04: Logic Synthesis, Slide #8

### CMOS Sum-of-products Implementation

**NAND-NAND**

$\bar{AB} = \bar{A} + \bar{B}$  "Pushing Bubbles"



**NOR-NOR**

$\bar{AB} = \bar{A} + \bar{B}$



You might think all these extra inverters would make this structure less attractive. However, quite the opposite is true.

6.004 Computation Structures L04: Logic Synthesis, Slide #11

### Finding Implicants

An implicant

- is a rectangular region of the K-map where the function has the value 1 (i.e., a region that will need to be described by one or more product terms in the sum-of-products)
- has a width and length that must be a power of 2: 1, 2, 4
- can overlap other implicants
- is a prime implicant if it is not completely contained in any other implicant.



• can be uniquely identified by a single product term. The larger the implicant, the smaller the product term.

6.004 Computation Structures L04: Logic Synthesis, Slide #18

### Write Down Equations

Picking just enough prime implicants to cover all the 1's in the KMap, combine equations to form minimal sum-of-products.

**K-Map:**

| C\AB | 00 | 01 | 11 | 10 |
|------|----|----|----|----|
| 0    | 0  | 0  | 1  | 1  |
| 1    | 0  | 1  | 1  | 0  |

$Y = \bar{A}\bar{C} + BC$

**K-Map:**

| C\AB | 00 | 01 | 11 | 10 |
|------|----|----|----|----|
| 00   | 0  | 1  | 1  | 1  |
| 01   | 1  | 1  | 1  | 1  |
| 11   | 1  | 1  | 1  | 1  |
| 10   | 1  | 0  | 0  | 1  |

$Y = D + \bar{B}\bar{C} + \bar{A}\bar{C} + \bar{B}C$

Minimal SOP is not necessarily unique!

6.004 Computation Structures L04: Logic Synthesis, Slide #20

### Prime Implicants, Glitches & Leniency

This circuit produces a glitch on Y when A=1, B=1, C: 1 $\rightarrow$ 0



**K-Map:**

| C\AB | 00 | 01 | 11 | 10 |
|------|----|----|----|----|
| 0    | 0  | 0  | 1  | 1  |
| 1    | 0  | 1  | 1  | 0  |

$Y = \bar{C}A + CB$

To make the circuit lenient, include product terms for ALL prime implicants.



6.004 Computation Structures L04: Logic Synthesis, Slide #21

**Problem 1.**

Given a function F defined by the truth table to the right, provide a minimal sum-of-products expression for F. Hint: Use a Karnaugh Map.



Minimal Sum-of-products Expression for F: \_\_\_\_\_

$$F = \bar{A} + \bar{D} + \bar{C} \cdot \bar{A} \cdot \bar{B}$$

**Problem 2.**

(A) A correctly-formed CMOS gate implementing G(A,B,C) uses the pulldown circuit shown on the right. Please give a minimal sum-of-products expression for G(A,B,C). Hint: use a Karnaugh map!

$$G(A, B, C) = A \cdot (B + C) + B \cdot C$$



Minimal sum-of-products expression for G(A,B,C):  $\bar{C} \cdot \bar{A} + \bar{A} \cdot \bar{B} + \bar{C} \cdot \bar{B}$



(B) Is the function G(A,B,C) from part (B) universal? In other words, can we implement any Boolean function using combinational circuits built only from G gates and the Boolean constants 0 and 1?

可以，C接低电平，就变成了1000,  
 NAND3.  $\Rightarrow$  universal [这是AND!!]  
 $G(A, B, C) \Rightarrow \text{NAND}_3$ .

### Problem 3.

Consider the following truth table which defines two functions F and G of three input variables (A, B, and C).

| A | B | C | F | G |
|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 1 |
| 0 | 0 | 1 | 1 | 1 |
| 0 | 1 | 0 | 0 | 1 |
| 0 | 1 | 1 | 1 | 0 |
| 1 | 0 | 0 | 1 | 1 |
| 1 | 0 | 1 | 0 | 0 |
| 1 | 1 | 0 | 0 | 1 |
| 1 | 1 | 1 | 1 | 0 |



Give the **minimal sum of products** (minimal SOP) logic equation for each of the two functions. Then determine if the minimum sum of products expression would result in a **lenient** implementation of the function. If it does, then enter “**SAME**” for the lenient SOP expression. If **not**, specify what sum of products expression would result in a lenient implementation. Hint: Use Karnaugh maps above to determine the minimal sum of products.

$\rightarrow 0, 1 \Rightarrow$

Minimal sum of products  $F(A,B,C) = \underline{\bar{C} \cdot \bar{B} + \bar{A} \cdot \bar{B} + C \cdot B}$ .

怎么画这个呢？

当  $\bar{A} = 0, C = 1$  时， $F(A,B,C) = 1$ .  
Does minimal SOP for F result in a lenient circuit (circle one)? Yes No.

If “No”, give lenient SOP expression for  $F(A,B,C) = \underline{\bar{C} \cdot \bar{B} + \bar{A} \cdot \bar{B} + C \cdot B + \bar{A} \cdot C}$

感觉这样. F 的值在某个度量从 1  $\rightarrow 0$  或  $0 \rightarrow 1$  的时候  
值不变, 但又依赖这个度量,

Minimal sum of products  $G(A,B,C) = \underline{\bar{C} + \bar{A} \cdot \bar{B}}$

这时候由于

inverter 是延迟

就会有时间

unlentient.

Does minimal SOP for G result in a lenient circuit (circle one)? Yes No

If “No”, give lenient SOP expression for  $G(A,B,C) = \underline{\bar{C} + \bar{A} \cdot \bar{B} + A \cdot B}$

#### Problem 4.

You are trying to select pulldowns for several 3- and 4-input CMOS gate designs. The Pulldowns-R-Us website offers seven different pulldowns, given names PD1 through PD7, diagrammed below:



The web site explains that the customer can choose which inputs or constants (GND, VDD) are connected to each NFET, allowing their pulldowns to be used in various ways to build gates with various numbers of inputs. Since Pulldowns-R-Us charges by transistor, you are interested in selecting pulldowns using the minimum number of transistors for each of the 3-input gates you are designing.

For each of the following 3- and 4-input Boolean functions, choose the appropriate pulldown design, i.e., the one which, properly connected, implements that gate's pulldown using the *minimum number* of transistors. This may require applying Demorgan's Laws or minimizing the logic equation first. If none of the above pulldowns meets this goal, write "NONE".

$$(A) \quad F(A, B, C) = \overline{A + (B \cdot C)} \quad \begin{array}{l} \text{pull down} \\ | \Rightarrow 0. \\ 0 \Rightarrow 1. \end{array} \quad \text{Choice or NONE: } \underline{\hspace{2cm}} \quad \underline{\hspace{2cm}} \quad \underline{\hspace{2cm}}$$

$\therefore F(1) \neq 1.$

$$(B) \quad F(A, B, C) = A + \overline{B \cdot C} \quad \begin{array}{l} \text{Choice or NONE: } \underline{\hspace{2cm}} \end{array}$$

$$= \bar{A} \cdot (\overbrace{\bar{B} \cdot C}^{\text{choice}}) = \bar{A} \cdot (B \cdot \bar{C}).$$

$$(C) \quad F(A, B, C) = (\overline{A \cdot \bar{B}}) + \overline{C} \quad \begin{array}{l} \text{Choice or NONE: } \underline{\hspace{2cm}} \end{array}$$

$$= C \cdot (\hat{A} \cdot \hat{B}) = C \cdot (A + B).$$

$$(D) \quad F(A, B, C, D) = \overline{A + C \cdot (B + D)} \quad \begin{array}{l} \text{Choice or NONE: } \underline{\hspace{2cm}} \end{array}$$

$$= A + C \cdot (B + D).$$

### Problem 5.

You are trying to select pullups for several 3-input CMOS gate designs. The Pullups Galore web site offers seven different pullups, given names PU1 through PU7, diagrammed below:



The web site explains that the customer can choose which inputs are connected to each PFET, allowing their pullups to be used in various ways to build gates with various numbers of inputs. Since Pullups Galore charges by transistor, you are interested in selecting pullups using the minimum number of transistors for each of the 3-input gates you are designing.

For each of the following 3-input Boolean functions, choose the appropriate pullup design, i.e., the one which, properly connected, implements that gate's pullup using the *minimum number* of transistors. This may require minimizing the logic equation first. If none of the above pullups meets this goal, write "NONE".

(A)  $F(A,B,C) = \overline{A} + \overline{B} + \overline{C}$        $A=0 \parallel B=0 \parallel C=0$ .      Choice or NONE: 5.

(B)  $F(A,B,C) = \overline{A} + \overline{B \cdot C}$        $A=0 \parallel B \cdot C=0$ .      Choice or NONE: 5.

(C)  $F(A,B,C) = \overline{\overline{A} + B \cdot C}$        $\overline{\overline{A}} \cdot (\overline{B \cdot C})$       Choice or NONE: 2.

(D)  $F(A,B,C) = A + \overline{B \cdot C}$        $A=1 \parallel B \cdot C=0$ .      Choice or NONE: \_\_\_\_\_

(E)  $F(A,B,C) = \overline{(A+B)} + \overline{(B+C)} + \overline{(A+C)}$        $\overline{A}=0 \Rightarrow 0=0$        $\Rightarrow 7 \text{ uses } 6 \text{ FETs, PFT vs. NFT?}$       Choice or NONE: 6. ?

(F)  $F(A,B,C) = \overline{(A+C) \cdot B}$        $= \overline{\overline{A} \cdot \overline{B} + \overline{B} \cdot \overline{C} + \overline{A} \cdot \overline{C}}$       Choice or NONE: 3.

$$\begin{aligned}
 &= \overline{\overline{B}} + \overline{(\overline{A} + \overline{C})} \\
 &= \overline{\overline{B}} + \overline{(\overline{A} \cdot \overline{C})}.
 \end{aligned}$$

**Problem 6.**

Consider the Boolean function  $H(A, B, C) = \overline{A} + \overline{B} \cdot \overline{C} + A \cdot B \cdot \overline{C}$ . Its truth table is shown to the right and a possible implementation is shown in the schematic below.



| A | B | C | H |
|---|---|---|---|
| 0 | 0 | 0 | 1 |
| 0 | 0 | 1 | 1 |
| 0 | 1 | 0 | 1 |
| 0 | 1 | 1 | 1 |
| 1 | 0 | 0 | 1 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 0 |

- (A) Give a minimal sum-of-products expression for H. A couple of scratch 3-input Karnaugh map templates are provided for your convenience.

minimal sum-of-products expression for H:  $\overline{C} + \overline{A}$

|   | 00 | 01 | 11 | 10 |
|---|----|----|----|----|
| 0 | 1  | 1  | 1  | 1  |
| 1 | 1  | 1  | 0  | 0  |

|   | 00 | 01 | 11 | 10 |
|---|----|----|----|----|
| 0 | 1  | 0  | 1  | 0  |
| 1 | 0  | 1  | 0  | 1  |

No simplify.



- (B) What is the largest number of product terms possible in a minimal sum-of-products expression for a 3-input, 1-output Boolean function?

Largest number of product terms possible: 10

max = 4.

|   | 00 | 01 | 11 | 10 |
|---|----|----|----|----|
| 0 | 1  | 1  | 0  | 1  |
| 1 | 0  | 1  | 1  | 1  |



**Problem 7.**

A minority gate has three inputs (call them A, B, C) and one output (call it Y). The output will be 0 if two or more of the inputs are 1, and 1 if two or more of the inputs are 0.

- (A) Give a *minimal sum-of-products* Boolean expression for the minority gates output Y, in terms of its three inputs A, B, and C.

Minimal SOP expression:  $Y = \underline{\bar{C} \cdot \bar{A} + \bar{C} \cdot \bar{B} + \bar{A} \cdot \bar{B}}$

|   |    |    |    |     |
|---|----|----|----|-----|
|   | 00 | 01 | 11 | 10. |
| 0 | 1  | 1  | 0  | 1   |
| 1 | 1  | 0  | 0  | 0   |

- (B) Is a minority gate *universal*, in the sense that using only minority gates (along with constants 0 and 1) its possible to implement arbitrary combinational logic functions?

Universal? Circle one: YES  can't tell  NO

$$Y(1, B, C).$$



| $S$ | $D_1$ | $D_0$ | $D_S$ |
|-----|-------|-------|-------|
| 0   | 0     | 0     | 0     |
| 0   | 0     | 1     | 1     |
| 0   | 1     | 0     | 0     |
| 0   | 1     | 1     | 1     |
| 1   | 0     | 0     | 0     |
| 1   | 0     | 1     | 0     |
| 1   | 1     | 0     | 1     |
| 1   | 1     | 1     | 1     |

### Problem 8.

A 6.004 intern at Intel has designed the combinational circuit shown below. His boss can't figure out what it does and has asked for your help.



| $A$ | $B$ | $C$ | $F(A, B, C)$ |
|-----|-----|-----|--------------|
| 0   | 0   | 0   | 0            |
| 0   | 0   | 1   | 0            |
| 0   | 1   | 0   | 1            |
| 0   | 1   | 1   | 0            |
| 1   | 0   | 0   | 0            |
| 1   | 0   | 1   | 1            |
| 1   | 1   | 0   | 1            |
| 1   | 1   | 1   | 0            |

(A) Please fill in the truth table for  $F(A, B, C)$  above.

Fill in truth table above

(B) Express  $F(A, B, C)$  in minimal sum-of-products form. Hint: use a Karnaugh map!

minimal sum-of-products expression for  $F(A, B, C) = \overline{C} \cdot \overline{B} + C \cdot \overline{A} \cdot \overline{B}$



(C) The boss isn't quite sure what it means but he knows his engineers are always impressed if he asks "is the circuit universal?" Is it? Circle YES or NO.

$F(A, B, C)$  universal? YES ... NO

$C \cdot \overline{A \cdot B}$

| $C \backslash A\ B$ | 00 | 01 | 11 | 10 |
|---------------------|----|----|----|----|
| 0                   | 0  | 0  | 1  | 0  |
| 1                   | 1  | 1  | 1  | 0  |

$F(A, B, C) = \overline{A} \cdot \overline{B} + C \cdot \overline{A}$

the selector  
not lenient.

### Problem 9.

The full subtractor (FS) implements a one-column binary subtraction of two bits ( $X$  and  $Y$ ) producing their difference ( $D$ ), accepting a borrow-in ( $B_{IN}$ ) from the previous column and producing a borrow-out ( $B_{OUT}$ ) for the next column. Numerically FS computes  $X - Y - B_{IN}$  and encodes the possible answers ( $1, 0, -1, -2$ ) using  $D$  and  $B_{OUT}$  as shown in the truth table to the right.

| X | Y | $B_{IN}$ | D | $B_{OUT}$ |
|---|---|----------|---|-----------|
| 0 | 0 | 0        | 0 | 0         |
| 0 | 0 | 1        | 1 | 1         |
| 0 | 1 | 0        | 1 | 1         |
| 0 | 1 | 1        | 0 | 1         |
| 1 | 0 | 0        | 1 | 0         |
| 1 | 0 | 1        | 0 | 0         |
| 1 | 1 | 0        | 0 | 0         |
| 1 | 1 | 1        | 1 | 1         |

- (A) (2 Points) Give a sum-of-products expression for  $B_{OUT}$ .

Sum of products expression:  $B_{OUT} = \overline{X} \cdot B_{in} + Y \cdot B_{in} + \overline{X} \cdot Y$

- (B) (1 Point) Is the FS( $X, Y, B_{IN}$ ) circuit universal in the sense that 2-input NOR and 2-input NAND are universal? In other words, using only acyclic networks of FS circuits (perhaps with one or more of their inputs tied to "0" or "1"), can one implement any combinational logic function?



- (C) (2 Points) You're trying to build an implementation for the  $B_{OUT}$  part of the FS circuit (see truth table above) but discover that the NITGFOC supply room only has 4-to-1 multiplexors in stock. In desperation, you call up your 6.004 TA who says "No problem! In fact, you can produce  $B_{OUT}$  with just a single 4-to-1 mux: connect  $X$  to S1 and  $Y$  to S0, then hook each of the data inputs to the appropriate choice of '0', '1' or  $B_{IN}$ ." Using this hint, finish off the implementation shown below.

Show connections for data inputs using only '0', '1' or  $B_{IN}$



|   |   |   |   |
|---|---|---|---|
| 0 | 0 | 0 | 1 |
| 0 | 0 | 1 | 0 |
| 0 | 1 | 0 | 1 |
| 0 | 1 | 1 | 0 |
| 1 | 1 | 0 | 0 |
| 1 | 1 | 1 | 0 |
| 1 | 0 | 0 | 1 |
| 1 | 0 | 0 | 1 |

**Problem 10.**

The 3-input Boolean function  $G(A,B,C)$  computes  $\bar{A} \cdot \bar{C} + A \cdot \bar{B} + \bar{B} \cdot \bar{C}$ .

- (A) How many 1's are there in the output column of G's 8-row truth table?

4.

- (B) Give a minimal sum-of-products expression for G.

$$\bar{C} \cdot \bar{A} + A \cdot \bar{B}$$

|   |    |    |    |    |
|---|----|----|----|----|
|   | 00 | 01 | 11 | 10 |
| 0 | 1  | 1  | 0  | 1  |
| 1 | 0  | 0  | 0  | 1  |

- (C) There's good news and bad news: the bad news is that the stockroom only has G gates. The good news is that it has as many as you need. Using only combinational circuits built from G gates, one can implement (choose the best response)

- (A) only inverting functions
  - (B) only non-inverting functions
  - (C) any function (G is universal)
  - (D) only functions with 3 inputs or less
  - (E) only functions with the same truth table as G
- (D) Can a sum-of-products expression involving 3 input variables with greater than 4 product terms always be simplified to a sum-of-products expression using fewer product terms?

Yes. Max (...) = 4.

Lec. 4 . 7.

# Computation Structures

## Sequential Logic Worksheet

### Concept Inventory:

- D-latch & the Dynamic Discipline
- D-register
- Timing constraints for sequential circuits
- Set-up and hold times for sequential circuits

復雜的 timing 計算。

不懂。

### Notes:

D latch



D register



$$t_1 = t_{CD,reg1} + t_{CD,L} \geq t_{HOLD,reg2}$$

$$t_2 = t_{PD,reg1} + t_{PD,L} + t_{SETUP,reg2} \leq t_{CLK}$$



### Problem 1.

Consider the following sequential logic circuit. It consists of one input IN, a 2-bit register that stores the current state, and some combinational logic that determines the state (next value to load into the register) based on the current state and the input IN.



- (A) Using the timing specifications shown below for the XOR and DREG components, determine the shortest clock period,  $t_{CLK}$ , that will allow the circuit to operate correctly or write NONE if no choice for  $t_{CLK}$  will allow the circuit to operate correctly and briefly explain why.

$$1.6 + 0.4 + 2 \times t_{PD} \quad 6.2$$

Minimum value for  $t_{CLK}$  (ns): ~~8.2ns~~.  
or explain why none exists

1. ~~4.1ns~~ PREG.

$t_{PD}$  &  $t_{setup}$   $\geq$   $t_{clock}$ ?

- (B) Using the same timing specifications as in (A), determine the setup and hold times for IN with respect to the rising edge of CLK.

$$2 \times 2.1 + 0.4 \quad 5.8 \quad 4.6$$

$t_{setup}$  for IN with respect to  $CLK^\uparrow$  (ns): \_\_\_\_\_

$t_{hold}$  for IN with respect to  $CLK^\uparrow$  (ns): 0.05

$$0.2 - 0.15$$

- (C) One of the engineers on the team suggests using a new, faster XOR2 gate whose  $t_{CD} = 0.05\text{ns}$  and  $t_{PD} = 0.7\text{ns}$ . Determine a new minimum value for  $t_{CLK}$  or write NONE and explain why no such value exists.

Minimum value for  $t_{CLK}$  (ns): No.  
or explain why none exists

$$0.15 < 0.2$$

$t_{sum}(t_{CD}) < t_{Hold..}$

### Problem 2.

Consider the following sequential logic circuit. It consists of three D registers, three different pieces of combinational logic (CL1, CL2, and CL3), one input IN, and one output OUT. The propagation delay, contamination delay, and setup time of the registers are all the same and are specified below each register. **The hold time for the registers is NOT the same** and is specified in bold below each register. The timing specification for each combinational logic block is shown below that logic.



- (A) (1 point) What is the smallest value for the  $t_{CD}$  of CL2 that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

$$R_2 - CL_2 \rightarrow R_3 \quad \text{Smallest value for } t_{CD} \text{ of CL2 (ns): } \underline{\hspace{2cm}} \quad \text{0ns } 2 \text{ ns.}$$

- (B) (2 points) What is the smallest value for the period of CLK (i.e.,  $t_{CLK}$ ) that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

$$R_3 - R_2 = 2 + 5 + 3 + 2. \quad \text{Smallest value for } t_{CLK} \text{ (ns): } \underline{\hspace{2cm}} \quad \text{10ns / 12.}$$

- (C) (2 points) What are the smallest values for the setup and hold times for IN relative to the rising edge of CLK that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

$$\text{Setup time for IN (ns): } \underline{\hspace{2cm}}$$

$$\text{Hold time for IN (ns): } \underline{\hspace{2cm}} \quad 0$$

- (D) (2 points) What are the propagation delay and contamination delay of the output, OUT, of this circuit relative to the rising edge of the clock?

$$t_{PD} \text{ for OUT (ns): } \underline{\hspace{2cm}} \quad \text{2}$$

$$t_{CD} \text{ for OUT (ns): } \underline{\hspace{2cm}} \quad 0.$$

完全搞不懂.....  
use  $R_3$ .  
为什么不是  $R_2$ .

$$CL_3 - CL_1 \rightarrow R_2 \rightarrow CL_2 - R_3 \text{ 呢?}$$

### Problem 3.

Here's a schematic for a 3-bit loadable down-counter, which uses a ripple decrementer as a building block:

| component | $t_{CD}$ (ns) | $t_{PD}$ (ns) | $t_s$ (ns) | $t_h$ (ns) |
|-----------|---------------|---------------|------------|------------|
| XOR2      | .03           | .14           | —          | —          |
| NOR2      | .01           | .05           | —          | —          |
| NOR3      | .02           | .08           | —          | —          |
| INV       | .005          | .02           | —          | —          |
| MUX2      | .02           | .12           | —          | —          |
| DREG      | .03           | .19           | .15        | .05        |



- (A) Using the contamination delays ( $t_{CD}$ ), propagation delays ( $t_{PD}$ ), setup times ( $t_s$ ), and hold times ( $t_h$ ) shown in the table above, please compute the minimum value for the clock period ( $t_{CLK}$ ) for which the circuit will work correctly.

根据时序图， $t_{CLK}$  约为 0.172 ns.

minimum value for  $t_{CLK}$  (ns): 0.172 ns.

~~0.57 ns.~~

- (B) What are the appropriate values for the setup ( $t_s$ ) and hold ( $t_h$ ) times for the **LD** input with respect to the rising edge of the clock?

setup time ( $t_s$ ) for LD: 0.575 ns. 0.17

hold time ( $t_h$ ) for LD: 0.27 ns. 0.03.

- (C) What is the  $t_{PD}$  for the Zero output with respect to the rising edge of CLK?

$t_{PD}$  for Zero (ns): 0.27 ns.

#### Problem 4.

Consider the following sequential logic circuit. The timing specifications are shown below each component. Note that the two registers do NOT have the same specifications.



- (A) What are the smallest values for the setup and hold times for IN relative to the rising edge of CLK that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

Setup time for IN (ns): 8 ns.

Hold time for IN (ns): 1.5.

- (B) What is the smallest value for the period of CLK (i.e., tCLK) that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

*for t<sub>pd</sub> + t<sub>cd</sub>*. {  $R_2 - R_1 = 8 + 4 = 12$  } Smallest value for tCLK (ns): 12.

*寄存器之间的距离?* {  $R_1 - R_2 = 3 + 4 + 4 = 11$  }

- (C) What is the smallest for the tCD of R1 that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

Smallest value for tCD of R1 (ns): 1.5 ?

- (D) Suppose two of these sequential circuits were connected in series, with the OUT signal of the first circuit connected to the IN signal of the second circuit. The same CLK signal is used for both circuits. Now what is the smallest value for the period of CLK (i.e., tCLK) that will guarantee the dynamic discipline is obeyed for all the registers in the circuit?

Smallest value for tCLK (ns): 16

*longest path*

$R_2 \rightarrow \bar{I}_n \rightarrow \text{NAND} \rightarrow R_2$

### Problem 5.

It is often useful to make clocked devices that count in binary, and a simple building block for such binary counters is the toggle flipflop whose symbol is shown on the right. It is a clocked device, hence the clock input indicated by the triangle on its lower-left edge. The other input, T (for *toggle*), may be set to one to cause the TFlop to flip its state (the *Q* output) from 0 to 1 or vice versa on the next active (positive) clock edge. If T is zero at an active clock edge, the state of the TFlop remains unchanged. We assume that the initial state of each TFlop at power-up is  $Q=0$ ; more sophisticated versions might feature a *Reset* input to force a  $Q=0$  state.



A TFlop may be implemented using a D flipflop like the ones developed in lecture together with an XOR2 gate, as shown to the left.

As is our convention for clocked devices, we would like to specify timing specs for the TFlop as  $t_{CD}$ ,  $t_{PD}$ ,  $t_{SETUP}$ , and  $t_{HOLD}$ , all measured relative to the active (positive) clock edge.

- (A) The timing specifications for the components are shown in the table below. Give appropriate values for the timing specifications of the TFlop implementation shown above.

| Component | $t_{CD}$ | $t_{PD}$ | $t_{SETUP}$ | $t_{HOLD}$ |
|-----------|----------|----------|-------------|------------|
| XOR2      | 40ps     | 400ps    | —           | —          |
| DREG      | 100ps    | 300ps    | 80ps        | 40ps       |

$$t_{CD}: \underline{140} \text{ ps}$$

$$t_{PD}: \underline{\cancel{200}}.300 \text{ ps}$$

$$t_{SETUP}: \underline{480.} \text{ ps}$$

$$t_{HOLD}: \underline{0} \text{ ps}$$

- (B) Suppose we connect the T input of a single TFlop to 1 (i.e.,  $V_{DD}$ ) and try to clock it at its maximum rate. What is the minimum clock period we can use and expect the TFlop to perform properly?



**Minimum clock period for correct operation:** 780 ps

We next consider the use of four TFlops to make a 4-bit ripple-carry counter as shown to the left. Assume that the TFlops share a common clock input (not shown) with an appropriate period, and that all TFlops have an initial  $Q=0$  state.

- (C) Suppose we run this circuit for a large number, N, of clock cycles. For approximately how many of the N active clock edges would you expect the T input to the topmost TFlop to be 1?

**Topmost T=1 occurrences in N cycles:** 2.4%

- (D) If the AND2 gates have  $t_{PD}=200\text{ps}$  and  $t_{CD}=40\text{ps}$ , what is the minimum clock period we can use for the 4-bit counter?

**Minimum clock period for correct operation:** 1.4. ps 783

# Computation Structures

## Finite State Machines Worksheet

### Concept Inventory:

- State transition diagrams & FSM truth tables
- Register & ROM implementation
- Equivalent FSMs; equivalent state reduction
- Metastability: causes and cures

### Notes:



FSMs are EQUIVALENT if and only if every inputs sequence yields identical output sequences.

Two states are equivalent if  
 1. both states have identical outputs, AND  
 2. every input transitions to equivalent states.



Quarantine time reduces  $p(\text{metastable})$



**Problem 1.**

- (A) For each of the following FSMs please indicate if they are or are not well formed. Note that the state names have been omitted for clarity; you may assume the state names are unique.



(A)

*Not mutually exclusive.*



(B)

FSM A (circle one): Well Formed / *Not Well Formed*

FSM B (circle one): *Well Formed* / Not Well Formed

FSM C (circle one): Well Formed / *Not Well Formed*

*Not exhaustive.*



(C)

- (B) Given the partially completed truth table and FSM diagram below. **Complete all the missing entries** in the truth table and the FSM diagram. The FSM is a Moore machine, i.e., the Out signal is determined only by the current state. In each state circle, the top entry is  $S_1 S_0$  and the bottom entry is the value of Out. Make sure that you have labeled all missing states, inputs, and outputs, and that you have added and labeled any missing transitions in the FSM.

| S1 | S0 | In | S1' | S0' | Out |
|----|----|----|-----|-----|-----|
| 0  | 0  | 0  | 0   | 0   | 0   |
| 0  | 0  | 1  | 1   | 0   | 0   |
| 0  | 1  | 0  | 0   | 1   | 1   |
| 0  | 1  | 1  | 1   | 0   | 1   |
| 1  | 0  | 0  | 1   | 1   | 1   |
| 1  | 0  | 1  | 0   | 0   | 1   |
| 1  | 1  | 0  | 0   | 1   | 0   |
| 1  | 1  | 1  | 0   | 0   | 0   |



- (C) If this FSM is implemented using a 2-bit state register and a ROM, what size ROM would be needed? Please specify the number of locations (entries) of the ROM, and the width of each entry.

Number of locations in ROM: 8

Width of each ROM entry (bits): 3

### Problem 2.

Consider the sequential logic circuit to the right, which implements an FSM with a single data input IN and single data output OUT. Assume that all signal transitions are timed so that the dynamic discipline is satisfied at each register.



Please describe the operation of the FSM by filling in both the state transition diagram and the truth table shown below. The two-digit state names in the state transition diagram are S0,S1, the logic values present at the outputs of REG0 and REG1 after the rising edge of the clock. In the truth table, S0' and S1' are the values that will be loaded into REG0 and REG1 at the next rising clock edge.

**Fill in state transition diagram and truth table**



| S0 | S1 | IN | S0' | S1' | OUT |
|----|----|----|-----|-----|-----|
| 0  | 0  | 0  | 0   | 0   | 0   |
| 0  | 0  | 1  | 1   | 0   | 0   |
| 0  | 1  | 0  | 0   | 0   | 0   |
| 0  | 1  | 1  | 1   | 0   | 0   |
| 1  | 0  | 0  | 0   | 1   | 0   |
| 1  | 0  | 1  | 1   | 1   | 0   |
| 1  | 1  | 0  | 0   | 1   | 1   |
| 1  | 1  | 1  | 1   | 1   | 1   |

### Problem 3.

A “Thingee” is a clocked device built out of 3 interconnected components, each of which is known to be a 4-state FSM. What bound, if any, can you put on the number of states of a Thingee?

**Max # of states, or “Can’t Tell”:** ≤ 64

**Problem 4.**

Consider the 1-input, 1-output finite state machine with the state transition diagram shown below. Note that the single output P only depends on the current state of the FSM.



- (A) (1 Point) The FSM has been processing inputs for a while and we would like to determine its current state. After entering three additional inputs “000”, we observe that we have reached a state where  $P=0$ . Please circle the possible values for the state *before* the additional three inputs were entered.

Possible values for state before:  $S_1 \text{ } \cancel{S_2} \text{ } S_3 \text{ } S_4 \text{ } \cancel{S_5}$

- (B) (2 Points) Assume that the states are represented by the 3-bit binary values given on the left below. Please fill in the appropriate entries for the partial truth table shown on the right where S is the current state, I is the input value, S' is the next state, P is the output value

| State | Encoding |
|-------|----------|
| S1    | 001      |
| S2    | 010      |
| S3    | 011      |
| S4    | 100      |
| S5    | 101      |

| S     | I   | S'  | P | Fill in partial truth table |
|-------|-----|-----|---|-----------------------------|
| $S_3$ | 011 | 010 | 0 |                             |
| $S_3$ | 011 | 100 | 0 |                             |
| $S_4$ | 100 | 101 | 1 |                             |
| $S_4$ | 100 | 011 | 1 |                             |

- (C) Please identify which, if any, states are equivalent. For example, if states S1, S2, and S4 are equivalent, please write “(S1,S2,S4)”. You may need multiple parenthesized lists if more than one set of states is equivalent.

Equivalent states:  $(S_1, S_3)$  /  $(S_4, S_5)$

### Problem 5.

Perfectly Perplexing Padlocks makes an entry-level electronic lock, the P3b, built from an FSM with two bits of state. The P3b has two buttons (“0” and “1”) that when pressed cause the FSM controlling the lock to advance to a new state. In addition to advancing the FSM, each button press is encoded on the B signal (B=0 for button “0”, B=1 for button “1”). The padlock unlocks when the FSM sets the UNLOCK output signal to 1, which it does whenever—and only whenever—the last 3 button presses correspond to the 3-digit combination. The combination is unique, and will open the lock independently of the starting state. Unfortunately the design notes for the P3b are incomplete.



| $S_1$ | $S_0$ | B | $S'_1$ | $S'_0$ | U |
|-------|-------|---|--------|--------|---|
| 0     | 0     | 0 | 1      | 1      | 0 |
| 0     | 0     | 1 | 0      | 0      | 0 |
| 0     | 1     | 0 | 1      | 1      | 1 |
| 0     | 1     | 1 | 1      | 0      | 1 |
| 1     | 0     | 0 | 0      | 1      | 0 |
| 1     | 0     | 1 | 0      | 0      | 0 |
| 1     | 1     | 0 | 1      | 1      | 0 |
| 1     | 1     | 1 | 1      | 0      | 0 |

(A) (1 Point) What is the 3-bit combination for the lock?

lock combination: 0 1 0

(B) (5 Points) Using the specification and clues from the partially completed diagrams above fill in the information that is missing from the state transition diagram and its accompanying truth table. When done:

- each state in the transition diagram should be assigned a 2-bit state name  $S_1S_0$  (note that in this design the state name is *not* derived from the combination that opens the lock),
- the arcs leaving each state should be mutually exclusive and collectively exhaustive,
- the value for U should be specified for each state, and
- the truth table should be completed.

(complete above transition diagram and table)

**Problem 6.**

Below is a state transition diagram for a 4-state FSM with a single binary input B. The FSM has single output – a light that is “on” when the FSM is in states “E” or “S”. The starting state, “W”, is marked by the heavy circle.



- (A) (1 Point) Does this FSM have a set or sets of equivalent states that can be merged to yield an equivalent FSM with fewer states?

List set(s) of states that can be merged or write NONE: (S, E)

- (B) (5 Points) Please fill in as many entries as possible in the following truth table for the FSM. The *light* output is a function of the current state and should be 1 when the light is “on” and 0 when it’s “off.”

| S1 | S0 | B | S1' | S0' | light |
|----|----|---|-----|-----|-------|
| 0  | 0  | 0 | 0   | 0   | 1     |
| 0  | 0  | 1 | 1   | 0   | 1     |
| 0  | 1  | 0 | 0   | 0   | 1     |
| 0  | 1  | 1 | 1   | 0   | 1     |
| 1  | 0  | 0 | 1   | 1   | 0     |
| 1  | 0  | 1 | 1   | 1   | 0     |
| 1  | 1  | 0 | 0   | 1   | 0     |
| 1  | 1  | 1 | 1   | 1   | 0     |

### Problem 7.

The following circuit has two inputs ( $A$ ,  $CLK$ ) and four outputs ( $W$ ,  $X$ ,  $Y$ ,  $Z$ ). The  $CLK$  signal is square wave with a period  $t_{CLK}=1\text{us}$ . The  $A$  signal makes a single  $0 \rightarrow 1$  transition but the timing of the transition is close to (within a few ns of) the active  $CLK$  edge, ignoring dynamic discipline. All the devices are lenient and have the same propagation delay  $t_{PD}=10\text{ns}$ .

In a test involving a large number of trials, the  $Z$  output has been examined 100ns after an active  $CLK$  edge (and when *both CLK and A have been stable for many propagation delays*); at this time,  $Z$  was found to be invalid  $P$  times. In the same test, what would you expect to observe at the other outputs 100ns after the  $CLK$  edge? For each output, circle one of

**LESS RELIABLE** if you would expect the output to be **invalid** appreciably **more** than  $P$  times;

**EQUALLY RELIABLE** if you would expect the output to be **invalid** about  $P$  times; or

**MORE RELIABLE** if you would expect the output to be **invalid** appreciably **less** than  $P$

*combination of logic never goes to metastable.*



# Computation Structures

## Pipelined Circuits Worksheet

### Concept Inventory:

- Latency & throughput
- Pipelined circuits & conventions
- Pipeline diagrams
- Pipelining methodology: contours
- Pipelined components; interleaving

### Notes:

Latency: the delay from when an input is established until the output associated with the input becomes valid.

- Combinational circuits:  $L = t_{PD}$
- K-pipeline:  $L = K * t_{CLK}$

Throughput: the rate at which inputs or outputs are processed.

- Combinational circuits:  $T = 1/L$
- K-pipeline:  $T = 1/t_{CLK}$



Unpipelined:

$$L = 45\text{ns}, T = 1/L = 1/(45\text{ns})$$

2-stage pipeline [ $t_{CLK}=25\text{ns}$ ]:

$$L = 2*25 = 50 \text{ ns}, T = 1/(25\text{ns})$$



Pipelining methodology:

- Form 1-pipeline by adding registers to all outputs
- To add a pipeline stage, draw contour across all paths from inputs to outputs such that it doesn't cross other contours and all input-output paths cross the contour in the same direction. This ensures the pipeline is well-formed (same # of registers on all input-output paths). A K-pipeline has K registers on all input-output paths.
- Contours must take into account pipelined or interleaved components. An N-way interleaved component behaves like N-pipeline.



### Problem 1.

A simple combinational circuit is to be pipelined for **maximum throughput using a minimal number of registers**. For each of the questions below, please create a valid K-stage pipeline. **Show your pipelining contours** and place large black circles (●) on the signal arrows to **indicate the placement of ideal pipeline registers** ( $t_{PD}=0$ ,  $t_{SETUP}=0$ ). Give the latency and throughput for each design. Remember that our convention is to place a pipeline register on each output.

- (A) (1 point). Show the maximum-throughput 1-stage pipeline.

Latency (ns):  $\frac{1}{2}$   
 Throughput (ns-1):  $\frac{1}{12}$



- (B) (2 points). Show the maximum-throughput 2-stage pipeline using a minimal number of registers.

Latency (ns):  $\frac{1}{2}$   
 Throughput (ns-1):  $\frac{1}{6}$



- (C) (2 points). Show the maximum-throughput 3-stage pipeline using a minimal number of registers.

Latency (ns):  $\frac{1}{2}$   
 Throughput (ns-1):  $\frac{1}{6}$



- (D) (2 points). Show the maximum-throughput 4-stage pipeline using a minimal number of registers.

Latency (ns):  $\frac{1}{2}$   
 Throughput (ns-1):  $\frac{1}{4}$



### Problem 2.

The following 1-stage pipelined circuit computes Z from the four inputs A, B, C, and D. Each component is annotated with its propagation delay in ns.



- (A) Please pipeline the circuit above for maximum throughput with the minimum possible latency using ideal pipeline registers ( $t_{PD} = 0$ ,  $t_{SETUP} = 0$ ). Show the location of pipeline registers in the diagram above using filled-in circles, like the one shown on the Z output. Please give the latency and throughput of the resulting pipelined circuit.

*Latency (ns)?* 15

*Throughput (1/ns)?* 1/3

*15*

*5. 6 x 3 = 18*

- (B) Now suppose the “3” component is replaced by a two-way interleaved component with a minimum  $t_{CLK}$  of 1.5ns. Recall that a two-way interleaved component behaves like a 2-stage pipelined component. Again, please pipeline the circuit below for maximum throughput with the minimum possible latency using ideal pipeline registers. Show the location of pipeline registers in the diagram below using filled-in circles, like the one shown on the Z output. Please give the latency and throughput of the resulting pipelined circuit.



**Problem 3.**

An unidentified government agency has a design for a combinational device depicted below:



Although you don't know the function of each of the component modules, they are each combinational and marked with their respective propagation delays. You have been hired to analyze and improve the performance of this device.

(A) (1 Point) What are the throughput and latency for the unpipelined combinational device?

$$\text{Latency: } \underline{180} \text{ ns; Throughput: } \underline{1/180} \text{ ns}^{-1}$$

(B) (4 Points) Show how to pipeline the above circuit for maximum throughput, by marking locations in the diagram where registers are to be inserted. **Use a minimum number of registers, but be sure to include one on the output.** Assume that the registers have 0  $t_{PD}$  and  $t_{SETUP}$ .

**(mark register locations in diagram above)**

(C) (1 Point) What are the latency and throughput of your pipelined circuit?

$$\text{Latency: } \underline{180} \text{ ns; Throughput: } \underline{1/60} \text{ ns}^{-1}$$

#### Problem 4.

The following circuit uses six full adder modules (as you've seen in lecture and lab) arranged in a combinational circuit that computes a 3-bit value  $F = A + B + 5$  for 3-bit inputs  $A$  and  $B$ :



The full adders have a  $t_{PD}$  of 6ns.

- (A) Give the latency and throughput of the combinational circuit.

Latency: 24 ns; Throughput: 1/24.

- (B) Indicate, on the above diagram, appropriate locations to place ideal (zero-delay) registers to pipeline the circuit for *maximum throughput* using a *minimum number of registers*. Be sure to include a register on each output.

**(mark circuit above)**

- (C) Give the latency and throughput of your pipelined circuit.

Latency: 24 ns; Throughput: 1/6

**Problem 5.**

- (A) You are provided with the circuit shown below. Each box represents some combinational logic. The number in each box is the  $t_{PD}$  of that combinational logic. The circuit has two inputs, X and Y, and one output Out. Pay close attention to the direction of the arrows especially the arrows shown in **bold**. What is the latency and throughput of this combinational circuit?



Latency (ns): 22

Throughput (1/ns): 1/22

- (B) Draw contours through the circuit above to produce a valid pipelined circuit **whose  $t_{CLK} = 9\text{ns}$  with minimum latency**. Extra copies of the diagram are included below. Please use a large dot to indicate the location of each pipeline register. Assume that you have ideal pipeline registers ( $t_{PD}=t_{CD}=t_{Setup}=t_{Hold}=0\text{ ns}$ ). Pay close attention to the direction of each arrow to ensure that you produce a valid pipeline. What is the latency and throughput of this pipelined circuit?

Latency (ns): 27

Throughput (1/ns): 1/27



- (C) You are now asked to consider the performance of this circuit using different clock periods while achieving the minimum latency. For each suggested  $t_{CLK}$ , specify whether or not you can create a **valid** pipelined circuit using that clock period. If you can, then provide the latency and throughput of the resulting circuit and specify the number of registers at each input. If it results in an invalid pipeline, enter NA for the rest of the row.

Extra copies of the circuit diagram are provided below.

| $t_{CLK}$ | Valid/Invalid | Latency (ns) | Throughput (1/ns) | Pipeline registers at input X | Pipeline registers at input Y |
|-----------|---------------|--------------|-------------------|-------------------------------|-------------------------------|
| 6 ns      | NA.           |              |                   |                               |                               |
| 7 ns      | Valid.        | 28           | 1/7               | 0                             | 2.                            |



### Problem 6.

A complex combinational circuit is constructed entirely from 2-input NAND gates having a propagation delay of 1 ns. If this circuit is pipelined for maximal throughput by adding (non-ideal) registers whose setup time and propagation delay are each 1 ns, what is the throughput of the resulting pipeline? Enter a number, a formula, or "CAN'T TELL".

Throughput ( $\text{ns}^{-1}$ ): 1/3.

**Problem 7.**

The following combinational circuit computes  $F(X, Y)$  and  $G(X, Y)$  from inputs  $X$  and  $Y$ . The  $t_{PD}$  (in ns) of each individual component is shown inside its box.



- (A) Using ideal zero-delay registers, mark the location of the minimal number of registers necessary to achieve maximum throughput. Give the latency and throughput of your pipelined circuit.

**mark diagram above**

Latency: 21 (ns); Throughput: 1/5 (1/ns)

Rummaging through the stockroom you find a pipelined component with two pipeline stages that can replace the "7" module. The minimum  $t_{CLK}$  for the new component is 4ns. The updated circuit is shown below.



- (B) Using ideal zero-delay registers, mark the location of the minimal number of registers necessary to achieve maximum throughput. Give the latency and throughput of your pipelined circuit.

**mark diagram above**

Latency: 25 (ns); Throughput: 1/5 (1/ns)

# Computation Structures

## Instruction Set Architecture Worksheet

### Summary of $\beta$ Instruction Formats

#### Operate Class:

| 31     | 26 | 25 | 21 | 20 | 16 | 15 | 11 | 10 | 0      |
|--------|----|----|----|----|----|----|----|----|--------|
| 10xxxx | Rc | Ra | Rb |    |    |    |    |    | unused |

OP(Ra,Rb,Rc):  $Reg[Rc] \leftarrow Reg[Ra] \text{ op } Reg[Rb]$

| Register | Symbol | Usage                 |
|----------|--------|-----------------------|
| R31      | R31    | Always zero           |
| R30      | XP     | Exception pointer     |
| R29      | SP     | Stack pointer         |
| R28      | LP     | Linkage pointer       |
| R27      | BP     | Base of frame pointer |

Opcodes: **ADD** (plus), **SUB** (minus), **MUL** (multiply), **DIV** (divided by)

**AND** (bitwise and), **OR** (bitwise or), **XOR** (bitwise exclusive or), **XNOR** (bitwise exclusive nor),  
**CMPEQ** (equal), **CMPLT** (less than), **CMPLE** (less than or equal) [result = 1 if true, 0 if false]  
**SHL** (left shift), **SHR** (right shift w/o sign extension), **SRA** (right shift w/ sign extension)

| 31     | 26 | 25 | 21 | 20 | 16 | 15 | 0                          |
|--------|----|----|----|----|----|----|----------------------------|
| 11xxxx | Rc | Ra |    |    |    |    | literal (two's complement) |

OPC(Ra,literal,Rc):  $Reg[Rc] \leftarrow Reg[Ra] \text{ op SEXT(literal)}$

Opcodes: **ADDC** (plus), **SUBC** (minus), **MULC** (multiply), **DIVC** (divided by)

**ANDC** (bitwise and), **ORC** (bitwise or), **XORC** (bitwise exclusive or), **XNORC** (bitwise exclusive nor)  
**CMPEQC** (equal), **CMPLTC** (less than), **CMPLEC** (less than or equal) [result = 1 if true, 0 if false]  
**SHLC** (left shift), **SHRC** (right shift w/o sign extension), **SRAC** (right shift w/ sign extension)

#### Other:

| 31     | 26 | 25 | 21 | 20 | 16 | 15 | 0                          |
|--------|----|----|----|----|----|----|----------------------------|
| 01xxxx | Rc | Ra |    |    |    |    | literal (two's complement) |

**LD(Ra,literal,Rc):**  $Reg[Rc] \leftarrow Mem[Reg[Ra] + SEXT(literal)]$

**ST(Rc,literal,Ra):**  $Mem[Reg[Ra] + SEXT(literal)] \leftarrow Reg[Rc]$

**JMP(Ra,Rc):**  $Reg[Rc] \leftarrow PC + 4; PC \leftarrow Reg[Ra]$

**BEQ/BF(Ra,label,Rc):**  $Reg[Rc] \leftarrow PC + 4; \text{ if } Reg[Ra] = 0 \text{ then } PC \leftarrow PC + 4 + 4 * SEXT(\text{literal})$

**BNE/BT(Ra,label,Rc):**  $Reg[Rc] \leftarrow PC + 4; \text{ if } Reg[Ra] \neq 0 \text{ then } PC \leftarrow PC + 4 + 4 * SEXT(\text{literal})$

**LDR(label,Rc):**  $Reg[Rc] \leftarrow Mem[PC + 4 + 4 * SEXT(\text{literal})]$

#### Opcode Table: (\*optional opcodes)

| 2:0 | 000  | 001  | 010   | 011   | 100    | 101    | 110    | 111 |
|-----|------|------|-------|-------|--------|--------|--------|-----|
| 5:3 | 000  | 001  | 010   | 011   | 100    | 101    | 110    | 111 |
| 000 |      |      |       |       |        |        |        |     |
| 001 |      |      |       |       |        |        |        |     |
| 010 |      |      |       |       |        |        |        |     |
| 011 | LD   | ST   |       | JMP   | BEQ    | BNE    |        | LDR |
| 100 | ADD  | SUB  | MUL*  | DIV*  | CMPEQ  | CMPLT  | CMPLE  |     |
| 101 | AND  | OR   | XOR   | XNOR  | SHL    | SHR    | SRA    |     |
| 110 | ADDC | SUBC | MULC* | DIVC* | CMPEQC | CMPLTC | CMPLEC |     |
| 111 | ANDC | ORC  | XORC  | XNORC | SHLC   | SHRC   | SRAC   |     |

### Problem 1.

An unnamed associate of yours has broken into the computer (a Beta of course!) that 6.004 uses for course administration. He has managed to grab the contents of the memory locations he believes holds the Beta code responsible for checking access passwords and would like you to help discover how the password code works. The memory contents are shown in the table below:

| Addr  | Contents   | Opcode | Rc    | Ra    | Rb         | Assembly                                                                   |                      |
|-------|------------|--------|-------|-------|------------|----------------------------------------------------------------------------|----------------------|
| 0x100 | 0xC05F0008 | 110000 | 00010 | 11111 | 0000   8   | ADDC (R <sub>31</sub> , R <sub>1</sub> , R <sub>2</sub> )                  | R <sub>2</sub> ← 8   |
| 0x104 | 0xC03F0000 | 110000 | 00001 | 11111 | 00000      | ADDC (R <sub>31</sub> , R <sub>0</sub> , R <sub>1</sub> )                  | R <sub>1</sub> ← 0   |
| 0x108 | 0xE060000F | 111000 | 00011 | 00000 | 0000   111 | ANDC (R <sub>0</sub> , R <sub>1</sub> , R <sub>3</sub> )                   | R <sub>0</sub> & OF. |
| 0x10C | 0xF0210004 | 111100 | 00001 | 00001 | 0000       | SHLC (R <sub>1</sub> , R <sub>6</sub> , R <sub>1</sub> ). R <sub>1</sub> ← |                      |
| 0x110 | 0xA4230800 | 101001 | 00001 | 00011 | 00000      | OR (R <sub>0</sub> , R <sub>3</sub> , R <sub>1</sub> )                     |                      |
| 0x114 | 0xF4000004 | 111101 | 00000 | 00000 | 00000      | SHRC (R <sub>0</sub> , R <sub>0</sub> , R <sub>0</sub> )                   |                      |
| 0x118 | 0xC4420001 | 110001 | 00010 | 00010 | 00000      | SUBC (R <sub>2</sub> , R <sub>0</sub> , R <sub>2</sub> )                   |                      |
| 0x11C | 0x73E20002 | 011100 | 11111 | 00010 | 00000      | BEQ (R <sub>2</sub> , R <sub>0</sub> , R <sub>31</sub> )                   |                      |
| 0x120 | 0x73FFFFF9 | 011100 | 11111 | 11111 | 1111       | BEQ (R <sub>31</sub> , R <sub>3</sub> , R <sub>31</sub> )                  |                      |
| 0x124 | 0xA4230800 | 101001 | 00001 | 00011 | 00000      | OR (R <sub>3</sub> , R <sub>0</sub> , R <sub>1</sub> )                     |                      |
| 0x128 | 0x605F0124 | 011000 | 00010 | 11111 | 00100      | LP (R <sub>31</sub> , R <sub>4</sub> , R <sub>2</sub> )                    |                      |
| 0x12C | 0x90211000 | 100100 | 00001 | 00001 | 00000      | CMPEQ (R <sub>1</sub> , R <sub>0</sub> , R <sub>1</sub> )                  |                      |

Further investigation reveals that the password is just a 32-bit integer which is in R<sub>0</sub> when the code above is executed and that the system will grant access if R<sub>1</sub> = 1 after the code has been executed. What "passnumber" will gain entry to the system?

**Problem 2.** $SHLC(R_0, 3, R_1)$ 

- (A) What assembly instruction could a compiler use to implement  $y = x * 8$  on the Beta assuming that MUL and MULC are not available? Assume x is in R0 and y is in R1.

 $ADD(R_0, R_0, R_0)$ 

**Equivalent assembly instruction:**  $\underline{ADD(R_1, R_1, R_1)}$ ,  
 $\underline{ADD(R_1, A_1, R_1)}$ .

- (B) Assume that the registers are initialized to: R0=8, R1=10, R2=12, R3=0x1234, R4=24 before execution of each of the following assembly instructions. For each instruction, provide the value of the specified register or memory location. **If your answers are in hexadecimal, make sure to prepend them with the prefix 0x.**

1. SHL(R3, R4, R5)      **Value of R5:** 0x 3400 0000

2. ADD(R2, R1, R6)      **Value of R6:** 22

3. **ADD(R0, 2, R7)**      **Value of R7:** 10 ? 20.      0x 1238

4. ST(R1, 4, R3)      **Value stored:** 0x1734      **at address:** 0x 1200 0000

- (C) A student tries to optimize his Beta assembly program by replacing a line containing

**ADDC(R0, 3\*4+5, R1)**

by

**ADDC(R0, 17, R1)**

Is the resulting binary program smaller? Does it run faster?

(circle one) **Binary program is SMALLER?** yes ... no

(circle one) **FASTER?** yes ... no

- (D) A BR instruction at location 0x1000 branches to 0x2000. If the binary representation for that BR were moved to location 0x1400 and executed there, where will the relocated instruction branch to?

**Branch target for relocated BR (in hex): 0x 2400?**

- (E) A line in an assembly-language program containing “ADDC(R1,2,R3)” is changed to “ADDC(R1,R2,R3)”. Will the modified program behave differently when executed?

*↑  
change to 2.*      Circle best answer: YES ... NO ... CAN’T TELL

### Problem 3.

Each of the following programs is loaded into a Beta's main memory starting at location 0 and execution is started with the Beta's PC set to 0. Assume that all registers have been initialized to 0 before execution begins. Please determine the specified values after execution reaches the HALT() instruction and the Beta stops. Write "CAN'T TELL" if the value cannot be determined. Please write all values in hex.

- (A) . = 0  
 LD(R31,X+4,R1)  
 SHLC(R1,2,R1)  
 LD(R1,X,R2)  
 HALT()  
 X: LONG(4) :  
 +4  
 +8  
 +12  
 +16  
 +20  
 +24  
 +28  
 +32  
 +36  
 +40  
 +44  
 +48  
 +52  
 +56  
 +60  
 +64  
 +68  
 +72  
 +76  
 +80  
 +84  
 +88  
 +92  
 +96  
 +100  
 +104  
 +108  
 +112  
 +116  
 +120  
 +124  
 +128  
 +132  
 +136  
 +140  
 +144  
 +148  
 +152  
 +156  
 +160  
 +164  
 +168  
 +172  
 +176  
 +180  
 +184  
 +188  
 +192  
 +196  
 +200  
 +204  
 +208  
 +212  
 +216  
 +220  
 +224  
 +228  
 +232  
 +236  
 +240  
 +244  
 +248  
 +252  
 +256  
 +260  
 +264  
 +268  
 +272  
 +276  
 +280  
 +284  
 +288  
 +292  
 +296  
 +300  
 +304  
 +308  
 +312  
 +316  
 +320  
 +324  
 +328  
 +332  
 +336  
 +340  
 +344  
 +348  
 +352  
 +356  
 +360  
 +364  
 +368  
 +372  
 +376  
 +380  
 +384  
 +388  
 +392  
 +396  
 +400  
 +404  
 +408  
 +412  
 +416  
 +420  
 +424  
 +428  
 +432  
 +436  
 +440  
 +444  
 +448  
 +452  
 +456  
 +460  
 +464  
 +468  
 +472  
 +476  
 +480  
 +484  
 +488  
 +492  
 +496  
 +500  
 +504  
 +508  
 +512  
 +516  
 +520  
 +524  
 +528  
 +532  
 +536  
 +540  
 +544  
 +548  
 +552  
 +556  
 +560  
 +564  
 +568  
 +572  
 +576  
 +580  
 +584  
 +588  
 +592  
 +596  
 +600  
 +604  
 +608  
 +612  
 +616  
 +620  
 +624  
 +628  
 +632  
 +636  
 +640  
 +644  
 +648  
 +652  
 +656  
 +660  
 +664  
 +668  
 +672  
 +676  
 +680  
 +684  
 +688  
 +692  
 +696  
 +700  
 +704  
 +708  
 +712  
 +716  
 +720  
 +724  
 +728  
 +732  
 +736  
 +740  
 +744  
 +748  
 +752  
 +756  
 +760  
 +764  
 +768  
 +772  
 +776  
 +780  
 +784  
 +788  
 +792  
 +796  
 +800  
 +804  
 +808  
 +812  
 +816  
 +820  
 +824  
 +828  
 +832  
 +836  
 +840  
 +844  
 +848  
 +852  
 +856  
 +860  
 +864  
 +868  
 +872  
 +876  
 +880  
 +884  
 +888  
 +892  
 +896  
 +900  
 +904  
 +908  
 +912  
 +916  
 +920  
 +924  
 +928  
 +932  
 +936  
 +940  
 +944  
 +948  
 +952  
 +956  
 +960  
 +964  
 +968  
 +972  
 +976  
 +980  
 +984  
 +988  
 +992  
 +996  
 +1000  
 +1004  
 +1008  
 +1012  
 +1016  
 +1020  
 +1024  
 +1028  
 +1032  
 +1036  
 +1040  
 +1044  
 +1048  
 +1052  
 +1056  
 +1060  
 +1064  
 +1068  
 +1072  
 +1076  
 +1080  
 +1084  
 +1088  
 +1092  
 +1096  
 +1100  
 +1104  
 +1108  
 +1112  
 +1116  
 +1120  
 +1124  
 +1128  
 +1132  
 +1136  
 +1140  
 +1144  
 +1148  
 +1152  
 +1156  
 +1160  
 +1164  
 +1168  
 +1172  
 +1176  
 +1180  
 +1184  
 +1188  
 +1192  
 +1196  
 +1200  
 +1204  
 +1208  
 +1212  
 +1216  
 +1220  
 +1224  
 +1228  
 +1232  
 +1236  
 +1240  
 +1244  
 +1248  
 +1252  
 +1256  
 +1260  
 +1264  
 +1268  
 +1272  
 +1276  
 +1280  
 +1284  
 +1288  
 +1292  
 +1296  
 +1300  
 +1304  
 +1308  
 +1312  
 +1316  
 +1320  
 +1324  
 +1328  
 +1332  
 +1336  
 +1340  
 +1344  
 +1348  
 +1352  
 +1356  
 +1360  
 +1364  
 +1368  
 +1372  
 +1376  
 +1380  
 +1384  
 +1388  
 +1392  
 +1396  
 +1400  
 +1404  
 +1408  
 +1412  
 +1416  
 +1420  
 +1424  
 +1428  
 +1432  
 +1436  
 +1440  
 +1444  
 +1448  
 +1452  
 +1456  
 +1460  
 +1464  
 +1468  
 +1472  
 +1476  
 +1480  
 +1484  
 +1488  
 +1492  
 +1496  
 +1500  
 +1504  
 +1508  
 +1512  
 +1516  
 +1520  
 +1524  
 +1528  
 +1532  
 +1536  
 +1540  
 +1544  
 +1548  
 +1552  
 +1556  
 +1560  
 +1564  
 +1568  
 +1572  
 +1576  
 +1580  
 +1584  
 +1588  
 +1592  
 +1596  
 +1600  
 +1604  
 +1608  
 +1612  
 +1616  
 +1620  
 +1624  
 +1628  
 +1632  
 +1636  
 +1640  
 +1644  
 +1648  
 +1652  
 +1656  
 +1660  
 +1664  
 +1668  
 +1672  
 +1676  
 +1680  
 +1684  
 +1688  
 +1692  
 +1696  
 +1700  
 +1704  
 +1708  
 +1712  
 +1716  
 +1720  
 +1724  
 +1728  
 +1732  
 +1736  
 +1740  
 +1744  
 +1748  
 +1752  
 +1756  
 +1760  
 +1764  
 +1768  
 +1772  
 +1776  
 +1780  
 +1784  
 +1788  
 +1792  
 +1796  
 +1800  
 +1804  
 +1808  
 +1812  
 +1816  
 +1820  
 +1824  
 +1828  
 +1832  
 +1836  
 +1840  
 +1844  
 +1848  
 +1852  
 +1856  
 +1860  
 +1864  
 +1868  
 +1872  
 +1876  
 +1880  
 +1884  
 +1888  
 +1892  
 +1896  
 +1900  
 +1904  
 +1908  
 +1912  
 +1916  
 +1920  
 +1924  
 +1928  
 +1932  
 +1936  
 +1940  
 +1944  
 +1948  
 +1952  
 +1956  
 +1960  
 +1964  
 +1968  
 +1972  
 +1976  
 +1980  
 +1984  
 +1988  
 +1992  
 +1996  
 +2000  
 +2004  
 +2008  
 +2012  
 +2016  
 +2020  
 +2024  
 +2028  
 +2032  
 +2036  
 +2040  
 +2044  
 +2048  
 +2052  
 +2056  
 +2060  
 +2064  
 +2068  
 +2072  
 +2076  
 +2080  
 +2084  
 +2088  
 +2092  
 +2096  
 +2100  
 +2104  
 +2108  
 +2112  
 +2116  
 +2120  
 +2124  
 +2128  
 +2132  
 +2136  
 +2140  
 +2144  
 +2148  
 +2152  
 +2156  
 +2160  
 +2164  
 +2168  
 +2172  
 +2176  
 +2180  
 +2184  
 +2188  
 +2192  
 +2196  
 +2200  
 +2204  
 +2208  
 +2212  
 +2216  
 +2220  
 +2224  
 +2228  
 +2232  
 +2236  
 +2240  
 +2244  
 +2248  
 +2252  
 +2256  
 +2260  
 +2264  
 +2268  
 +2272  
 +2276  
 +2280  
 +2284  
 +2288  
 +2292  
 +2296  
 +2300  
 +2304  
 +2308  
 +2312  
 +2316  
 +2320  
 +2324  
 +2328  
 +2332  
 +2336  
 +2340  
 +2344  
 +2348  
 +2352  
 +2356  
 +2360  
 +2364  
 +2368  
 +2372  
 +2376  
 +2380  
 +2384  
 +2388  
 +2392  
 +2396  
 +2400  
 +2404  
 +2408  
 +2412  
 +2416  
 +2420  
 +2424  
 +2428  
 +2432  
 +2436  
 +2440  
 +2444  
 +2448  
 +2452  
 +2456  
 +2460  
 +2464  
 +2468  
 +2472  
 +2476  
 +2480  
 +2484  
 +2488  
 +2492  
 +2496  
 +2500  
 +2504  
 +2508  
 +2512  
 +2516  
 +2520  
 +2524  
 +2528  
 +2532  
 +2536  
 +2540  
 +2544  
 +2548  
 +2552  
 +2556  
 +2560  
 +2564  
 +2568  
 +2572  
 +2576  
 +2580  
 +2584  
 +2588  
 +2592  
 +2596  
 +2600  
 +2604  
 +2608  
 +2612  
 +2616  
 +2620  
 +2624  
 +2628  
 +2632  
 +2636  
 +2640  
 +2644  
 +2648  
 +2652  
 +2656  
 +2660  
 +2664  
 +2668  
 +2672  
 +2676  
 +2680  
 +2684  
 +2688  
 +2692  
 +2696  
 +2700  
 +2704  
 +2708  
 +2712  
 +2716  
 +2720  
 +2724  
 +2728  
 +2732  
 +2736  
 +2740  
 +2744  
 +2748  
 +2752  
 +2756  
 +2760  
 +2764  
 +2768  
 +2772  
 +2776  
 +2780  
 +2784  
 +2788  
 +2792  
 +2796  
 +2800  
 +2804  
 +2808  
 +2812  
 +2816  
 +2820  
 +2824  
 +2828  
 +2832  
 +2836  
 +2840  
 +2844  
 +2848  
 +2852  
 +2856  
 +2860  
 +2864  
 +2868  
 +2872  
 +2876  
 +2880  
 +2884  
 +2888  
 +2892  
 +2896  
 +2900  
 +2904  
 +2908  
 +2912  
 +2916  
 +2920  
 +2924  
 +2928  
 +2932  
 +2936  
 +2940  
 +2944  
 +2948  
 +2952  
 +2956  
 +2960  
 +2964  
 +2968  
 +2972  
 +2976  
 +2980  
 +2984  
 +2988  
 +2992  
 +2996  
 +3000  
 +3004  
 +3008  
 +3012  
 +3016  
 +3020  
 +3024  
 +3028  
 +3032  
 +3036  
 +3040  
 +3044  
 +3048  
 +3052  
 +3056  
 +3060  
 +3064  
 +3068  
 +3072  
 +3076  
 +3080  
 +3084  
 +3088  
 +3092  
 +3096  
 +3100  
 +3104  
 +3108  
 +3112  
 +3116  
 +3120  
 +3124  
 +3128  
 +3132  
 +3136  
 +3140  
 +3144  
 +3148  
 +3152  
 +3156  
 +3160  
 +3164  
 +3168  
 +3172  
 +3176  
 +3180  
 +3184  
 +3188  
 +3192  
 +3196  
 +3200  
 +3204  
 +3208  
 +3212  
 +3216  
 +3220  
 +3224  
 +3228  
 +3232  
 +3236  
 +3240  
 +3244  
 +3248  
 +3252  
 +3256  
 +3260  
 +3264  
 +3268  
 +3272  
 +3276  
 +3280  
 +3284  
 +3288  
 +3292  
 +3296  
 +3300  
 +3304  
 +3308  
 +3312  
 +3316  
 +3320  
 +3324  
 +3328  
 +3332  
 +3336  
 +3340  
 +3344  
 +3348  
 +3352  
 +3356  
 +3360  
 +3364  
 +3368  
 +3372  
 +3376  
 +3380  
 +3384  
 +3388  
 +3392  
 +3396  
 +3400  
 +3404  
 +3408  
 +3412  
 +3416  
 +3420  
 +3424  
 +3428  
 +3432  
 +3436  
 +3440  
 +3444  
 +3448  
 +3452  
 +3456  
 +3460  
 +3464  
 +3468  
 +3472  
 +3476  
 +3480  
 +3484  
 +3488  
 +3492  
 +3496  
 +3500  
 +3504  
 +3508  
 +3512  
 +3516  
 +3520  
 +3524  
 +3528  
 +3532  
 +3536  
 +3540  
 +3544  
 +3548  
 +3552  
 +3556  
 +3560  
 +3564  
 +3568  
 +3572  
 +3576  
 +3580  
 +3584  
 +3588  
 +3592  
 +3596  
 +3600  
 +3604  
 +3608  
 +3612  
 +3616  
 +3620  
 +3624  
 +3628  
 +3632  
 +3636  
 +3640  
 +3644  
 +3648  
 +3652  
 +3656  
 +3660  
 +3664  
 +3668  
 +3672  
 +3676  
 +3680  
 +3684  
 +3688  
 +3692  
 +3696  
 +3700  
 +3704  
 +3708  
 +3712  
 +3716  
 +3720  
 +3724  
 +3728  
 +3732  
 +3736  
 +3740  
 +3744  
 +3748  
 +3752  
 +3756  
 +3760  
 +3764  
 +3768  
 +3772  
 +3776  
 +3780  
 +3784  
 +3788  
 +3792  
 +3796  
 +3800  
 +3804  
 +3808  
 +3812  
 +3816  
 +3820  
 +3824  
 +3828  
 +3832  
 +3836  
 +3840  
 +3844  
 +3848  
 +3852  
 +3856  
 +3860  
 +3864  
 +3868  
 +3872  
 +3876  
 +3880  
 +3884  
 +3888  
 +3892  
 +3896  
 +3900  
 +3904  
 +3908  
 +3912  
 +3916  
 +3920  
 +3924  
 +3928  
 +3932  
 +3936  
 +3940  
 +3944  
 +3948  
 +3952  
 +3956  
 +3960  
 +3964  
 +3968  
 +3972  
 +3976  
 +3980  
 +3984  
 +3988  
 +3992  
 +3996  
 +4000  
 +4004  
 +4008  
 +4012  
 +4016  
 +4020  
 +4024  
 +4028  
 +4032  
 +4036  
 +4040  
 +4044  
 +4048  
 +4052  
 +4056  
 +4060  
 +4064  
 +4068  
 +4072  
 +4076  
 +4080  
 +4084  
 +4088  
 +4092  
 +4096  
 +4100  
 +4104  
 +4108  
 +4112  
 +4116  
 +4120  
 +4124  
 +4128  
 +4132  
 +4136  
 +4140  
 +4144  
 +4148  
 +4152  
 +4156  
 +4160  
 +4164  
 +4168  
 +4172  
 +4176  
 +4180  
 +4184  
 +4188  
 +4192  
 +4196  
 +4200  
 +4204  
 +4208  
 +4212  
 +4216  
 +4220  
 +4224  
 +4228  
 +4232  
 +4236  
 +4240  
 +4244  
 +4248  
 +4252  
 +4256  
 +4260  
 +4264  
 +4268  
 +4272  
 +4276  
 +4280  
 +4284  
 +4288  
 +4292  
 +4296  
 +4300  
 +4304  
 +4308  
 +4312  
 +4316  
 +4320  
 +4324  
 +4328  
 +4332  
 +4336  
 +4340  
 +4344  
 +4348  
 +4352  
 +4356  
 +4360  
 +4364  
 +4368  
 +4372  
 +4376  
 +4380  
 +4384  
 +4388  
 +4392  
 +4396  
 +4400  
 +4404  
 +4408  
 +4412  
 +4416  
 +4420  
 +4424  
 +4428  
 +4432  
 +4436  
 +4440  
 +4444  
 +4448  
 +4452  
 +4456  
 +4460  
 +4464  
 +4468  
 +4472  
 +4476  
 +4480  
 +4484  
 +4488  
 +4492  
 +4496  
 +4500  
 +4504  
 +4508  
 +4512  
 +4516  
 +4520  
 +4524  
 +4528  
 +4532  
 +4536  
 +4540  
 +4544  
 +4548  
 +4552  
 +4556  
 +4560  
 +4564  
 +4568  
 +4572  
 +4576  
 +4580  
 +4584  
 +4588  
 +4592  
 +4596  
 +4600  
 +4604  
 +4608  
 +4612  
 +4616  
 +4620  
 +4624  
 +4628  
 +4632  
 +4636  
 +4640  
 +4644  
 +4648  
 +4652  
 +4656  
 +4660  
 +4664  
 +4668  
 +4672  
 +4676  
 +4680  
 +4684  
 +4688  
 +4692  
 +4696  
 +4700  
 +4704  
 +4708  
 +4712  
 +4716  
 +4720  
 +4724  
 +4728  
 +4732  
 +4736  
 +4740  
 +4744  
 +4748  
 +4752  
 +4756  
 +4760  
 +4764  
 +4768  
 +4772  
 +4776  
 +4780  
 +4784  
 +4788  
 +4792  
 +4796  
 +4800  
 +4804  
 +4808  
 +4812  
 +4816  
 +4820  
 +4824  
 +4828  
 +4832  
 +4836  
 +4840  
 +4844  
 +4848  
 +4852  
 +4856  
 +4860  
 +4864  
 +4868  
 +4872  
 +4876  
 +4880  
 +4884  
 +4888  
 +4892  
 +4896  
 +4900  
 +4904  
 +4908  
 +4912  
 +4916  
 +4920  
 +4924  
 +4928  
 +4932  
 +4936  
 +4940  
 +4944  
 +4948  
 +4952  
 +4956  
 +4960  
 +4964  
 +4968  
 +4972  
 +4976  
 +4980  
 +4984  
 +4988  
 +4992  
 +4996  
 +5000  
 +5004  
 +5008  
 +5012  
 +5016  
 +5020  
 +5024  
 +5028  
 +5032  
 +5036  
 +5040  
 +5044  
 +5048  
 +5052  
 +5056  
 +5060  
 +5064  
 +5068  
 +5072  
 +5076  
 +5080  
 +5084  
 +5088  
 +5092  
 +5096  
 +5100  
 +5104  
 +5108  
 +5112  
 +5116  
 +5120  
 +5124  
 +5128  
 +5132  
 +5136  
 +5140  
 +5144  
 +5148  
 +5152  
 +5156  
 +5160  
 +5164  
 +5168  
 +5172  
 +5176  
 +5180  
 +5184  
 +5188  
 +5192  
 +5196  
 +5200  
 +5204  
 +5208  
 +5212  
 +5216  
 +5220  
 +5224  
 +5228  
 +5232  
 +5236  
 +5240  
 +5244  
 +5248  
 +5252  
 +5256  
 +5260  
 +5264  
 +5268  
 +5272  
 +5276  
 +5280  
 +5284  
 +5288  
 +5292  
 +5296  
 +5300  
 +5304  
 +5308  
 +5312  
 +5316  
 +5320  
 +5324  
 +5328  
 +5332  
 +5336  
 +5340  
 +5344  
 +5348  
 +5352  
 +5356  
 +5360  
 +5364  
 +5368  
 +5372  
 +5376  
 +5380  
 +5384  
 +5388  
 +5392  
 +5396  
 +5400  
 +5404  
 +5408  
 +5412  
 +5416  
 +5420  
 +5424  
 +5428  
 +5432  
 +5436  
 +5440  
 +5444  
 +5448  
 +5452  
 +5456  
 +5460  
 +5464  
 +5468  
 +5472  
 +5476  
 +5480  
 +5484  
 +5488  
 +5492  
 +5496  
 +5500  
 +5504  
 +5508  
 +5512  
 +5516  
 +5520  
 +5524  
 +5528  
 +5532  
 +5536  
 +5540  
 +5544  
 +5548  
 +5552  
 +5556  
 +5560  
 +5564  
 +5568  
 +5572  
 +5576  
 +5580  
 +5584  
 +5588  
 +5592  
 +5596  
 +5600  
 +5604  
 +5608  
 +5612  
 +5616  
 +5620  
 +5624  
 +5628  
 +5632  
 +5636  
 +5640  
 +5644  
 +5648  
 +5652  
 +5656  
 +5660  
 +5664  
 +5668  
 +5672  
 +5676  
 +5680

1000 0000 0000 0000 0000 0000 0000 0000

(E) . = 0  
 0 LD(r31, X, r0)  $R_0 \leftarrow 0x87654321$   
 4 CMPE(r0, r31, r1)  $R_1 \leftarrow r_0 < 0$   
 8 BNE(r1, L1, r1)  
 C ADDC(r31, 17, r2)  $R_2 \leftarrow 17$   
 10 BEQ(r31, L2, r31)  $\underline{1111}$   
 11 L1: SRAC(r0, 4, r2)  $R_2 \leftarrow 0$   
 12 L2: HALT()  
 14  
 18 . = 0x1CE8  
 X: LONG(0x87654321)

Value left in R0? 0x 0x F8765432

Value left in R1? 0x 1

Value left in R2? 0x 0 ✓ F8765432

Value assembler assigns to L1: 0x 18765432

(F) . = 0  
 LD(R31, i, R0)  $R_0 \leftarrow 3$   
 SHLC(R0, 2, R0)  $R_0 \leftarrow C$ .  
 LD(R0, a-4, R1)  $R_1 \leftarrow C0$   
 HALT()  
 a: LONG(0xBADBABE)  
 LONG(0xDEADBEEF)  
 $\leftarrow$  LONG(0xC0FFEE)  
 LONG(0x8BADF00D)  
 i: LONG(3)

Contents of R0 (in hex): 0x C0

Contents of R1 (in hex): 0x 00 FFFF

(G) . = 0  
 LD(R31, Z, R1)  $R_1 \leftarrow$   
 SHRC(R1, 16, R2)  $C4 02.6.$   
 Z: SUBC(R2, 0x3C, R3)  $1100|0|0001|00010|3C$   
 HALT()

Value left in R1: 0x C4 623C

Value left in R3: 0x 00 C4 26

Value assembler assigns to symbol Z: 0x 8

(H) . = 0  
 0 LD(R31, X, R0)  $R_0 \leftarrow DECAF$   
 4 CMOVE(0, R1)  $R_1 \leftarrow 0$   
 8 L: ADDC(R1, 1, R1)  $R_1 \leftarrow R_1 + 1$   
 C SHRC(R0, 1, R0)  $R_0 \leftarrow R_0 \Rightarrow 1$   
 14 HALT()

Value left in R0: 0x 0 16 bytes.

Value left in R1: 0x 20 ⇒ 14

Value left in R2: 0x 21 14.

X: LONG(0xDECAF)

1. loads instruction load 3rd Register.  
 2. BNE, BEQ 指令会把 PC 位置到第 3 个 Register.

# Computation Structures

## Compilation Worksheet

**compile\_expr(expr)  $\Rightarrow Rx$**

- Constants:  $1234 \Rightarrow Rx$

- $CMOVE(1234, Rx)$
- $LD(c1, Rx)$
- ...
- $c1: LONG(123456)$

- Variables:  $a \Rightarrow Rx$

- $LD(a, Rx)$
- ...
- $a: LONG(0)$

- Variables:  $b[expr] \Rightarrow Rx$

- $compile\_expr(expr) \Rightarrow Rx$
- $MULC(Rx, bsize, Rx)$
- $LD(Rx, b, Rx)$
- ...
- // reserve array space
- $b: . = . + bsize * blen$

- Operations:  $expr_1 + expr_2 \Rightarrow Rx$

- $compile\_expr(expr_1) \Rightarrow Rx$
- $compile\_expr(expr_2) \Rightarrow Ry$
- $ADD(Rx, Ry, Rx)$

- Procedure call:  $f(e_1, e_2, \dots) \Rightarrow Rx$
- next lecture!

- Assignment:  $a=expr \Rightarrow Rx$

- $compile\_expr(expr) \Rightarrow Rx$
- $ST(Rx, a)$

**compile\_statement(...)**

- Unconditional:  $expr;$
- $compile\_expr(expr)$

- Compound:  $\{ s1; s2; \dots \}$
- $compile\_statement(s1)$
- $compile\_statement(s2)$
- ...

- Conditional:  $if (expr) s1;$
- $compile\_expr(expr) \Rightarrow Rx$
- $BF(Rx, Lendif)$
- $compile\_statement(s1)$
- $Lendif:$

- Conditional:  $if (expr) s1; else s2;$
- $compile\_expr(expr) \Rightarrow Rx$
- $BF(Rx, Lelse)$
- $compile\_statement(s1)$
- $BR(Lendif)$
- $Lelse:$
- $compile\_statement(s1)$
- $Lendif:$

- Iteration:  $while (expr) s1;$
- $BR(Ltest)$
- $Lwhile:$
- $compile\_statement(s1)$
- $Ltest:$
- $compile\_expr(expr) \Rightarrow Rx$
- $BT(Rx, Lwhile)$

- Iteration:  $for (init; test; incr) s1;$
- $init;$
- $while (test) \{ s1; incr; \}$

### Problem 1.

Please hand-compile the following snippets of C code into equivalent Beta assembly language statements. Assume that memory locations have been allocated for all C variables with labels that corresponds to the variable names. So to load the value of the C variable `a` into register `R3`, the appropriate assembly language statement would be `LD(R31, a, R3)`. And to store the value in `R17` to the C variable `b`, the appropriate assembly language statement would be  
`ST(R17, b, R31)`. Similarly, assume that memory locations have been allocated for each C array, with a label defined whose value is the address of the 0<sup>th</sup> element of the array.  $LD(y + 4 \times 13, R_0)$  必须指明下标。

(A)  $a = 42;$   
 $\text{CMOVE}(42, R_0)$   
 $ST(R_0, a, R31)$

(B)  $c = 5*x - 13;$   
 $LD(X, R_0)$        $SubC(R_1, 13, R_1)$   
 $SHLC(R_0, 2, R_1)$   $ST(R_1, c, R_1)$   
 $ADD(R_0, R_1, R_1)$

(C)  $y = (x - 3)*(y + 123456);$   
 $LD(X, R_0)$        $ADD(R_1, R_2, R_1)$   
 $LD(Y, R_1)$        $MUL(R_0, R_1, R_0)$   
 $SubC(R_0, 3, R_0)$   $ST(R_0, Y)$   
 $LD(\text{const}, R_2)$        $\text{const:long}(123456)$

(D)  $\text{if } (a == 3) b = b + 1;$   
 $LD(a, R_0)$        $ST(R_1, b)$ .  
 $LD(b, R_1)$ .

$\text{CMPEQC}(R_0, 3, R_2)$ .  $L_1:$

$BF(R_2, L_1)$   
 $ADDc(R_1, 1, R_1)$

(E)  $a[i] = a[i-1];$   
 $LD(i, R_0)$ .

$MUL(R_1, 4, R_0)$

$LD(R_0, a-4, R_1)$

$ST(R_1, a, R_0)$

(F)  $x = y[3] + y[12];$   
 $CMOVE(3, R_0)$   
 $SHLC(R_0, 2, R_0)$  SHLC(R\_0, 2, R\_1)  
 $SHLC(R_0, 2, R_1)$   
 $LD(R_0, Y, R_0)$   
 $(G) \text{if } (b == 0 \text{ || } b < \text{min}) \{$   
 $\text{min} = b;$   
 $\} \text{ else } \{$   
 $\text{too\_big} += 1;$   
 $\}$   
 $LD(b, R_0)$   
 $BF(R_0, L_1)$   
 $LD(min, R_1)$   
 $CMPLT(R_0, R_1, R_2)$   
 $L_1: BF(R_0, L_2)$

$ST(R_0, min)$   
 $BR(L_3)$

(H)  $\text{sum} = 0;$   
 $i = 0;$   
 $\text{while } (i < 10) \{$   
 $\text{sum} = \text{sum} + i$   
 $i = i + 1;$   
 $\}$

$L_2: LD(too\_big, R_0)$   
 $ADDc(R_0, 1, R_0)$   
 $L_3: ST(R_0, too\_big)$

$CMOVE(0, R_0)$   
 $CMOVE(0, R_1)$   
 $ST(R_0, sum)$   
 $ST(R_1, sum)$   
 $BR(L_2)$

$L_2: CMPLTc(R_1, 10, R_1)$   
 $BT(R_2, L_1).$

$L_1:$   
 $ADD(R_0, R_1, R_0)$   
 $ST(R_0, sum)$   
 $ADDc(R_1, 1, R_1)$   
 $ST(R_1, sum)$

$L_3:$

## Problem 2.

In block-structured languages such as C or Java, the scope of a variable declared locally within a block extends only over that block, i.e., the value of the local variable cannot be accessed outside the block. Conceptually, storage is allocated for the variable when the block is entered and deallocated when the block is exited. In many cases, this means the compiler is free to use a register to hold the value of the local variable instead of a memory location.

Consider the following C fragment:

```
int sum = 0;
{ int i;
  for (i = 0; i < 10; i = i+1) sum += i;
}
```

- A. Hand-compile this loop into assembly language, using registers to hold the values of the local variables "i" and "sum".

|                |                             |
|----------------|-----------------------------|
| $MV(R_3, R_0)$ | $L_1: ADD(R_0, R_1, R_0)$   |
| $ST(R_0, sum)$ | $ADDL(R_1, 1, R_1)$         |
| $MV(R_3, R_1)$ | $L_2: CMPLTC(R_1, 10, R_2)$ |
| $BR(L_2)$      | $BT(R_2, L_1)$              |

- B. Define a *memory access* as any access to memory, i.e., instruction fetch, data read (LD), or data write (ST). Compare the number of total number of memory accesses generated by executing the optimized loop with the total number of memory access for the unoptimized loop (part G of the preceding problem).

H: 2 instructions

2 memory access

loop(1, 10).

10 instruction fetch.  
6 memory access,

- C. Some optimizing compilers "unroll" small loops to amortize the overhead of each loop iteration over more instructions in the body of the loop. For example, one unrolling of the loop above would be equivalent to rewriting the program as

```
int sum = 0;
{ int i;
  for (i = 0; i < 10; i = i+2) {
    sum += i; sum += i+1;
  }
}
```

$$\Rightarrow 2 + 10 \times (6 + 4) = 104$$

$$A: 4 + 1$$

$$\text{loop. } (1, 4)$$

4

$$+ 1 = 4b$$

Hand-compile this loop into Beta assembly language and compare the total number of memory accesses generated when it executes to the total number of memory accesses from part (1).

### Problem 3.

Which of the following Beta instruction sequences might have resulted from compiling the following C statement? For each sequence describe the value that does end up as the value of y.

```
int x[20], y;  
y = x[1] + 4;
```

- A. LD (R31, x + 1, R0)      exception.  
ADDC (R0, 4, R0)  
ST (R0, y, R31)
- B. CMOVE (4, R0)  
ADDC (R0, x + 4, R0)  
ST (R0, y, R31)      address of x[2]
- C. LD (R31, x + 4, R0)       $R_0 \leftarrow x[1]$ .  
ST (R0, y + 4, R31)      by next address  $\leftarrow x[1]$ .
- D. CMOVE (4, R0)  
LD (R0, x, R1)  
ST (R1, y, R0)       $R_1 \leftarrow x[1]$ .      same as C.  
 $y \leftarrow x[1]$ .
- E. LD (R31, x + 4, R0)  
ADDC (R0, 4, R0)  
ST (R0, y, R31)
- F. ADDC (R31, x + 1, R0)      error  
ADDC (R0, 4, R0)  
ST (R0, y, R31)