

# Essentials of Computer Systems - Exercises #2

## 1 Datapath Design - part 1

Exercise 1.1 Datapath design:

- (a) Design a datapath (draw a DFD) with the following functionality:  $z = 3(b - a) - ac + a$ . The building blocks available are a multiplier and a subtracter, shown in figure below. You are given an area constraint: you are allowed to use at most 1 multiplier and at most one subtracter.

→ we know we will have to REUSE



- (b) Fill out the analysis table for the DFD obtained in part (a) ignoring MUXes:

| Analysis table      |                                                  |
|---------------------|--------------------------------------------------|
| latency             | 3                                                |
| throughput          | $\frac{1}{3}$                                    |
| critical path delay | $\max(t_{cp0}, t_{cp1}, t_{cp2}) = 2 \text{ ns}$ |
| clock period        | $t_{clock} = t_{cp} + t_{REG} = 2.5 \text{ ns}$  |
| # inputs            | 3                                                |
| # outputs           | 1                                                |
| # registers         | 3                                                |
| # subtracters       | 1                                                |
| # multipliers       | 1                                                |
| total area in GE    | 14 GE                                            |

$$A = 3 \times A_{REG} + A_{SUB} + A_{MULT} = 14 \text{ GE}$$



- (c) Fill out the analysis table for the DFD obtained in part (a) including MUXes:

$$A = 14 \text{ GE} + 5 \times A_{MUX} = 19 \text{ GE}$$

| Analysis table      |                                                 |
|---------------------|-------------------------------------------------|
| latency             | 3                                               |
| throughput          | $\frac{1}{3}$                                   |
| critical path delay | $t_{cp} = 2.2 \text{ ns}$                       |
| clock period        | $t_{clock} = t_{cp} + t_{REG} = 2.7 \text{ ns}$ |
| # inputs            | 3                                               |
| # outputs           | 1                                               |
| # registers         | 3                                               |
| # multiplexers      | 4                                               |
| # subtracters       | 1                                               |
| # multipliers       | 1                                               |
| total area in GE    | 19 GE                                           |

MUX on register  $r_1$ :  
 $r_1.d$

|         |       |       |       |       |
|---------|-------|-------|-------|-------|
| Cycle 0 | $i_1$ | $i_2$ | $i_3$ | $i_4$ |
| Cycle 1 | -3    | $r_3$ | $r_1$ | $r_2$ |
| Cycle 2 | -     | -     | $r_1$ | $r_2$ |

Cycle 0:  $i_1$   
Cycle 1: sub  
Cycle 2: sub → MUX

|         | $m_1$   | $sub$   |         |         |
|---------|---------|---------|---------|---------|
|         | $src_1$ | $src_2$ | $src_1$ | $src_2$ |
| Cycle 0 | $i_1$   | $i_2$   | $i_3$   | $i_4$   |
| Cycle 1 | -3      | $r_3$   | $r_1$   | $r_2$   |
| Cycle 2 | -       | -       | $r_1$   | $r_2$   |

MUX has MUXes on both inputs but also on output!

$$t_1 = t_{MUX} + t_{SUB} + t_{MUX} = 0.2 \text{ ns} + 1 \text{ ns} + 0.2 \text{ ns} = 1.4 \text{ ns}$$

MULT has MUXES on both inputs:

$$t_2 = t_{MUX} + t_{MULT} = 0.2 \text{ ns} + 2 \text{ ns} = 2.2 \text{ ns}$$

$$\Rightarrow t_{cp} = \max(t_1, t_2) = 2.2 \text{ ns}$$

(d) DRAW THE CIRCUIT



Exercise 1.2 Datapath design:

use negative constant  $x-y = x+(-1 \cdot y)$

- (a) Design a datapath (draw a DFD) with the following functionality:  $z = a(a+b) - (bc+a)$ . The building blocks available are a multiplier and an adder, shown in figure below. All inputs are available only in the first clock cycle. The top priority is to minimize the area.



- (b) Fill out the analysis table for the DFD obtained in part (a):

| Analysis table      |                              |
|---------------------|------------------------------|
| latency             | 4                            |
| throughput          | $1/4$                        |
| critical path delay | $2.6 \text{ ns}$             |
| clock period        | $2.6 + 0.5 = 3.1 \text{ ns}$ |
| # inputs            | 3                            |
| # outputs           | 1                            |
| # registers         | 3                            |
| # multiplexers      | 6                            |
| # adders            | 1                            |
| # multipliers       | 1                            |
| total area in GE    | 37                           |

$1 \cdot 15 + 1 \cdot 10 + 3 \cdot 2 + 6 \cdot 1 = 37 \text{ GE}$

$t_{cp} = (2.4, 2.6, 1.2) = 2.6 \text{ ns}$

| cycle | ADD      |          | MULT  |       | REGs     |          |               |
|-------|----------|----------|-------|-------|----------|----------|---------------|
|       | $s_1$    | $s_2$    | $p_1$ | $p_2$ | 1        | 2        | 3             |
| 0     | $i_{-1}$ | $i_{-2}$ |       |       | $i_{-2}$ | $i_{-3}$ | $i_{-1}$      |
| 1     | $r_1$    | $r_3$    |       |       | $r_1$    | $r_2$    | $\text{MULT}$ |
| 2     |          |          | -1    | $r_2$ |          |          | $\text{MULT}$ |
| 3     | $r_1$    | $r_3$    |       |       |          |          | $\text{ADD}$  |

$1 \quad 1 \quad 2 \quad 1 \quad 1 \quad 0 \quad 0 = 6 \text{ MUXes}$

- (c) Draw the circuit for the DFD obtained in part (a):



# Essentials of Computer Systems - Exercises #3

## 1 Datapath Design - part 2

Exercise 1.1 Datapath design:

- (a) Design a datapath (draw a DFD) with the following functionality:  $z = d \cdot f(2a, f(2a, b+c))$ . The building blocks available are an adder, multiplier, and a functional unit  $f$ , shown figure below. Design goals: maximize Tput while minimizing the clock period and the area. All inputs are available only in the first clock cycle.



- (b) Fill out the analysis table for the DFD obtained in part (a):

| Analysis table      |                                                                                     |
|---------------------|-------------------------------------------------------------------------------------|
| latency             | 4                                                                                   |
| throughput          | $\frac{1}{2}$                                                                       |
| critical path delay | $t_{cp} = \max(t_{cp0}, t_{cp1}, t_{cp2}, t_{cp3}) = 10.2 \text{ ns}$               |
| clock period        | $t_{clock} = t_{cp} + t_{REG} = 10.2 \text{ ns} + 0.5 \text{ ns} = 10.7 \text{ ns}$ |
| # inputs            | 4                                                                                   |
| # outputs           | 1                                                                                   |
| # registers         | 7                                                                                   |
| # multiplexers      | 2                                                                                   |
| # adders            | 2                                                                                   |
| # multipliers       | 1                                                                                   |
| # $f$ units         | 1                                                                                   |
| total area in GE    | 66 GE                                                                               |

$$A = 7 \times 2 \text{ GE} + 2 \times 1 \text{ GE} + 2 \times 5 \text{ GE} + 1 \times 10 \text{ GE} + 1 \times 30 \text{ GE} = 66 \text{ GE}$$

Exercise 1.2 Datapath design:

- (a) Design a datapath (draw a DFD) with the following functionality:  $z = a^2 - b + (a - c)^2 + c$ . The building blocks available are an adder, subtracter, and a squarer, shown figure below. Design goals: maximize Tput while minimizing the area. All inputs are available only in the first clock cycle.  $\rightarrow$  see REGs r2, r4 and r6



ORDER:  $i_1 = x$ ,  $i_2 = y$  and  $o = x - y$



$$t_{CP} = \max(3.2, 2.2) = 3.2 \text{ ns}$$

- (b) Fill out the analysis table for the DFD obtained in part (a):

| Analysis table      |                              | SQ    | SUB   | ADD   | REGs  |   |   |   |   |   |  |
|---------------------|------------------------------|-------|-------|-------|-------|---|---|---|---|---|--|
| latency             | 4                            |       |       |       | 1     | 2 | 3 | 4 | 5 | 6 |  |
| throughput          | $1/2$                        | $s_1$ | $s_2$ | $p_1$ | $p_2$ |   |   |   |   |   |  |
| critical path delay | $3.2 \text{ ns}$             |       |       |       |       |   |   |   |   |   |  |
| clock period        | $3.2 + 0.5 = 3.7 \text{ ns}$ |       |       |       |       |   |   |   |   |   |  |
| # inputs            | 3                            |       |       |       |       |   |   |   |   |   |  |
| # outputs           | 1                            |       |       |       |       |   |   |   |   |   |  |
| # registers         | 6                            |       |       |       |       |   |   |   |   |   |  |
| # multiplexers      | 5                            |       |       |       |       |   |   |   |   |   |  |
| # adders            | 1                            |       |       |       |       |   |   |   |   |   |  |
| # subtracters       | 1                            |       |       |       |       |   |   |   |   |   |  |
| # squarers          | 1                            |       |       |       |       |   |   |   |   |   |  |
| total area in GE    | 32 GE                        |       |       |       |       |   |   |   |   |   |  |

$1.5 + 1.3 + 1.7 + 5 \cdot 1 + 6 \cdot 2 = 32 \text{ GE}$

- (c) Draw the circuit for the DFD obtained in part (a):

$$= 5 \text{ MUX}$$

