

Department of Electrical and Computer Engineering  
The University of Texas at Austin

EE 460N Fall 2014  
Y. N. Patt, Instructor  
Stephen Pruett, Emily Bragg, Siavash Zangeneh TAs  
Exam 2  
November 5, 2014

Name: Solution

Problem 1 (20 points): \_\_\_\_\_

Problem 2 (10 points): \_\_\_\_\_

Problem 3 (20 points): \_\_\_\_\_

Problem 4 (25 points): \_\_\_\_\_

Problem 5 (25 points): \_\_\_\_\_

Total (100 points): \_\_\_\_\_

Note: Please be sure that your answers to all questions (and all supporting work that is required) are contained in the space provided.

Note: Please be sure your name is recorded on each sheet of the exam.

Please sign the following. I have not given nor received any unauthorized help on this exam.

Signature: \_\_\_\_\_

**GOOD LUCK!**

Name: \_\_\_\_\_

**Problem 1 (20 points)**

**Part a (5 points):** A parity bit can be used to detect any single bit error. But if two bits are transmitted in error, the parity bit scheme does not work. What characteristic of the bit errors makes this a non-problem for those situations that use parity for detecting bit errors.

Statistical independence between bits

**Part b (5 points):** Vector instruction A requires the result of vector instruction B as a source. Vector chaining allows vector instruction A to start processing before vector instruction B finishes. When can vector instruction A start its execution phase?

Once the first element of B  
~~has~~ finished being processed

**Part c (5 points):** An SRAM cell consists of two cross-coupled inverters (therefore, at least 4 transistors usually, and often more). It stores a 1 or a 0 depending on which inverter is outputting a 1 and which is outputting a 0. On the other hand, a DRAM consists of only one transistor. How, then, does it store a 1 or 0?

A capacitor, which is either charged or discharged

**Part d (5 points):** On a page fault, the operating systems often loads a page from the disk into a frame that was previously occupied by a different page. How does the operating system know whether it is necessary to write the previously occupied page back to the disk? Please be brief and explicit. Answer in fifteen words or fewer.

The modify bit will be set in the PTE for that page

Name: \_\_\_\_\_

**Problem 2 (10 points)**

The following data flow graph receives as inputs a value  $x$ , an  $n$  element vector  $V_0, V_1, \dots, V_{n-1}$ , the value  $n$ , and a value 0 on its four input ports.



What "answer" is produced by the execution of this data flow graph?

A count of the number of occurrences of  $x$  in the vector

Name: \_\_\_\_\_

**Problem 3 (20 points)**

**Part a (10 points)** Assume we have a byte addressable memory that has the following address format:



i (2 points): What is the maximum size of physical memory? **2<sup>29</sup> bit address** 2<sup>29</sup>

ii (2 points): Circle the correct answer: Multiple (**Channel** / Banks / Ranks) enable simultaneous transfer of data from different parts of memory to the processor in the same cycle.

iii (3 points): How many bits of data can be transferred between the processor

and memory simultaneously?

**16 bytes on bus x 2 channels**

$$= 32 \text{ bytes} \times 8 \text{ bits/byte} =$$

**256 bits**

iv (3 points): How many bytes of storage are there on a single DRAM chip?

**row, col, bank bits**

**10 - Row**

**5 + 5 - Col**

**2 - Bank**

**12 bits of  
chip address**

~~**2<sup>12</sup>**~~ **2<sup>22</sup>**

**Part b (10 points)** Suppose we have a byte addressable memory system made up of 1 rank and some number of banks.

For the following program fragment, assume the elements of arrays a, b, and c occupy a single byte each. Assume that the variable *i* is kept in a register for the duration of the fragment and does not cause any memory accesses. Initially none of the elements of a, b, and c have been loaded into a row buffer. What is the minimum number of banks and the minimum size of the row buffer such that exactly 4 row buffer misses occur?

```
for ( i = 0; i < 63; i++ )  
    c[i] = a[i] + b[i];
```

**1** Banks

**4 row buffer misses means  
these 3 arrays occupy 4 rows**

**64** Bytes in row buffer



These rows can  
be across 1, 2, or  
4 banks. Question  
asks for minimum

4

$$\frac{64 \times 3}{4} = 48 \text{ bytes} \Rightarrow \text{rounds to } 64 \text{ Bytes}$$

Name: \_\_\_\_\_

#### Problem 4 (25 points)

We wish to enhance the LC-3b by adding Virtual memory and a new instruction. Virtual memory will be implemented by a VAX-like scheme as we studied in class.

The new instruction will be STI SR,BR,offset and will use opcode 1010. The format is:



STI R0,R1,#0

STI stands for Store Indirect.

x A040

STI operates as follows: We compute a virtual address (call it A) by adding the sign-extended offset to the contents of BR. The memory location specified by A contains the virtual address B. We wish to store the contents of SR into the address specified by B.

(Note: For those of you who studied the STI instruction in EE306, note that the address A is calculated differently from the way it was in the LC-3.)

**Part a (5 points):** To process the STI instruction, one must go through the Fetch, Decode, etc. instruction cycle. What is the maximum number of physical addresses that can be accessed in processing an STI instruction.

9

Fetch + 2 loads,  
1 load and 1 store

**Part b (20 points):** You are given the following information:

|                        |                |                       |         |
|------------------------|----------------|-----------------------|---------|
| Virtual Address Space: | 64 KB          | Physical Memory Size: | 4 KB    |
| User Space Range:      | x0000 to x7FFF | PTE Size:             | 2 Bytes |
| System Space Range:    | x8000 to xFFFF |                       |         |



R0: x8000

R1: x401E

PC: x3048

A 4 entry TLB.

| V | Page # | PTE |      |
|---|--------|-----|------|
|   |        | V   | PFN  |
| 1 | x0C1   | 1   | x01A |
| 1 | x182   | 1   | x024 |
| 0 | -      | -   | -    |
| 0 | -      | -   | -    |

X3048  
0|011 0000 0|100 1000  
Page size is 6 bits

0|011 0000 0|100 1000  
2  
Page size is 5 bits

0100,100 { 0,1000  
14 88

TLB hit could  
be either entry...

## Figuring out SBR

$$\begin{array}{r} \text{XB000} = \\ 10110000000000000000 \\ \times 180 \\ \hline +2 \\ \hline 300 \\ \end{array} \quad \begin{array}{r} \\ \\ \times 300 \\ \hline +580 \\ \hline X360 \\ \end{array}$$

Name: \_\_\_\_\_

### **Problem 4 continued**

To process STI R0,R1,#0, seven physical memory accesses are needed. The table below shows the VA, PA, Data, and whether or not there was a TLB hit for each of these seven physical memory accesses in the order they occurred.

Your job: Complete the table and fill in the four boxes. You can assume no page faults occur.

| Virtual Address               | Physical Address         |
|-------------------------------|--------------------------|
| These are<br>on the same page | They use the<br>same PTE |

| Virtual Address | Physical Address | Data    | TLB Hit |
|-----------------|------------------|---------|---------|
| x3048           | x488             | xA040   | Yes     |
| N/A             | x360             | x8004   | No      |
| xB000           | x080             | x8040   | No      |
| x401E           | x81E             | (x40FE) | No      |
| N/A             | x360             | x8004   | No      |
| xB00E           | x08E             | x800E   | No      |
| x40FE           | x1DE             | x8009   | No      |

check TLB  
for PFN

Frame size: 25 2

Number of frames: 

|   |              |
|---|--------------|
| 7 | $12 - 5 = 7$ |
|---|--------------|

UBR: ✓ Acos

SBR: X960

$$\begin{array}{r}
 \text{X } 40FE = 0100\ 0000\ 1111\ 1110 \\
 \text{X } 1IDE = 0001\ 1101\ 1110
 \end{array}$$

6 bits 5 bits  
 ↓      ↓  
 2      2  
(2)

Cannot be  
 6 bits

Figuring out Page size:

① ~~lets~~ Lets fill in what we know first. We know the contents of R1 is the next virtual address. Then the data stored at that location is the second TLB tells us that byte on Page bits are either 5 or 6 bits. Comparing x40FE and x1DE tells us that Bip must be 5 bits.

## Figuring out user

$$\begin{array}{r} \times 40FE : 0190\ 0000\ 1111\ 1110 \\ \times 207 \\ \downarrow \quad \leftarrow z \\ \times 40E \\ + \underline{UBR} \\ \times B00E \end{array}$$

$$\therefore \text{UBR} = x A B C D O$$

Scratch  
~~1010 must be 5 bits.~~  
 00001000,1110      10000000,1110  
 0    8    E            8    1    E  
 000,1110      0100 0000 0001 1110  
 x200  
 $\frac{x2}{x400}$   
 $+ A C 0 0$   
 $\hline$   
 B000

Name: \_\_\_\_\_

### Problem 5 continued:

**Part a :** Modify the state machine.

There are several places in the state machine where one of these exceptions could occur. We have provided enough two-state sequences for you to check and initiate each exception. Complete each of the two-state sequences you need for each of the places where they are needed. Note the "from" and "to" boxes associated with each two-state sequence. These are used to identify where the two-state sequence is to be inserted. You may not need all the two-state sequences that we have provided.



Figure 1: Two-state sequences

Name: \_\_\_\_\_

**Problem 5 continued:**

**Part b :** Design the logic to generate E and EXCV. You are not allowed to add any additional control signals to the control store.



Figure 2: Modified Datapath for new exceptions

Name: \_\_\_\_\_

**Problem 5 continued:**

**Part c :** Modify the microsequencer. You are not allowed to add any additional control signals to the control store.

