

## TUTORIAL 1

Perform the following operation using signed-magnitude addition subtraction algorithm. The sign bit has been marked in Red.

In each case indicate the value of AVF.

- a. **0 101101 + 0 011111**
- b. **1 011111 + 1 101101**
- c. **0 101101 – 0 011111**
- d. **0 101101 – 0 101101**
- e. **1 011111 – 0 101101**

## TUTORIAL 2

Perform the following operation using signed-magnitude multiplication and Booth's multiplication algorithm respectively.

- a. **(-) 9 × (+) 8 (Sign Magnitude Multiplication Algorithm)**
- b. **(-) 6 × (-) 5 (Booth's Multiplication Algorithm)**

### **TUTORIAL 3**

1.
  - a. How many  $128 \times 8$  RAM chips are required to provide a memory capacity of 2048 bytes?
  - b. How many lines of the address bus must be used to access 2048 bytes of memory? How many of these lines will be common to all chips?
  - c. How many lines must be decoded for chip select? Specify the size of the decoders.
2. A set-associative cache consists of 64 lines, or slots, divided into four-line sets. Main memory consists of 4K blocks of 128 words each. Show the main memory address format.
3. Obtain the complement function of the match logic of one word in an associative memory.

## TUTORIAL 4

- In the given Figure 1, the two-word instruction at address 200 and 201 is a “load to AC” instruction with an address field equal to 500.
- First word of the instruction specifies the operation code and mode, the second word specifies the address part.
- PC has the value of 200 for fetching this instruction.
- The content of processor register R1 is 400.
- The content of an index register XR is 100.
- AC receives the operand after the instruction is executed.

Now, based on the given information in Figure 1 and Table 1, please fill the content of Table 2.

|            | Address | Memory               |
|------------|---------|----------------------|
| $PC = 200$ | 200     | Load to AC      Mode |
| $R1 = 400$ | 201     | Address = 500        |
| $XR = 100$ | 202     | Next instruction     |
|            | 399     | 450                  |
|            | 400     | 700                  |
|            | 500     | 800                  |
|            | 600     | 900                  |
|            | 702     | 325                  |
|            | 800     | 300                  |

Figure 1

Table 1

| Mode              | Algorithm         | Principal Advantage | Principal Disadvantage     |
|-------------------|-------------------|---------------------|----------------------------|
| Immediate         | Operand = A       | No memory reference | Limited operand magnitude  |
| Direct            | EA = A            | Simple              | Limited address space      |
| Indirect          | EA = (A)          | Large address space | Multiple memory references |
| Register          | EA = R            | No memory reference | Limited address space      |
| Register indirect | EA = (R)          | Large address space | Extra memory reference     |
| Displacement      | EA = A + (R)      | Flexibility         | Complexity                 |
| Stack             | EA = top of stack | No memory reference | Limited applicability      |

Relative Addressing A + (PC)

Table 2

| Addressing Mode     | Effective Address | Content of AC |
|---------------------|-------------------|---------------|
| Immediate Operand   |                   |               |
| Direct Addressing   |                   |               |
| Indirect Addressing |                   |               |
| Relative Addressing |                   |               |
| Indexed Addressing  |                   |               |
| Register            |                   |               |
| Register Indirect   |                   |               |

## TUTORIAL 5

1. The memory unit of a computer has 256K words of 32-bit each. The computer has an instruction format with four fields: an operational code field, a mode field to specify one of seven addressing modes, a register address field to specify one of 60 processor registers, and a memory address. Specify the instruction format and the number of bits in each field if the instruction is in one memory word.
2. A computer has 32-bit instructions and 12-bit addresses. If there are 250 two-address instructions, how many one-address instructions can be formulated?
3. An address space is specified by 24 bits and the corresponding memory space by 16 bits.
  - a. How many words are there in address space?
  - b. How many words are there in memory space?
  - c. If a page consists of 2K words, how many pages and blocks are there in the system?
4. A digital computer has a memory unit of  $64K \times 16$  and a cache memory of 1K words. The cache uses direct mapping with a block size of four words.
  - a. How many bits are there in the tag, index, block, and word fields of the address format?
  - b. How many bits are there in each word of cache including a valid bit?
  - c. How many blocks can the cache accommodate?

## TUTORIAL 6

1. The memory locations 1000, 1001 and 1020 have data values 18, 1 and 16 respectively before the following program is executed.

|        |                |                  |
|--------|----------------|------------------|
| MOVI   | Rs, 1          | Move immediate   |
| LOAD   | Rd, 1000(Rs) * | Load from memory |
| ADDI   | Rd, 1000       | Add immediate    |
| STOREI | O(Rd), 20 *    | Store immediate  |

Which of the statements below is TRUE after the program is executed?

- a. Memory location 1000 has value 20
- b. Memory location 1020 has value 20
- c. Memory location 1021 has value 20
- d. Memory location 1001 has value 20

\* content of displacement addressing mode.

2. Consider the following memory values and a one-address machine with an accumulator, what values do the following instructions load into accumulator?

- Word 20 contains 40
- Word 30 contains 50
- Word 40 contains 60
- Word 50 contains 70

Instructions are-

- a) Load immediate 20
- b) Load direct 20
- c) Load indirect 20
- d) Load immediate 30
- e) Load direct 30
- f) Load indirect 30

## TUTORIAL 7

1. In certain scientific computations it is necessary to perform the arithmetic operation  $(A_i + B_i)(C_i + D_i)$  with a stream of numbers. Specify a pipeline configuration to carry out this task. List the contents of all registers in the pipeline for  $i = 1$  through 6.
2. Draw a space-time diagram for a six-segment pipeline showing the time it takes to process eight tasks.
3. Determine the number of clock cycles that it takes to process 200 tasks in a six-segment pipeline.
4. A nonpipelined system takes 50 ns to process a task. The same task can be processed in a six-segment pipeline with a clock cycle of 10 ns.
  - a. Determine the speedup ratio of the pipeline for 100 tasks.
  - b. What is the maximum speedup that can be achieved?
5. The pipeline of following figure has the following propagation times: 40 ns for the operands to be read from memory into registers R1 and R2, 45 ns for the signal to propagate through the multiplier, 5 ns for the transfer into R3, and 15 ns to add the two numbers into R5. Consider there are 3 segments for pipelined unit.



- a. What is the minimum clock cycle time that can be used?
  - b. A nonpipelined system can perform the same operation by removing R3 and R4. How long will it take to multiply and add the operands without using the pipeline?
  - c. Calculate the speedup of the pipeline for 10 tasks and again for 100 tasks.
  - d. What is maximum speedup that can be achieved?
6. The time delay of a four-segment pipelined unit are as follows:  $t_1 = 50$  ns,  $t_2 = 30$  ns,  $t_3 = 95$  ns, and  $t_4 = 45$  ns. The interface register's delay time  $t_r = 5$  ns. How long would it take to complete 100 tasks in the pipeline?

## TUTORIAL 8

1. The time delay of a pipelined unit with four phases are as follows:  $t_1 = 50$  ns,  $t_2 = 60$  ns,  $t_3 = 90$  ns, and  $t_4 = 80$  ns. The interface register's delay time  $t_r = 10$  ns. Calculate the following:

- a. Pipeline time for 1000 task completion
- b. Sequential time for 1000 task completion
- c. Throughput

2. A four stage pipeline has the stage delays as 150, 120, 160 and 140 ns respectively. Registers are used between the stages and have a delay of 5 ns each. Assuming constant clocking rate, the total time taken to process 1000 data items on the pipeline will be (choose the correct alternate)-

- a. 120.4 microseconds
- b. 160.5 microseconds
- c. 165.5 microseconds
- d. 590.0 microseconds

3. The stage delays in a 4-stage pipeline are 800, 500, 400 and 300 picoseconds. The first stage is replaced with a functionally equivalent design involving two stages with respective delays 600 and 350 picoseconds. Calculate the throughput increase (in %).

4. You have been given 2 designs D1 and D2 for a synchronous pipeline processor. D1 has 5 stage pipeline with execution time of 3 ns, 2 ns, 4 ns, 2 ns and 3 ns. While the design D2 has 8 pipeline stages each with 2 ns execution time. How much time can be saved using design D2 over design D1 for executing 100 instructions?

5. Consider the following procedures. Assume that the pipeline registers have zero latency.

P1: 4-stage pipeline with stage latencies 1 ns, 2 ns, 2 ns, 1 ns

P2: 4-stage pipeline with stage latencies 1 ns, 1.5 ns, 1.5 ns, 1.5 ns

P3: 5-stage pipeline with stage latencies 0.5 ns, 1 ns, 1 ns, 0.6 ns, 1 ns

P4: 5-stage pipeline with stage latencies 0.5 ns, 0.5 ns, 1 ns, 1 ns, 1.1 ns

Which procedure has the highest peak clock frequency?

## TUTORIAL 9

1. What is RISC Pipeline? Explain Delayed Load for the following four instruction:

LOAD:  $R1 \leftarrow M[Address1]$

LOAD:  $R2 \leftarrow M[Address2]$

ADD:  $R3 \leftarrow R1 + R2$

STORE:  $M[Address3] \leftarrow R3$

Explain Delayed Branch for the following set of instructions:

LOAD from Memory to R1

Increment R2

Add R3 to R4

Subtract R5 from R6

Branch to address X

Next instruction in X

2. Consider the following four instructions in the following program. Suppose that the first instruction starts from step 1. Specify the operations that will be performed in the four segments (consider you are using four-segment instruction pipeline) during step 4.

LOAD  $R1 \leftarrow M[321]$

ADD  $R2 \leftarrow R2 + M[313]$

INC  $R3 \leftarrow R3 + 1$

STORE  $M[314] \leftarrow R3$

3. Show how the increment operation from the above instructions can create data hazards in the three segment RISC pipeline.

4. Consider a computer with four floating-point pipeline processors. Suppose that each processor uses a cycle time of 40 ns. How long will it take to perform 400 floating point operations? Is there a difference if the same 400 operations are carried out using a single pipeline processor with a cycle time of 10 ns?

5. A weather forecasting computation requires 250 billion floating-point operations. The problem is processed in a supercomputer that can perform 100 megaflops. How long will it take to these calculations?

## TUTORIAL 10

1.



| CS | RS1 | RS0 | Register selected                |
|----|-----|-----|----------------------------------|
| 0  | x   | x   | None: data bus in high-impedance |
| 1  | 0   | 0   | Port A register                  |
| 1  | 0   | 1   | Port B register                  |
| 1  | 1   | 0   | Control register                 |
| 1  | 1   | 1   | Status register                  |

The addresses assigned to the four registers of the I/O interface of the above figure are equal to the binary equivalent of 12, 13, 14, and 15. Show the external circuit that must be connected between an 8-bit I/O address from the CPU and the CS, RS1, and RS0 inputs of the interface.

2. Six interfaces of the type shown in the previous figure, are connected to a CPU that uses an I/O register of eight bits. Each one of six chip select (CS) inputs is connected to a different address line. Thus, the high-order address line is connected to the CS input of the first interface unit and the sixth address line is connected to the CS input of sixth interface unit. The two low-order address lines are connected to RS1 and RS0 of all six interface units. Determine the 8-bit address of each register in each interface.
3. Information is inserted into a FIFO buffer at a rate of  $m$  bytes per second. The information is deleted at a rate of  $n$  bytes per second. The maximum capacity of the buffer is  $k$  bytes.
  - a. How long does it take for an empty buffer to fill up when  $m > n$ ?
  - b. How long does it take for a full buffer to empty when  $m < n$ ?
  - c. Is the FIFO Buffer required if  $m = n$ ?

4. How many characters per second can be transmitted over a 1200-baud line in each of the following modes? (A character code consists of 8 bits.)
  - a. Synchronous serial transmission.
  - b. Asynchronous serial transmission with two stop bits.
  - c. Asynchronous serial transmission with one stop bit.
5. A CPU with a 20 MHz clock is connected to memory unit whose access time is 40 ns. Formulate a read and write timing diagrams using a READ strobe and a WRITE strobe. Include the address in the timing diagram.