

**Final Assessment Test (FAT) – November/December 2022**

|              |                                        |             |                       |
|--------------|----------------------------------------|-------------|-----------------------|
| Programme    | B.Tech.                                | Semester    | Fall Semester 2022-23 |
| Course Title | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L              |
| Faculty Name | Prof. Bhavadharini R M                 | Slot        | B1+TB1                |
| Time         | 3 Hours                                | Class Nbr   | CH2022231001498       |
|              |                                        | Max. Marks  | 100                   |

**Part A (10 X 10 Marks)**

**Answer All questions**

- Q1. Assume a RISC-based computer system. Illustrate the steps required to execute the sequence of [10] instructions :

| Memory Address | Instruction |
|----------------|-------------|
| 1010           | Load A      |
| 1011           | Add B       |
| 1100           | Store C     |



Where, 1010, 1011, 1100 are the memory address of the instructions, A is the memory address of the first operand and the data value in A is 20 and location B has a data value of 30. Identify the contents of registers PC, MAR, MDR, and IR during the instruction fetch and execute phases.

(10 marks)

- i. Perform multiplication using Booth's Algorithm.

[10] (6)

- ii. Show the step-by-step process for multiplying  $(-23)_{10}$  and  $(7)_{10}$ . (7 Marks)

-92

Marks)

[10]

- iii. Write the re-coded value of the multiplier. How many passes have no Arithmetic operation? (3

Marks)

- iv. Perform the following Floating point operation on the numbers  $(356.65)_{10}$  and  $(222.75)_{10}$ .

[10]

i. Convert the above numbers into the normalized notation of binary format. (3 marks)

ii. Perform subtraction operation for the given numbers and write the normalized result in IEEE-

754 double precision format. (7 marks)

- v. Assume a stack-oriented processor that includes the stack operations PUSH and POP.

[10]

Arithmetic operations automatically involve the top one or two stack elements. Begin with an empty stack and illustrate the contents of the stack after each instruction. (5 marks)

PUSH 5

PUSH 16

PUSH 2

PUSH 6

ADD

DIV

MUL

PUSH 3

DIV

ii. You are on the design team for a new processor. It is decided that this processor's clock must run at 2 GHz. Assume that the programs that would be executed on this processor would typically consist of 30% of load and store instructions, 60% of arithmetic and logical instructions, and 10% branching instructions. If each of these classes of instructions requires 6, 2, and 5 clock cycles, respectively, calculate the CPI and the MIPS rating of this processor. (5 marks)

5. A company needs to find out which of its two processors would require less time to execute the instruction, MUL R1, [R2], R3. The first processor has a single internal processor bus, while the second processor has three internal buses. Illustrate the architectural design of the two processors and provide the control steps involved in fetching and executing the above instruction with both processors. (10 marks) [10]

6. Consider a 2-way set associative cache with a total of 12 cache blocks. The main memory block requests are as follows: [10]

10, 55, 11, 4, 13, 8, 132, 129, 212, 129, 64, 8, 48, 32, 73, 92

Calculate the number of misses and the miss ratio if the replacement strategy is

- i. Least Recently Used (LRU) (5 marks)  
ii. First In First Out(FIFO) (5 marks)

7. In a DMA transfer, there are 2 devices placing bus requests. Device A has ID 5 and Device B has ID 8. Identify the device that becomes the bus master and Illustrate the distributed arbitration process with appropriate diagram. [10]

8. Consider an 8-bit word  $(01111101)_2$  transmitted as  $(0111111)_2$ . [10]  
i. Draw the layout of data bits and calculate the check bit for storing and retrieving the given data.(5 marks)  
ii. Using hamming code, show the steps involved in error detection and apply the correction if any for the same.(5 marks)

9. An E-Commerce application like Flip kart demands zero downtime and maintains payment-based sensitive data. How will you ensure the reliability, availability, and redundancy of the data if the server crashes out? Explain the Hybrid RAID level suitable for this application with an appropriate diagram. [10]

10. Consider the two instructions : [10]  
I1 : Add R1, R2, R3  
I2 : Shift left R3  
i. Draw the timing diagram for a 4-stage pipeline. (5 marks)  
ii. Identify the number of stalls due to data dependencies and show how they are handled using the operand forwarding scheme. (5 marks)



## Final Assessment Test (FAT) – November/December 2022

|              |                                        |             |                       |
|--------------|----------------------------------------|-------------|-----------------------|
| Programme    | B.Tech.                                | Semester    | Fall Semester 2022-23 |
| Course Title | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L              |
| Faculty Name | Prof. A. K Ilavarasi                   | Slot        | B2+TB2                |
| Time         | 3 Hours                                | Class Nbr   | CH2022231001515       |
|              |                                        | Max. Marks  | 100                   |

## Part A (10 X 10 Marks)

Answer All questions

1. At memory location 2021, Mul A, (R52), #23 instruction is residing. How will the processor fetch the instruction and execute the instruction based on various registers in the CPU, discuss the complete flow in detail using RISC architecture? For the given scenario, draw an architectural diagram as well. [10]
2. (i) In a hardware designing company, a manager assigned his team to perform Binary Floating-Point addition using IEEE 754 single precision format on the following numbers  $(198.31)_{10}$  and  $(87.346)_{10}$ . Your task is to help the team to compute the various intermediate steps to perform the addition operation to produce the result. (7 Marks)  
 (ii) Represent the final result in single precision floating point format for the problem mentioned in Question 2. section (i). (3 Marks)
3. Multiply  $(-23)_{10} \times (-7)_{10}$  using Booth's Algorithm. [10]  
 i) Determine the value of the Accumulator at the end of the second step? (3 Marks)  
 ii) Write down all the steps needed for computation in each iteration and determine the value in the Accumulator and Q register in the final step? (7 Marks)
4. Write the sequence of control steps required to execute each of the instructions given below using a single bus structure. [10]  
 (i) Add R1, #6. (3 marks)  
 (ii) Mul R1, (2090). (3 marks)  
 (iii) Move the contents of the memory location whose address is at memory location NUM to register R1. (4 marks)
5. Consider that two processors are implemented on the same instruction set architecture. The instructions are divided into four classes according to their CPI namely class A, B, C, and D. Processor P1 has a clock rate of 2.5 GHz with the CPI values of 1, 2, 3, and 3 for different classes of instructions, and Processor P2 has a clock rate of 3 GHz with the CPI values of 2, 2, 2, and 2 for different classes of instructions. Given a program with a dynamic instruction count of 1 million instructions which are divided into four classes namely 10% of class A, 20% of class B, 50% of class C, and 20% of class D. Compute the following: [10]  
 (i) Which processor implementation is faster? (2 Marks)  
 (ii) What is the global CPI for each implementation? (4 Marks)  
 (iii) Find the clock cycles required in both cases. (4 Marks)
6. Consider a computer with the following characteristics: Main memory consists of 1Mbyte; Word size is of 1 byte; Block size is of 16 bytes; The cache size is of 64 Kbytes. [10]

- (i) For the main memory addresses of F0010, 01234, and CABBE, give the corresponding tag, cache line address, and word offsets for a direct-mapped cache. (3 Marks)
- (ii) Give any two main memory addresses with different tags that map to the same cache slot for a direct-mapped cache. (2 Marks)
- (iii) For the main memory addresses of F0010 and CABBE, give the corresponding tag and offset values for a fully-associative cache. (2 Marks)
- (iv) For the main memory addresses of F0010 and CABBE, give the corresponding tag, cache set, and offset values for a two-way set-associative cache. (3 Marks)
7. (i) A Processor needs to transfer a file from a peripheral storage to the memory. During the I/O operation, the processor, main memory and I/O share a common bus. Suggest a suitable I/O technique for the above operation and explain its working in detail. (6 Marks) [10]
- (ii) With a neat sketch, identify and explain the methodology for connecting multiple devices that can receive acknowledgments in a serial manner. (4 Marks)
8. You want to store the following data  $(1735)_{10}$  in a 12-bit binary format at the address location 1345H in the memory. Ensure that data is stored and retrieved correctly. Determine the code generated for the given data and store it in the specified location. When the stored data is accessed from the same location after some time it is read as  $(1223)_{10}$ . [10]
- a) Draw the diagram of error correcting code function and discuss those results in detail. (3 Marks)
- b) Draw the layout of data bits and check bits. (3 Marks)
- c) Using SEC code, show the steps involved in error detection and correction for the above scenario. (4 Marks)
9. Infy operational data center requires a solution to recover their critical data disks from abrupt failures and disasters. [10]
- (i) Suggest at least three generalized solutions to this problem and justify that it increases the reliability of the system too. (4 marks)
- (ii) If Infy focuses on providing highest level of fault tolerance to a single disk drive, then which level of the above technique would you recommend? (3 marks)
- (iii) Which level of the above technique does Infy follow if they want to strip data at a block level across several drives with parity stored on one drive? (3 marks)
10. (i) Consider the following Assembly language code: [10]
- I1 : ADD R5, R2, R3  
I2 : SUB R4, R5, R2
- When the above assembly language program is executed in a pipelined processor, Identify which type of dependency exists and also discuss the hazards encountered and provide solutions to overcome the hazard. (7 Marks)
- (ii) Assume a 5-stage instruction cycle. Find out the speedup achieved if a set of 25 instructions is run on a processor without pipelining. (3 Marks)





**Final Assessment Test (FAT) - July/August 2023**

|              |                                        |             |                           |
|--------------|----------------------------------------|-------------|---------------------------|
| Programme    | B.Tech.                                | Semester    | Fall Inter Semester 22-23 |
| Course Title | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L                  |
| Faculty Name | Prof. SRINIVASAN R                     | Slot        | A1+TA1                    |
|              |                                        | Class Nbr   | CH2022232500333           |
| Time         | 3 Hours                                | Max. Marks  | 100                       |

**Part A (10 X 10 Marks)**

**Answer All questions**

Q1. (a) Draw the Von Neumann architecture and explain fetch-decode-execute cycle. [10]

(b) Table 1 shows a segment of memory from Von Neumann architecture. If the program counter (PC) contains the data of address 503, determine the contents of the Memory Address Register (MAR) and Memory Data Register (MDR) during a fetch cycle.

| Address | Contents |
|---------|----------|
| 500     | 11001101 |
| 501     | 11110001 |
| 502     | 10101111 |
| 503     | 10000110 |
| 504     | 00011001 |
| 506     | 10101100 |

Table 1

Q2. (a) Draw the flowchart for Booth's multiplication algorithm. [10]

(b) Multiply  $(-13)_{10}$  and  $(-20)_{10}$  using the same algorithm.

Q3. (a) Consider the IEEE 754 standard for representing floating-point numbers. Represent the decimal number  $(-0.25)_{10}$  in the single precision format. [10]

(b) Perform the following floating-point addition:  $(1.111)_2 \times 2^{-1} + (1.011)_2 \times 2^{-3}$ .

Q4. (a) Suppose you have the instruction "Add 900". Given the memory as shown in Table 2, the contents of AC and the base register are 200 and 100 respectively. [10]

| Memory address | value |
|----------------|-------|
| 800            | 900   |
| 900            | 1000  |
| 1000           | 500   |
| 1100           | 600   |
| 1200           | 800   |
| 1300           | 250   |

Table 2

What would be loaded into the AC, if the addressing mode for the operand is:

- i) Immediate.
- ii) Direct
- iii) Indirect
- iv) Indexed

(b) Describe the design of microprogrammed control unit of CPU, with diagram.

- Q5. (a) Consider a two-level cache with access time of 5ns and 80ns respectively. If the hit rates are 95% and 75% respectively in the two caches and the memory access time is 250ns, what is the average access time? [10]
- (b) Why do we use memory interleaving? Explain the types of memory interleaving.
- Q6. (a) Identify the mode of data transfer between main memory and I/O devices without intervention by CPU. Explain it with suitable diagram. [10]
- (b) When a device interrupt occurs, how does the processor determine which device issued the interrupt?
- Q7. (a) Illustrate read and write cycles using handshake in asynchronous type buses. [10]
- (b) With a neat sketch, explain the various arbitration mechanisms in buses.
- Q8. (a) What factors should be considered when choosing a RAID level for a specific storage requirement? [10]
- (b) Assume a byte data value is  $10011010_2$ . Find the Hamming ECC code for that byte, and then invert bit 10 and show that the ECC code finds and corrects the single-bit error.
- Q9. (a) The 5 stages of a processor have the latencies as given in Table 3. [10]
- | <b>Fetch</b> | <b>Decode</b> | <b>Execute</b> | <b>Memory</b> | <b>Writeback</b> |
|--------------|---------------|----------------|---------------|------------------|
| 250ps        | 350ps         | 300ps          | 500ps         | 50ps             |
- Table 3
- Assume that when pipelining, each pipeline stage costs 20 ps extra for the registers between pipeline stages. Calculate cycle time, throughput and time to execute 2000 instructions when the processor is (i) non-pipelined and (ii) pipelined
- (b) Explain the hazards in pipelining.
- Q10. (a) In a program, 95% of the execution time is spent inside a loop that can be executed in parallel. If we parallelize this loop and execute the program on 8 CPUs, what is the maximum speedup and efficiency? [10]
- (b) With neat sketch, describe the architecture and features of superscalar processors.



## Final Assessment Test (FAT) - July/August 2023

|              |                                        |             |                           |
|--------------|----------------------------------------|-------------|---------------------------|
| Programme    | B.Tech.                                | Semester    | Fall Inter Semester 22-23 |
| Course Title | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L                  |
| Faculty Name | Prof. Balakrishnan R                   | Slot        | A2+TA2                    |
| Time         | 3 Hours                                | Class Nbr   | CH2022232500338           |
|              |                                        | Max. Marks  | 100                       |

## Part A (10 X 10 Marks)

Answer All questions

01. (a) Explain and illustrate any 3 bus structures available. [10]  
 (b) Compare and differentiate between Von Neumann and Harvard Architecture.
02. Explain 5-bit booth multiplier using the flowchart and find the result to multiply  $(-9) \times (-13)$ . [10]
03. Divide 11 (eleven) by 3 (three) using non-restoring method and explain the steps involved. [10]
04. (a) Consider a MIPS computer with a processor speed of 1GHZ and each ALU instruction takes 3 clock cycles, branch/jump instruction takes 2 clock cycles, each short word (SW) instruction takes 4 clock cycles and each long word(LW) instruction takes 5 clock cycles. Consider a benchmark program which takes 200 million ALU instructions, 55 million branching instructions, 25 million SW instructions and 20 million LW instructions. Find (i) CPI and (ii) CPU time.  
 (b) A two-byte relative mode branch instruction is stored in memory location 1000. The branch is made to the location 87. What is the effective address? [10]
05. (a) Consider a system in which bus cycles takes 500 ns. Transfer of bus control in either direction, from processor to I/O device or vice versa, takes 250 ns. One of the I/O devices has a data transfer rate of 50 KB/s and employs DMA. Data are transferred 1 byte at a time. Suppose we employ DMA in a burst mode. That is, the DMA interface gains bus mastership prior to the start of a block transfer and maintains control of the bus until the whole block is transferred. For how long would the device tie up the bus when transferring a block of 128 bytes? Repeat the calculation for cycle-stealing mode.  
 (b) Explain the interrupt priority system using polling method. [10]
06. (a) A two-way set associative cache memory uses block of four words. The cache can accommodate a total of 2048 words from main memory. The main memory size is 128K X 32. Formulate all pertained information required to construct the cache memory (Tag, Index, Data, blocks, words).  
 (b) What is the size of the cache memory? [10]
07. The access time of a cache memory is 100 ns and that of main memory is 1000 ns. It is estimated that 80% of the memory requests are for read and the remaining 20% is for write. The hit ratio for read access is 0.9. A write through procedure is used.  
 (a) What is the average access time of the system considering only memory read cycle?  
 (b) What is the average access time of the system for both read and write cycles?  
 (c) What is the hit ratio taking into consideration the write cycles? [10]

08. (a) A bit stream 10011101 is transmitted using the standard CRC method. The generator polynomial is  $x^3+1$ . What is the actual bit string transmitted? Suppose the third bit from the left is inverted during transmission. How will receiver detect this error ? [10]  
(b) Explain the organization of magnetic disk with Read / Write Mechanism.
09. (a) What is Flynn's classification of computer architecture ? and explain each classification with block diagram. [10]  
(b) Consider a pipeline having 4 phases with duration 60, 50, 90 and 80 ns. Given latch delay is 10 ns. Calculate  
i) Pipeline cycle time  
ii) Pipeline time for 1000 tasks  
iii) Sequential time for 1000 tasks
10. (a) Explain the difference between superscalar and super pipeline architecture with block diagram. [10]  
(b) Consider a non-pipelined processor with a clock speed of 2.5 MHz and average cycles per instruction of 4. The same processor is upgraded to a five stage pipelined RISC processor but due to the internal pipeline delay, the clock speed is reduced to 2 MHz Assume there are no stalls in the pipeline. Calculate  
i) Cycle Time in Non-Pipelined Processor  
ii) Non-Pipeline Execution Time  
iii) Cycle Time in Pipelined Processor  
iv) Pipeline Execution Time  
v) Speed Up ratio



**Final Assessment Test (FAT) - November/December 2023**

|              |                                        |             |                         |
|--------------|----------------------------------------|-------------|-------------------------|
| Programme    | B.Tech..                               | Semester    | FALL SEMESTER 2023 - 24 |
| Course Title | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L                |
| Faculty Name | Prof. VAIDEHI VIJAYAKUMAR              | Slot        | F1+TF1                  |
| Time         | 3 Hours                                | Class Nbr   | CH2023240100882         |
|              |                                        | Max. Marks  | 100                     |

**PART-A (10 X 10 Marks)**
**Answer all questions**

01. a) Consider the following assembly language program for a hypothetical processor. A,B and C are 8 bit registers. The meanings of various instructions are shown as comments [10]

```

MOV B, #0      ; B ← 0
MOV C, #8      ; C ← 8
Z: CMP C, #0   ; compare C with 0
    JZ X        ; jump to X if zero flag is set
    SUB C, #1   ; C ← C - 1
    RRC A, #1   ; right rotate A through carry by one bit. Thus:
                  ; If the initial values of A and the carry flag are  $a_7 \dots a_0$  and
                  ;  $c_0$  respectively, their values after the execution of this
                  ; instruction will be  $c_0 a_7 \dots a_1$  and  $a_0$  respectively.
    JC Y        ; jump to Y if carry flag is set
    JMP Z        ; jump to Z
Y: ADD B, #1   ; B ← B + 1
    JMP Z        ; jump to Z
X:             ;

```

Find the value of register B, if register A has A0. (5 Marks)

- b) Compare RISC and CISC architectures using a suitable example. (5 Marks)

02. ABC technologies have planned to design a processor which can perform multiplication using 2's complement addition and subtraction operations. [10]
- a) Explain the algorithm for ABC technologies by sketching the flowchart (5 Marks)
- b) Check the working of it with the following number:  $(-15)_{10} \times (21)_{10}$  (5 Marks)
03. a) Consider a 64-bit machine where an instruction (ADD R1, R2) is stored at memory location 4088(in Hexadecimal). What will be the value of MAR, MDR, IR and PC while the instruction is fetched and executed? Individual instruction is 64-bit long. (5 Marks)
- b) Consider a 128 bit machine where an instruction (ADD R1, LOC A) is stored at location 204B (in Hexadecimal). LOC A is memory location whose value is 200B (in Hexadecimal). How many memory accesses are required to execute this instruction assuming the instruction is stored in memory? In addition, what will be the content of PC after the instruction is fetched? Individual instruction is 128-bit long. (5 Marks)

04. Benchmark B is comprised of ALU, Load, Store, and Branch/Jump instructions. The number of instructions required to execute benchmark B are broken down by class-type in the table below. The number of clock cycles required to execute each instruction is also given in the table below. [10]

| Instruction Class | Clock Cycles(CC) per Instruction | Number of Instructions                 |
|-------------------|----------------------------------|----------------------------------------|
| ALU               | 6                                | 1,000,000,000 (i.e $1.0 \times 10^9$ ) |
| Load              | 8                                | 200,000,000 (i.e. $2.0 \times 10^8$ )  |
| Store             | 7                                | 180,000,000 (i.e. $1.8 \times 10^8$ )  |
| Branch / Jump     | 5                                | 140,000,000 (i.e. $1.4 \times 10^8$ )  |

To improve the performance of this benchmark, 2 changes will be made.

- a) A new instruction will be added load++ which will (i) copy a piece of data from memory into a register, and (ii) update the register value that is currently used to calculate a memory address (such that a separate instruction is not required to calculate the address of the next data element). (7 Marks)

The new load++ instruction:

- Requires 8 CCs to execute – just like the original load instruction.
- Will lead to the elimination of 70,000,000 (or  $7 \times 10^7$ ) ALU instructions.
- b) A change will be made to the existing datapath to reduce the number of clock cycles associated with branch / jump instructions from 5 to 4. (3 Marks)

Assuming a constant clock rate of 2 GHz, what speedup is obtained if both changes are implemented?

05. a) A famous program runs in 10s on computer A which has a 2GHZ clock. We are trying to help a computer designer build a computer B which will run this program in 6s. The design has determined that a substantial increase in the clock rate is possible, but this cause computer B to require 1.2 times as many clock cycles as computer A. What clock rate should the designer has to target? (5 Marks) [10]

- b) If the CPU clock cycle rate of a computer is 1 MHZ and a program takes 45 million cycles to execute, what is the CPU time? (5 Marks)

06. a) A cache is organized in direct-mapped manner with the following parameters:

Main memory size 32K words; Cache size 512 words; Block size 64 words.

(i) How many bits are there in a main memory address? (2 Marks)

(ii) How many bits are there in each of the TAG, BLOCK and WORD fields and also find the Tag Directory Size? (3 Marks)

b) Suppose that in 1000 memory reference there are 50 misses in the first level cache and 20 misses in the second level cache. What are the various miss rate? Assume the miss penalty from the L2 cache to memory is 100 clock cycles the hit time of the L2 cache is 10 clock cycles. The hit time of L1 is 1 clock cycle and there are 2 memory references per instruction. What is the average memory access time? (5 Marks)

07. a) Differentiate Synchronous and Asynchronous bus. How read operation happen in both types of buses? Explain with the help of timing diagrams. (5 Marks) [10]

- b) "RAID disks offers excellent performance, large and reliable storage"- Justify this statement through various levels. (5 Marks)

08. a) Consider a computer system with DMA support. The DMA module is transferring one 8-bit character in one CPU cycle from a device to memory through *cycle stealing* at regular intervals. Consider a 2 MHz processor. If 0.5% processor cycles are used for DMA, find the data transfer rate in bits per second? (5 Marks) [10]
- b) Consider a system employing interrupt driven I/O for a particular device that transfer data at an average of 8 KB/sec on a continuous basis. Consider interrupt processing takes about 100  $\mu$ sec i.e. time to jump to ISR, execute it and return to main program. What will be the fraction of processor time consumed by this I/O device if interrupts occur for every byte? (5 Marks)
09. a) Suppose we want to transfer a data 1101011011 and protect it from errors using the CRC polynomial  $x^4+x+1$ . Use Polynomial long division to determine the message that should be transmitted. Corrupt the leftmost second bit of the transmitted message and show that the error is detected by the receiver using CRC technique. (7 Marks) [10]
- b) During Second World War the Germans sent a coded message 10011010 (Using Hamming Code) from Berlin to their spy in Boston for blasting a submarine in a particular location in the sea. The message was caught in-between by a US spy. Help the spy to interpret the codeword generated for this data. (3 Marks)
10. a) Consider two arrays A and B that store integers. You want to sum the elements in A and B and store the result in array C: [10]  
Like  $C[i] = A[i] + B[i]$  in a loop. For the given scenario, Identify which type of FLYNN'S classification model is suitable? Justify your identification and explain with proper diagram. (7 Marks)
- b) If the time taken for the 5 stages of a processor are 1ns, 1.5ns, 5ns, 3.5ns, and 0.5ns respectively, what is the best speedup you can get with pipelining compared to the original processor without pipelining? (3 Marks)



Final Assessment Test (FAT) - November/December 2023

| Final Assessment Test (FAT) - November/December 2023 |                                        |             |                         |
|------------------------------------------------------|----------------------------------------|-------------|-------------------------|
| Programme                                            | B.Tech.                                | Semester    | FALL SEMESTER 2023 - 24 |
| Course Title                                         | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L                |
| Faculty Name                                         | Prof. Nivedita M                       | Slot        | G1+TGI                  |
| Time                                                 | 3 Hours                                | Class Nbr   | CH2023240100672         |
|                                                      |                                        | Max. Marks  | 100                     |

### **PART - A (7 X 10 Marks)**

## ~~Answer all questions~~

01. Consider the below two architectures A and B.

[101]



BEGIN

```
NUMBER counter, sum=0  
FOR counter=1 TO 100 STEP 1 DO  
    sum=sum+counter  
END FOR  
OUTPUT sum  
END
```

- a) How will the pseudo code given be executed in systems A and B? [5 Marks]

b) Discuss the step by step execution of the given pseudo code on the two systems in detail. [5 Marks]

02. Prove that multiplication of two n-digit signed numbers in base 2 gives a product of no more than  $2n$  digits with  $(-20)_{10} * (+15)_{10}$ . [10]

03. a) Illustrate the complete control sequence of the Single Cycle Data Path and Multiple Cycle Data path to fetch and execute the instruction: MUL R1, (R2), R3. (Assume R2 and R3 as source and R1 as destination). The instructions should show the steps of fetch and execute phases. (7 [10])

1. Show the pros and cons of the single bus organization over the multi-bus. (3 Marks)

04. Compare the performance of two different computers: M1 and M2. The following measurements [1]  
were taken for each computer:

- a) Which computer is faster in terms of execution time for each program and by what ratio? (3 Marks)

- b) Find the instruction execution rate (instructions per second) for each computer while executing program L. (3 Marks)
- c) The clock rates for M1 and M2 are 3 GHz and 5 GHz respectively. Find the CPI for program L on both machines. (4 Marks)
05. A cache has 64KB capacity; 128 byte lines and is 4-way set associative. The system containing [10]  
cache has 32-bit address.[2.5 Marks each]
- How many bits of tag are required in each entry in the tag array?
  - Find the tag directory size.
  - If the cache is write-through, how many bits are required for each entry in the tag array and how much storage is required for the tag array if an LRU replacement policy is used?
  - If the cache is write-back, how many bits are required for each entry in the tag array?
06. a) Identify the data transfer technique used for writing a byte data from the processor to a peripheral device which is not controlled by a common clock pulse and explain its functionality using appropriate timing diagrams. (5 Marks)
- b) Consider a system in which bus cycles takes 500 ns. Transfer of bus control in either direction, from processor to I/O device or vice versa, takes 250 ns. One of the I/O devices has a data transfer rate of 50 KB/s and employs DMA. Data are transferred one byte at a time.
  - Suppose we employ DMA in burst mode, for how long would the device tie up the bus when transferring a block of 128 bytes? (2.5 Marks)
  - Repeat the calculation for cycle-stealing mode. (2.5 Marks)
07. Consider four peripheral devices P1, P2, P3, P4 connected to a Processor with P1 having highest priority and P4 having lowest priority. Analyze the following bus arbitration methods with respect to communication reliability in the event of P1 failure. [10]
- Daisy Chaining (4 Marks)
  - Polling (3 Marks)
  - Independent Request (3 Marks)

#### PART - B (2 X 15 Marks)

**Answer all questions**

- Q8 a) Consider a 4-drive, 200GB per drive RAID array. What is the available data storage capacity for each of the RAID levels 0, 1, 3, 4, 5 and 6? Justify your answer. (10 Marks) [15]
- b) A manufacturer wishes to design an array of hard disks in a server with a capacity of 512 GB or more. If the technology used to manufacture the disk allows 2048-byte sectors, 4096 sectors/track and 8192 tracks/platter, how many disks are required, assuming two platters/disk? (5 Marks)
09. a) Identify and explain the dependencies in the following code when executed in a 5-stage pipelined processor. How will you resolve the dependencies? Sketch the pipelining stages and identify the stall(s) if any. (10 Marks) [15]

add R3, R4, R2  
sub R5, R3, R1  
load R6, 200(R3)  
add R7, R3, R6

Note: In the above instruction format, first operand is the destination and other operands are source operands.

- b) Given the following code snippet, Identify the suitable Flynn's Taxonomy classification and explain it. (5 Marks)

## Final Assessment Test (FAT) - November/December 2023

|              |                                        |             |                         |
|--------------|----------------------------------------|-------------|-------------------------|
| Programme    | B.Tech.                                | Semester    | FALL SEMESTER 2023 - 24 |
| Course Title | COMPUTER ARCHITECTURE AND ORGANIZATION | Course Code | BCSE205L                |
| Faculty Name | Prof. B V A N S S Prabhakar Rao        | Slot        | F2+TF2                  |
|              |                                        | Class Nbr   | CH2023240100883         |
| Time         | 3 Hours                                | Max. Marks  | 100                     |

## PART - A (10 X 10 Marks)

Answer all questions

01. i. Discuss how the stored program concept works for the following instructions which adhere to *Opcode source, destination* format. Explain the operational steps involved with a suitable diagram. (5 marks)
- LOAD (R12), R6  
DIV R6, R0, R8
- ii. Comment on the speed of operation in processors that use Von-Neumann architecture and Harvard architecture. Draw suitable diagrams to justify your inference. [5 marks]
02. i. Perform multiplication  $(-21)_{10} \times (9)_{10}$  using modified booth algorithm. (7 marks) [10]  
ii. If the least significant bit of Q register is 0 in every iteration, the algorithm yields a best case outcome. Prove if this statement is true or not. (3 marks)
03. Perform the following Floating point operation on the numbers  $(357.65)_{10}$  and  $(222.75)_{10}$ . [10]  
i. Convert the above numbers into normalized notation of binary format. (5 marks)  
ii. Perform addition operation for the given numbers and write the normalized result in IEEE-754 double precision format.(5 marks)
04. Illustrate the architectural design of Single Cycle Data Path to fetch and execute the following instructions. [10]  
MOV #20, AX  
MUL (R2), AX  
MOV AX, RI  
i. Write down the micro routine control sequence steps involved with respect to the given instructions which adhere to *Opcode source, destination* format. (8 marks)  
ii. What is the significance of WMFC signal? (2 marks)
05. Consider a cache of 4 lines of 16 bytes each. Main memory is divided into blocks of 16 bytes each. That is, block 0 has bytes with addresses 0 through 15, and so on. Now consider a program that accesses memory in the following sequence of addresses:  
Repeat 2 times: 63 through 70  
Once: 15 through 32; 80 through 95
- a. Suppose the cache is organized as direct mapped. Memory blocks 0, 4, and so on are assigned to line 0; blocks 1, 5, and so on to line 1; and so on. Compute the hit ratio.(5 marks)
- b. Suppose the cache is organized as fully associative( Assume flexible block assignment), Compute the hit ratio using the least recently used replacement scheme.(5 marks)

- Q6. i) Discuss the principle of synchronous bus operation using a clock diagram for burst mode data transfer. How does a high frequency clock impact the communicating devices? (5 marks)  
ii) Differentiate strobe control and handshake protocol ?(5 marks)
- Q7. Let us assume that 00110010 is the code word that is sent, and that 00100010 is received. The receiver has to search for appropriate parities to ascertain whether the code is correct because it has no idea what was communicated.  
a) If even parity is employed, identify transmission errors using Hamming code algorithm.(8 marks)  
b) Discuss in detail how the bit errors are corrected as per the above algorithm.(2 marks)
- Q8. Consider the following sequence of instructions organized in a 4-stage pipeline: [10]  
Add #20, R0, R1  
Mul #3, R2, R3  
And R1, R2, R4  
Add R0, R4, R5  
(Note: In all the instructions, the destination operand is given last)  
i. Identify the specific type of data dependences observed in the pipelined execution with appropriate 4- stage pipeline diagram.(5 marks)  
ii. Discuss the methods to eliminate the hazards with a suitable diagram.( 5 marks)
- Q9. i. An E-commerce application like Flipkart demands zero down time and maintains payment-based sensitive data. How will you ensure reliability, availability and redundancy of data if the server crashes out. Suggest a nested RAID level suitable for this application with appropriate diagram. ( 5 marks)  
ii. A small -scale industry is looking for a cost-effective storage design. The response time is vital for the day to day operations of this industry and requires both read and write operations to be fast. Discuss a suitable RAID level architecture that will meet with the requirements.(5 marks)
- Q10. Consider that the processor is executing instructions in a sequence by fetching data from the memory. Simultaneously the processor receives interrupt requests from five devices. Programmed I/O is consuming more time when issuing commands for data transfer and holds the processor in data transfer while the data processing is neglected.  
i. Address this issue and suggest how to improve the processor performance when high speed peripherals require bulk transfer of data (7 Marks)  
ii. Discuss the interrupt driven I/O process with a suitable example.(3 Marks)

