



দূর  
মেঘের  
আলপনা

# Architecture

Supported By "দূর মেঘের আলপনা"

Adapted By Manik Hosen



□ Define Computer Architecture.

□ Computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems.

দূর  
মেঘের  
আলমনা

# Computer Architecture

- ❑ Define Computer Organization.
- ❑ Computer Organization refers to the level of abstraction above the digital logic level, but below the operating system level.

দূর  
বেগের  
আলমনা

# Computer Architecture

What are the shortcomings of IAS Computer?

Shortcomings of IAS Computer:

- Shared memory - a defective program can overwrite another in memory, causing it to crash.
- Memory leaks - some defective programs fail to release memory when they are finished with it, which could cause the computer to crash due to insufficient memory.
- Data bus speed - the CPU is much faster than the data bus, meaning it often sits idle (Von Neumann bottleneck).
- Fetch rate - data and instructions share the same data bus, even though the rate at which each needs to be fetched is often very different.

# IAS Computer

❑ A vector of 20 non-negative numbers is stored in consecutive locations beginning in location 200 with memory of IAS computer. Write a program using IAS computer instruction set to compute the address of largest number in this array. If several locations contain the largest number then specify the smallest address.

## IAS Instructions

| Location | Instruction                         | Comment              |
|----------|-------------------------------------|----------------------|
| 0        | 1                                   | Constant             |
| 1        | 100                                 | Address I(result)    |
| 2        | 101                                 | Address J            |
| 3        | 108                                 | COUNT                |
| 4L       | AC:=M(100)                          | Transfer A(I) to AC  |
| 4R       | AC:=AC-M(101)                       | Compute A(I)-A(J)    |
| 5L       | If AC $\geq$ 0 then go to M(7,0:19) |                      |
| 5R       | AC:=M(2)                            | Replace I by J       |
| 6L       | M(1):=AC                            |                      |
| 6R       | M(4,8:13):=AC(28:39)                | Modify address in 4L |
| 7L       | AC:=M(3)                            | Decrement COUNT      |
| 7R       | AC:=AC-M(2)                         |                      |
| 8L       | If AC $\geq$ 0 then go to M(9,0:19) | Test COUNT           |
| 8R       | Go to M(8,20:39)                    | Halt                 |
| 9L       | AC:=M(2)                            | Increment J          |
| 9R       | AC:=AC+M(0)                         |                      |
| 10L      | M(2):=AC                            |                      |
| 10R      | M(4,28:39):=AC(28:39)               | Modify address in 4R |
| 11L      | Go to M(4,0:19)                     |                      |

দুর্ঘের  
আলপনা

# IAS Instructions

- ❑ What do you mean by 64 bit processor?
- ❑ A 64-bit processor is a microprocessor with a word size of 64 bits, a requirement for memory and data intensive applications such as computer-aided design (CAD) applications, database management systems, technical and scientific applications, and high-performance servers.

প্রক্রিয়াসমূহ  
Processor

- ❑ What do you mean by 32 bit processor?
- ❑ A 32-bit processor means the operating system and software work with data units that are 32 bits wide. Windows 95, 98, and XP are all 32-bit operating systems.

প্রক্রিয়াসেন  
Processor

- Discuss the S/360-370 data formats.
- The S/360 architecture defines formats for characters, integers, decimal integers and hexadecimal floating point numbers. Character and integer instructions are mandatory, but decimal and floating point instructions are part of the Decimal arithmetic and Floating-point arithmetic features.
  - Characters: Characters are stored as 8-bit bytes.
  - Integers: Integers are stored as two's complement binary halfword or fullword values.
  - Packed decimal: Packed decimal numbers are stored as 1-16 8-bit bytes containing an odd number of decimal digits followed by a 4-bit sign.
  - Zoned decimal: Zoned decimal numbers are stored as 1-16 8-bit bytes, each containing a zone in bits 0-3 and a digit in bits 4-7. The zone of the rightmost byte is interpreted as a sign.
  - Floating point: Floating point numbers are only stored as fullword or doubleword values on older models.

# S/360-370

- Discuss the S/360-370 instruction formats.
- Instructions in the S/360 are two, four or six bytes in length, with the opcode in byte 0.  
Instructions have one of the following formats:
  - RR:(two bytes) Generally byte 1 specifies two 4-bit register numbers, but in some cases, e.g., SVC, byte 1 is a single 8-bit immediate field.
  - RS:(four bytes) Byte 1 specifies two register numbers, bytes 2-3 specify a base and displacement.
  - RX:(four bytes) Byte 1 bits 0-3 specifies either a register number or a modifier, byte 1 bits 4-7 specifies the number of the general register to be used as an index, bytes 2-3 specify a base and displacement.
  - SI:(four bytes) Byte 1 specifies an immediate field, bytes 2-3 specify a base and displacement.
  - SS:(six bytes). Byte 1 specifies two 4-bit length fields or one 8-bit length field, bytes 2-3 and 4-5 each specify a base and displacement. The encoding of the length fields is length-1.
- Instructions must be on a two-byte boundary in memory; hence the low-order bit of the instruction address is always 0.

□ Draw the structure of an I/O Processor.



দূর  
মেঘের  
আলপনা

# I/O Processor

মানিক হোসেন

- ❑ Discuss the structure of an I/O Processor.
- ❑ The Input Output Processor (IOP) is just like a CPU that handles the details of I/O operations. It is more equipped with facilities than those are available in typical DMA controller. The IOP can fetch and execute its own instructions that are specifically designed to characterize I/O transfers. In addition to the I/O related tasks, it can perform other processing tasks like arithmetic, logic, branching and code translation. The main memory unit takes the pivotal role. The Input Output Processor is a specialized processor which loads and stores data into memory along with the execution of I/O instructions. It acts as an interface between system and devices. It involves a sequence of events to executing I/O operations and then store the results into the memory.

# I/O Processor

- ❑ Discuss the overview of CPU.
- ❑ Central Processing Unit (CPU) consists of the following features:
  - CPU is considered as the brain of the computer.
  - CPU performs all types of data processing operations.
  - It stores data, intermediate results, and instructions (program).
  - It controls the operation of all parts of the computer.
- ❑ CPU itself has following three components:
  - Memory or Storage Unit
  - Control Unit
  - ALU(Arithmetic Logic Unit)

মানিক হোমেন  
CPU

দ্বাৰা  
মেঘেৱ  
আলিপনা

## □ Draw Flowchart of CPU Organization.



দূর  
মেঘের  
আলমনা

মনিক হোস্ট  
**CPU**

- Discuss components of CPU.
- Memory Unit: This unit can store instructions, data, and intermediate results. This unit supplies information to other units of the computer when needed. Functions of the memory unit are:
  - It stores all the data and the instructions required for processing.
  - It stores intermediate results of processing.
  - It stores the final results of processing before these results are released to an output device.
  - All inputs and outputs are transmitted through the main memory.
- Control Unit: This unit controls the operations of all parts of the computer but does not carry out any actual data processing operations. Functions of this unit are:
  - It is responsible for controlling the transfer of data and instructions among other units of a computer.
  - It manages and coordinates all the units of the computer.
  - It obtains the instructions from the memory, interprets them, and directs the operation of the computer.
  - It communicates with Input/Output devices for transfer of data or results from storage.
- ALU (Arithmetic Logic Unit): This unit consists of two subsections namely:
  - Arithmetic Section: Function of arithmetic section is to perform arithmetic operations like addition, subtraction, multiplication, and division.
  - Logic Section: Function of logic section is to perform logic operations such as comparing, selecting, matching, and merging of data.

মানিক হোমেন  
CPU

## □ Draw Block Diagram of CPU



দূর  
মেঘের  
আলমনা

মানিক হোস্ট  
**CPU**

- ❑ Discuss the register level design of a typical CPU.
- ❑ The register level of CPU has a register file in the DPU for data and/or address storage. the ALU obtains most of its operands from the register file and also stores most of its results there. A status register monitors the output of the ALU and other key points. The principal special purpose address registers are the program counter and the stack pointer. Special circuits are included for address computation, although the main ALU can also be used for this purpose. The control circuits in the PCU derive their inputs from the instruction register, which stores the opcode of the current instruction, and the status register. Communication with the outside world is via a system bus that transmits address, data and control information among the CPU, Memory and the IO system. Various nonprogrammable “buffer” registers serve as temporary storage points between the system bus and the CPU.

# Register Level CPU

□ Draw block diagram of register level CPU.



দূর  
মেঘের  
আলপনা

মানিক হোসেন

Register Level CPU

- What do you mean by Normalization?
- Normalization is the process of minimizing redundancy from a relation or set of relations.
- What do you mean by Bias Exponents?
- In floating-point arithmetic, a biased exponent is the result of adding some constant (called the bias) to the exponent chosen to make the range of the exponent nonnegative. Biased exponents are particularly useful when encoding and decoding the floating-point representations of subnormal numbers.

মানিক হোমেপাথি  
Bias

- ❑ What is tag?
- ❑ tag is a section in data that describes the type of the data, how it is to be interpreted and if it is a reference the type of the object that it points to.

মানিক হোমে  
Tag

Block diagram of an 8x8 bit fixed point binary multiplier using full adder.

# Multiplier

B: A multifunction cell capable of addition, subtraction and no operation  
C: A functional cell to generate control input signal.



- ❑ Discuss the operation of 8x8 multiplier.
- ❑ A multiplier is implemented by a combinational array which requires a multifunctional cell capable of addition, subtraction and no operation. Its various functions are selected by a pair of control lines H and D as indicated. Here we use another cell type which generate the control input signal for H and D required by the B cell. Cell C compares  $x_i$  with  $x_{i-1}$  and generates the value required by H and D. Finding the result from these cells we get the resultant bits  $P_i$ .

# Multiplier

□ Draw the diagram of 4 bit carry lookahead adder.



# Carry Lookahead Adder

- ❑ Discuss 4-bit Carry Lookahead Adder.
- ❑ The carry lookahead adder effectively combines sets of four  $x_i, y_i$  inputs into groups that are added via carry look ahead. The results computed by the various groups are then linked ripple carries. Finally we get 4-bit sum and a carry bit from carry lookahead as output.

# Carry Lookahead Adder

- ❑ How do active control signals generate during an add operation  $A:=A+B$ ?
- ❑ CU must activate the following three types of control signals during the clock cycle in which the ADD A.B instruction is executed.
  - ❑ Function select: Add.
  - ❑ Storage control: Read A, Read B, Write A.
  - ❑ Data routing: Select p-t, Select u-w, Select v-x.
- ❑ There is usually some feedback of control information from DP to CU to indicate exceptional conditions encountered during instruction execution. In the example of the functional unit performing the addition sends an overflow signal to CU whenever the sum  $A+B$  exceeds the normal word size. The operation is executed in a single clock cycle.

বাংলা  
মেঘের  
আলপনা

# Control Signals

Write down the algorithm to calculate greatest common divisor (GCD) of two numbers.

```
gcd(in: X,Y; out: Z);
  register XR, YR, TEMPR;
  XR := X;
  YR := Y;
  while XR>0 do begin
    if XR≤YR then begin
      TEMPR := YR;
      YR := XR;
      XR := TEMPR;
    end
    XR := XR - YR;
  end
  Z := YR;
end gcd;
```

দূর  
মেঘের  
আলপনা

# GCD Algorithm

❑ Hardware needed to generate control signals to implement GCD procedure.



দূর  
মেঘের  
আলপনা

মানিক হোসেন

GCD Procedure

- ❑ Discuss the hardware implementation of GCD procedure.
- ❑ The control unit (CU) generates control signals Load XR and Load YR to load each register independently with the input data X and Y. A control signal Select XY routes X and Y to XR and YR, respectively. Another signal Swap controls the swap operation defined by which requires routing the outputs of the XR and YR registers to each other's inputs. A final signal Subtract is assumed to control the subtraction  $XR := XR - YR$  by routing the output of the subtractor to XR. The input signals to CU are an asynchronous Reset signal, two comparison signals ( $XR \geq YR$ ) and ( $XR > 0$ ) generated by DP, and the usual, implicit clock signal.

# GCD Procedure

- ❑ What is cache memory?
- ❑ A cache is a smaller, faster memory, closer to a processor core, which stores copies of the data from frequently used main memory locations. It reduces the average cost to access data from the main memory.

# Cache Memory

- ❑ Why cache memory is used in computer?
- ❑ Cache memory is important because it provides data to a CPU faster than main memory, which increases the processor's speed. The alternative is to get the data from RAM, or random access memory, which is much slower. Cache memory is also often called CPU memory and it is usually physically located on the CPU. The data that is stored in cache is usually the data and commands most often used by the CPU. It is a very fast way to serve data to the processor, but the size of memory cache is limited.

# Cache Memory

- ❑ What do you mean by performance trade-offs for choosing memory devices.
- ❑ Performance trade-off for choosing memory devices means a case where an algorithm or program trade increase performance of memory another decrease performance of memory.

## Performance Trade-off

□ Draw a typical RAM along with its major external connections.



মানিক হোস্ট  
RAM

দূর  
মেঘের  
আলপনা

- ❑ Discuss the RAM design with its major external connections.
- ❑ A RAM IC typically contains all required access circuitry, including address decoders, drivers and control circuits. WE is the write enable line, a memory write (read) operation takes place if  $WE = 1$  (0). A second control line, the chip select line CS, triggers a memory operation. A word is accessed for either reading or writing only when CS is activated. This line signals that the data bus has a word ready to be written into the RAM or, in the case of a read operation, the data bus is ready to receive a data word. It has a bidirectional data bus D, which is directly wired to all addressable storage locations, and so it requires a third control line, output enable OE. In write operation OE is deactivated ( $OE = 0$ ) and in read operation OE is activated ( $OE = 1$ ) so that only the addressed memory location transfers its data to D.

মানিক হোম  
**RAM**

□ Draw the structure of a fixed point ALU.



দূর  
মেঘের  
আলপনা

মানিক হোসেন

ALU

- Describe the structure of fixed point ALU.
- In fixed point ALU there three one-word registers are used for operand storage: the accumulator AC, the multiplier-quotient register MQ and the data register DR. AC and MQ are organized as a single register AC. MQ capable of left and right shifting. Additional data processing is provided by a combinational ALU capable of addition, subtraction and logical operations. This unit derives its inputs from AC and DR and places its results in AC. The MQ register stores the multiplier during the multiplication and the quotient during division. DR stores the multiplicand or divisor while the result is stored in AC.

মানিক হোসেন

ALU

- ❑ What do you mean by bit slice ALU?
- ❑ Bit slice ALU means a combined ALU where each module contains an ALU (arithmetic-logic unit) usually capable of handling a 4-bit field.

## Bit Sliced ALU

- ❑ Discuss how a 16-bit bit sliced ALU is designed by 4-bit ALU slices.
- ❑ To design a 16-bit ALU by 4-bit ALU slices the data buses and register files of the individual slices are effectively juxtaposed to increase their size from 4 to 16 bits. The control lines that select and sequence the operations to be performed are connected to every slice so that all slices execute the same actions in lockstep with one another. Each slice thus performs the same operation on a different 4-bit part of the input operands and produces only the corresponding part of the results. The required control signals are derived from an external control unit, which can be hardwired or micro-programmed. Certain operations require information to be exchanged between slices. For example, to implement a shift operation, each slice must be able to send a bit to and a bit from its left or right neighbors. Similarly, when performing addition or subtraction carry bits must be transmitted between neighboring slices.

## Bit Sliced ALU

- Define different Types of Bus.
- There are three types of Bus. They are:
  - Address Bus: Address bus pass memory addresses from one to another components.
  - Control Bus: Control Bus are those which are used to send out signals to coordinate and manage the activities of the motherboard components.
  - Data Bus: Transfer data between peripherals, memory and the CPU.

দুর্বল  
মানিক

তানপনা

মানিক হোসে  
**Bus**

❑ Why do we use tri-state buffer for bus interfacing?

❑ We use tri-state buffer logic for bus interfacing because:

- They greatly increase the fan-in and fan-out limits of bus lines, permitting very large numbers of devices to be attached to the same line.
- They support bidirectional transmission over the bus by allowing the same bus connection to serve as an input port and as an output port different times.

# Bus interfacing

- ❑ What do you mean by DMA?
- ❑ Direct memory access (DMA) is a method that allows an input/output (I/O) device to send or receive data directly to or from the main memory, bypassing the CPU to speed up memory operations.

মানিক হোম  
**DMA**

## ❑ The circuitry required for Direct Memory Access (DMA).



DMA

- ❑ Discuss the circuitry required for DMA operation.
- ❑ Assuming that all access to main memory is via a shared system bus to implement DMA the IO device is connected to the system bus via a special interface circuit, a DMA controller, which contains a data buffer register IODR, as in the programmed IO case. It also controls an address register IOAR and a data count register DC. These registers enable the DMA controller to transfer data to or from a contiguous region of memory. IOAR stores the address of the next word to be transferred. It is automatically incremented or decremented after each word transfer. The data counter stores the number off words that remain to be transferred. It is automatically decremented after each transfer and tested for zero. When the data count reaches zero the DMA transfer halts. The DMA controller is normally provided with an interrupt capability, in which case it sends an interrupts to the CPU to signal the end of the IO data transfer.

মানিক হোমেন  
**DMA**

❑ What are the architectural limitations of first generation computers?

❑ The major limitations of first generation computers

1. The operating speed was very low.
2. Power consumption was very high.
3. They required large space for installation.
4. The Programming capability was quite low.
5. Quite larger, they generate lot of heat require special housing.
6. The Medium internal store.

দূর  
মেঘের  
আলমনা

1G Computer

❑ Which of limitations of first generation has been solved by second generation?

- Smaller in size compared to the first generation of computer.
- Used less energy and were not heated as much as the first one.
- Better speed and could calculate data in microseconds.
- Better portability as compared to the first generation.

## 2G Computer

□ What are the limitations of second generation computer?

- Cooling system was required.
- Constant maintenance was required.
- Commercial production was difficult.
- Only used for specific purposes.
- Costly and not versatile.
- Punch cards were used for input.

দ্বি  
মেয়ের  
আলপনা

2G Computer

## ❑ What are the limitations of third generation computers?

- They used Integrated Circuits, popularly known as chips.
- These computers were smaller than the second-generation computers.
- Capacities of main memory were greatly enlarged.
- They used an operating system that allowed machines to run many different programs simultaneously.
- Power requirement became less.
- Maintenance of IC required sophisticated technology.

৩G  
বাংলা  
গো

# 3G Computer

## ❑ What are the advantages of second generation computers?

- Smaller in size as compared to the first generation computer.
- The second generation computers were more reliable.
- Used less energy and were not heated as much as first generation computer.
- Wider commercial use.
- Better portability as compared to the first generation computers.
- Better speed and could calculate data in microseconds.
- Used faster peripherals like tape drives, magnetic disk etc.
- Used assembly language as well as machine language.
- Accuracy improved.

দূর  
মেঘের  
আলমনা

2G Computer

❑ What are the advantages of third generation computer?

- Smaller in size as compared to previous generations.
- More reliable.
- Used less energy.
- Produced less heat as compared to the previous two generations of computers.
- Better speed and could calculate data in nanoseconds.
- Used fan for head discharge to prevent damage.
- Maintenance cost was low because hardware failure is rare.
- Totally general purpose.
- Could be used for high-level language.
- High storage capacity than the previous generation's computer.
- Versatile to an extent.
- Less expensive.
- More accurate than previous.
- Used mouse and keyboard for input.

দূর  
মেঘের  
আলপনা

3G Computers

❑ Let A and B be two vectors comprising 325 numbers each that must be added in pairs to compute vector C, such that  $C(I) := A(I) + B(I)$ , where  $I=1,2,\dots,325$ . Write a program using IAS instructions that will execute the program. Given that, data A, B and C are stored sequentially beginning in locations 1101, 2001 and 3001 respectively.

# IAS Instructions

| Location | Instruction                          | Comment                            |
|----------|--------------------------------------|------------------------------------|
| 0        | 324                                  | Constant (count N)                 |
| 1        | 1                                    | Constant                           |
| 2        | 325                                  | Constant                           |
| 3L       | AC:=M(1325)                          | Load A(I) into AC                  |
| 3R       | AC:=AC+M(2325)                       | Compute A(I) + B(I)                |
| 4L       | M(3325):=AC                          | Store sum C(I)                     |
| 4R       | AC:=M(0)                             | Load count N into AC               |
| 5L       | AC:=AC-M(1)                          | Decrement count N by one           |
| 5R       | If $AC \geq 0$ then go to M(6,20:39) | Test N if nonnegative branch to 6R |
| 6L       | Go to M(6,0:19)                      | Halt                               |
| 6R       | M(0):=AC                             | Update count N                     |
| 7L       | AC:=AC+M(1)                          | Increment AC by one                |
| 7R       | AC:=AC+M(2)                          | Modify address in 3L               |
| 8L       | M(3,8:19):=AC(28:39)                 | Modify address in 3R               |
| 8R       | AC:=AC+M(2)                          | Modify address in 4L               |
| 9L       | M(3,28:39):=AC(28:39)                |                                    |
| 9R       | AC:=AC+M(2)                          |                                    |
| 10L      | M(4,8:19):=AC(28:39)                 |                                    |
| 10R      | go to M(3,0:19)                      | Branch to 3L                       |

দুর্ঘের  
আলমনা

বাংলা শিখ

# IAS Instructions

- ❑ What is HDL?
- ❑ Hardware description language (HDL) is a specialized computer language used to program electronic and digital logic circuits. The structure, operation and design of the circuits are programmable using HDL.

মানিক হোস্ট  
**HDL**

- ❑ What are the advantages of HDL?
- ❑ Hardware description language such as VHDL have several advantages. They can provide precise, technology-independent descriptions of digital circuits at various levels of abstraction, primarily the gate and register levels. Consequently, they are widely used for documentation purposes. Like programming languages, HDLs can be processed by computers and so are suitable for use with computer aided design (CAD) programs.

মানিক হোস্ট  
**HDL**

- ❑ What are the disadvantages of HDL?
- ❑ HDL descriptions are often long and verbose. They lack the intuitive appeal and rapid insights that circuit diagrams and less formal descriptive methods provide.

- Give a VHDL description of a half adder.

```
entity half_adder is
```

```
    port(xy: in bit; sum, carry: out bit);  
end half_adder;
```

```
architecture behavior of half_adder is begin
```

```
    sum<=x xor y;
```

```
    carry<=x and y;
```

```
end behavior
```

দূর  
মেঘের  
আলপনা

মানিক হোস্ট  
HDL

## ❑ Structure of IBM 7094 computer:



IBM 7094

- Explain the internal architecture of IBM 7094 computer.
- The most important of these is the use of data channels. A data channel is an independent I/O module with its own processor and instruction set. In a computer system with such devices, the CPU does not execute detailed I/O instructions. Such instructions are stored in a main memory to be executed by a special-purpose processor in the data channel itself. The CPU initiates an I/O transfer by sending a control signal to the data channel, instructing it to execute a sequence of instructions in memory. The data channel performs its task independently of the CPU and signals the CPU when the operation is complete. This arrangement relieves the CPU of a considerable processing burden. Another new feature is the multiplexor, which is the central termination point for data channels, the CPU, and memory. The multiplexor schedules access to the memory from the CPU and data channels, allowing these devices to act independently.

# IBM 7094

## Block diagram of a simple accumulator based CPU:



দূর  
মেঘের  
আলপনা

মানিক হোসেন

CPU

- Discuss the structure of a simple accumulator based CPU.
- This organization is typical of first generation computers and low cost microcontrollers. Instructions are fetched by the program control unit PCU, whose main register is a program counter PC. They are executed in the DPU., which contains an n-bit ALU and two data registers AC and DR.

মানিক হোম  
CPU

- ❑ What are the differences between ripple carry adder and carry look ahead adder?
- ❑ Ripple Carry Adder: A ripple carry adder passes its carry bit through a long logic chain, which is very straightforward to design, but can have a very large delay.
- ❑ A carry look-ahead adder includes additional logic which decodes the inputs directly to determine the carry output of a group of the adders. This special decoding provides an alternate and faster path for the carry information.

মানিকগঞ্জে  
**Adders**

## Ripple Carry Adder



## Carry Look Ahead Adder



মানিক হোসেন

# Adders

- ❑ Explain IEEE 754 standard floating point number format.
- ❑ This standard format number is of 32-bit. It comprises a 23-bit mantissa field M, an 8-bit exponent field E and a sign bit S. The base B is two. As in all signed binary number formats, both fixed point and floating point, S occupies the left-most bit position. M is a fraction that with S forms a sign-magnitude binary number.

# Number Format

- ❑ Why do we need overflow-underflow indicator?
- ❑ When we do some operation to two valid representations and the result can not be represented in the representation because the value is too large or too small then without indicating overflow or underflow the output can be wrong.
- ❑ How it works?
- ❑ Overflow is detected by a specific circuit for signed and unsigned number. The circuit stores the overflowed or underflowed portion of the number and show it as overflow or underflow in a output.

# Overflow & Underflow

- ❑ What are the advantages of excess-3 code over BCD?
- ❑ Excess-3 codes are useful is that excess-3 addition and subtraction is more straightforward to implement than binary decimal codes (BCD).
- ❑ It does not have 0000 and 1111 which may cause a fault in a memory or basic transmission line. It is also more difficult to write the zero pattern to magnetic media.

# XS 3 Code

## ❑ What are the advantages of BCD?

- One advantage of BCD over binary representations is that there is no limit to the size of a number. To add another digit, just add a new 4-bit sequence.
- Numbers represented in pure binary format are limited to the largest number that can be represented by 8, 16, 32 or 64 bits.
- Sometimes, the right-most nibble contains the sign (positive or negative).
- It is easier to convert decimal numbers to and from BCD than binary and, though BCD is often converted to binary for arithmetic processing, it is possible to build hardware that operates directly on BCD.

দ্বা  
বিবের

আলমপনা

মানিক হোটেল

BCD

## What are the disadvantages of BCD?

- Some operations are more complex to implement. Adders require extra logic to cause them to wrap and generate a carry early. 15-20 percent more circuitry is needed for BCD add compared to pure binary. Multiplication requires the use of algorithms that are somewhat more complex than shift-mask-add (a binary multiplication, requiring binary shifts and adds or the equivalent, per-digit or group of digits is required)
- Standard BCD requires four bits per digit, roughly 20 percent more space than a binary encoding (the ratio of 4 bits to log<sub>2</sub>10 bits is 1.204). When packed so that three digits are encoded in ten bits, the storage overhead is greatly reduced, at the expense of an encoding that is unaligned with the 8-bit byte boundaries common on existing hardware, resulting in slower implementations on these systems.
- Practical existing implementations of BCD are typically slower than operations on binary representations, especially on embedded systems, due to limited processor support for native BCD operations.

মানিক হোস্টেল  
**BCD**

## Booth's Multiplication Algorithm:

### Actions

*Initialize registers  
Set  $Q[-1]$  to 0  
Subtract  $M$  from  $A$   
Right shift  $A.Q$   
Skip add/subtract  
Right shift  $A.Q$   
Add  $M$  to  $A$   
Right shift  $A.Q$   
Skip add/subtract  
Right shift  $A.Q$   
Subtract  $M$  from  $A$   
Right shift  $A.Q$   
Skip add/subtract  
Right shift  $A.Q$   
Add  $M$  to  $A$   
Right shift  $A.Q$   
Subtract  $M$  from  $A$   
Set  $Q[0]$  to 0*

দূর  
মেঘের  
আলপনা

মানিক হোসেন

# Booth's Algorithm

- ❑ Define I/O Processor.
- ❑ The input/output processor or I/O processor is a processor that is separate from the CPU and is designed to handle only input/output processes for a device or the computer. The I/O processor is capable of performing actions without interruption or intervention from the CPU.

# I/O Processor

- Discuss the operation of a DMA controller.
- It allows the device to transfer the data directly to/from memory without any interference of the CPU. Using a DMA controller, the device requests the CPU to hold its data, address and control bus, so the device is free to transfer data directly to/from the memory. Data can be transferred in several different ways under DMA control. In a DMA block transfer a data word sequence of arbitrary length is transferred in a single burst while the DMA controller is master of the memory bus. This DMA mode is needed by secondary memories like disk drives, where data transmission cannot be stopped or slowed without loss of data, and block transfers are the norm. Block DMA transfer supports the fastest IO data transfer rates. But it can make the CPU inactive for relatively long periods by tying up the system bus. An alternative technique called cycle stealing allows the DMA controller to use the system bus to transfer one data word, after which it must return control of the bus to the CPU. Consequently, long blocks of IO data are transferred by a sequence of DMA bus transactions interspersed with CPU bus transactions. Cycle stealing reduces the maximum IO transfer rate, but it also reduces the interference by DMA controller in the CPU's memory access.

# DMA Controller

- ❑ Write some names of semi random memories. Why they are semi random?
- ❑ Magnetic Hard Disks and CD ROMs.
- ❑ Memory devices such as magnetic hard disks and CD ROMs contain many rotating storage tracks. If each track has its own read write head, the tracks can be accessed randomly, but access within each track is serial. so they are called semi random memory.

# Semi Random Memory

- ❑ How can we implement fast RAM interfaces?
- ❑ There are two basic ways we can increase the data transfer rate across its external interface:

i. Use a bigger memory word: We can design the RAM with an internal memory word size of  $w=Sn$  bits. This size permits  $Sn$  bits to be accessed as a unit in one memory cycle time  $T_M$ . We then need fast circuits inside the RAM that, in the case of a read operation, can access an  $Sn$ -bit word, break it into  $S$  parts, and output them to the processor, all within the period  $T_M$ . During write operations these circuits must accept up to  $S$   $n$ -bit words from the processor, assemble them into an  $nS$ -bit word and store the result again within the period  $T_M$ .

ii. Access more than one word at a time: We can partition the RAM into  $S$  separate banks  $M_0, M_1, \dots, M_{S-1}$ , each covering part of the memory address space and each provided with its own addressing circuitry. Then it is possible to carry out  $S$  independent accesses simultaneously in our memory clock period  $T_M$ . Once more, we need fast circuits inside the RAM unit to assemble and disassemble the words being accessed.

# Fast RAM Interfaces

- What is seek time?
- Seek time is the time taken for a hard disk controller to locate a specific piece of stored data. Or, the average time to move a head from one track to another is the seek time.
- What is latency?
- Disk latency refers to the time delay between a request for data and the return of the data. Or, the average of time required to move read write head from wrong part of the storage to the right cell.
- What is data transfer rate?
- The data transfer rate (DTR) is the amount of digital data that is moved from one place to another in a unit time. Or, the speed at which data can be transferred continuously to or from the track is the data transfer rate.

# Memory Access

- What do you mean by bus arbitration?
- Bus Arbitration refers to the process by which the current bus master accesses and then leaves the control of the bus and passes it to the another bus requesting processor unit. The controller that has access to a bus at an instance is known as Bus master.
- A conflict may arise if the number of DMA controllers or other controllers or processors try to access the common bus at the same time, but access can be given to only one of those. Only one processor or controller can be Bus master at the same point of time. To resolve these conflicts, Bus Arbitration procedure is implemented to coordinate the activities of all devices requesting memory transfers. The selection of the bus master must take into account the needs of various devices by establishing a priority system for gaining access to the bus. The Bus Arbiter decides who would become current bus master.

# Bus Arbitration

- discuss different types of bus arbitration schemes.
- There are two approaches to bus arbitration:
  - Centralized bus arbitration.
  - Distributed bus arbitration
- Centralized bus arbitration: Only single bus arbiter performs the required arbitration and it can be either a processor or a separate DMS controller. There are three arbitration schemes which run on centralized arbitration.
  - i. Daisy Chaining: It is a simple and cheaper method where all the masters use the same line for making bus requests.
  - ii. Polling Method: In this method, the controller is used to generate address lines for the master. For example, if there are 8 masters connected in a system at least 3 address lines are required.
  - iii. Independent Request: In this scheme, each bus has its own bus request and a grant. The built-in priority decoder selects the highest priority requests and asserts the system.
- Distributed bus arbitration: Here, all the devices participate in the selection of the next bus master. Each device on the bus is assigned a 4-bit identification number. When one or more devices request control of the bus, they assert the start arbitration signal and place their 4-bit identification numbers on arbitration lines through ARB3. Each device compares the code and changes its bit position accordingly. It does so by placing a 0 at the input of their drive. The distributed arbitration is highly reliable because the bus operations are not dependent on devices.

# Bus Arbitration

- What is CISC processor?
- A complex instruction set computer is a computer in which single instructions can execute several low-level operations or are capable of multi-step operations or addressing modes within single instructions. Processor used by this computer is called CISC processor.
  
- What is RISC processor?
- A reduced instruction set computer, or RISC, is one whose instruction set architecture allows it to have fewer cycles per instruction than a complex instruction set computer. Processor used by this computer is called CISC processor.

**CISC & RISC**

- What are the features of CISC processors?
- The standard features of CISC processors are listed below:
  1. CISC chips have complex instructions: A CISC processor would come prepared with a specific instruction. When executed, this instruction loads the two values into separate registers, multiplies the operands in the execution unit, and then stores the product in the appropriate register.
  2. CISC processors have a variety of instructions: There are a variety of instructions many of which are complex and thus make up for smaller assembly code thus leading to very low RAM consumption.
  3. CISC machines generally make use of complex addressing modes: CISC processes have a variety of different addressing modes in which the operands can be addressed from the memory as well as located in the different registers of the CPU. There are many instructions that refer memory as opposed to RISC architecture.
  4. CISC processors have variable length instructions: The decision of CISC processor designers to provide a variety of addressing modes leads to variable-length instructions. For example, instruction length increases if an operand is in memory as opposed to in a register.
  5. Easier compiler design: Compilers have very little to do when executing on a computer having CISC architecture. The complex instruction set and smaller assembly code meant little work for the compiler and thus eased up compiler design
  6. CISC machines uses micro-program control unit: CISC uses micro programmed control unit. These systems consist of micro programs which are nothing but series of microinstructions, which control the CPU at a very fundamental level of hardware circuitry. This is then stored in a control memory like ROM from where the CPU accesses them and generates control signals.
  7. CISC processors are having limited number of registers: CISC processors normally only have a single set of registers. Since the addressing modes give provisions for memory operands, limited number of "costly" register memory is sufficient for the functions.

মানক হোসেন

CISC

- What are the features of RISC processors?
- The standard features of RISC processors are listed below:
  1. RISC processors use a small and limited number of instructions: RISC processors only support a small number of primitive and essential instructions. This puts emphasis on software and compiler design due to the relatively simple instruction set.
  2. RISC machines mostly uses hardwired control unit: Most of the RISC processors are based on the hardwired control unit design approach. In hardwired control unit, the control units use fixed logic circuits to interpret instructions and generate control signals from them. It is significantly faster than its counterpart but are rather inflexible.
  3. RISC processors consume less power and have high performance: RISC processors have been known to be heavily pipelined this ensures that the hardware resources of the processor are utilized to a maximum giving higher throughput and also consuming less power.
  4. Each instruction is very simple and consistent: Most instructions in a RISC instruction set are very simple that get executed in one clock cycle.
  5. RISC processors use simple addressing modes: RISC processors don't have as many addressing modes and the addressing modes these processors have are rather very simple. Most of the addressing modes are for register operations and do not refer memory.
  6. RISC instruction is of uniform fixed length: The decision of RISC processor designers to provide simple addressing modes leads to uniform length instructions. For example, instruction length increases if an operand is in memory as opposed to in a register. a. This is because we have to specify the memory address as part of instruction encoding, which takes many more bits. This complicates instruction decoding and scheduling.
  7. Large Number of Registers: The RISC design philosophy generally incorporates a larger number of registers to prevent in large amounts of interactions with memory.

মানিক হোসেন

RISC

## ❑ The characteristics of CISC:

- Instruction-decoding logic will be Complex.
- One instruction is required to support multiple addressing modes.
- Less chip space is enough for general purpose registers for the instructions that are Operated directly on memory.
- Various CISC designs are set up two special registers for the stack pointer, handling interrupts, etc.
- MUL is referred to as a “complex instruction” and requires the programmer for storing functions.

দুর্য  
মেঘের  
আলমনা

মানিক হোস্ট  
**CISC**

# দূর মেঘের আলমনা

মানিক হোস্ট  
**RISC**

## □ The characteristics of RISC:

- Simple Instructions are used in RISC architecture.
- RISC helps and supports few simple data types and synthesize complex data types.
- RISC utilizes simple addressing modes and fixed length instructions for pipelining.
- RISC permits any register to use in any context.
- One Cycle Execution Time
- The amount of work that a computer can perform is reduced by separating “LOAD” and “STORE” instructions.
- RISC contains Large Number of Registers in order to prevent various number of interactions with memory.
- In RISC, Pipelining is easy as the execution of all instructions will be done in a uniform interval of time i.e. one click.
- In RISC, more RAM is required to store assembly level instructions.
- Reduced instructions need a less number of transistors in RISC.
- RISC uses Harvard memory model means it is Harvard Architecture.
- A compiler is used to perform the conversion operation means to convert a high-level language statement into the code of its form.

❑ Comparison between CISC & RISC:

❑ Differences between CISC & RISC:

| CISC                                                               | RISC                                                |
|--------------------------------------------------------------------|-----------------------------------------------------|
| It is prominent on Hardware                                        | It is prominent on Software                         |
| It has high cycles per second                                      | It has low cycles per second                        |
| It has transistors used for storing instructions which are complex | More simple transistors are used for storing memory |
| LOAD and STORE memory-to-memory is induced in instructions         | LOAD and STORE register-register are independent    |
| It has multi-clock                                                 | It has single clock                                 |

**CISC & RISC**

## ❑ Advantages of CISC:

- Microprogramming is easier than assembly language to implement, and less expensive than hard wiring a control unit.
- The ease of microcoding new instructions allowed designers to make CISC machines upwardly compatible:
- As each instruction became more accomplished, fewer instructions could be used to implement a given task.

মানিক হোস্ট  
**CISC**

## □ Disadvantages of CISC:

- The performance of the machine slows down due to the amount of clock time taken by different instructions will be dissimilar
- Only 20% of the existing instructions is used in a typical programming event, even though there are various specialized instructions in reality which are not even used frequently.
- The conditional codes are set by the CISC instructions as a side effect of each instruction which takes time for this setting – and, as the subsequent instruction changes the condition code bits – so, the compiler has to examine the condition code bits before this happens.

## ❑ Advantages of RISC:

- RISC architecture has a set of instructions, so high-level language compilers can produce more efficient code
- It allows freedom of using the space on microprocessors because of its simplicity.
- Many RISC processors use the registers for passing arguments and holding the local variables.
- RISC functions use only a few parameters, and the RISC processors cannot use the call instructions, and therefore, use a fixed length instruction which is easy to pipeline.
- The speed of the operation can be maximized and the execution time can be minimized.

Very less number of instructional formats, a few numbers of instructions and a few addressing modes are needed.

মানিক হোমের  
আলদনা  
RISC

## □ Disadvantages of RISC:

- Mostly, the performance of the RISC processors depends on the programmer or compiler as the knowledge of the compiler plays a vital role while changing the CISC code to a RISC code
- While rearranging the CISC code to a RISC code, termed as a code expansion, will increase the size. And, the quality of this code expansion will again depend on the compiler, and also on the machine's instruction set.
- The first level cache of the RISC processors is also a disadvantage of the RISC, in which these processors have large memory caches on the chip itself. For feeding the instructions, they require very fast memory systems.

- ❑ Discuss the “Delay Element Method” for designing control unit.
- ❑ Here the behavior of a control unit is represented in the form of a flowchart. Every step of the flowchart at time  $t_i$  will activate  $\{C_{i,j}\}$ , where  $C_i$  is the control signal at time  $t_i$  for the execution of the instruction  $j$ . Once the flowchart is complete then individual circuits for each  $\{C_i\}$  are formed. It is obvious the instruction  $j$  will be executed when all the steps of  $\{C_i\}$  are performed from  $C_{1,j}, C_{2,j}, \dots, C_{n,j}$ . But all these steps should not be performed together, instead there should be a finite time gap between every two steps. The delay in the circuit is introduced by a D flip-flop. Delay time = one pulse. Thus one after the other all steps are performed as flowchart.

# Control Unit Design

## □ Flowchart of Delay Element Method:



দূর  
মেঘের  
আলপনা

মানিক হোসেন

# Control Unit Design

The overall structure of a 4-bit adder-subtractor:



# Adder Subtractor

- Discuss the 4-bit adder subtractor.
- It uses a 4-bit adder with carry line. Where there is a input selector s to determine whether it will add two number or subtract. There are two input number containing 4-bit they are X and Y. If the value of  $s=0$  then Y and X will be added and the result will be,  $Z = Y + X$ . If the value of  $s=1$  then the value of X is made 2's complement and then add to Y. Thus we subtract X from Y. Hence the value of output is  $Z = Y - X$ . The output Z contains a 4-bit number. There is a carry out which contains the carry of the sum when adding or the borrow of subtraction when subtracting.

# Adder Subtractor

## □ Combinational ALU using 74181 ( 4-bit ALU ):



ALU

ଦୂର ମେଘେର ଆଲପନା ଆମାର ଲେଖା ପ୍ରଥମ ଉପନ୍ୟାସ । ଏହିଟି ଏକୁଷେ ଏହିମେଲୋଯ ପ୍ରକାଶ ହଛେ । ଶାମେର ମାତୁମେର ପ୍ରତିଦିନେର ଜୀବତକାହିନୀରେ ଏହି ଉପନ୍ୟାସର ପଢ଼ୁନ୍ମି । ଉପନ୍ୟାସର କେନ୍ଦ୍ରୀୟ ଚାରିଟ୍ରେ ବୁଝେଛେ ହାଶେମ, ବୀଥି ଓ ଅବତୀ । ଗଲ୍ଲେ ଗଲ୍ଲେ ଚଲେ ତାଦେର ଜୀବନ । ଗଲ୍ଲେର ମାତେ ହଠ୍ୟ କରେ ଥେଣେ ଯାଯା କିଛୁ ନିଯାତିର କାରଣେ । ଏମନିହି ଏକଟି କାରଣେ ବୀଥିର ସଂଜାରେ ଝଡ଼ ଆଜେ । ଯେହି ଝଡ଼େ ବଦଳେ ଯାଯା ଅନେକ କିଛୁହି । ତାଦେର ପ୍ରତିଦିନେର ଜୀବିକା ତିର୍ବାହେର ଜନ୍ୟ ଖୁଜତେ ହୟ ବିଭିନ୍ନ ପଥ । ତାର ଜନ୍ୟ ଅନେକ ଦୂରେଓ ସେତେ ହୟ । ତାଦେର ଜୀବନେ ସେତ କଥନୋ ବୁଝି ଝରେ ତା ।

ମାନିକ ହୋସେନ

# ଦୂର ମେଘର ଆଲପନା