



# CS & IT ENGINEERING

COMPUTER ORGANIZATION  
AND ARCHITECTURE

Floating Point Representation

One Shot - 01



By- Vishvadeep Gothi sir



# Recap of Previous Lecture



Topic

Control Unit

Topic

RISC vs CISC

Topic

Floating Point Representation

# Topics to be Covered



Topic

Floating Point Representation

Topic

IEEE-754 Floating Point Representation

Topic

Booth Algorithm

Ans = 10

#Q. A micro-programmed control unit is required to generate a total of 25 control signals. Assume that during any microinstruction at most 2 control signals are active. Minimum number of bits required in the control word to generate the required control signals will be?



example:-

microflow control unit

No. of inst<sup>n</sup>s supported = 16

each inst<sup>n</sup> execution needs, 8 microinst<sup>n</sup>s



Control mem. size =  $\downarrow$  ?  
 $128 * 81$  bits

no. of microinst<sup>n</sup>s  
 $= 16 * 8$   
 $= 128$

$128 \Downarrow$   
add. = 7 bits



Size of Control memory  
 $= 128 * 81$  bits

#Q. Consider a microprogrammed control unit which has to support 64 number of instructions. For each instruction execution control unit generates a sequence of 64 control words. Each microinstruction contains 3 fields: 122 control signals to support horizontal control unit, a MUX select field to select one of 7 inputs, and a next address field. The size of control memory needed is?

$$\text{no. of microinstns} = 64 * 64$$

$$= 2^{12}$$

↓

$$\text{add.} = 12 \text{ bits}$$

$$\begin{aligned}\text{Control mem. size} &= 2^{12} * 137 \text{ bits} \\ &= 548 \text{ k bits}\end{aligned}$$



- #Q. Design of a vertical microprogrammed control unit requires to generate 50 signals. Out of first 43 those only 4 signals can be active at a time. And for remaining 7, anyone can be active anytime. The microinstruction of the control unit stores control signal information along with 3-bit mux select and 10-bits address field. The size of control memory required is?



## Topic : RISC vs CISC



| S.<br>No. | RISC (Reduced Instruction-Set<br>Computer)        | CISC (Complex Instruction-Set<br>Computer)             |
|-----------|---------------------------------------------------|--------------------------------------------------------|
| 1.        | Less Number of Instructions<br>Supported          | More Number of Instructions                            |
| 2.        | Fixed Length Instructions                         | Variable Length Instructions                           |
| 3.        | Simple Instructions                               | Complex Instructions                                   |
| 4.        | Simple and less number of addressing<br>Modes     | Complex and More number of<br>addressing Modes         |
| 5.        | Easy to implement using hardwired<br>control unit | Difficult to implement using hardwired<br>control unit |



## Topic : RISC vs CISC



| S.<br>No. | RISC (Reduced Instruction-Set<br>Computer)     | CISC (Complex Instruction-Set<br>Computer)                             |
|-----------|------------------------------------------------|------------------------------------------------------------------------|
| 6.        | One Cycle per instruction                      | More than one cycle per instruction                                    |
| 7.        | Register-to-Register arithmetic operation only | Register-to-Memory & Memory-to-Register arithmetic operations possible |
| 8.        | More Number of Registers                       | Less Number of Registers                                               |



#Q. Consider the following processor design characteristics.

- I. Register-to register arithmetic operations only
- II. Fixed-length instruction format
- III. Hardwired control unit

Which of the characteristics above are used in the design of a RISC processor?

A

I and II only

C

I and III only

B

II and III only

D

I, II and III



## Topic : Floating-Point Numbers

↓  
motive  $\Rightarrow$  larger range of numbers with lesser bits .



## Topic : Floating-Point Numbers

- The number is represented in format:



- Mantissa is signed normalized (implicit/explicit) fraction number
- Exponent is stored in biased form.

(original exponent + bias)  $\Rightarrow$  stored exponent

$$S = \begin{cases} 0 & +ve \\ 1 & -ve \end{cases}$$



## Topic : Biased Exponent

if  $e$  represented in  $k$  bits  
 $bias = 2^{k-1}$

| S | E | M |
|---|---|---|
|---|---|---|



| original ( $e$ ) | stored ( $E$ ) | (excess - 16) |
|------------------|----------------|---------------|
| -16              | 0              | $bias = 16$   |
| -15              | 1              |               |
| -14              | 2              |               |
| :                | :              |               |
| 0                | 16             |               |
| :                | :              |               |
| 15               | 31             |               |

$$E = e + 16$$



## Topic : Mantissa



101.01 → Explicit normalization  $\Rightarrow 0.\overline{10101} * 2^3$   
should be '1'

101.01 → Implicit normalization  $\Rightarrow \underline{1.0101} * 2^2$   
should be 1

$$\begin{aligned}e &= 3 \\E &= 3 + \text{bias} \\m &= \text{number after point} \\&= 10101\end{aligned}$$

$$\begin{aligned}e &= 2 \\E &= 2 + \text{bias} \\m &= 0101\end{aligned}$$



## Topic : Value Formula

$$(-1)^S \Rightarrow (-1)^0 = 1 \quad S=0$$
$$(-1)^1 = -1 \quad S=1$$



|   |   |   |
|---|---|---|
| S | E | M |
|---|---|---|

$$\text{Value}_{(\text{explicit})} = (-1)^S * O.M * 2^{E-\text{bias}}$$

$$\text{Value}_{(\text{implicit})} = (-1)^S * 1.M * 2^{E-\text{bias}}$$

#Q. A certain well-known computer family represents the exponents of its floating-point numbers as "excess-64" integers; i.e., a typical exponent  $e_6e_5e_4e_3e_2e_1e_0$  represents the number:

**A**

$$e = -64 + \sum_{i=0}^6 2^i e_i$$

**B**

$$e = -64 + \sum_{i=0}^6 2e_i$$

**C**

$$e = 64 - \sum_{i=0}^6 2^i e_i$$

**D**

$$e = 64 - \sum_{i=0}^6 2e_i$$

explicit

#Q. Consider a 16-bit register used to store floating point numbers. The mantissa is normalized signed fraction number. Exponent is represented in excess-32 form. What is the 16-bit value for  $+(23.5)_{10}$  in this register?

$$\text{bias} = 32 = 2^{k-1} \Rightarrow 2^5 = 2^{k-1} \Rightarrow 5 = k-1 \Rightarrow k=6 \quad +ve \Rightarrow s = 0$$

| 16 |        |           |
|----|--------|-----------|
| s  | E      | m         |
| 1  | 6      | 9         |
| 0  | 100101 | 101111000 |

$$(23.5)_{10} = (10111.1)_2$$

Explicit normalize  
 $\downarrow$

$$0.10111 * 2^5$$

$$e = 5 \Rightarrow E = 5 + 32 = 37 = 100101$$

$$m = 10111$$

#Q. What is the 4-digit hexadecimal value for  $+(23.5)_{10}$  in above question's register?

0 100101101111000

4B78

#Q. What is the 4-digit hexadecimal value for  $\frac{+(39.75)_{10}}{\downarrow}$  in above question's register?

| $s$ | $e$           | $m$             |
|-----|---------------|-----------------|
| 0   | <u>100110</u> | <u>10011110</u> |

4D3E

Ans.

$$(100111.11)_2$$

↓  
explicit norm.

$$0.100111 * 2^6$$

$$e = 6 \Rightarrow E = 6 + 32 = 38 = (100110)_2$$

$$m = 100111$$



## Topic : Number Range



Underflow example :-

| S | E | M |
|---|---|---|
| 1 | 6 | 9 |

$$\text{bias} = 32$$

$$E \Rightarrow 0 \text{ to } 63$$

$\Downarrow$

$$e \Rightarrow \underline{-32 \text{ to } 31}$$

$\Downarrow$

value to store

$$\Rightarrow 0.0000\ldots 011$$

$\Downarrow$

explicit normalize

$\Downarrow$

$$0.11 * 2^{-33} \Rightarrow \text{number can not be stored}$$

not allowed



## Topic : Disadvantages of Conventional Representation

1. Can not store zero
2. Has underflow



# Topic : IEEE-754 Floating Point Representation



IEEE-754  
Representation

Single Precision

32-bits

| S | E | M  |
|---|---|----|
| 1 | 8 | 23 |

$$\text{bias} = 127$$

Double  
Precision

64-bits

| S | E  | M  |
|---|----|----|
| 1 | 11 | 52 |

$$\text{bias} = 1023$$

$$E = \left\{ \begin{array}{l} 00\dots0 \\ \text{or} \\ 11\dots1 \end{array} \right\} \text{Special Case}$$



## Topic : IEEE-754 Floating Point Representation



| S      | E                                          | M                             | Number                     |
|--------|--------------------------------------------|-------------------------------|----------------------------|
| 0      | 0~~~~~0                                    | 0.....0                       | +0                         |
| 1      | 0~~~~~0                                    | 0....0                        | -0                         |
| 0      | 11.----                                    | 0.....0                       | +∞                         |
| 1      | 11.----                                    | 0.....0                       | -∞                         |
| 0 or 1 | 11.----                                    | $m \neq 00.....0$             | N.A.N. (Not A Number)      |
| 0 or 1 | 00~~~~~0                                   | $m \neq 0.....0$              | Denormalized number        |
| 0 or 1 | $E \neq 0...0$<br>and<br>$E \neq 11.---- $ | $m = 0....0$<br>to<br>11.---- | Implicit normalized number |

## Denormalized numbers

A very-very small number which can not be stored as normalized number.

for single precision  $\Rightarrow$

$$\text{bias} = 127$$

|     |     |     |
|-----|-----|-----|
| $s$ | $E$ | $m$ |
|     | 8   |     |

$\downarrow$   
for Normalized number

$$E \Rightarrow \begin{array}{l} 00000001 \\ \text{to} \\ 11111110 \end{array} \left. \begin{array}{l} \} 1 \text{ to } 254 \\ \} \end{array} \right.$$

$$E_{\min} = 1$$

$$E_{\min} = 1 - 127 = -126$$

Ex:- number  $\Rightarrow 0.00000\dots011$   
 $\downarrow$

$$\text{normalize} \Rightarrow 0.011 * 2^{-126}$$

Can not be normalized hence  
store it as denormalized number

|     |        |           |
|-----|--------|-----------|
| $s$ | $E$    | $m$       |
|     | 00...0 | 01100...0 |

$$\text{Value (denormalized)} = (-1)^S * 0.M * 2^{-126} \quad \begin{matrix} \text{single} \\ \text{or} \\ -1022 \end{matrix} \quad \begin{matrix} \text{double precision} \\ \downarrow \end{matrix}$$

$$\text{Value (implicit)} = (-1)^S * 1.M * 2^{E-\text{bias}}$$

#Q. The value of a float type variable is represented using the single-precision 32-bit floating point format IEEE-754 standard that uses 1bit for sign, 8 bits for biased exponent and 23 bits for mantissa. A float type variable X is assigned the decimal value of -19.625. The representation of X in hexadecimal notation is?



$$\begin{aligned}
 & \text{S} = 1 \\
 & (10011.101)_2 \\
 & \downarrow \\
 & \text{Implicit normalize} \\
 & \downarrow \\
 & 1.0011101 * 2^4 \\
 & e = 4 \Rightarrow E = 4 + 127 = 131 = (10000011)_2 \\
 & m = 001110100...0
 \end{aligned}$$

[NAT]

Ans =  $+(2^6)_{10}$



#Q. The value represented by the following 32-bits in IEEE-754 representation is?

01000001110100000...00  
S      E      M

$E \neq 0 \dots 0$  } implicit  
and                   } normalized  
 $\neq 11 \dots 1$  }

$$E = (0000011)_2$$

$$= (131)_{10}$$

Value =  $+1.101 * 2^{131-127}$   
=  $+1.101 * 2^4$   
=  $+ (11010)_2$   
=  $+(26)_{10}$

#Q. The value represented by the following 32-bits in IEEE-754 representation is?

0000000000110000..00

$s \downarrow \quad e \downarrow \quad m \downarrow$

$e = 0 - \dots - 0$  } denormalized  
and  
 $m \neq 0 - \dots - 0$

$$\begin{aligned} \text{value} &= + 0.11 * 2^{-126} \\ &= + 11.0 * 2^{-2} * 2^{-126} \\ &= + (3 * 2^{-128}) \end{aligned}$$

#Q. The value of a float type variable is represented using the single-precision 32-bit floating point format IEEE-754 standard that uses 1bit for sign, 8 bits for biased exponent and 23 bits for mantissa. A float type variable X is assigned the decimal value of -14.25. The representation of X in hexadecimal notation is

**A**

C1640000H

**B**

416C0000H

**C**

41640000H

**D**

C16C0000H

#Q. Consider the following representation of a number in IEEE 754 single-precision floating point format with a bias of 127.

S: 1 E: 10000001 = (129) F: 1111000000000000000000000

Here S, E and F denote the sign, exponent and fraction components of the floating-point representation.

The decimal value corresponding to the above representation (rounded to 2 decimal places) is -7.75

$$\begin{aligned} \text{Value} &= -1 \cdot 1111 * 2^{129-127} \\ &= -1 \cdot 1111 * 2^2 \\ &= -111.0 \\ &= -(7.75)_{10} \end{aligned}$$

H.W.

Q.49

Three floating point numbers  $X$ ,  $Y$ , and  $Z$  are stored in three registers  $R_X$ ,  $R_Y$ , and  $R_Z$ , respectively in IEEE 754 single precision format as given below in hexadecimal:

$$R_X = 0xC1100000, R_Y = 0x40C00000, \text{ and } R_Z = 0x41400000$$

Which of the following option(s) is/are CORRECT?

(A)  $4(X + Y) + Z = 0$

(B)  $2Y - Z = 0$

(C)  $4X + 3Z = 0$

(D)  $X + Y + Z = 0$



## 2 mins Summary



**Topic**

**Control Unit**

**Topic**

**RISC vs CISC**

**Topic**

**Floating Point Representation**



# Happy Learning

## THANK - YOU