

## Sequential Binary Multiplier Design

16 June 2021 11:30 PM

A binary multiplier is used to multiply two binary numbers and produce a product. A n bit multiplicand and an m bit multiplier produces a m+n bit product

This binary multiplier when implemented in parallel form, where the multiplication is performed in 1 clock cycle, though takes lesser time, it has more adders and hence more logic gates which trades off with the area and the power consumption of the circuit

On the other hand, a sequential multiplier has only one adder and one shift register to accomplish the same binary multiplication at a reduced area and power while trading off with the speed. (i.e more clock cycles are required to perform a multiplication)

Binary multiplication is adding of multiplicands and shifting based on the multiplier.

When done manually

1. Check the multiplier 1 or 0
2. If 1 then copy the multiplicand and shift the copied partial product no of positions to the left( initially multiplier is 0th position so no shift)
3. If 0 then copy all 0's and shift the copied partial product no of positions of the current multiplier bit to the left
4. Repeat the procedure m times where m is the number of bits in the multiplier
5. Add the partial products ( This addition can be done with a sequential adder)

When done with a hardware the process can be slightly altered so that the addition and the shifting can be performed in the same clock cycle instead of two different clock cycles

When we want the product in a single clock cycle a purely parallel combinational circuit will be synthesized

However when we want to implement it sequentially, the multiplicands based on the current bit focused in the multiplier, can be shifted and added to an accumulated partial product. This way we needn't do all the partial products (shifting) and then finally adding bit by bit as we did in the manual way.

The multiplier, multiplicand and the accumulated partial product are all stored in the register. The operation of shifting and adding is performed by the state machine

In a sequential design, there are several ways to distribute the effort of multiplication in multiple clock cycles

- ① In 1 clock cycle 1 partial product is formed and then accumulated
- ② In 1 clock cycle 2 partial products are formed and then accumulated  
(but this will take more logic gates & faster standard cells / slower clock)

So we opt for ①

If there are p bits in the multiplier, there will be p partial products  
So we would need p registers to store the p partial products and  
a p-bit adder to add the partial products to get the final product

Instead of doing the exact equivalent of manual calculation as  
said above. only

- ① Adder to sum 2 binary numbers (1 Full Adder)
- ② Registers to accumulate the partial product values  
(shift register)

Consider

$$\begin{array}{r} 1011 \\ \times 101 \\ \hline 1011 \\ 0000 \\ \hline 1011 \\ 111011 \end{array}$$

partial product

The multiplicand is shifted to the left

Instead of shifting the multiplicand to the left, the partial product (accumulated in the shift register) can be shifted to the right

when the corresponding bit of the multiplier is 0, there is no need to add all 0's

### REGISTER CONFIGURATION

A  $\rightarrow$  partial product

Q  $\rightarrow$  Multiplier

B  $\rightarrow$  Multiplicand

P  $\rightarrow$  no of bits in the multiplier (counter)

C  $\rightarrow$  carry of the parallel addition



### Working

① Initially the system is in the idle state, till Start becomes 1

② when start is 1, the system loads A to 0, C to 0, B  $\rightarrow$  multiplicand  
 Q  $\rightarrow$  multiplier  
 $P \leftarrow n$  (no of multiplier bits)

③ Depending on the value of  $Q[0]$ , multiplicand is added and shifted or

if  $Q[0] = 0$  then only shift happens

Now since we have started to perform the multiplication on multiplicand 1st bit P -

④ depending on  $Q[0]$ , always happens whether to add or not depending  $Q[0] = 1$  or  $Q[0] = 0$

$\{C, A, Q\} \gg 1$  (Right shift by 1).  $Q[0]$  is taken.

If  $Q[0]$  is 1, multiplicand is added to the shifted partial product (contd)

of reg A)

⑥ On the addition a carry C can/cannot be produced, but the value of this carry is stored in C, so as to be added to the next partial product when it is shifted into A

⑦ Now P is checked if it is 0, if not then goto step 4

When  $P=0$ , the content in the combined register A and Q is the product

### CONTROL PATH DESIGN

These control signals are required to data path

- |                           |           |                     |
|---------------------------|-----------|---------------------|
| ① load registers          | load-reg  |                     |
| ② shift register          | shift-reg | (Interface signals) |
| ③ add register            | add-reg   |                     |
| ④ decrement the counter P | decr-P    |                     |

Feedback from data path.

- |                                                      |
|------------------------------------------------------|
| ⑤ $Q[0]$ (current <sup>but</sup> bit of multiplier ) |
| ⑥ Zero (Zero = 1, when $P=0$ else Zero = 0)          |

⑦ Start signal to the controller

⑧ Ready signal (Since the contents of the product register are changing we need to know when the multiplier is ready)



## ASMD chart



## ASMD chart with control signals





From the above chart we have 3 states

<sup>Binary</sup>  
00 Gray code      <sup>Gray code</sup>  
01 assignment      00 =  
10 with 1 bit      01      S\_Idle  
change to simplify  
logic assignment      11      S\_add  
                        ↓  
                        2 ff

S\_Idle      } we need 2 bits  
S\_add      } to represent 3 states  
S\_Shift      so 2 flip flops.

one hot

001  
010  
100  
↓

3 flip flops



S\_Idle → S\_add

$A \in \delta$      $Q \in \eta$      $P \in \gamma$   
 $B \in \text{multiplicand}$   
 $Q \in \text{multiplier}$

S\_add → S\_Shift       $P \in \gamma - 1$

if  $Q(0) = 1$ ,  $\{C, A, B\} = A + B$   
(re)  $A \Leftarrow A + B$   
 $C \Leftarrow \text{cont}$

S\_Shift

$\{C, A, B\} \subset \{C, A, B\} \gg 1$   
[C becomes 0]

multiplicand

multiplic

C becomes 0 →

Example

multiplicand

$$B = 10111_2 = 17_{10} = 23_{10}$$

multiplicat

$$Q = 10011_2 = 13_{10} = 19_{10}$$

$Q_0 = 1$ , add B

| C | A                            | Q     | P   |
|---|------------------------------|-------|-----|
| 0 | <u>00000</u><br><u>10111</u> | 10011 | 101 |
| 0 | <u>10111</u>                 |       | 108 |

shift right C A Q

|   |              |       |     |
|---|--------------|-------|-----|
| 0 | 01011        | 11001 |     |
|   | <u>10111</u> |       |     |
| 1 | <u>00010</u> |       | 211 |

shift C A Q

|   |       |       |  |
|---|-------|-------|--|
| 0 | 10001 | 01100 |  |
|---|-------|-------|--|

$Q_0 = 0$ , shift C A Q

|   |       |       |     |
|---|-------|-------|-----|
| 0 | 01000 | 10110 | 010 |
|---|-------|-------|-----|

$Q_0 = 0$ , shift C A B

|   |       |       |     |
|---|-------|-------|-----|
| 0 | 00100 | 01011 | 001 |
|---|-------|-------|-----|

$Q_0 = 1$ , add B

|   |              |  |     |
|---|--------------|--|-----|
| 0 | <u>10111</u> |  | 000 |
|---|--------------|--|-----|

shift C A Q

|   |              |  |     |
|---|--------------|--|-----|
| 0 | <u>01101</u> |  | 000 |
|---|--------------|--|-----|

— Stop

Product  $00110110101_2$

STATE TABLE

↗ Moore type  
O/P

| State<br>Symbol | Present state | Input           | $Q_{t+1} = D$ |                                                                                              | Output                                                                                        |
|-----------------|---------------|-----------------|---------------|----------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
|                 |               |                 | Next state    | To ready                                                                                     |                                                                                               |
| S_Idle          | 0 0           | start Q[0] zero | 0 0           | 1                                                                                            |  0 0 0 0  |
| S_Idle          | 0 0           | 0 X X           | 0 1           |  1 0 0 0 |  0 0 0 0 |
| S_Add.          | 0 1           | 1 X X           | 1 0           | 0                                                                                            |  1 0 0 0 |
| S_Add.          | 0 1           | X 0 X           | 1 0           | 0                                                                                            |  1 0 0 0 |
| S_Add.          | 0 1           | X 1 X           | 1 0           | 0                                                                                            |  1 0 0 0 |
| S_Add.          | 1 0           | X X 0           | 0 1           | 0                                                                                            |  1 0 0 0 |
| S_Add.          | 1 0           | X X 1           | 0 0           | 0                                                                                            |  1 0 0 0 |

We realize with D flip

These more type o/p which is 1 at a given time is determined by

W<sub>X</sub> remains 00000000

These more type op which is 1 at a given time is determined by the state value.



$$D_{G_1} = T_1$$

$$D_{G_0} = \text{start } T_0 + \text{zero } T_2.$$

$$\text{Load-reg} = T_0 \cdot \text{start}$$

$$\text{add-reg} = T_1 \cdot Q[0]$$



Control path

Ref: Morris Mano  
One hot design

### One-Hot Design (One Flip-Flop per State)

Another method of control logic design is the one-hot assignment, which results in a sequential circuit with one flip-flop per state. Only one of the flip-flops contains a 1 at any time; all others are reset to 0. The single 1 propagates from one flip-flop to another under the control of decision logic. In such a configuration, each flip-flop represents a state that is present only when the control bit is transferred to it.

This method uses the maximum number of flip-flops for the sequential circuit. For example, a sequential circuit with 12 states requires a minimum of four flip-flops. By contrast, with the method of one flip-flop per state, the circuit requires 12 flip-flops, one for each state. At first glance, it may seem that this method would increase system cost, since more flip-flops are used. But the method offers some advantages that may not be apparent. One advantage is the simplicity with which the logic can be designed by inspection of the ASMD chart or the state diagram. No state or excitation tables are needed if D-type flip-flops are employed. The one-hot method offers a savings in design effort, an increase in operational simplicity, and a possible decrease in the total number of gates, since a decoder is not needed.

The design procedure for a one-hot state assignment will be demonstrated by obtaining the control circuit specified by the state diagram of Fig. 8.16(a). Since there are three states in the state diagram, we choose three D flip-flops and label their outputs  $G_0$ ,  $G_1$ , and  $G_2$ , corresponding to  $S_{idle}$ ,  $S_{add}$ , and  $S_{shift}$ , respectively. The input equations for setting each flip-flop to 1 are determined from the present state and the input conditions along the corresponding directed lines going into the state. For example,  $D_{G0}$ , the input to flip-flop  $G_0$ , is set to 1 if the machine is in state  $G_0$  and  $Start$  is not asserted, or if the machine is in state  $G_2$  and  $Zero$  is asserted. These conditions are specified by the input equation:

$$D_{G0} = G_0 Start' + G_2 Zero$$

In fact, the condition for setting a flip-flop to 1 is obtained directly from the state diagram, from the condition specified in the directed lines going into the corresponding flip-flop state ANDed with the previous flip-flop state. If there is more than one directed line going into a state, all conditions must be ORed. Using this procedure for the other three flip-flops, we obtain the remaining input equations:

$$D_{G1} = G_0 Start + G_2 Zero'$$

$$D_{G2} = G_1$$



**FIGURE 8.18**  
Logic diagram for one-hot state controller

The logic diagram of the one-hot controller (with one flip-flop per state) is shown in Fig. 8.18. The circuit consists of three D flip-flops labeled  $G_0$  through  $G_2$ , together with the associated gates specified by the input equations. Initially, flip-flop  $G_0$  must be set to 1 and all other flip-flops must be reset to 0, so that the flip-flop representing the initial state is enabled. This can be done by using an asynchronous preset on flip-flop  $G_0$  and an asynchronous clear for the other flip-flops. Once started, the controller with one flip-flop per state will propagate from one state to the other in the proper manner. Only one flip-flop will be set to 1 with each clock edge; all others are reset to 0, because their D inputs are equal to 0.

```

1  module Sequential_Binary_Multiplier (Product, Ready, Multiplicand, Multiplier, Start,clock, reset_b);
2 // Default configuration: five-bit datapath
3 parameter dp_width = 5; // Set to width of datapath
4 output [2*dp_width -1: 0] Product;
5 output Ready;
6 input [dp_width -1: 0] Multiplicand, Multiplier;
7 input Start, clock, reset_b;
8 parameter BC_size = 3; // Size of bit counter
9 parameter S_idle = 3'b001, // one-hot code
10 S_add = 3'b010,
11 S_shift = 3'b100;
12 reg [2: 0] state, next_state;
13 reg [dp_width -1: 0] A, B, Q; // Sized for datapath
14 reg C;
15 reg [BC_size -1: 0] P;
16 reg Load_regs, Decr_P, Add_regs, Shift_regs;
17 // Miscellaneous combinational logic
18 assign Product = {A, Q};
19 wire Zero = (P == 0); // counter is zero
20 // Zero = ~|P; // alternative
21
22 wire Ready = (state == S_idle); // controller status
23
24 // control unit
25 always @ ( posedge clock, negedge reset_b)
26 if (~reset_b) state <= S_idle; else state <= next_state;
27
28 always @ (state, Start, Q[0], Zero) begin
29 next_state = S_idle;
30 Load_regs = 0;
31 Decr_P = 0;
32 Add_regs = 0;
33 Shift_regs = 0;
34
35 case (state)
36 S_idle: begin if (Start) next_state = S_add; Load_regs = 1; end
37 S_add: begin next_state = S_shift; Decr_P = 1; if ( Q[0]) Add_regs = 1; end
38 S_shift: begin Shift_regs = 1; if (Zero) next_state = S_idle;
39 else next_state = S_add; end
40 default : next_state = S_idle;
41 endcase
42 end
43
44 // datapath unit
45 always @ ( posedge clock) begin
46 if (Load_regs) begin
47 P <= dp_width;
48 A <= 0;
49 C <= 0;
50 B <= Multiplicand;
51 Q <= Multiplier;
52 end
53 if (Add_regs) {C, A} <= A + B;
54 if (Shift_regs) {C, A, Q} <= {C, A, Q} >> 1;
55 if (Decr_P) P <= P -1;
56 end
57 endmodule

```

Test bench

```

1 // test bench for the binary multiplier
2 module t_Sequential_Binary_Multiplier;
3 parameter dp_width = 5; // Set to width of datapath
4 wire [2*dp_width -1: 0] Product; // Output from multiplier
5 wire Ready;
6 reg [dp_width -1: 0] Multiplicand, Multiplier; // Inputs to multiplier
7 reg Start, clock, reset_b;
8 // Instantiate multiplier
9 Sequential_Binary_Multiplier M0 (Product, Ready, Multiplicand, Multiplier, Start, clock,
10 reset_b);
11 // Generate stimulus waveforms
12 initial #200 $finish ;
13 initial
14 begin
15 Start = 0;
16 reset_b = 0;
17 #2 Start = 1; reset_b = 1;
18 Multiplicand = 5'b10111; Multiplier = 5'b10011;
19 #10 Start = 0;
20 end
21 initial
22 begin
23 clock = 0;
24 repeat (26) #5 clock = ~clock;
25 end
26 // Display results and compare with Table 8.5
27 always @ ( posedge clock)
28 begin
29 $dumpfile("tb_sbmm.vcd");
30 $dumpvars;
31 $strobe ("C=%b A=%b Q=%b P=%b time=%0d",M0.C,M0.A,M0.Q,M0.P, $time );
32 end
33 endmodule

```

PROBLEMS OUTPUT DEBUG CONSOLE TERMINAL

```
[root@shriharipc ~]# cd /d/RISC-V/verilog/Sequential_Binary_Multiplier
[root@shriharipc ~]# iverilog sbm.v tb_sbm.v
[root@shriharipc ~]# ./a.out
C=0 A=00000 Q=10011 P=101 time=5
C=0 A=10111 Q=10011 P=100 time=15
C=0 A=01011 Q=11001 P=100 time=25
C=1 A=00010 Q=11001 P=011 time=35
C=0 A=10001 Q=01100 P=011 time=45
C=0 A=10001 Q=01100 P=010 time=55
C=0 A=01000 Q=10110 P=010 time=65
C=0 A=01000 Q=10110 P=001 time=75
C=0 A=00100 Q=01011 P=001 time=85
C=0 A=11011 Q=01011 P=000 time=95
C=0 A=01101 Q=10101 P=000 time=105
C=0 A=00000 Q=10011 P=101 time=115
C=0 A=00000 Q=10011 P=101 time=125
[root@shriharipc ~]
```



#### Synthesis

```
yosys> read_liberty -lib /d/RISC-V/sky130RTLDesignAndSynthesisWorkshop/my_lib/lib/sky130_fd_sc_hd_tt_025C_1v80.lib
1. Executing Liberty frontend.
Imported 428 cell types from liberty file.

yosys> read_verilog sbm.v
2. Executing Verilog-2005 frontend: sbm.v
Parsing Verilog input from 'sbm.v' to AST representation.
Generating RTLIL representation for module `Sequential_Binary_Multiplier'.
Note: Assuming pure combinatorial block at sbm.v:28 in
compliance with IEC 62142(E):2005 / IEEE Std. 1364.1(E):2002. Recommending
use of @* instead of @(...) for better match of synthesis and simulation.
Successfully finished Verilog frontend.

yosys> synth -top Sequential_Binary_Multiplier
```

```
== Sequential_Binary_Multiplier ==
Number of wires: 79
Number of wire bits: 128
Number of public wires: 15
Number of public wire bits: 50
Number of memories: 0
Number of memory bits: 0
Number of processes: 0
Number of cells: 104
$_ANDNOT_ 17
$_AND_ 1
$_AOI3_ 2
$_AOI4_ 1
$_DFP_PNO_ 2
$_DFP_PNI_ 1
$_DFP_P_ 19
$_MUX_ 29
$_NAND_ 6
$_NOR_ 3
$_NOT_ 1
$_OAI3_ 3
$_OR_ 8
$_XNOR_ 4
$_XOR_ 7
```

```
yosys> abc -liberty /d/RISC-V/sky130RTLDesignAndSynthesisWorkshop/my_lib/lib/sky130_fd_sc_hd_tt_025C_1v80.lib
```

#### 4.1.2. Re-integrating ABC results.

```

ABC RESULTS: sky130_fd_sc_hd_a2toi_1 cells:      2
ABC RESULTS: sky130_fd_sc_hd_a2toi_1 cells:      5
ABC RESULTS: sky130_fd_sc_hd_a22toi_1 cells:      1
ABC RESULTS: sky130_fd_sc_hd_a3toi_1 cells:     10
ABC RESULTS: sky130_fd_sc_hd_and2_0 cells:       3
ABC RESULTS: sky130_fd_sc_hd_clkinv_1 cells:      2
ABC RESULTS: sky130_fd_sc_hd_lpfflow_inputisop_1 cells:    1
ABC RESULTS: sky130_fd_sc_hd_lpfflow_isobufsrc_1 cells:    1
ABC RESULTS: sky130_fd_sc_hd_nand2_1 cells:     38
ABC RESULTS: sky130_fd_sc_hd_nand2b_1 cells:     1
ABC RESULTS: sky130_fd_sc_hd_nand3_1 cells:      2
ABC RESULTS: sky130_fd_sc_hd_nor2_1 cells:     12
ABC RESULTS: sky130_fd_sc_hd_nor3_1 cells:      3
ABC RESULTS: sky130_fd_sc_hd_o21iai_1 cells:     1
ABC RESULTS: sky130_fd_sc_hd_o21iai_0 cells:      6
ABC RESULTS: sky130_fd_sc_hd_o22ai_1 cells:      1
ABC RESULTS: sky130_fd_sc_hd_o31ai_1 cells:      1
ABC RESULTS: sky130_fd_sc_hd_or3b_1 cells:      2
ABC RESULTS: sky130_fd_sc_hd_xnor2_1 cells:     3
ABC RESULTS: sky130_fd_sc_hd_xor2_1 cells:      1
ABC RESULTS: internal signals:          59
ABC RESULTS: input signals:           33
ABC RESULTS: output signals:          23
Removing temp directory.

```

```
yosys> dfflibmap -liberty /d/RISC-V/sky130RTLDesignAndSynthesisWorkshop/my_lib/lib/sky130_fd_sc_hd_tt_025C_1v80.lib
```

```

Mapping DFF cells in module `Sequential_Binary_Multiplier':
mapped 2 $_dff_pn0_ cells to \sky130_fd_sc_hd_dfrtp_1 cells.
mapped 1 $_dff_pn1_ cells to \sky130_fd_sc_hd_dfstp_2 cells.
mapped 19 $_dff_p_ cells to \sky130_fd_sc_hd_dfxtp_1 cells.

```

```
yosys> opt_clean -purge
```

```

6. Executing OPT_CLEAN pass (remove unused cells and wires).
Finding unused cells or wires in module Sequential_Binary_Multiplier..
Removed 0 unused cells and 118 unused wires.
<suppressed ~4 debug messages>

```

Yosys>show

