

# CDA 4213/CIS 6930 CMOS VLSI

## Fall 2020

### Final Project

|                                               |                                                                                                                                                                                                                                                                                   |
|-----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Today's Date:                                 | 11/11/2020                                                                                                                                                                                                                                                                        |
| Team Members:                                 | Mateus A. Fernandes A.                                                                                                                                                                                                                                                            |
| No. of Hours Spent:                           | 167 hours                                                                                                                                                                                                                                                                         |
| Exercise Difficulty:<br>(Easy, Average, Hard) | Average                                                                                                                                                                                                                                                                           |
| Any Feedback:                                 | Software is very slow to get the waveform of the full design (especially when just testing). It made this simple project more frustrating than it should be. For example: it takes about 30min to run PEX and 10-30min to get each waveform from the larger/more complex designs. |

(1) Proposed Design – Bit slice design

(a) List all module bit-slices you have used for your design.

(b) For each bit slice, show the gate-level design and layout design. For layout, include the snapshot from Cadence Virtuoso. If you have used any other blocks, include them as well.

I. Full Adder (CPA):



## II. Full Adder with AND gate (CSA):



### III. Registers (Inputs and Output):

#### i. Inputs Reg (X, Y):



Note: although they are the similar, there is additional metals to connect the ground (GND) to the CSA and CPA cell's GND metals. Also, slight shifting of output metals to match with respective CSA cell input metals. This will make the appearance of the X and Y registers to be different, but logically they are the same.

ii. Output Reg (P):



Note: although they are all similar, there is additional and/or shifting of input metals to match with the respective position of the neighbor cell's output metals. This will make the appearance of the bottom and right P registers to be different, but logically they are the same.

#### IV. 2-to-1 MUX



#### V. Inverter:



- (2) Show the layout of your multiplier with the registers (outside the padframe). Explain the design and functionality of your multiplier.



## 16x16 Multiplier



### ***Functionality and design explanation:***

Main Functionality: multiply two 16-bits binary numbers

- 1) The registers are located around the design. The X, Y input registers are located to the left and top, respectively. The P output registers are located on the remaining sides. All registers have a AND gate connected to the RST' (reset on active low) to clear their D Flip-Flops. When the RST' is high, the AND gate will just act as a non-inverting buffer between the input and the D Flip-Flop. Additionally, the P registers have a 2 input MUX to change between serial and parallel load (using the LOAD pin as the selection input).
- 2) Between the X and Y registers there is a 2 input MUX to change between the Normal Mode and Test Mode (using the TEST pin as the selection input). Similarly, between Y and P registers.
- 3) In the middle (major top portion) is located a 16x16 CSA cells array. Each CSA cell contain an AND gate to perform the binary multiplication between each bit respective bit from the X and Y registers. Also, each have a Full Adder to add the respective diagonal cells. The final total of this diagonal addition will reach either the respective P register (for the lowest output 16 bits) or the CPA cells (for the upper output 16 bits). If it has any carry out, it will transfer it to the Full Adder of the cell bellow (either another CSA or a CPA).
- 4) The CPA cells is located underneath the CSA array. These will take the upper output 16 bits from the CPA array (as mentioned in part 3). These CSA only has Full Adder. Its purpose is to add any additional carry out generated from the cell above or to the right. Then, output it's sum to the respective P register bellow.
- 5) All cells have their input and output metals to their edges, so it makes it easier to connect the cells together without needing to add extra metals in between. Just placing each cell touching each other will connect their respective pins.
- 6) The binary number inputs are serialized to the X, Y registers from LSM first to MSB last. Similarly, the binary number output is serialized out from the P registers from MSB first to LSB last. The clock (CLK), reset (RST), load (LOAD), and test (TEST) are just 1-bit input pins/metals acting as a Boolean value.

(3) Simulation Results (without padframe):

(a) Individual cells:

I. Full Adder (CPA):



II. Full Adder with AND gate (CSA):



### III. Registers (Inputs and Output):

#### i. Inputs Reg (X, Y):



#### i. Output Reg (P):



## V. 2-to-1 MUX:



## VI. Inverter:



(b) The final multiplier:

### Normal Mode (NM)

Notes: Clock Cycle = 50ns = 0.05μs; Setup = 2 cycles (0μs, 0.1μs);

Input time = 16 cycles (0.1μs, 0.9μs); Input serialized: LSB first, MSB last;

Output time = 32 cycles (0.9μs, 2.5μs); Output serialized: MSB first, LSB last;

Reset is active LOW (RST''); TEST (not showing on waveform) = 0V = Logic 0

NM1:  $0 \times 0 = 0$  ( $0000000000000000 \times 0000000000000000 = 00000000000000000000000000000000$ )



NM2:  $0 \times 255 = 0$  ( $0000000000000000 \times 00000001111111 = 00000000000000000000000000000000$ )



NM3:  $1 \times 255 = 255$  ( $0000000000000001 \times 0000000011111111 = 0000000000000000000000000000000011111111$ )



NM4:  $12 \times 29 = 348$  ( $0000000000001100 \times 00000000000011101 = 00000000000000000000000000000000101011100$ )



NM5:  $19 \times 91 = 1729$  ( $00000000000010011 \times 0000000001011011 = 0000000000000000000000000000000011011000001$ )



NM6:  $83 \times 97 = 8051$  ( $0000000001010011 \times 0000000001100001 = 00000000000000000000111101110011$ )



NM7:  $170 \times 204 = 34680$  ( $0000000010101010 \times 0000000011001100 = 00000000000000001000011101111000$ )



NM8:  $255 \times 255 = 65025$  ( $0000000011111111 \times 0000000011111111 = 000000000000000011111110000000001$ )



NM9:  $65535 \times 65535 = 4294836225$  ( $1111111111111111 \times 1111111111111111 = 111111111111111100000000000000000001$ )



### Test Mode (TM)

Notes: Clock Cycle = 50ns = 0.05μs; Setup = 2 cycles (0μs, 0.1μs);

Output start at = cycle 66 (from 3.25μs);

Reset is active LOW (RST'); TEST = 5V = Logic 1

TM1: All 1's



TM2: Alternating 0's and 1's, i.e., 010101...



(4) Layout of the final design (with padframe):



*(The extra space was originally intended to add an internal clock with a ring oscillator. However, due to time constrain, we decided not to include it)*

(5) Simulation waveforms for the final design (with padframe):

### Normal Mode (NM)

Notes: Clock Cycle = 50ns = 0.05μs; Setup = 2 cycles (0μs,0.1μs);

Input time = 16 cycles (0.1μs,0.9μs); Input serialized: LSB first, MSB last;

Output time = 32 cycles (0.9μs, 2.5μs); Output serialized: MSB first, LSB last;

Reset is active HIGH (RST); TEST (not showing on waveform) = 0V = Logic 0

NM1: 0 x 0 = 0 (0000000000000000 x 0000000000000000 = 00000000000000000000000000000000)



NM2: 0 x 255 = 0 (0000000000000000 x 0000000011111111 = 00000000000000000000000000000000)



NM3:  $1 \times 255 = 255$  ( $0000000000000001 \times 0000000011111111 = 0000000000000000000000000000000011111111$ )



NM4:  $12 \times 29 = 348$  ( $0000000000001100 \times 00000000000011101 = 00000000000000000000000000000000101011100$ )



NM5:  $19 \times 91 = 1729$  ( $00000000000010011 \times 0000000001011011 = 0000000000000000000000000000000011011000001$ )



NM6:  $83 \times 97 = 8051$  ( $000000001010011 \times 000000001100001 = 00000000000000000000111101110011$ )



NM7:  $170 \times 204 = 34680$  ( $0000000010101010 \times 0000000011001100 = 00000000000000001000011101111000$ )



NM8:  $255 \times 255 = 65025$  ( $0000000011111111 \times 0000000011111111 = 000000000000000011111110000000001$ )



NM9:  $65535 \times 65535 = 4294836225$  ( $1111111111111111 \times 1111111111111111 = 111111111111111100000000000000000001$ )



### Test Mode (TM)

Notes: Clock Cycle = 50ns = 0.05μs; Setup = 2 cycles (0μs, 0.1μs);

Output start at = cycle 66 (from 3.25μs);

Reset is active HIGH (RST); TEST = 5V = Logic 1

TM1: All 1's



TM2: Alternating 0's and 1's, i.e., 010101...

