

# **Final Project Report**

**EEDG 6306 ASIC Design**

**Sandeep Jakkampudi(skj220003)**  
**Sriram Nimmala(sxn220002)**

# **INTRODUCTION**

The realm of digital audio technology has witnessed significant advances through the integration of Application Specific Integrated Circuits (ASICs), which offer customised solutions tailored to specific audio processing needs. This report details the journey of designing the Mini Stereo Digital Audio Processor (MSDAP), an ASIC aimed at elevating audio performance in consumer electronics through high-efficiency signal processing. The MSDAP project exemplifies the intricate process of ASIC design, from conceptual specifications to final verification before tape-out, illustrating the potential for ASICs to meet stringent performance and volume demands in the audio technology sector.

# **OBJECTIVE**

ASICs, though less common than FPGA and standard cell layouts, provide distinct advantages such as reduced area, enhanced speed performance, lower power consumption, and cost efficiency when produced in high volumes. These benefits are critical in audio processing applications where size, efficiency, and cost are paramount. The objective of the MSDAP project was to harness these advantages to deliver a superior audio processor capable of supporting advanced audio features while maintaining compactness and energy efficiency.

# MSDAP ALGORITHM

## Introduction:

The MSDAP serves as a finite impulse response (FIR) filter for audio streams. The FIR filter is implemented through a convolution of coefficients and data values. The convolution is calculated using the sum of products:

$$y(n) = \sum_{k=0}^N h(k) \times x(n-k)$$

Where  $h(k)$  is a coefficient value, and  $x(n-k)$  is the prior data input. This sum may be rewritten as:

$$y(n) = h(0)x(n-0) + h(1)x(n-1) + h(2)x(n-2) + \dots + h(N)x(n-N)$$

The multiplication procedure is removed by defining "u-terms" as the sum of prior inputs. Thus the computation can be expressed as a series of addition and power-of-two operations:

$$y(n) = 2^{-1}(\dots 2^{-1}(2^{-1}(2^{-1}u_1 + u_2) + u_3) + \dots) + u_{16})$$

$$u_j = x_j(1) + x_j(2) + \dots + x_j(r_j) \quad 1 \leq j \leq 16$$

This mathematical transformation has a substantial impact on the construction of the ASIC device as multiplication is a more expensive process for hardware to accomplish. Since u-term is merely a collection of addition problems, and each power of 2 is a one-bit shift. This results in the final design to be smaller and more energy efficient

## INPUT FORMAT:

The system stores and utilises input data in three distinct formats, each characterised by varying bit widths. Despite their varying sizes, these inputs are uniformly processed as 16-bit entities when sending to the chip.

## MSDAP PIN INTERFACE:



| PINS            | I/O    | FUNCTIONALITY                                                                                                                                                                                                                |
|-----------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>SCLK</b>     | input  | The system clock serves as the time reference for internal signals, convolution, and output sampling.                                                                                                                        |
| <b>DCLK</b>     | input  | Data clock serves as the timing reference for input samples on inDataL and inDataR.                                                                                                                                          |
| <b>reset_n</b>  | input  | An asynchronous reset signal for the main module.                                                                                                                                                                            |
| <b>Frame</b>    | input  | The active high frame is utilised to align the serial input data. When the chip receives the first bit of input data, it sets the frame high for one DCLK cycle before returning to low state.                               |
| <b>Start</b>    | input  | Asynchronous signal, active high. Tells the chip to initialise and start working                                                                                                                                             |
| <b>InputL</b>   | input  | The InputL channel transmits all input data for the left channel, such as Rjs, coefficients, and data samples. Data is transferred serially, one bit at a time with the MSB transmitted first and the LSB transmitted last.  |
| <b>InputR</b>   | input  | The InputR channel transmits all input data for the right channel, such as Rjs, coefficients, and data samples. Data is transferred serially, one bit at a time with the MSB transmitted first and the LSB transmitted last. |
| <b>InReady</b>  | output | When the chip is ready to accept input data, set inReady to high on the rising edge of sClk. Otherwise, it's set low.                                                                                                        |
| <b>OutReady</b> | output | When the chip is ready to transmit output data, set outReady to high on the frame's rising edge. Otherwise, it's set low.                                                                                                    |
| <b>OutputR</b>  | output | On the rising edge of silk, all bits of data from the right channel are output. All bits are output on the 40-bit bus in a single clock cycle.                                                                               |
| <b>OutputL</b>  | output | On the rising edge of silk, the entire left channel output data is output. All bits are output on the 40-bit bus in a single clock cycle.                                                                                    |

## System Specification:

| Specification signal/parameter | Parameter              | Requirement                                                                                             |
|--------------------------------|------------------------|---------------------------------------------------------------------------------------------------------|
| Filter Order                   | Order                  | 256th order                                                                                             |
| SCLK                           | Frequency              | 26.88Mhz                                                                                                |
| DCLK                           | Frequency              | 768 khz                                                                                                 |
| Reset                          | Active High/Active Low | Active Low                                                                                              |
| Dual channel input             | Left and Right Channel | Inputs are to be accepted serially in a format such that LSB is accepted first and MSB is accepted last |

## MAIN FSM:



### State 0:

When the start signal is received, the system transitions to State 0, where it simultaneously clears Input,  $R_j$ , and Coefficient memories. After clearing all memories, it progresses to State1.

**State 1:**

In State 1, the InReady signal activates, signalling readiness to receive Rj data, and the system awaits the Frame signal from the controller. Upon receiving the Frame signal, the system transitions to State 2.

**State 2:**

State 2 maintains the InReady signal active while beginning to accept incoming bits on both the left and right channels with each rising edge of the data clock cycle. These bits are stored in Rj memory. This process is repeated for 16 bits, after which the system increments the pointer for the next input upon the next Frame signal. After receiving 16 inputs on each channel, the system moves to State 3.

**State 3:**

Here, the InReady signal activates (if previously inactive) to indicate readiness for coefficient data reception. The system waits for the Frame signal, and upon its activation, transitions to State 4.

**State 4:**

In State 4, the InReady signal remains active as the system accepts incoming bits on the left and right channels with each data clock cycle, storing them in coefficient memory. After processing 16 bits, the pointer is incremented, awaiting the next input on the subsequent Frame signal. This process continues until 512 coefficient inputs are recorded, prompting a transition to State 5.

**State 5:**

In State 5, the InReady signal activates (if it is low) to show readiness for input data reception. Following the high Frame signal, the system progresses to State 6.

**State 6:**

In State 6, with the InReady signal active, the system continues to accept incoming bits on both channels each data clock cycle, storing them in Input memory. After 16 bits, the pointer moves to accept the next input on the next Frame signal. Simultaneously, the system starts processing the previously accepted input. If the output is ready, the OutReady signal is activated, and the computed output is sent.

**State 7:**

State 7 is a reset state, triggered by a low signal on the Reset pin. In this state, all operations are paused, and previously accepted inputs are disregarded, except for Rj and coefficient data. The system exits this state upon a high signal on the Reset pin and returns to State 5 to restart input data acceptance.

**State 8:**

Upon receiving 800 consecutive inputs on both channels, the system moves to State 8. Here, while the InReady signal remains active and inputs continue to be received, they are not stored until a non-zero input is detected on either channel. Upon detecting a non-zero input, the system transitions to State 6 to fully accept the current input.

# ARCHITECTURE

The whole setup is split into several smaller parts, each doing its own specific job. We've put together a controller to keep all these parts working together smoothly. Here's what each part does:

1. Serial-in Parallel-Out (SIPO) block
2. RJ memory
3. H memory
4. X memory
5. Arithmetic and Logic Unit (ALU)
6. Parallel-In Serial-Out (PISO)
7. Main Controller



## SIPO:



### Pins:

**din** : This pin receives the serial data that is to be converted into parallel form.

**clk** : The negative edge of this clock is used to shift data into the shift register.

**data\_flag** : Control input to reset the counter and clear or initialize the output on a positive edge.

**reset\_n** : When this pin is low, the shift register is reset.,

**output\_flag** : A flag that indicates when the output (dout) holds valid data.

**dout** : It holds the parallel output of the serial data shifted in through din.

### Operation:

The SIPO samples serial input data on the falling edge of the Dclk clock signal, storing each bit starting with the LSB-first. As long as the Frame signal is high, the module continues to sample and shift the incoming bits into a parallel format. Once all 16 bits are received, the complete data is stored in the output buffer, and a word\_ready signal is asserted to indicate data availability. This signal remains high until acknowledged by a high on the received pin, which then allows for a new data cycle or a module reset.

## MEMORY:



### Pins:

**RW0\_addr** : input address bus, used to specify the memory address for read or write operations.

**RW0\_clk**: Clock input signal for triggering memory operations.

**RW0\_wdata** : input data bus for writing data to the memory.

**RW0\_rdata** : output data bus for reading data from the memory.

**RW0\_en**: Enable signal for the memory operation. When high, it allows the module to perform a read or write operation based on the mode selected.

**RW0\_wmode**: Write mode signal. When high, it enables writing operation; when low, it enables reading.

We are using three sram macros as provided with different bus widths for data,coeff and rj memories.

| MEMORY    | BUS WIDTH |
|-----------|-----------|
| RJ MEM    | 8 BITS    |
| COEFF MEM | 9 BITS    |
| DATA MEM  | 16 BITS   |

## ALU:



### Pins:

**alu\_en:** Enables the ALU operation. When high, the ALU performs operations as configured.

**sclk :** System clock input, drives the timing of operations within the module.

**reset:** Reset signal, used to initialise or reset the state of the module.

**RjData:** RJ data input, Used to specify the Number of coefficients to be used for computation.

**CoeffData:** Coefficient data input used in operations addition or subtraction.

**NewData:** Signal indicating new data is available for processing.

**InDataAddr:** Input data address, specifies the address for input data.

**RjAddr:** Address for RJ Memory determines the addressing for RjData.

**CoeffAddr:** Address for the coefficient data.

**DataAddr:** Data address output, provides the address for output data.

**Shift :** Control signal for shifting operations in the shifter module.

**DataOut:** The output data from the module resulting from the ALU operations.

### Operation:

The ALU starts operating when it receives a Start or Reset signal and runs at the speed of the system clock, Sclk. It activates with an enable signal and can pause if instructed to sleep. The module initially fetches data from a register called RJ, beginning at the first position. It then reads a set number of coefficients from memory called Coeff, updating the position after each clock cycle to read the next one. After processing the Coeff coefficients, it fetches new RJ data and repeats this cycle, ensuring a continuous supply of data for processing. During each cycle, it also calculates addresses to fetch additional data from data memory, using the information from coefficients. The data from Data Memory is processed through add and shift operations, and the final results are sent out through the DataOut port. Computed signal is high when the output is ready. If the module is set to sleep while enabled, it stops all operations and keeps the output signal low, indicating it's not processing any data.

### PISO:



**din:** parallel data into the shift register.

**sclk:** Serial clock input, triggers the shifting of data.

**load:** when asserted (high), loads the data from din into the internal register temp.

**send\_flag:** Control signal to initiate the shifting process.

**dout**: Outputs the shifted data bit by bit.

**sent\_ack**: Acknowledgment signal, indicates when all bits have been shifted out.

## Operation:

The PISO module shifts out parallel input data on the rising edge of the sclk clock signal, transmitting each bit starting with the MSB first. As long as the send\_flag is asserted, the module continues to shift and send the bits serially. Once all bits specified by the data\_width parameter have been sent, the complete data has been serialised, and a sent\_ack signal is asserted to indicate that the transmission is complete. This signal remains high until it is cleared by a subsequent reset or reinitialization of the sending process, which then allows for a new data cycle to commence.

## RTL VERIFICATION:

### Waveforms:

State-1: FSM transition to State 1 where starts waiting for rj



## State-2: FSM transition to State 2 where starts loading rj



## State –4: coeff data is loading



## State –6: Algorithm is executed, and data is computed



# SYNTHESIS:

## MSDAP TOP SYNTHESIS:



## MAIN FSM:



## PROCESSOR MODULE:



## AREA REPORT:

| Instance       | Module             | Cell Count | Cell Area | Net Area | Total Area |
|----------------|--------------------|------------|-----------|----------|------------|
| MSDAP          |                    |            |           |          |            |
| processor_R    | processor_1        | 932        | 23272.200 | 1489.291 | 24761.491  |
| CoeffMem_inst  | CO_MEMORY          | 920        | 11550.486 | 698.964  | 12249.451  |
| DataMem_inst   | DATA_MEMORY        | 44         | 5907.209  | 45.080   | 5952.289   |
| ALU_Top_inst   | ALU_Top_1          | 6          | 3725.153  | 2.724    | 3727.877   |
| ALU_fsm_inst   | ALU_fsm_1          | 589        | 1118.578  | 375.232  | 1493.810   |
| NewData_inst   | memtime_3          | 260        | 509.950   | 175.755  | 685.705    |
| ShiftReg_inst  | shifter_213        | 3          | 12.830    | 0.998    | 13.828     |
| Adder_inst     | adder_1            | 161        | 365.083   | 59.224   | 424.308    |
| sipo_inst      | sipo_1             | 167        | 242.844   | 97.322   | 340.166    |
| RjMem_inst     | R_MEMORY           | 65         | 186.157   | 37.292   | 223.449    |
| piso_inst      | piso_1             | 5          | 171.556   | 1.995    | 173.551    |
| pisoDff_inst   | pisoDff_212        | 61         | 136.702   | 34.811   | 171.513    |
| zeroCounter    | zeroCounter_1      | 1          | 6.065     | 0.000    | 6.065      |
| memclear_inst  | memclear_1         | 35         | 118.040   | 36.883   | 154.922    |
| clear_dff_inst | memclear_dff_210   | 1          | 79.782    | 17.815   | 97.596     |
| sync_with_clk  | memclear_dff_4     | 1          | 6.065     | 0.000    | 6.065      |
| mux_inst       | adder_1            | 1          | 5.832     | 0.000    | 5.832      |
| mux_coeff_inst | mux_1              | 25         | 242.844   | 97.322   | 340.166    |
| mux_data_inst  | mux8_1             | 10         | 21.228    | 2.443    | 23.671     |
| memtime_inst   | mux7_1             | 9          | 17.496    | 2.200    | 19.696     |
| mux_rj_inst    | memtime_2          | 5          | 12.830    | 0.998    | 13.828     |
| processor_L    | processor          | 920        | 11550.486 | 698.964  | 12249.451  |
| CoeffMem_inst  | CO_MEMORY          | 44         | 5907.209  | 45.080   | 5952.289   |
| DataMem_inst   | DATA_MEMORY        | 6          | 3725.153  | 2.724    | 3727.877   |
| ALU_Top_inst   | ALU_Top            | 589        | 1118.578  | 375.232  | 1493.810   |
| ALU_fsm_inst   | ALU_fsm            | 260        | 509.950   | 175.755  | 685.705    |
| NewData_inst   | memtime            | 3          | 12.830    | 0.998    | 13.828     |
| ShiftReg_inst  | shifter            | 161        | 365.083   | 59.224   | 424.308    |
| Adder_inst     | adder              | 167        | 242.844   | 97.322   | 340.166    |
| sipo_inst      | sipo               | 65         | 186.157   | 37.292   | 223.449    |
| RjMem_inst     | R_MEMORY           | 5          | 171.556   | 1.995    | 173.551    |
| piso_inst      | piso               | 61         | 136.702   | 34.811   | 171.513    |
| pisoDff_inst   | pisoDff            | 1          | 6.065     | 0.000    | 6.065      |
| zeroCounter    | zeroCounter        | 55         | 118.040   | 36.883   | 154.922    |
| memclear_inst  | memclear           | 35         | 79.782    | 17.815   | 97.596     |
| clear_dff_inst | memclear_dff       | 1          | 6.065     | 0.000    | 6.065      |
| sync_with_clk  | memclear_dff_4_211 | 1          | 5.832     | 0.000    | 5.832      |
| mux_inst       | mux                | 25         | 39.891    | 5.844    | 45.735     |
| mux_coeff_inst | mux8               | 10         | 21.228    | 2.443    | 23.671     |
| mux_data_inst  | mux7               | 9          | 17.496    | 2.200    | 19.696     |
| memtime_inst   | mux3_1             | 3          | 12.830    | 0.998    | 13.828     |
| mux_rj_inst    | mux3               | 5          | 9.098     | 1.228    | 10.326     |
| Main_FSM_inst  | main_fsm           | 90         | 169.128   | 51.922   | 221.050    |

## GENUS -TIMING REPORT:

```

Path 1: MET (18272 ps) Setup Check with Pin processor_R/DataMem_inst/mem_0_1/CE->A[0]
    View: PVT_0P63V_100C.setup_view
    Group: Sclk
Startpoint: (R) Main_FSM_inst/state_reg[3]/CLK
    Clock: (R) Sclk
Endpoint: (R) processor_R/DataMem_inst/mem_0_1/A[0]
    Clock: (F) Sclk

        Capture      Launch
    Clock Edge:+ 19000          0
Src Latency:+   0             0
Net Latency:+   0 (I)         0 (I)
    Arrival:= 19000          0

    Setup:-     96
Uncertainty:- 100
Required Time:= 18804
Launch Clock:- 0
Data Path:- 532
Slack:= 18272

#-----#
#      Timing Point          Flags Arc Edge           Cell           Fanout Load Trans Delay Arrival Instance
#      #-----#          #-----# #-----# #-----# #-----# (fF) (ps) (ps) (ps) Location
#-----#
Main_FSM_inst/state_reg[3]/CLK      -      -      R  (arrival)          147  -   0   -   0   (-,-)
Main_FSM_inst/state_reg[3]/QN      -      CLK->QN R  ASYNC_DFFHx1_ASAP7_75t_SL  7  4.9  64  71  71  (-,-)
Main_FSM_inst/g2528/Y      -      A->Y  F  INVx1_ASAP7_75t_SL   3  2.1  32  18  89  (-,-)
Main_FSM_inst/g2526_5122/Y      -      B->Y  R  NAND2xp5_ASAP7_75t_SL   4  2.7  57  30  118  (-,-)
Main_FSM_inst/g2522_3680/Y      -      B->Y  R  OR2x2_ASAP7_75t_SL   3  2.1  18  27  145  (-,-)
Main_FSM_inst/g2515_5107/Y      -      B->Y  F  OA121xp5_ASAP7_75t_SL  22 12.8 174  78  224  (-,-)
processor_R/mux_inst/g335/Y      -      A->Y  R  INVx1_ASAP7_75t_L   23 14.3 196  129  353  (-,-)
processor_R/mux_inst/g314/Y      -      A2->Y R  A022x1_ASAP7_75t_SL   2  1.5  31  44  396  (-,-)
processor_R/mux_data_inst/g140/Y      -      B1->Y R  A022x1_ASAP7_75t_SL   2  26.7 269  134  530  (-,-)
processor_R/DataMem_inst/mem_0_1/A[0] (P) -      R  SRAM1RW256x8          2  -   -   1   532  (-,-)
#-----#
(P) : Instance is preserved

```

## GENUS POWER REPORT:

| Instance: /MSDAP            |             |             |             |             |         |  |
|-----------------------------|-------------|-------------|-------------|-------------|---------|--|
| Power Unit: W               |             |             |             |             |         |  |
| PDB Frames: /stim#0/frame#0 |             |             |             |             |         |  |
| Category                    | Leakage     | Internal    | Switching   | Total       | Row%    |  |
| memory                      | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00%   |  |
| register                    | 5.30033e-05 | 5.88571e-06 | 3.44954e-07 | 5.92340e-05 | 50.82%  |  |
| latch                       | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00%   |  |
| logic                       | 4.70389e-05 | 2.33824e-06 | 5.11661e-06 | 5.44937e-05 | 46.76%  |  |
| bbox                        | 0.00000e+00 | 0.00000e+00 | 7.18180e-08 | 7.18180e-08 | 0.06%   |  |
| clock                       | 6.14970e-08 | 4.59131e-09 | 2.68360e-06 | 2.74969e-06 | 2.36%   |  |
| pad                         | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00%   |  |
| pm                          | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00000e+00 | 0.00%   |  |
| Subtotal                    | 1.00104e-04 | 8.22854e-06 | 8.21698e-06 | 1.16549e-04 | 100.00% |  |
| Percentage                  | 85.89%      | 7.06%       | 7.05%       | 100.00%     | 100.00% |  |

## GENUS TIMING SUMMARY:

```
@genus:root: 66> report_timing_summary
=====
Generated by:          Genus(TM) Synthesis Solution 19.14-s108_1
Generated on:        May 06 2024 11:47:08 am
Module:                MSDAP
Operating conditions: PVT_0P63V_100C
Interconnect mode:    global
Area mode:            physical library
=====

# SETUP           WNS   TNS   FEP
#
# -----
View: ALL      18272.4  0.0   0
  Group : inout     N/A   N/A   N/A
  Group : in2reg    N/A   N/A   N/A
  Group : reg2out   N/A   N/A   N/A
  Group : reg2reg   18272.4  0.0   0
#
# -----
# DRV             WNS   TNS   FEP
#
# -----
View: ALL
  Check: max_transition  -151  -1380.0   10
  Check: max_capacitance   N/A     N/A   0
  Check: max_fanout       N/A     N/A   0
#
# -----
9
```

## CRITICAL PATH:



Timing Report - (id: 1)

| Options  | Endpoint: processor_L/DataMem_inst/mem_0/A[7] | Close      |                |                |
|----------|-----------------------------------------------|------------|----------------|----------------|
| Airlines | Endpoint                                      | Slack (ps) | Rise Slew (ps) | Fall Slew (ps) |
| cpath_1  | processor_L/DataMem_inst/mem_0/A[7]           | 18272.40   | 201.90         | 146.60         |

  

| Pin                      | Type                   | Fanout | Load (fF) | Slew (ps) | Delay (ps) | Arrival (ps) |   |
|--------------------------|------------------------|--------|-----------|-----------|------------|--------------|---|
| g2528/Y                  | INVx1_ASAP7_75t_SL     | 3      | 2.10      | 32.30     | 18.20      | 88.90        | F |
| g2526_5122/B             | NAND2x5p5_ASAP7_75t_SL | 3      |           | 0.00      | 88.90      |              | F |
| g2526_5122/Y             | NAND2x5p5_ASAP7_75t_SL | 4      | 2.70      | 57.20     | 29.60      | 118.50       | R |
| g2522_3680/B             | OR2x2_ASAP7_75t_SL     | 4      |           | 0.00      | 118.50     |              | R |
| g2522_3680/Y             | OR2x2_ASAP7_75t_SL     | 3      | 2.10      | 18.10     | 26.80      | 145.30       | R |
| g2515_5107/B             | OA121xp5_ASAP7_75t_SL  | 3      |           | 0.10      | 145.40     |              | R |
| g2515_5107/Y             | OA121xp5_ASAP7_75t_SL  | 22     | 12.80     | 174.40    | 78.30      | 223.70       | F |
| Main_FSM_inst/reset_mode | (hpin)                 | 22     | 12.80     | 174.40    |            | 223.70       | F |
| processor_L/ResetMode    | (hpin)                 | 22     | 12.80     | 174.40    |            | 223.70       | F |
| mux_inst/sel             | (hpin)                 | 22     | 12.80     | 174.40    |            | 223.70       | F |
| g335/A                   | INVx1_ASAP7_75t_L      | 22     |           | 0.00      | 223.70     |              | F |
| g335/Y                   | INVx1_ASAP7_75t_L      | 23     | 14.30     | 196.40    | 129.00     | 352.70       | R |
| g313_2883/A2             | AO22x1_ASAP7_75t_SL    | 23     |           | 0.00      | 352.70     |              | R |
| g313_2883/Y              | AO22x1_ASAP7_75t_SL    | 2      | 1.50      | 31.10     | 43.70      | 396.40       | R |
| mux_inst/dout_addr[7]    | (hpin)                 | 2      | 1.50      | 31.10     |            | 396.40       | R |
| mux_data_inst/b[7]       | (hpin)                 | 2      | 1.50      | 31.10     |            | 396.40       | R |
| g139_6131/B1             | AO22x1_ASAP7_75t_SL    | 2      |           | 0.00      | 396.40     |              | R |
| g139_6131/Y              | AO22x1_ASAP7_75t_SL    | 2      | 26.70     | 269.20    | 134.10     | 530.50       | R |
| mux_data_inst/z[7]       | (hpin)                 | 2      | 26.70     | 269.20    |            | 530.50       | R |
| DataMem_inst/RW0_addr[7] | (hpin)                 | 2      | 26.70     | 269.20    |            | 531.60       | R |
| mem_0/A[7]               | SRAM1RW256x8           | 2      |           | 1.10      |            |              |   |
| mem_0/CE                 | setup                  |        |           | 0.00      | 96.00      | 627.60       | R |
| (clock Sclk)             | capture                |        |           |           |            | 19000.00     | F |
|                          | uncertainty            |        |           |           | -100.00    | 18900.00     | F |



## PHYSICAL DESIGN:

### AFTER PLACEMENT



## CORE UTILIZATION

```
Core utilization = 57.186799
TU for group A0 = 54.493188
Effective Utilizations
Extracting standard cell pins and blockage .....
**WARN: (IMPTR-2104): Layer M10: Pitch=1280 is less than min width=640 + min spacing=32000.
**ERROR: (IMPTR-2101): Layer M10: Pitch=11520x9 is still less than min width=32000 + min spacing=640.
**WARN: (IMPTR-2108): For layer M10, the gaps of 468 out of 468 tracks are narrower than 8.160um (space 8.000 + width 0.160).
Type 'man IMPTR-2108' for more detail.
As a result, your trialRoute congestion could be incorrect.
Pin and blockage extraction finished
Extracting macro/IO cell pins and blockage .....
Pin and blockage extraction finished
**Info: (IMPSP-307): Design contains fractional 20 cells.
Average module density = 0.174.
Density for the design = 0.174.
    = stdcell_area 16174 sites (3773 um^2) / alloc_area 92917 sites (21676 um^2).
Pin Density = 0.03834.
    = total # of pins 6976 / total area 181936.

*** Summary of all messages that are not suppressed in this session:
Severity ID          Count Summary
ERROR   IMPTR-2101      1 Layer %s: Pitch=%dx%d is still less than...
WARNING IMPTR-2104      1 Layer %s: Pitch=%d is less than min wid...
WARNING IMPTR-2108      1 For layer M%d, the gaps of %d out of %d ...
*** Message Summary: 2 warning(s), 1 error(s)

@innovus 34>
```

## AFTER PIN PLACEMENT AND POWER STRAPS



## AFTER PLACEMENT WITHOUT NETS



## AFTER PLACEMENT



## CONGESTION

```
Innovus 103> report_congestion -overflow -hotspot
Usage: (3.8W 2.8%V) = (1.899e+04um 1.474e+04um) = (17582 13651)
Overflow: 9 = 1 (0.00% H) + 8 (0.02% V)

Congestion distribution:

Remain   cntH      cntV
-----
2:    0      0.00%  4      0.01%
1:    1      0.00%  2      0.01%
-----
0:    7      0.02%  1      0.00%
1:   24      0.07%  1      0.00%
2:   31      0.08%  50     0.14%
3:  112      0.30%  53     0.14%
4:  283      0.77%  60     0.16%
5: 36277     98.75% 36569   99.53%
[hotspot] +-----+ max hotspot | total hotspot +
[hotspot] |           | 0.00 | 0.00 |
[hotspot] | normalized | 0.00 | 0.00 |
[hotspot] +-----+
Local HotSpot Analysis: normalized max congestion hotspot area = 0.00, normalized total congestion hotspot area = 0.00 (area is in unit of 4 std-cell row bins)
```

## TIMING SUMMARY REPORT

```
#####
# Generated by: Cadence Innovus 19.11-s128.1
# OS: Linux x86_64(Host ID engnx02a.utdallas.edu)
# Generated on: Mon May 6 12:15:04 2024
# Design: MSDAP
# Command: report_timing_summary
#####
# SETUP          WNS   TNS   FEP
#-----
View : ALL      17934.994  0.000   0
  Group : in2out      N/A   N/A   0
  Group : reg2out     N/A   N/A   0
  Group : in2reg    37007.824  0.0   0
  Group : reg2reg    17934.994  0.0   0
#-----
# DRV           WNS   TNS   FEP
#-----
View : ALL
  Check : max_transition  N/A   N/A   0
  Check : min_transition  N/A   N/A   0
  Check : max_capacitance N/A   N/A   0
  Check : min capacitance N/A   N/A   0
```

## Utilization after Placement

```
Reporting Utilizations.....
Core utilization = 60.084048
TU for group A0 = 54.493188
Effective Utilizations
**Info: (IMPSP-307): Design contains fractional 20 cells.
Average module density = 0.215.
Density for the design = 0.215.
      = stdcell_area 19298 sites (4502 um^2) / alloc_area 89963 sites (20987 um^2).
Pin Density = 0.04296.
      = total # of pins 7816 / total area 181936.
*** Message Summary: 0 warning(s), 0 error(s)
```

# TIMING REPORT(SET UP) AFTER CTS

```
#####
# Generated by: Cadence Innovus 19.11-s128.1
# OS: Linux x86_64(Host ID engnx02a.utdallas.edu)
# Generated on: Mon May 6 12:20:10 2024
# Design: MSDAP
# Command: report_timing
#####
Path 1: MET (17934.994 ps) Setup Check with Pin processor_R/ALU_Top_inst/ALU_fsm_inst/coeff_index_reg[3]/CLK->D
    View: PVT_0P63V_100C.setup_view
    Group: Sclk
    Startpoint: (R) processor_R/CoeffMem_inst/mem_0_0/CE
    Clock: (F) Sclk
    Endpoint: (R) processor_R/ALU_Top_inst/ALU_fsm_inst/coeff_index_reg[3]/D
    Clock: (R) Sclk

        Capture          Launch
    Clock Edge:+ 38000.000      19000.000
    Src Latency:+ -84.437      -93.062
    Net Latency:+  76.600 (P)   91.199 (P)
    Arrival:= 37992.164      18998.137

        Setup:-      9.340
    Uncertainty:- 100.000
    Cppr Adjust:+ 0.000
    Required Time:= 37882.824
    Launch Clock:= 18998.137
    Data Path:+ 949.693
    Slack:= 17934.994
```

```
#####
# Generated by: Cadence Innovus 19.11-s128.1
# OS: Linux x86_64(Host ID engnx02a.utdallas.edu)
# Generated on: Mon May 6 12:23:36 2024
# Design: MSDAP
# Command: report_timing -early -max_slack 0 -nworst 100 > hold_report.txt
#####
Path 1: VIOLATED (-0.148 ps) Hold Check with Pin Main_FSM_inst/state_reg[2]/CLK->D
    View: PVT_0P77V_0C.hold_view
    Group: Sclk
    Startpoint: (R) Main_FSM_inst/state_reg[1]/CLK
    Clock: (R) Sclk
    Endpoint: (R) Main_FSM_inst/state_reg[2]/D
    Clock: (R) Sclk

        Capture          Launch
    Clock Edge:+ 0.000      0.000
    Src Latency:+ -56.717     -56.717
    Net Latency:+  60.900 (P)  56.100 (P)
    Arrival:= 4.183      -0.617

        Hold:+ 5.048
    Uncertainty:+ 100.000
    Cppr Adjust:- 0.000
    Required Time:= 109.238
    Launch Clock:= -0.617
    Data Path:+ 109.700
    Slack:= -0.148

#-----#
# Timing Point          Flags Arc      Edge Cell           Fanout Trans (ps) Delay (ps) Arrival (ps)
#-----#
Main_FSM_inst/state_reg[1]/CLK - CLK          R (arrival)          40 26.400  - -0.617
Main_FSM_inst/state_reg[1]/QN - CLK->QN F ASYNC_DFFHx1_ASAP7_75t_SL 7 26.400 36.600 35.983
Main_FSM_inst/g2791_6260/Y - A2->Y R A0I321xp33_ASAP7_75t_SL 1 24.000 10.200 46.183
Main_FSM_inst/FE_PHC407_n_34/Y - A->Y R H81xp67_ASAP7_75t_SRAM 1 10.300 16.800 62.983
Main_FSM_inst/FE_PHC137_n_34/Y - A->Y R H84xp67_ASAP7_75t_SRAM 1 8.900 46.100 109.083
Main_FSM_inst/state_reg[2]/D - D R ASYNC_DFFHx1_ASAP7_75t_SL 1 17.000 0.000 109.083
```

# TIMING REPORT (HOLD) AFTER CTS

```

Path 2: VIOLATED (-0.034 ps) Hold Check with Pin processor_L/ALU_Top_inst/ALU_fsm_inst/CoeffAddr_reg[6]/CLK->D
  View: PVT_0P77V_0C.hold_view
  Group: Sclk
  Startpoint: (R) processor_L/ALU_Top_inst/ALU_fsm_inst/CoeffAddr_reg[6]/CLK
    Clock: (R) Sclk
  Endpoint: (F) processor_L/ALU_Top_inst/ALU_fsm_inst/CoeffAddr_reg[6]/D
    Clock: (R) Sclk

      Capture     Launch
  Clock Edge:+ 0.000     0.000
  Src Latency:+ -56.717   -56.717
  Net Latency:+ 59.600 (P) 54.300 (P)
  Arrival:=    2.883   -2.417

    Hold:+ 10.534
  Uncertainty:+ 100.000
  Cppr Adjust:- 0.000
  Required Time:= 113.417
  Launch Clock:= -2.417
  Data Path:+ 115.800
  Slack:= -0.034

#-----
# Timing Point
# Trans Delay Arrival
#(ps) (ps) (ps)
#-----
processor_L/ALU_Top_inst/ALU_fsm_inst/CoeffAddr_reg[6]/CLK - CLK R (arrival) 67
37.400 -2.417
processor_L/ALU_Top_inst/ALU_fsm_inst/CoeffAddr_reg[6]/QN - CLK->QN R DFFHQNx1_ASAP7_75t_SL 2
37.400 28.800 26.383
processor_L/ALU_Top_inst/ALU_fsm_inst/FE_PHC345_CoeffAddr_6/Y - A->Y R HB2xp67_ASAP7_75t_R 2
12.300 24.100 50.483
processor_L/ALU_Top_inst/ALU_fsm_inst/g4099_2883/Y - B2->Y F AOI321xp33_ASAP7_75t_SL 1
14.300 8.600 59.083
processor_L/ALU_Top_inst/ALU_fsm_inst/FE_PHC201_n_171/Y - A->Y F HB4xp67_ASAP7_75t_SRAM 1
9.600 54.300 113.383
processor_L/ALU_Top_inst/ALU_fsm_inst/CoeffAddr_reg[6]/D - D F DFFHQNx1_ASAP7_75t_SL 1
17.500 0.000 113.383
#-----
```

## TIMING REPORT (HOLD)

```

#-----
# Path 3: MET (-0.000 ps) Hold Check with Pin processor_R/memclear_inst/din_wr_addr_reg[5]/CLK->D
#   View: PVT_0P77V_0C.hold_view
#   Group: Sclk
#   Startpoint: (R) processor_R/memclear_inst/din_wr_addr_reg[5]/CLK
#     Clock: (R) Sclk
#   Endpoint: (F) processor_R/memclear_inst/din_wr_addr_reg[5]/D
#     Clock: (R) Sclk

      Capture     Launch
  Clock Edge:+ 0.000     0.000
  Src Latency:+ -56.717   -56.717
  Net Latency:+ 60.900 (P) 56.300 (P)
  Arrival:=    4.183   -0.417

    Hold:+ 9.500
  Uncertainty:+ 100.000
  Cppr Adjust:- 0.000
  Required Time:= 113.683
  Launch Clock:= -0.417
  Data Path:+ 114.100
  Slack:= -0.000

#-----
# Timing Point
# Delay Arrival
#(ps) (ps)
#-----
processor_R/memclear_inst/din_wr_addr_reg[5]/CLK - CLK R (arrival) 40 26.400
- -0.417
processor_R/memclear_inst/din_wr_addr_reg[5]/QN - CLK->QN R DFFHQNx1_ASAP7_75t_SL 3 26.400
29.500 29.083
processor_R/memclear_inst/FE_PHC324_M1DataWAddr_5/Y - A->Y R HB2xp67_ASAP7_75t_R 1 14.700
21.800 50.883
processor_R/memclear_inst/g158_8246/Y - A1->Y F OA1211xp5_ASAP7_75t_SL 1 9.300
7.600 58.483
processor_R/memclear_inst/FE_PHC260_n_19/Y - A->Y F HB4xp67_ASAP7_75t_SRAM 1 13.300
55.200 113.683
processor_R/memclear_inst/din_wr_addr_reg[5]/D - D F DFFHQNx1_ASAP7_75t_SL 1 17.000
0.000 113.683
#-----
```

## TIMING REPORT (HOLD)

# CLOCK TREE



# CLOCK TREE REPORT

```
Clock tree Dclk:
Total FF: 40
Max Level: 4
(L1) port In_Dclk
 \_ (L2) processor_R/sipo_inst/CTS_ccl_a_buf_00054/A -> Y (BUFx24_ASAP7_75t_SL)
   \_ (L3) processor_R/sipo_inst/g395/A -> Y (CKINVDCx14_ASAP7_75t_SRAM)
     \_ ... (40 sinks omitted)

Clock tree Sclk:
Total FF: 159
Max Level: 6
(L1) port In_Sclk
 \_ (L2) CTS_ccl_a_buf_00051/A -> Y (BUFx24_ASAP7_75t_SL)
   | \_ (L3) CTS_cfo_buf_00073/A -> Y (BUFx24_ASAP7_75t_SL)
   |   \_ (L4) CTS_ccl_a_buf_00048/A -> Y (BUFx24_ASAP7_75t_SL)
   |     \_ (L5) CTS_cpc_drv_buf_00082/A -> Y (BUFx24_ASAP7_75t_SL)
   |       \_ ... (38 sinks omitted)
   |         \_ (L5) CTS_cpc_drv_buf_00083/A -> Y (BUFx24_ASAP7_75t_SL)
   |           \_ ... (40 sinks omitted)
   |             \_ ... (67 sinks omitted)
   \_ (L3) processor_L/CTS_ccl_a_buf_00046/A -> Y (BUFx12f_ASAP7_75t_SRAM)
     \_ (L4) processor_L/CTS_cpc_drv_buf_00076/A -> Y (BUFx24_ASAP7_75t_SL)
       \_ ... (1 sinks omitted)

 \_ (L2) processor_R/g31/A -> Y (INVx13_ASAP7_75t_SRAM)
   \_ (L3) processor_L/CoeffMem_inst/CTS_ccl_a_buf_00033/A -> Y (BUFx24_ASAP7_75t_SL)
     \_ ... (1 sinks omitted)
   \_ (L3) processor_L/CoeffMem_inst/CTS_ccl_a_buf_00036/A -> Y (BUFx24_ASAP7_75t_SL)
     \_ ... (1 sinks omitted)
```

# UTILIZATION REPORT

```
Reporting Utilizations.....  
  
Core utilization = 98.721134  
TU for group A0 = 54.493188  
Effective Utilizations  
**Info: (IMPSP-307): Design contains fractional 20 cells.  
Average module density = 1.001.  
Density for the design = 1.001.  
= stdcell_area 90031 sites (21002 um^2) / alloc_area 89963 sites (20987 um^2).  
Pin Density = 0.04297.  
= total # of pins 7818 / total area 181936.  
*** Message Summary: 0 warning(s), 0 error(s)
```

```
#####
# Generated by: Cadence Innovus 19.11-s128_1
# OS: Linux x86_64(Host ID engnx02a.utdallas.edu)
# Generated on: Mon May 6 13:17:42 2024
# Design: MSL
# Command: report_timing -early -max_slack 0 -mworst 100 > hold_postroute.rpt
#####
Path 1: VIOLATED (-0.062 ps) Hold Check with Pin processor_R/sipo_inst/counter_reg[1]/CLK->O
    View: PVT_0P77V_0C.hold_view
    Group: Dclk
    Startpoint: (R) processor_R/sipo_inst/counter_reg[1]/CLK
    Clock: (F) Dclk
    Endpoint: (F) processor_R/sipo_inst/counter_reg[1]/O
    Clock: (F) Dclk
        Capture           Launch
    Clock Edge: 651000.000   651000.000
    Src Latency: -34.714     -34.714
    Net Latency:  37.901 (P)   31.964 (P)
    Arrival: 651003.188    650997.250
        Hold: 10.436
        Uncertainty: 100.000
        Cppr Adjust:  0.000
    Required Time: 651113.625
    Launch Clock: 650997.250
    Data Path: 116.312
    Slack: -0.062
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| Timing Point | Flags | Arc | Edge | Cell | Fanout | Trans | Delay | Arrival |  
|             |       |     |     |       |       | (ps) | (ps) | (ps) |  
+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| processor_R/sipo_inst/counter_reg[1]/CLK |       | CLK | R | (arrival) | 48 | 30.600 | 0.000 | 650997.250 |
| processor_R/sipo_inst/counter_reg[1]/O/N |       | CLK->ON | R | ASYNC_DFFHx1_ASAP7_75t_SL | 1 | 30.600 | 30.100 | 651027.375 |
| processor_R/sipo_inst/FE_PHC335_counter_1/Y |       | A->Y | R | HB2xp67_ASAP7_75t_SRAM | 1 | 11.200 | 25.688 | 651053.062 |
| processor_R/sipo_inst/FE_PHC88_counter_1/Y |       | A->Y | R | HB4xp67_ASAP7_75t_SRAM | 4 | 11.100 | 55.000 | 651108.062 |
| processor_R/sipo_inst/g354/Y |       | B1->Y | F | A0122xp5_ASAP7_75t_SL | 1 | 28.900 | 5.500 | 651113.562 |
| processor_R/sipo_inst/counter_reg[1]/D |       | D | F | ASYNC_DFFHx1_ASAP7_75t_SL | 1 | 10.400 | 0.000 | 651113.562 |
```

## POST ROUTE TIMING REPORT (HOLD)

```

at 2: MET (-0.004 ps) Hold Check with Pin processor_L/ALU_Top_inst/ALU_fsm_inst/coeff_max_reg[5]/CLK->SE
    View: PVT_0P77V_0C.hold_view
    Group: Sclk
    Startpoint: (F) Rst_n
    Clock:
    Endpoint: (F) processor_L/ALU_Top_inst/ALU_fsm_inst/coeff_max_reg[5]/SE
    Clock: (R) Sclk

        Capture      Launch
    Clock Edge:+ 0.000     0.000
    Src Latency:+ -56.717   0.000
    Net Latency:+ 64.300 (P) 0.000 (I)
    Arrival:= 7.583     0.000

        Hold:+ 8.321
    Uncertainty:+ 100.000
    Cppr Adjust:- 0.000
    Required Time:= 115.904
    Launch Clock:= 0.000
    Data Path:+ 115.900
    Slack:= -0.004

+-----+-----+-----+-----+-----+-----+-----+-----+
|      Timing Point      | Flags | Arc | Edge | Cell          | Fanout | Trans | Delay | Arrival |
+-----+-----+-----+-----+-----+-----+-----+-----+
| Rst_n                  |       | F   | (arrival) | HB4xp67 ASAP7_75t_SRAM | 8 | 4.100 | 1.700 | 1.700 |
| FE_PHC412_Rst_n/Y     | A->Y | F   |           | INVx1 ASAP7_75t_SL    | 2 | 11.600 | 61.700 | 63.400 |
| processor_L/ALU_Top_inst/ALU_fsm_inst/g4305/Y | A->Y | R   |           | NOR3xp33 ASAP7_75t_SL | 10 | 25.400 | 22.400 | 85.800 |
| processor_L/ALU_Top_inst/ALU_fsm_inst/g4227_8428/ | C->Y | F   |           | NOR3xp33 ASAP7_75t_SL | 10 | 38.100 | 30.100 | 115.900 |
| Y                      |       |     |           | SDFHx1 ASAP7_75t_SL   | 10 | 46.100 | 0.000  | 115.900 |
| processor_L/ALU_Top_inst/ALU_fsm_inst/coeff_max_re g[5]/SE | SE   | F   |           |                     |     |       |       |           |
+-----+-----+-----+-----+-----+-----+-----+-----+

```

## POST ROUTE TIMING REPORT (HOLD)

## VIOLATIONS POST ROUTE



## POST ROUTE



## POST ROUTE WITHOUT NETS



| Hinst Name                                         | Module Name        | Inst Count | Total Area |
|----------------------------------------------------|--------------------|------------|------------|
| MSDAP                                              |                    |            |            |
| Main_FSM_inst                                      | main_fsm           | 98         | 180.559    |
| processor_L                                        | processor          | 1117       | 11884.310  |
| processor_L/ALU_Top_inst                           | ALU_Top            | 680        | 1257.612   |
| processor_L/ALU_Top_inst/ALU_fsm_inst              | ALU_fsm            | 361        | 656.217    |
| processor_L/ALU_Top_inst/ALU_fsm_inst/NewData_inst | memtime            | 5          | 15.863     |
| processor_L/ALU_Top_inst/Adder_inst                | adder              | 158        | 236.313    |
| processor_L/ALU_Top_inst/ShiftReg_inst             | shifter            | 161        | 365.083    |
| processor_L/CoeffMem_inst                          | CO_MEM             | 60         | 5948.733   |
| processor_L/DataMem_inst                           | DATA_MEM           | 6          | 3737.984   |
| processor_L/RJMem_inst                             | R_MEM              | 4          | 177.155    |
| processor_L/memclear_inst                          | memclear           | 66         | 122.939    |
| processor_L/memclear_inst/clear_dff_inst           | memclear_dff       | 1          | 6.065      |
| processor_L/memclear_inst/sync_with_clk            | memclear_dff_4_211 | 1          | 5.832      |
| processor_L/memtime_inst                           | memtime_1          | 5          | 15.863     |
| processor_L/mux_coeff_inst                         | mux8               | 9          | 19.362     |
| processor_L/mux_data_inst                          | mux7               | 9          | 19.362     |
| processor_L/mux_inst                               | mux                | 24         | 39.191     |
| processor_L/mux_rj_inst                            | mux3               | 4          | 9.331      |
| processor_L/piso_inst                              | piso               | 92         | 180.325    |
| processor_L/sipo_inst                              | pisoDff            | 1          | 6.065      |
| processor_L/sipo_inst                              | sipo               | 91         | 213.218    |
| processor_R/zerocounter                            | zeroCounter        | 54         | 117.340    |
| processor_R                                        | processor_1        | 1126       | 11896.446  |
| processor_R/ALU_Top_inst                           | ALU_Top_1          | 690        | 1262.978   |
| processor_R/ALU_Top_inst/ALU_fsm_inst              | ALU_fsm_1          | 371        | 661.582    |
| processor_R/ALU_Top_inst/ALU_fsm_inst/NewData_inst | memtime_3          | 5          | 15.863     |
| processor_R/ALU_Top_inst/Adder_inst                | adder_1            | 158        | 236.313    |
| processor_R/ALU_Top_inst/ShiftReg_inst             | shifter_213        | 161        | 365.083    |
| processor_R/CoeffMem_inst                          | CO_MEM_1           | 65         | 5956.198   |
| processor_R/DataMem_inst                           | DATA_MEM_208       | 6          | 3737.744   |
| processor_R/RJMem_inst                             | R_MEM_209          | 4          | 177.155    |
| processor_R/memclear_inst                          | memclear_1         | 59         | 114.074    |
| processor_R/memclear_inst/clear_dff_inst           | memclear_dff_210   | 1          | 6.065      |
| processor_R/memclear_inst/sync_with_clk            | memclear_dff_4     | 1          | 5.832      |
| processor_R/memtime_inst                           | memtime_2          | 5          | 15.863     |
| processor_R/mux_coeff_inst                         | mux8_1             | 9          | 20.529     |
| processor_R/mux_data_inst                          | mux7_1             | 9          | 19.362     |
| processor_R/mux_inst                               | mux_1              | 24         | 39.191     |
| processor_R/mux_rj_inst                            | mux3_1             | 4          | 9.331      |
| processor_R/piso_inst                              | piso_1             | 90         | 179.159    |
| processor_R/piso_inst/pisoDff_inst                 | pisoDff_212        | 1          | 6.065      |
| processor_R/sipo_inst                              | sipo_1             | 100        | 238.412    |
| processor_R/zerocounter                            | zeroCounter_1      | 54         | 117.340    |

## AREA REPORT

```
VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/DataIn[1] ( M3 )
Bounds : ( 37.260, 17.088 ) ( 37.332, 17.184 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/DataIn[4] ( M3 )
Bounds : ( 38.412, 10.944 ) ( 38.484, 11.040 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/DataIn[3] ( M3 )
Bounds : ( 40.716, 10.944 ) ( 40.788, 11.040 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/DataIn[7] ( M3 )
Bounds : ( 31.500, 10.752 ) ( 31.572, 10.848 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/DataIn[0] ( M3 )
Bounds : ( 33.804, 10.752 ) ( 33.876, 10.848 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/DataIn[2] ( M3 )
Bounds : ( 35.100, 10.752 ) ( 35.172, 10.848 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/Data[5] ( M3 )
Bounds : ( 35.532, 10.560 ) ( 35.604, 10.656 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/Data[1] ( M3 )
Bounds : ( 34.524, 6.720 ) ( 34.596, 6.816 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/Data[0] ( M3 )
Bounds : ( 35.100, 6.528 ) ( 35.172, 6.624 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/Data[2] ( M3 )
Bounds : ( 35.532, 8.864 ) ( 35.604, 8.160 )

VIAENCLOSURE: ( Enclosure ) Regular Wire of Net processor_L/Data[4] ( M3 )
Bounds : ( 35.676, 7.680 ) ( 35.748, 7.776 )

Total Violations : 465 Viols.
```

## DRC REPORT

| Total Power                      |                |                 |                 |               |                |                |
|----------------------------------|----------------|-----------------|-----------------|---------------|----------------|----------------|
| Total Internal Power:            | 0.02032863     | 13.6491%        |                 |               |                |                |
| Total Switching Power:           | 0.00980202     | 6.5813%         |                 |               |                |                |
| Total Leakage Power:             | 0.11880727     | 79.7697%        |                 |               |                |                |
| Total Power:                     | 0.14893792     |                 |                 |               |                |                |
| Group                            | Internal Power | Switching Power | Leakage Power   | Total Power   | Percentage (%) |                |
| Sequential Macro                 | 0.00724        | 0.0005354       | 0.0518          | 0.05957       | 40             |                |
| IO                               | 0              | 0.0001952       | 0               | 0.0001952     | 0.1311         |                |
| Combinational                    | 0.003507       | 0.005376        | 0.0489          | 0.05778       | 38.79          |                |
| Clock (Combinational)            | 0.009582       | 0.003695        | 0.01811         | 0.03139       | 21.08          |                |
| Clock (Sequential)               | 0              | 0               | 0               | 0             | 0              |                |
| Total                            | 0.02033        | 0.009802        | 0.1188          | 0.1489        | 100            |                |
| Rail                             | Voltage        | Internal Power  | Switching Power | Leakage Power | Total Power    | Percentage (%) |
| VDD                              | 0.63           | 0.02033         | 0.009802        | 0.1188        | 0.1489         | 100            |
| Clock                            | Internal Power | Switching Power | Leakage Power   | Total Power   | Percentage (%) |                |
| Dclk                             | 1.431e-05      | 1.091e-05       | 0.0006552       | 0.0008904     | 0.5978         |                |
| Sclk                             | 0.009568       | 0.003684        | 0.01725         | 0.0305        | 20.48          |                |
| Total (excluding duplicates)     | 0.009582       | 0.003695        | 0.01811         | 0.03139       | 21.08          |                |
| Clock: Dclk                      |                |                 |                 |               |                |                |
| Clock Period: 1.302000 usec      |                |                 |                 |               |                |                |
| Clock Toggle Rate: 1.5361 Mhz    |                |                 |                 |               |                |                |
| Clock Static Probability: 0.5000 |                |                 |                 |               |                |                |
| Clock: Sclk                      |                |                 |                 |               |                |                |
| Clock Period: 0.038000 usec      |                |                 |                 |               |                |                |
| Clock Toggle Rate: 52.6316 Mhz   |                |                 |                 |               |                |                |
| Clock Static Probability: 0.5000 |                |                 |                 |               |                |                |

## POWER REPORT

## POWER DISTRIBUTION SUMMARY

```

* Power Distribution Summary:
*   Highest Average Power: processor_L/CTS_cpc_drv_buf_00076 (BUFx24_ASAP7_75t_SL):      0.001681
*   Highest Leakage Power:    CTS_cpc_drv_buf_00083 (BUFx24_ASAP7_75t_SL):      0.0008622
*   Total Cap:      7.55776e-12 F
*   Total instances in design: 2353
*   Total instances in design with no power:      0
*   Total instances in design with no activity:     0
*   Total Fillers and Decap:      0

```

## PAD RING PROPOSAL:



## RECTANGULAR PAD RING WITH MATCHING ASPECT RATIO

### SPECIAL TOPIC

ASICs require highly customised physical designs, particularly with the advent of nanoscale technology nodes. As process geometries shrink, issues like signal integrity, power distribution, and timing become more intricate.

#### 1. Clock Network Layout

A robust clock network is essential for synchronised circuit operation. Key techniques for efficient clock distribution include:

##### Clock Tree Synthesis (CTS):

- **Principle:** A hierarchical tree structure balances the path delays from the clock source to flip-flops, mitigating skew.
- **Challenges:** Variability in interconnect delays, insertion delay (clock latency), and local environmental factors can induce clock skew and jitter.
- **Implementation:** In ASIC Design, the clock tree can be balanced using a zero-skew approach, incorporating buffers or delay elements strategically to maintain consistent delays.

# CLOCK TREE



# CLOCK TREE REPORT

```
Clock tree Dclk:  
Total FF: 40  
Max Level: 4  
  (L1) port In_Dclk  
    \_ (L2) processor_R_sipo_inst/CTS_ccl_a_buf_00054/A -> Y (BUFx24_ASAP7_75t_SL)  
      \_ (L3) processor_R_sipo_inst/g395/A -> Y (CKINVDCx14_ASAP7_75t_SRAM)  
        \_ ... (40 sinks omitted)  
  
Clock tree Sclk:  
Total FF: 159  
Max Level: 6  
  (L1) port In_Sclk  
    \_ (L2) CTS_ccl_a_buf_00051/A -> Y (BUFx24_ASAP7_75t_SL)  
      \_ (L3) CTS_cfo_buf_00073/A -> Y (BUFx24_ASAP7_75t_SL)  
        \_ (L4) CTS_ccl_a_buf_00048/A -> Y (BUFx24_ASAP7_75t_SL)  
          \_ (L5) CTS_cpc_drv_buf_00082/A -> Y (BUFx24_ASAP7_75t_SL)  
            \_ ... (38 sinks omitted)  
            \_ (L5) CTS_cpc_drv_buf_00083/A -> Y (BUFx24_ASAP7_75t_SL)  
              \_ ... (40 sinks omitted)  
    \_ (L3) processor_L/CTS_ccl_a_buf_00046/A -> Y (BUFx12f_ASAP7_75t_SRAM)  
      \_ (L4) processor_L/CTS_cpc_drv_buf_00076/A -> Y (BUFx24_ASAP7_75t_SL)  
        \_ ... (67 sinks omitted)  
    \_ (L2) processor_R/g31/A -> Y (INVx13_ASAP7_75t_SRAM)  
      \_ (L3) processor_L/CoeffMem_inst/CTS_ccl_a_buf_00033/A -> Y (BUFx24_ASAP7_75t_SL)  
        \_ ... (1 sinks omitted)  
    \_ (L3) processor_L/CoeffMem_inst/CTS_ccl_a_buf_00036/A -> Y (BUFx24_ASAP7_75t_SL)  
      \_ ... (1 sinks omitted)
```

## Clock Mesh Network:

- **Principle:** A mesh network overlays a grid of interconnects that distribute the clock signal uniformly.
- **Challenges:** The efficiency of the clock tree is highly dependent on the clock mesh decision. This also helps in reducing the high power consumption by clock network by optimally placing only required buffers and interconnects.
- **Implementation:** A hybrid mesh-CTS structure can be advantageous for ASIC Design, combining the reduced skew of the mesh with the low power requirements of CTS.

## 2. Power Network Layout

The power delivery network ensures stable voltage levels for all devices. Its design addresses:

### Grid Distribution:

- **Principle:** Multi-layer grids distribute power uniformly, minimising IR drops and reducing localised power consumption.
- **Implementation:** ASIC design uses a hierarchical grid structure where higher metal layers handle global power distribution, and lower layers feed local regions.

### Decoupling Capacitors:

- **Purpose:** Minimise voltage fluctuations by providing a local charge reservoir.
- **Implementation:** Decoupling capacitors are strategically placed near high-frequency switching regions in ASIC Design to stabilise power supply.

### IR Drop Analysis:

- **Principle:** Evaluates voltage drops across the network to identify critical points of high resistance.
- **Implementation:** ASIC Design utilises CAD tools to simulate IR drops, allowing engineers to identify regions needing grid reinforcement.

## 3. Physical Verification

Verification ensures the physical design aligns with logical design and manufacturing constraints. Key checks include:

### Design Rule Checking (DRC):

Design Rule Checking is something done by backend engineers to satisfy the library vendor defined design process rules for the physical layout of the chip.

- **Objective:** Verify compliance with foundry-specific spacing, width, and layer rules.  
Implementation: Automated tools flag violations in MSDAP's layout, helping designers adjust problematic geometries. Now-A-Days, physical verification tools like IC Validator, calibre are advanced in identifying and resolving such DRC's. These rules are getting complicated as we delve deep into the lower technology nodes. Proper documentation is available in those tools to fix all such violations.

### Antenna Effect Checking:

- **Principle:** If there is a long metal route in the design. During fabrication, charge can accumulate, potentially damaging sensitive gate oxides.
- **Implementation:** In general, have two types of fixes. Metal jumpers and Antenna Diodes. Design employs both of these techniques to discharge excessive charges and avoid antenna effects. Metal Jumpers are nothing but breaking up the long metal layer somewhere and route in a higher metal layer for a short distance. This will break the charge flow into the gate of the fanout cell connected to the net. Antenna

Diode is a standard placed close to the gate which is a fanout of a long net. This diode is a reverse biased diode which attracts the accumulated charges to ground.

#### **Electrical Rule Checking (ERC):**

- **Objective:** Identify violations such as short circuits and unconnected nets that may occur in the design.
- **Implementation:** Automated tools are used to analyse the ASIC's netlist, identifying potential electrical errors before the design proceeds to manufacturing. These tools detect issues like missing connections, shorts between power and ground nets, and signal conflicts.

#### **Layout Versus Schematic (LVS):**

- **Objective:** Ensure that the final layout matches the original circuit schematic to maintain design accuracy.
- **Implementation:** CAD tools compare the extracted netlist from the physical layout against the schematic netlist to verify logical consistency. Any mismatch is flagged for review and correction. By this we will not have a chance of functionality mismatch for every cell in the design.

#### **Critical Path Analysis:**

- **Objective:** Identify the slowest signal paths that limit overall performance to improve efficiency.
- **Implementation:** Genus highlights the critical paths within an ASIC, revealing areas that require optimization. Strategies such as buffer insertion, cell sizing, wire rerouting adjustments, or logic restructuring help optimize delays and ensure that timing constraints are met.

#### **Signal Integrity:**

- **Challenges:** Crosstalk and electromagnetic interference can cause signal corruption, leading to incorrect logic states and timing violations.
- **Mitigation:** Shielding sensitive signals, carefully routing parallel traces, and increasing spacing between signal lines reduce crosstalk and electromagnetic interference. Ground planes and differential pairs further improve signal integrity.

#### **Thermal Analysis:**

- **Challenges:** High-power circuits generate significant heat, risking performance degradation and potential device failure if not managed correctly.
- **Mitigation:** Thermal-aware placement tools strategically arrange high-power components and optimise power distribution. Heat sinks, thermal vias, and airflow design help maintain safe operating temperatures.

## **ASIC DESIGN FLOW**



IMAGE: ASIC FLOW[1]

## Specification

The specification step is about translating high-level functional requirements into detailed, technical language that can be used throughout the design process.

### Process:

- **Technical Specification Development:** We converted given specification requirements into technical terms, detailing every i/o, and algorithm requirement. We specified the signal types, data formats, timing requirements, and power constraints. In MSDAP, this includes specifying the behaviour for audio signal processing, such as sampling rates and the precision of audio output.
- **Design Trade-offs:** We evaluated potential trade-offs in design, such as power versus performance, area versus cost, and speed versus complexity. For example, deciding to serialise the 40-bit output data to reduce pin count, balancing the silicon area savings against potential impacts on data throughput and latency and the precision of audio output.

## Model Validation

Implemented a C++ simulation of MSDAP's core functionalities serves as a preliminary check to validate and refine the algorithm before committing to hardware implementation. This step ensures that the algorithm performs correctly in terms of logic and efficiency, reducing the risk of costly revisions during the hardware design stages.

### **Process:**

- **Algorithm Translation to C++:** We began by converting the defined DSP functionalities of the MSDAP from the specification into executable C++ code and ran simulations using the C++ model.
- **Validation Against Specifications:** Ensured that the C++ simulation results align with the intended outcomes of the MSDAP as specified in the initial design documents. Adjusted the C++ model as needed to match the specification closely, ensuring that the algorithm behaves as expected.

### **Behavioural Model**

The behavioural model abstracts the MSDAP into a high-level functional unit, allowing for early simulations to validate its functionality against the specifications without concern for the physical implementation details.

## **Process:**

- **High-Level Modeling:** Used Verilog to write a model that captures the chip's intended functions. The model did not incorporate details about the physical gates or connections.
- **Validation:** Ensured that the behavioural model aligns with the technical specifications by verifying outputs against expected reference file.

## **Architecture**

In the architectural phase, the MSDAP's design is decomposed into smaller, manageable blocks, which helps in optimising each functional aspect and simplifying the overall design process.

## **Process:**

- **Block Definition:** Defined functional blocks of the chip, specifying their i/o, and their functionality within the system. For example, a sipo was used to send the data to the memories that it received in a bit by bit as a 16 bit output.
- **Data Flow and Control Logic:** Wrote RTL code that describes these interactions and ensure synchronous behaviour across different modules .

## **Synthesis:**

Synthesis is crucial as it transforms the architectural description into a physical layout that can be fabricated, turning the HDL code into a gate-level netlist.

## **Process:**

- **Gate-Level Conversion:** Genus was used to convert RTL descriptions into a gate-level netlist, selecting specific gates from **ASAP7** technology library. Optimised netlist is produced for power, performance, and area based on predefined constraints.
- **Timing Analysis:** An initial timing analysis was done by Genus to ensure that the design meets the required clocking and performance specifications. Adjusted the RTL code to fix any timing violations.

## **Physical Design:**

This stage transforms the synthesised netlist into a physical layout that can be manufactured, focusing on placing blocks and routing connections efficiently.

## **Process:**

- **Floorplanning:** Ran the placement of each block within the chip's area to optimise for power distribution due to using power straps using innovus.
- **Clock Tree Synthesis:** Designed a clock distribution network that minimises clock skew and provides a reliable timing reference to all parts of the chip.
- **Routing:** Implemented global and detailed routing strategies to connect all components with minimal delay and area consumption.

## **WORKING EXPERIENCE:**

- Worked on different architectures while writing the RTL to optimise the frequency. Since we never had exposure towards architecture development and RTL writing, This MSDAP has a lot of computation units involved and made us think of multiple architectures.
- RTL Synthesis involved in unmapping, optimising the logic and mapping

## **REFERENCES:**

[1]<https://www.einfochips.com/blog/asic-design-flow-in-vlsi-engineering-services-a-quick-guide/>