



Faculty of Engineering – Ain Shams University

## ASIC Design and Automation ECE413

Course Project

# Back-End Design for a MIPS16 Processor

| Name                       | ID      | Email                  |
|----------------------------|---------|------------------------|
| Ahmed Hamdy Elhusseiny     | 2101133 | 2101133@eng.asu.edu.eg |
| Mahmoud Hesham Abdelmoniem | 2100613 | 2100613@eng.asu.edu.eg |
| Ahmed Alaa Mohamed         | 2101434 | 2101434@eng.asu.edu.eg |
| Kareem Tarek Ibrahim       | 2100386 | 2100386@eng.asu.edu.eg |

December 25, 2025

# Contents

|          |                                      |           |
|----------|--------------------------------------|-----------|
| <b>1</b> | <b>Introduction</b>                  | <b>2</b>  |
| <b>2</b> | <b>Back-End Design Flow</b>          | <b>3</b>  |
| 2.1      | Digital Synthesis . . . . .          | 3         |
| 2.2      | Formal Verification . . . . .        | 5         |
| 2.3      | Floorplanning . . . . .              | 6         |
| 2.4      | Power Planning . . . . .             | 7         |
| 2.5      | Placement . . . . .                  | 8         |
| 2.6      | Clock Tree Synthesis (CTS) . . . . . | 9         |
| 2.7      | Routing . . . . .                    | 10        |
| 2.8      | Chip Finish and Output . . . . .     | 11        |
| 2.9      | Reports and Analysis . . . . .       | 12        |
| 2.10     | Setup Timing Report . . . . .        | 12        |
| 2.11     | Hold Timing Report . . . . .         | 12        |
| 2.12     | Design Rule Check (DRC) . . . . .    | 13        |
| 2.13     | Placement Legality Check . . . . .   | 14        |
| 2.14     | Area Report . . . . .                | 14        |
| <b>3</b> | <b>Conclusions</b>                   | <b>15</b> |

# Chapter 1

## Introduction

The continuous advancement of integrated circuit (IC) technology has enabled the implementation of increasingly complex digital systems on a single chip. Among these systems, microprocessors play a central role in modern computing applications, ranging from embedded systems to high-performance processors. Designing such systems requires a structured design methodology that ensures correct functionality, high performance, and reliable physical implementation.

This project focuses on the back-end design of a 16-bit MIPS processor (MIPS16), starting from a synthesized gate-level netlist and progressing through all major physical design stages. The back-end flow includes digital synthesis verification, floorplanning, power planning, placement, clock tree synthesis (CTS), routing, and final chip finishing and signoff. Each stage is carefully executed to meet timing, power, and physical design constraints imposed by the selected technology.

The objective of this project is to apply industry-standard ASIC back-end design techniques and tools to transform a functional RTL design into a manufacturable layout. Emphasis is placed on achieving timing closure, ensuring power integrity, maintaining design rule compliance, and verifying layout correctness through DRC and LVS checks. The final outcome of this project is a clean, signoff-ready GDSII layout along with extracted parasitic data for post-layout analysis.

Through this project, practical experience is gained in understanding the challenges and trade-offs involved in ASIC physical design, providing a strong foundation for advanced digital IC design and verification tasks.

# Chapter 2

## Back-End Design Flow

### 2.1 Digital Synthesis

Digital synthesis is the first step in the back-end design flow, where the register-transfer level (RTL) description of the processor is translated into a gate-level netlist. The RTL design, typically written in a hardware description language such as Verilog, describes the intended functionality and behavior of the system without including physical implementation details.

During synthesis, the RTL code is mapped to a network of logic gates using only the standard cells provided in the selected technology library. This technology library defines the available gates, their logical functions, timing characteristics, power consumption, and physical properties. Consequently, the synthesis process strictly instantiates gates from the technology file while preserving the functional equivalence of the original RTL design.

The output of the synthesis stage is a gate-level Verilog netlist (.v file). This netlist serves as the input to the formal verification stage, where functional equivalence between the synthesized gate-level implementation and the original RTL description is verified. Additionally, the synthesized netlist is used as the starting point for the subsequent physical design steps, including floorplanning, placement, and routing. After completing the synthesis process, a static timing analysis (STA) was performed on the synthesized gate-level netlist to verify that the design meets the required timing constraints. This analysis ensures that there are no setup or hold time violations at the synthesis stage before proceeding to the physical design flow.

The timing report was generated using maximum delay analysis under worst-case operating conditions. The results confirm that all analyzed timing paths satisfy the required timing constraints, as indicated by positive slack values. This demonstrates that the synthesized design is timing-clean at this stage of the design flow.

The following excerpt shows a portion of the synthesis timing report for the MIPS16 processor, highlighting one of the critical timing paths analyzed during synthesis.

```
*****
Report : timing
    -path full
    -delay max
    -max_paths 10
Design : mips_16
```

Version: G-2012.06-SP2  
Date : Fri Dec 21 16:37:04 2018  
\*\*\*\*\*

Operating Conditions: worst\_low  
Library: NangateOpenCellLibrary\_ss0p95vn40c

Startpoint: pc\_current\_reg[1]  
Endpoint : pc\_current\_reg[15]  
Path Group: clk  
Path Type : max

data arrival time 0.80  
data required time 2.37  
slack (MET) 1.57

As shown in the report, the positive slack confirms that the data arrival time meets the required setup time for the target clock period. Therefore, no setup or hold violations are observed in the synthesis stage, and the synthesized netlist is suitable for subsequent back-end design steps such as floorplanning, placement, and routing.

## 2.2 Formal Verification

Formal verification is employed to ensure functional equivalence between the synthesized gate-level netlist and the original RTL design. In this project, the gate-level netlist generated by the synthesis tool is treated as the implementation design (**IMPL**), while the RTL description serves as the reference design (**REF**).

The formal verification process compares REF and IMPL by performing equivalence checking without applying any testbench or input stimuli. Instead, the tool internally matches corresponding compare points in both designs and evaluates their logical behavior to guarantee that the synthesized design preserves the exact functionality of the original RTL description.

Formal verification is preferred over gate-level simulation for large and complex designs because it provides an exhaustive and time-efficient method to prove functional correctness. Unlike simulation-based approaches, formal verification explores all possible input combinations, making it a critical step in modern ASIC design flows.

### Formal Verification Results

The equivalence checking process was completed successfully using the formal verification tool. The results indicate that all compare points between the reference and implementation designs match correctly.

#### Passing Compare Points Report

```
*****
Report      : passing_points
Reference   : Ref:/WORK/mips_16
Implementation : Imp:/WORK/mips_16
Version     : L-2016.03-SP1
Date        : Tue Jan  2 05:52:18 2024
*****
```

47 Passing compare points

#### Failing Compare Points Report

```
*****
Report      : failing_points
Reference   : Ref:/WORK/mips_16
Implementation : Imp:/WORK/mips_16
Version     : L-2016.03-SP1
Date        : Tue Jan  2 05:52:18 2024
*****
```

No failing compare points.

## 2.3 Floorplanning

In this stage, the physical design setup was initialized by defining the design name and loading the required standard cell libraries, technology files, and timing models. The **gate-level netlist generated by the synthesis tool** was imported into the place-and-route environment, and the synthesis constraints were applied. Clock propagation was enabled to allow accurate timing analysis.

A starting floorplan was created with a core utilization of 25% and predefined IO-to-core spacing, while routing constraints were set by limiting the maximum routing layer to metal6. An initial virtual flat placement was then performed, resulting in the placement of standard cells within the defined core area, as shown in Fig. 2.1. Finally, the design was saved in the P&R tool database to be used as the input for the next physical design stage.



Figure 2.1: Initial floorplan and virtual placement showing distributed standard cells in the core area

## 2.4 Power Planning

In the power planning stage, the power and ground connectivity was first defined by logically connecting the VDD and VSS nets to the corresponding power and ground pins of all standard cells, ensuring correct power intent across the design.

A power ring was created around the core using upper metal layers, followed by a multi-layer power mesh across the core area to provide uniform power distribution and to minimize IR-drop. Virtual power pads were placed around the core boundary to emulate external power sources and to provide multiple current injection points for the VDD and VSS nets.

The power network was synthesized based on the specified voltage supply and power budget, and IR-drop analysis was performed to verify that the voltage drop remained within acceptable limits. Tap cells were then inserted throughout the design to provide proper well biasing and stable current sourcing for the standard cells. Finally, the design was updated and saved for use in the subsequent physical design stages.



Figure 2.2: IR-drop analysis of the power distribution network showing voltage drop across the core area

Figure 2.2 shows the IR-drop analysis results for the implemented power distribution network. The color map represents the voltage drop across the core, where lower drop regions appear in cooler colors, while higher drop regions appear in warmer colors. The maximum observed voltage drop is within the specified limit of 22 mV, corresponding to 2% of the nominal 1.1 V supply. This confirms that the power ring and multi-layer power mesh provide adequate current delivery and that the design meets power integrity requirements.

## 2.5 Placement

Placement is the stage where standard cells are assigned exact physical locations within the core area while optimizing timing, congestion, and area. Before starting placement, physical design and constraint checks were performed to ensure that the design was ready for optimization.

The placement flow begins with `place_opt`, which represents the **first optimization step in the placement stage**. This step performs timing-driven initial placement and basic optimization, resulting in a legalized placement that considers setup timing and congestion constraints.

After initial placement, incremental placement optimization was carried out using `psynopt`. This optimization refines cell positions and sizes to further improve timing and congestion while preserving placement legality. Following any optimization step, the power and ground connectivity must be re-established to maintain correct biasing and constant signal behavior.

Therefore, the power and ground connections, including tie connections, were re-derived using:

```
derive_pg_connection -power_net VDD -ground_net VSS -tie
```

Finally, placement legality was verified, and tie cells were connected to drive constant logic values required by the design. The completed placement was then saved for the subsequent clock tree synthesis and routing stages.



Figure 2.3: Final standard-cell placement result



Figure 2.4: Terminal output of the `check_legality` command

## 2.6 Clock Tree Synthesis (CTS)

Clock Tree Synthesis (CTS) is the stage where a balanced clock distribution network is constructed to deliver the clock signal from the source to all sequential elements while minimizing clock skew and insertion delay. In this stage, clock buffers are inserted and optimized to achieve uniform clock arrival times across the design.

The CTS flow performs timing-driven buffer insertion and clock tree balancing, ensuring that both setup and hold constraints are satisfied. Special care is taken to control clock skew and transition times while preserving placement legality.

As a result of the CTS process, the inserted clock buffers exhibit equal rising and falling delays, indicating a well-balanced clock tree and symmetric clock signal propagation. The completed CTS design is then prepared for post-CTS optimization and routing.

## 2.7 Routing

Routing is the stage where electrical connections are physically implemented between placed standard cells. At the beginning of this stage, spare cells are added to the design to allow for future Engineering Change Orders (ECOs) without requiring major layout modifications. Routing is responsible for completing signal connectivity while meeting timing, signal integrity, and design rule constraints.

The routing process is divided into three main stages. The first stage is **global routing**, which determines coarse routing paths and evaluates routing resources and congestion. The second stage is **track assignment**, where specific routing tracks are assigned to each net based on timing and congestion requirements. The final stage is **detailed routing**, which generates exact wire shapes and vias while satisfying all design rule constraints.

Timing-driven routing, signal integrity optimization, and crosstalk prevention were enabled to improve overall routing quality. Hold time fixing and incremental routing optimizations were applied as needed, followed by route verification to ensure a clean and DRC-compliant design. Finally, power and ground connectivity was re-derived, and the routed design was saved for signoff analysis.



Figure 2.5: Final routed layout of the MIPS16 processor



Figure 2.6: Post-routing DRC verification results

## 2.8 Chip Finish and Output

During the chip finishing stage, the final layout is thoroughly checked and verified to ensure full physical correctness. Design Rule Check (DRC) is performed to confirm that the layout complies with all technology design rules, while Layout Versus Schematic (LVS) verification is carried out to ensure that the extracted layout netlist matches the synthesized schematic without any connectivity mismatches. All input, output, and power pins are clearly defined and verified to guarantee correct external interfacing.

In addition, filler cells are inserted throughout the layout to eliminate any discontinuities between the n-well and p-well regions and to ensure proper well and substrate continuity. These filler cells are correctly connected to the power rails, with n-well regions tied to VDD and p-well regions tied to VSS, which helps maintain robust power distribution and prevents latch-up issues.

After successful physical verification, the final GDSII file is generated for fabrication. Parasitic extraction is then performed to produce the Standard Parasitic Exchange Format (.spf) file, which captures the extracted resistance and capacitance information. This .spf file is subsequently used as the input for post-layout simulation and analysis using the SPICE-based simulation tool (Spi-glass).

## 2.9 Reports and Analysis

### 2.10 Setup Timing Report

```
*****
Report : timing
    -path full
    -delay max
    -max_paths 1
Design : mips_16
Version: G-2012.06-ICC-SP2
Date   : Tue Dec 25 14:18:16 2018
*****
```

\* Some/all delay information is back-annotated.

```
Operating Conditions: worst_low   Library: NangateOpenCellLibrary_ss0p95vn40c
  Parasitic source      : LPE
  Parasitic mode        : RealRC
  Extraction mode       : MIN_MAX
  Extraction derating  : -40/-40/-40
```

Information: Percent of Arnoldi-based delays = 0.00%

```
Startpoint: pc_current_reg_1_
Endpoint  : pc_current_reg_15_
Path Group: clk
Path Type  : max
```

```
data arrival time    0.88
data required time   2.53
slack (MET)          1.66
```

### 2.11 Hold Timing Report

```
*****
Report : timing
    -path full
    -delay min
    -max_paths 1
Design : mips_16
Version: G-2012.06-ICC-SP2
Date   : Tue Dec 25 14:17:06 2018
*****
```

\* Some/all delay information is back-annotated.

Operating Conditions: worst\_low Library: NangateOpenCellLibrary\_ss0p95vn40c  
 Parasitic source : LPE  
 Parasitic mode : RealRC  
 Extraction mode : MIN\_MAX  
 Extraction derating : -40/-40/-40

Startpoint: datamem/ram\_reg\_101\_\_14\_  
 Endpoint : datamem/ram\_reg\_101\_\_14\_  
 Path Group: clk  
 Path Type : min

data arrival time 0.33  
 data required time 0.33  
 slack (MET) 0.00

## 2.12 Design Rule Check (DRC)



Figure 2.7: Design Rule Check (DRC) Report

## 2.13 Placement Legality Check



Figure 2.8: Placement Legality Check

## 2.14 Area Report

```
*****
Report : area
Design : mips_16
Version: G-2012.06-ICC-SP2
Date   : Tue Dec 25 14:25:31 2018
*****
```

Library(s) Used:  
NangateOpenCellLibrary\_ss0p95vn40c

|                                |     |
|--------------------------------|-----|
| Number of ports:               | 34  |
| Number of nets:                | 387 |
| Number of cells:               | 261 |
| Number of combinational cells: | 239 |
| Number of sequential cells:    | 16  |
| Number of macros:              | 0   |
| Number of buf/inv:             | 68  |
| Number of references:          | 28  |

|                        |              |
|------------------------|--------------|
| Combinational area:    | 18796.624042 |
| Buf/Inv area:          | 4246.689916  |
| Noncombinational area: | 19288.191353 |
| Net Interconnect area: | undefined    |

|                  |              |
|------------------|--------------|
| Total cell area: | 38084.815394 |
|------------------|--------------|

# Chapter 3

## Conclusions

This project successfully demonstrated a complete ASIC back-end design flow for the MIPS16 processor, starting from a synthesized gate-level netlist and ending with a signoff-ready layout. During the physical implementation, several design rule check (DRC) violations were encountered after routing. These issues were resolved by applying incremental detailed routing using the `route_zrt_detail -incremental true` command, which effectively fixed the remaining routing violations without disrupting the existing layout.

Additionally, some hold time violations were observed during timing analysis. To resolve these violations, buffer cells were inserted along the critical minimum-delay paths. After buffer insertion, placement legality was verified using the `check_legality` command, and DRC was re-run to ensure that no new physical violations were introduced. Final setup and hold timing analysis confirmed that all timing constraints were met with positive slack.

Overall, the final design is free of DRC violations, placement legality issues, and setup or hold timing violations, confirming that the MIPS16 processor meets all functional, physical, and timing requirements and is ready for fabrication.