

# Optimized vs. Unoptimized VLSI Design: A Comparative Analysis

Rajvi Desai and Piyush Maru

*Department of Electronics and Communication Engineering*

*Nirma University*

Ahmedabad, India

**Abstract**—This paper presents the design, implementation, and comparative analysis of a 2-bit Arithmetic Logic Unit (ALU) with integrated seven-segment display decoder using Microwind 3 VLSI design tool. The ALU employs a 4x1 multiplexer for operation selection, supporting addition, subtraction, and multiplication, with the fourth input grounded for future expansion. Two distinct design methodologies are implemented and compared: an unoptimized version utilizing pre-designed logic gates from Microwind's library, and an optimized version featuring transistor-level custom design with shared diffusion regions, optimized transistor placement, and compact routing strategies. The 2-bit output is decoded and displayed on a seven-segment LED display for visual verification. Comprehensive performance comparison is conducted using Microwind's built-in simulation and analysis tools, evaluating critical metrics including silicon area utilization, transistor count, power consumption, and propagation delay. Simulation results demonstrate that the optimized transistor-level design achieves superior area efficiency.

**Index Terms**—ALU, VLSI, Microwind, optimization, seven-segment decoder, full-custom design, layout design, CMOS

## I. INTRODUCTION

An Arithmetic Logic Unit (ALU) is basically a fundamental building block of microprocessors and digital systems, which is responsible for performing arithmetic and logical operations. In modern VLSI design, optimization of digital circuits is very important for achieving reduced power consumption and improved performance. An optimized circuit takes less silicon area, hence costs on a larger scale are reduced.

Our work focuses on designing a 2-bit ALU capable of performing addition, subtraction, and multiplication operations, with an integrated seven-segment decoder for displaying the output. The design is implemented using Microwind, a full-custom VLSI layout tool that provides comprehensive capabilities including:

- Physical Layout with Design Rule Check(DRC) option.
- CMOS process library selection (120nm, 180nm, 350nm).
- Transient and static simulation.
- Power and delay analysis.

The primary contribution of this work is a detailed comparison between unoptimized and optimized implementations, demonstrating practical optimization techniques applicable to full-custom VLSI design.

## II. OBJECTIVES

The objectives of this project are:

- 1) Design and implement an unoptimized 2-bit ALU in Microwind.
- 2) Develop an optimized version using layout and logic optimization techniques.
- 3) Integrate a seven-segment decoder for output display.
- 4) Perform comprehensive comparison of both designs.
- 5) Analyze area, power consumption, and propagation delay.
- 6) Validate functionality through Microwind simulation.

## III. SYSTEM OVERVIEW

### A. ALU Operations

The designed 2-bit ALU performs the following operations based on control signals or selection lines:

- **Multiplication (MUL)**: Generates product of two 2-bit numbers.
- **Addition (ADD)**: Computes sum of two 2-bit operands.
- **Subtraction (SUB)**: Performs 2's complement subtraction.
- **Grounding**: A ground value provided, which can be replaced by some operation's output for future uses.

### B. System Architecture

The complete system consists of:

- 1) Input stage: Two 2-bit operands ( $A[1:0]$ ,  $B[1:0]$ )
- 2) Control logic: Operation selector ( $S[1:0]$ )
- 3) ALU core: Arithmetic computation unit
- 4) Seven-segment decoder: BCD to 7-segment converter
- 5) Output stage: Result and display signals

## IV. UNOPTIMIZED ALU DESIGN

### A. Logic Design

The unoptimized design employs direct implementation of Boolean functions without optimization:

- Full adders implemented using standard Sum-of-Products (SOP) expressions
- Separate subtractor circuit using 2's complement method
- Independent multiplier using AND gates and adders



Fig. 1. System block diagram of 2-bit ALU with seven-segment decoder.

### B. Layout Characteristics

Key features of the unoptimized layout include:

- Higher transistor count
- Increased diffusion area
- Longer interconnect routing
- Multiple metal layers with congestion
- Higher parasitic capacitance

### C. Design Methodology

The unoptimized circuit was designed using the following approach:

- 1) Direct translation of Boolean equations to gates
- 2) Standard cell placement without optimization
- 3) Conventional routing without consideration of path length
- 4) No sharing of common sub-expressions

## V. OPTIMIZED ALU DESIGN

### A. Logic Optimization

Several optimization techniques were applied:

- **Boolean Minimization:** Karnaugh maps used to reduce logic expressions
- **Common Sub-expression Elimination:** Identified and reused intermediate signals
- **Multiplexer Optimization:** Reduced control logic complexity

### B. Layout Optimization Strategies

The optimized layout incorporates several physical design techniques:

- 1) **Shared Diffusion:** Adjacent transistors share source/drain regions

- 2) **Euler Path Method:** NMOS and PMOS transistors ordered to minimize polysilicon breaks
- 3) **Compact Routing:** Minimized metal-1 interconnect length
- 4) **Standard Cell Approach:** Row-based placement for regularity
- 5) **Minimum Spacing:** DRC-compliant minimum spacing rules applied

### C. Transistor-Level Optimization

- Proper sizing of PMOS and NMOS transistors (W/L ratios)
- Reduced number of series-connected transistors
- Balanced pull-up and pull-down networks

## VI. SEVEN-SEGMENT DECODER DESIGN

### A. Optimized Decoder

Optimization techniques applied:

- 1) **Karnaugh Map Minimization:** Reduced Boolean expressions for each segment
- 2) **Common Term Extraction:** Identified shared intermediate products
- 3) **Gate Reduction:** Eliminated redundant logic gates
- 4) **Compact Layout:** Systematic placement reducing routing complexity

Example optimization for segment 'a':

$$a_{unopt} = \overline{Y2Y0} + Y2Y0 + Y3 + Y1 \quad (1)$$

$$a_{opt} = Y3 + Y1 + (Y2 \odot Y0) \quad (2)$$

## VII. MICROWIND DESIGN FLOW

The complete design flow implemented in Microwind consists of the following steps:

- 1) **Technology Selection:** Choosing CMOS process node (120nm/180nm/350nm).
- 2) **Transistor Placement:** Positioning NMOS and PMOS devices.
- 3) **Layer Drawing:**
  - Polysilicon gates
  - N-diffusion and P-diffusion regions
  - Metal-1, Metal-2, Metal-3, Metal-4, Metal-5 interconnects
  - Contact and via placement
- 4) **Connectivity:** Establishing thick VDD and GND power rails.
- 5) **I/O Assignment:** Defining input and output nodes
- 6) **Design Rule Check (DRC):** Verifying layout compliance.
- 7) **Simulation Setup:** Configuring transient analysis parameters
- 8) **Functional Verification:** Validating logic operations through simulation.
- 9) **Performance Analysis:** Measuring delay, power, and area.

## VIII. SIMULATION RESULTS

### A. Functional Verification

Comprehensive simulations were performed for all operations:

- **Addition:** Verified for all input combinations (00+00 to 11+11)
- **Subtraction:** Tested 2's complement subtraction
- **Multiplication:** Validated product generation
- **Seven-Segment Output:** Confirmed correct segment activation

### B. Timing Analysis

Propagation delays were measured from input transition to output stabilization using Microwind's cursor measurement tool.

### C. Power Analysis

Dynamic and static power consumption were analyzed using Microwind's integrated power analysis feature with VDD = 1.2V.



Fig. 2. Optimized Layout Simulation



Fig. 3. Unoptimized Layout Simulation

## IX. PERFORMANCE COMPARISON

### A. Performance Metrics

Table I summarizes power, delay, and efficiency metrics.

### B. Layout Metrics

Table II presents a detailed comparison of layout characteristics.

TABLE I  
PERFORMANCE METRICS COMPARISON

| Metric             | Unoptimized | Optimized | Reduced |
|--------------------|-------------|-----------|---------|
| Power ( $\mu W$ )  | 182         | 54.532    | 70.03%  |
| Delay (ps)         | 130         | 114       | 12.3%   |
| Area ( $\mu m^2$ ) | 1960.62     | 1214      | 38.07%  |

TABLE II  
LAYOUT METRICS COMPARISON

| Parameter          | Unoptimized | Optimized | Improved |
|--------------------|-------------|-----------|----------|
| Area ( $\mu m^2$ ) | 1960.6      | 1214      | 38.07%   |
| Width ( $\mu m$ )  | 43.98       | 59.51     | —        |
| Height ( $\mu m$ ) | 44.58       | 20.4      | —        |
| NMOS Count         | 192         | 213       | -10.93%  |
| PMOS Count         | 192         | 213       | -10.93%  |
| Total Transistors  | 384         | 426       | -10.93%  |
| Metal Layers Used  | 3           | 5         | —        |

### C. Analysis

The optimized design demonstrates:

- Significant area reduction through shared resources
- Lower power consumption due to reduced switching activity
- Improved delay characteristics from shorter interconnects
- Better power-delay product (PDP) indicating overall efficiency

## X. LAYOUT MISTAKES AND CORRECTIONS

During the initial stages of layout development, several common design mistakes were encountered. Identifying and correcting these issues was essential for achieving a DRC-clean, optimized, and functionally accurate layout. The major mistakes and corresponding corrections are summarized below:

### A. Overlapping or Misaligned Polysilicon Gates

One of the frequent errors was accidental overlap between polysilicon gates or misalignment with diffusion regions. This resulted in unintended transistor formation or incorrect channel lengths. The issue was resolved by:

- Maintaining proper alignment between poly and diffusion
- Following minimum spacing rules as per the Microwind DRC guidelines
- Using the grid-snap feature for precise placement

### B. Missing VDD and GND Power Rails

In the unoptimized design, each gate had separate supply connections, leading to increased routing complexity. Additionally, some transistors were left without proper VDD or GND rails. This was corrected by:

- Adding thick horizontal power rails at the top (VDD) and bottom (GND)
- Ensuring every PMOS connects to VDD and NMOS to GND
- Sharing power rails across all cells to reduce metal redundancy

### C. No Common Input Routing for Signal Tapping

Inputs such as A, B, and control signals were directly connected to individual gates without a shared metal rail, making routing messy and increasing layout height. The correction involved:

- Creating dedicated horizontal or vertical rails for each input signal
- Tapping the signal from the rail whenever required
- Reducing routing congestion and improving layout readability

### D. Improper Diffusion Sharing

Initially, adjacent NMOS and PMOS devices were placed without sharing diffusion regions, leading to:

- Increased silicon area
- More contact cuts and higher parasitic capacitance

This was fixed using:

- Euler-path-based transistor ordering
- Combining source/drain regions of series-connected transistors

### E. Missing Contacts and Vias

Some transistors lacked metal-to-diffusion contacts, causing floating nodes in simulation. Corrections included:

- Adding proper contacts between poly, metal, and diffusion
- Ensuring metal-1 to metal-2 via placement wherever routing changed layers

### F. Incorrect Transistor Orientation

A few PMOS and NMOS devices were rotated improperly, causing reverse diffusion regions and logic malfunction. This was corrected by:

- Keeping PMOS in N-well and NMOS in P-substrate
- Maintaining consistent transistor orientation across the layout

### G. Overlapping Metal Layers Without Via

Some metal-1 and metal-2 lines were overlapped without vias, which is not electrically connected in Microwind. The solution was:

- Explicitly placing VIA12, VIA23, etc., wherever layer transitions were required
- Running DRC after every major routing modification

### H. No Clear Labeling for Nodes

Initially, input and output nodes were not labeled, making simulation difficult. This was resolved by:

- Using consistent naming for all nets
- Labeling all primary I/Os for transient analysis

These mistakes and their corrections significantly improved the design quality, reduced layout area, and ensured functional accuracy during simulation.

## XI. CONCLUSION

This work successfully demonstrated the significant impact of layout optimization techniques in full-custom VLSI design through the implementation of a 2-bit ALU with seven-segment decoder using Microwind 3. The comparative analysis between unoptimized and optimized transistor-level implementations reveals substantial improvements across all critical performance metrics.

The optimized design achieved a 38.07% reduction in silicon area ( $1906.6 \mu\text{m}^2$  to  $1214 \mu\text{m}^2$ ) through strategic placement and shared diffusion regions. Most significantly, the design achieved a remarkable 70.03% reduction in power consumption alongside a 12.3% improvement in propagation delay (130 ps to 114 ps). The 54.27% reduction in layout height further validates the effectiveness of compact routing and optimized transistor arrangement.

These results underscore the critical importance of layout optimization even within transistor-level design. The 70.03% power savings is transformative for battery-powered and mobile applications, directly translating to extended battery life and reduced thermal dissipation. The 38.07% area reduction yields significant manufacturing cost benefits at production scale, while simultaneous performance improvement demonstrates superior power-delay product efficiency. This proves that careful attention to physical layout—employing Euler path ordering, shared diffusion, and compact routing—makes a dramatic difference even when both designs are implemented at the transistor level.

This project validates that optimization is not just about choosing transistor-level over gate-level design, but about how efficiently those transistors are arranged and connected. The successful verification of all arithmetic operations confirms that meticulous layout optimization enhances area, power, and speed without compromising functional correctness. As semiconductor technology advances toward smaller nodes and higher integration densities, the ability to optimize physical layout becomes increasingly essential for competitive and sustainable VLSI design.

## XII. FUTURE SCOPE

This work establishes a foundation for several promising extensions and enhancements in ALU design and VLSI optimization:

- **Architecture Scaling:** Extend the current 2-bit ALU to 4-bit, 8-bit, 16-bit, or 32-bit implementations to handle larger data widths required in modern processors, applying the same optimization techniques demonstrated in this work.
- **Enhanced Functionality:** Expand ALU operations to include logical operations (AND, OR, XOR, NOT), shift operations (logical and arithmetic shifts), and comparison operations, creating a complete arithmetic-logic unit suitable for processor design.
- **Advanced Power Optimization:** Integrate clock gating and power gating techniques to further reduce dynamic

and static power consumption. Explore multi-threshold CMOS (MTCMOS) and voltage scaling strategies for greater power efficiency.

- **Technology Migration:** Implement the design in advanced process nodes such as 45nm, 28nm, or FinFET technology to evaluate performance improvements and challenges at smaller geometries.
- **Performance Enhancement:** Introduce pipelining techniques to improve throughput by enabling concurrent execution of multiple operations, valuable for high-performance applications.
- **Design Methodology Comparison:** Compare the optimized full-custom design against standard cell library-based implementations to quantify advantages of manual optimization versus automated synthesis.
- **MUX Utilization:** Utilize the fourth input of the  $4 \times 1$  multiplexer, currently grounded, to implement division, modulo operations, or other specialized arithmetic functions.
- **Display Enhancement:** Integrate advanced display modules beyond seven-segment, such as LCD or OLED displays, for enhanced output visualization and user interface capabilities.

#### REFERENCES

- [1] N. Weste and D. Harris, *CMOS VLSI Design: A Circuits and Systems Perspective*, 4th ed. Boston: Addison-Wesley, 2011.
- [2] E. Sicard, *Microwind User's Manual*, INSA, 2020.
- [3] M. M. Mano and M. D. Ciletti, *Digital Design*, 5th ed. Upper Saddle River, NJ: Pearson, 2013.
- [4] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, *Digital Integrated Circuits: A Design Perspective*, 2nd ed. Upper Saddle River, NJ: Prentice Hall, 2003.
- [5] J. F. Wakerly, *Digital Design: Principles and Practices*, 4th ed. Pearson Education, 2006.
- [6] D. A. Patterson and J. L. Hennessy, *Computer Organization and Design: The Hardware/Software Interface*, 5th ed. Morgan Kaufmann, 2014.
- [7] S.-M. Kang and Y. Leblebici, *CMOS Digital Integrated Circuits: Analysis and Design*, 3rd ed. McGraw-Hill, 2003.
- [8] S. Brown and Z. Vranesic, *Fundamentals of Digital Logic with Verilog Design*, McGraw-Hill Education.



Figure 1: Optimised Microwind Layout for 2-bit ALU



Figure 2: Unoptimised Microwind layout for 2-bit ALU



Figure 3: Optimised Simulation Result for 2-bit ALU



Figure 4: Unoptimised Simulation Result for 2-bit ALU



Figure 5: RTL view of 2-bit ALU



Figure 6: Post Mapping and Synthesis