

# Design and Implementation of 4-bit ALU Using 90nm CMOS Technology

Vishal Sanjay Raju

*Dept. of Electronics and Communication Engineering,  
Manipal Institute of Technology,  
Manipal Academy of Higher Education,  
Manipal, Udupi-576104,  
Karnataka, India  
vishal3.mitmpl2024@learner.manipal.edu*

Karthi Pradeep

*Dept. of Electronics and Communication Engineering,  
Manipal Institute of Technology,  
Manipal Academy of Higher Education,  
Manipal, Udupi-576104,  
Karnataka, India  
karthi.pradeep@manipal.edu*

**Abstract**— This paper presents a 4-bit Arithmetic Logic Unit (ALU) design and implementation using 90nm CMOS (Complementary Metal-Oxide-Semiconductor) technology. The ALU is a fundamental building block of any central processing unit (CPU), microcontroller unit (MCU), GPU's, DSP's etc. and is responsible for performing arithmetic and logical operations. The design process involves the schematic design of adder, subtractor, comparator, 1's complement and decoder for the development of the ALU architecture followed by simulation of the design using Cadence Virtuoso EDA (Electronic Design Automation) tools. The implementation phase focuses on layout design to ensure the functionality and manufacturability of the ALU. The performance metrics such as power consumption, area, and speed are analyzed to validate the efficiency of the design. The successful realization of the 4-bit ALU showcases the potential for advancements in integrated circuit design and the practical application of CMOS technology in modern computing systems.

**Keywords**— ALU, adder, 90nm, CMOS, virtuoso, processor

## I. INTRODUCTION

Arithmetic Logic Unit or also called as ALU is a digital circuit that performs a range of arithmetic and logical operations on 4-bit binary numbers. As one of the core components in processors, the ALU enables execution of basic calculations and decision-making tasks, serving as the computational brain within microprocessors, digital signal processors (DSPs), and embedded systems. By working with 4-bit data units, this ALU is suited for small-scale computing applications and educational purposes, providing a compact yet functional approach.

The circuit design is done using multiple multiplexers where each input will perform different arithmetic and logical operations [1]. We have used 8-bit input multiplexer which is having 3 select lines. So, total we have used 8 multiplexers of 8-bit input each and we are giving maximum 8 bit as output from the ALU. The adder in this ALU is designed for adding 4-bit at a time. In a comparison of adder design, the circuit designed by CMOS will take total number 384 transistors, power consumption of 5.1mW and 50.7ns of delay [2]. The circuit consist of 4-bit magnitude comparator which is of low power and high-speed design, the conventional comparator consumes power (dynamic power) of  $162\mu\text{W}$  but this design will consume power (dynamic power) of  $112\mu\text{W}$  [3]. The complete ALU architecture will efficiently execute arithmetic and logic operations, this low power design technique will use the static power around 66.2% but it will ensure minimal performance degradation [4]. The ALU circuit can be designed in actual FPGA board as well. FPGA boards like

Basys3 board, lattice FPGA and many more. We can make use of the Verilog HDL programming language for the design of the ALU in FPGA board [6].

In this paper the ALU consists of five blocks namely adder, subtractor, magnitude comparator, 1's complement and decoder. The 4-bit ALU design typically includes two main categories that is, arithmetic operations for numerical calculations and logical operations for bitwise manipulations. This design enables efficient processing of data, allowing for basic computing and control logic functions that are essential in various digital systems. Implementing a 4-bit ALU also provides insights into the key principles of digital design, such as binary arithmetic, circuit simplification, and the integration of logical gates to achieve multifunctional capability. Its simplicity makes it a versatile tool in digital electronics, ideal for introductory learning as well as foundational experiments in more complex digital architectures.

## II. PROPOSED ARCHITECTURE

The ALU is a crucial component in any digital systems especially in processors. The design must be able to solve all types of arithmetic and logical operations supported by the processors instruction set. For example in RISC-V I type ISA the ALU should be able to perform addition, subtraction, left and right shift, logical AND, OR and XOR operations. So, the ALU must be able to solve all the required operations in the ISA. In the proposed architecture the ALU design is capable of doing arithmetic operations like addition, subtraction and logical operations like magnitude comparison, one's complement and 3:8 decoder. These 5 operations will be controlled by the 3-bit select lines S0, S1 and S2. Based on the values of S0, S1 and S2 the ALU will perform the particular arithmetic and logical operations as shown in table 1.

TABLE I ALU OPERATION BASED ON SELECT LINES

| S0 | S1 | S2 | ALU Operation        |
|----|----|----|----------------------|
| 0  | 0  | 0  | Addition             |
| 1  | 0  | 0  | Subtraction          |
| 0  | 1  | 0  | Magnitude Comparator |
| 1  | 1  | 0  | One's complement     |
| 0  | 0  | 1  | 3:8 Decoder          |

The proposed architecture is designed such that it should be able to perform five arithmetic and logical operations as shown in table 1. The 4-bit inputs will be given to the particular circuit, which will take the inputs and perform the particular operation. The output will be given to the 8:1 multiplexer, and

based on the select line input the output of the multiplexer will be given as the final output of the ALU. The complete architecture of the ALU is as shown in fig 1.



Fig. 1. Proposed architecture of arithmetic and logic unit

The architecture consists of 8 8:1 multiplexer's, where every multiplexer's first input will be connected to the output of the adder. Similarly, every multiplexer's second input is connected to the output of the subtractor block, third input is connected to the output of the magnitude comparator, four input is connected to output of one's complement block, and fifth input of all the mux's is connected to the 3:8 decoder. In this way the complete architecture of the ALU is designed.

### III. WORKING METHODOLOGY

The complete design of the ALU is made by precisely designing each and every block like 4-bit adder, subtractor, 4-bit magnitude comparator, 4-bit one's complement, 3:8 decoder, 8:1 multiplexer etc. In order to design these block we must design the logical blocks like inverter, OR gate, AND gate etc. Using these logical blocks we need to design the main arithmetic as well as logical blocks.

#### A. Schematic Design

The schematic of the 4-bit adder block is as shown in Fig 2. The schematic consist of 4 full adders of 1 bit each. The sum of each bit positions will be evaluated and carry will be propagated to the next full adder. The sum bit of the first full adder block is connected to the first pin of the first 8:1 multiplexer, similarly the sum bit of the second full adder block is connected to the first pin of the second 8:1 multiplexer. In this way all the 4 adder block is connected to the multiplexer. The carry bit is connected to the first pin of the fifth 8:1 multiplexer.

The schematic of the subtractor block is shown in the fig 3. The schematic is designed according to the Boolean equation of the subtractor circuit. The output of this block is difference and borrow pins. The difference pin is connected to the second pin of the first 8:1 multiplexer and borrow pin is connected to the second pin of the second 8:1 multiplexer.

The schematic of magnitude comparator is as shown in fig 4. This block will compare 2 4-bit inputs and give output for greater than, less than and equal possibilities. The circuit is designed according to the Boolean equation of the magnitude comparator. For this block the we have 3 output i.e.  $A > B$ ,  $A < B$  and  $A = B$ . The  $A > B$  output pin is connected to the third pin of the first 8:1 multiplexer,  $A < B$  output pin is connected to the third pin of the second 8:1 multiplexer and similarly the  $A = B$  output pin is connected to the third pin of the third 8:1 multiplexer.

The schematic of one's complement is as shown in the Fig 5. The design is constructed by simple 4 inverters for 4-bit input. The first output of the one's complement block is connected to the fourth pin of the first 8:1 multiplexer. In the same way consecutive output pins of the one's complement is connected to the fourth pin of the second, third and fourth 8:1 multiplexer.

The schematic of 3:8 decoder is as shown in Fig 6. The design is constructed using Boolean equation of 3:8 decoder. In this decoder since we have 8 outputs available, we have designed 8 multiplexers in the ALU design. Each output pin of the decoder block is connected to the fifth pin of all the 8 consecutive 8:1 multiplexers.

The schematic of the complete ALU is as shown in fig 7. The output of all the blocks which was designed will be given to the designated input of the 8:1 multiplexers. The S0, S1 and S2 all three select lines of all the multiplexers is connected together, and the required input is given. The inputs of the multiplexers which are unused are connected to the ground. The CMOS technology node used for design of the schematic is 90nm. The schematic was designed using cadence virtuoso software.

#### B. Layout Design

After the complete design of the ALU schematic, we design the layout for the 90nm CMOS transistors. The complete layout of the ALU schematic which we did in Fig 7 is shown in Fig 8. For the design of the layout, we used maximum of up-to metal-6 layers, and majority we have used metal 1 and metal 2. The layout design of the 5 blocks and 8:1 multiplexer as first designed and then in the layout design of ALU we import the individual blocks layout and then connect the pins together.

In the next process for the evaluation of the layout design against the schematic design of ALU and evaluating the layout for CMOS 90nm process, the design is tested for DRC (design rule check) and LVS (layout versus schematic).

The DRC (design rule check) validation of the layout design is as shown in the Fig 9. The output is shown as DRC test is passed. This will confirm that the designed layout is having correct wire to wire spacing, pin to wire connections, n and p diffusion etc. for the 90nm CMOS PDK (process design kit).

The LVS (layout versus schematic) validation of the layout design is as shown in the Fig 10. The output is shown as zero violations and mismatches. This will confirm that the designed layout completely matches with the schematic design of ALU and there is no mismatch in connection between schematic and layout design.

The parasitic resistors and capacitors which is extracted from the layout design of the ALU is as shown in the Fig 11. This will show the parasitic capacitance and resistance of the transistors which depends on its fanout, fan in and various different parameters. This we will ultimately use to calculate the delay of the circuits. This also displays the wire resistance and wire capacitance which will be generated due to length of the metal layer, width of the metal layer and also various other parameters.



Fig. 2. Schematic of the 4-bit adder



Fig. 3. Schematic of the subtractor



Fig. 4. Schematic of the magnitude comparator



Fig. 5. Schematic of the One's complement



Fig. 6. Schematic of the 3:8 decoder



Fig. 7. Schematic of ALU



Fig. 8. Layout design of the ALU



Fig. 9. DRC validation of the ALU layout



Fig. 12. Input values A and B of adder block and magnitude comparator block



Fig. 10. LVS validation of the ALU layout



Fig. 13. Input values of 4-bit one's complement block and 3:8 decoder block



Fig. 11. RC parasitic extraction from ALU layout



Fig. 14. Input value of 3:8 decoder block and carry bit for adder and borrow bit for subtractor



Fig. 15. Select line input values and output 8-bit values for ALU

#### IV. RESULTS

The testing of functionality of the ALU design can be tested by designing the testbench for the ALU schematic which we have designed. Pulsating voltages are supplied to the testing pins with logic 1 voltage as 2.5V and logic 0 voltage as 0V. Vdd voltage is connected to the 2.5V and Vss voltage is connected to the 0V.

The input voltage values of the adder block and magnitude comparator block is as shown in the Fig 12, one's complement block and 3:8 decoder block is as shown in Fig 13 and remaining inputs for 3:8 decoder block and subtractor block is as shown in Fig 14. The inputs AA0, AA1, AA2, AA3, AB0, AB1, AB2 and AB3 are 2 inputs of 4-bit each and ACin is the carry bit for the adder. CA0, CA1, CA2, CA3, CB0, CB1, CB2 and CB3 are the 2 inputs of 4-bit each for the magnitude comparator block. CP0, CP1, CP2 and CP3 are the 4-bit inputs for the one's complement. DA, DB and DC are the 3 inputs of the 3:8 decoder. SA, SB and SBin are the inputs for the full subtractor.

The output of the ALU according to the S0, S1 and S2 is as shown in the Fig 15. The output is of 8-bit O0, O1, O2, O3, O4, O5, O6 and O7. The operation of the ALU according to the select lines is as shown in the table 1. In the Fig 15, we can observe that for the first 5 select lines input values the output is valid, and for the remaining of the select lines the output is zero.

TABLE 2. POWER CONSUMPTION, DELAY AND TRANSISTOR COUNT FOR ALU DESIGN

| Power Consumption | Delay  | Transistor Count |
|-------------------|--------|------------------|
| 7.12mW            | 56.3ns | 808              |

Several recent approaches to 4-bit ALU design have focused on optimizing either power consumption, speed or area often trading one parameter with other [11][12]. For instance, Turaga et al. implemented a 4-bit CMOS ALU (250nm, 2.5V) with a measured average power of 78.3 $\mu$ W [11]. Designs using pass transistor logic or Gate Diffusion Input (GDI) in advanced nodes such as the MGDI-based 4-bit ALU reported by Thakare [14]. consumed about 913 $\mu$ W at 1.8V and 90nm, but featured only 132 transistors and excellent area/delay metrics.

#### V. CONCLUSIONS

This paper provides the complete design of the schematic and layout of arithmetic and logic unit which is very crucially used inside processors. The schematic of the individual block of ALU is shown. The output results of the designed ALU is shown. The successful validation of layout design by DRC (design rule check) and LVS (layout versus schematic) is shown. This shows that the complete ALU designed with the CMOS 90nm technology PDK (project development kit).

By comparison, our design in 90nm CMOS technology demonstrated a power consumption of 5.1mW for the adder block, with a delay of 50.7ns and lower transistor count for critical modules. Unlike many prior works that focus on either area or power savings alone, our approach optimizes for both power and speed, achieving a dynamic power reduction in the magnitude comparator (112 $\mu$ W versus 162 $\mu$ W in conventional designs) and ensuring design rule and layout-schematic consistency through successful DRC and LVS checks. This positions our ALU as a robust compromise between transistor efficiency, power savings, and timing, suitable for compact processors or embedded systems. Furthermore, while emerging technologies such as CNTFETs or adiabatic logic offer potential improvements in power or speed, our design leverages mainstream 90nm CMOS for practical manufacturability and compatibility with common EDA flows, which enhances its utility in current VLSI educational and low-scale industrial contexts.

#### ACKNOWLEDGMENT

The authors would like to thank the PG lab, Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Udupi for giving necessary facility and technical support.

#### REFERENCES

- [1] L. Dhulipalla and A. Lourts Deepak, "Design and implementation of 4-bit alu using finfets for nano scale technology," International Conference on Nanoscience, Engineering and Technology, ICONSET 2011.
- [2] S. Balaji Ramakrishna, A. G. Prasad, P. Anand and T. Aravind, "High performance gdi alu using 10t adder cells", 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), 2018, Bangalore, India.
- [3] P. Singh and P. K. Jain, "Design and analysis of low power, high speed 4-bit magnitude comparator", 2018 International Conference on Recent

- Innovations in Electrical, Electronics & Communication Engineering (ICRIECE), Bhubaneswar, India, 2018.
- [4] A. Pathak, S. Gupta and B. Jena, "Design and evaluation of a 4-bit alu and ram system: a step towards ultra-low power computing", 2023 International Conference on Next Generation Electronics (NEleX), Vellore, India, 2023.
  - [5] E. A. Cortés-Barrón, M. A. Reyes-Barranca, L. M. Flores-Nava and A. Medina-Santiago, "4-Bit Arithmetic Logic Unit (ALU) based on neuron mos transistors", 2012 9th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), Mexico City, Mexico, 2012.
  - [6] A. K. Panigrahi, S. Patra, M. Agrawal and S. Satapathy, "Design and implementation of a high speed 4bit alu using basys3 fpga board", 2019 Innovations in Power and Advanced Computing Technologies (i-PACT), Vellore, India, 2019.
  - [7] A. Kumar, B. Kumawat and N. Pandey, "Design of 4-bit alu using team memristor model and cmos logic" 2020 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 2020.
  - [8] R. V. Biswas, M. H. Ahanaf and S. T. Nidhi, "A  $2.319 \mu\text{w}$ , 37.34 mhz transmission gate based 4-bit alu for contemporary low-powered, high-speed microprocessors", 2024 6th International Conference on Electrical Engineering and Information & Communication Technology (ICEEICT), Dhaka, Bangladesh, 2024.
  - [9] J. L. V. Ramana Kumari, Y. Varshitha, C. Gopi and M. Rakesh, "Design and implementation of alu using ring counters", 2023 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT), Karaikal, India, 2023.
  - [10] G. N. Chiranjeevi and S. Kulkarni, "Pipeline architecture for  $n=k*21$  bit modular alu: case study between current generation computing and vedic computing", 2021 6th International Conference for Convergence in Technology (I2CT), Maharashtra, India, 2021.
  - [11] Turaga, Sriraj, Vanama, Kundan, Ja, Rithwik & Sai, K. Jaya. "Design of low power 4-bit alu using adiabatic logic", IOSR journal of VLSI and Signal Processing, 2014.
  - [12] B. Anjaneyulu, N. Siva Sankara Reddy, "Design of high-speed low power 4-bit alu using cnfet," SSRG International Journal of Electrical and Electronics Engineering, vol. 11, no. 8, pp. 36-49, 2024.
  - [13] Shylaja V, Dr. K. Ezhilarsan, "Design and Performance Efficiency Analysis of a Low Power 4-bit Arithmetic Logic Unit," J. Electrical Systems, 2024.
  - [14] Ashish Thakare, Sunil Gupta and Pravin Zode., "Design of A Low Power Area Optimized 4-Bit Arithmetic Logic Unit For High Speed Processors," IOSR Journal of Engineering, 2019.