

# Comparative Analysis of Parallel Prefix Adders

Abhilash  
 School of ECE  
 REVA University  
 Bangalore, India  
 abhilashingaldals@gmail.com

Sarfraz Hussain  
 Asst. Prof, School of ECE  
 REVA University  
 Bangalore, India  
 s.hussaine@viteee.com

**Abstract**—The speed, size, and power consumption of adders are always trade-offs that affect the system's overall performance. One of the performance issues is carry propagation, which restricts the speed of the addition process. When it comes to latency and coverage area, Parallel Prefix Adders (PPAs) are the greatest adders on the market right now. The implementation and comparison of four widely used parallel adders—the Brent-Kung Adder, the Kogge-Stone Adder, the J Sklansky Adder, and the Ladner-Fischer Adder—are covered in this paper. The adders obtained delays are 5.910 ns, 4.580 ns, 4.148 ns, and 4.520 ns, respectively, and their corresponding power outputs are 15.196uW, 20.532uW, 17.864uW, and 11.240uW. In this research, the adders power consumption and delays have been compared. It's discovered that no single adder can provide benefits for Power and Delay performance characteristics. Different adders are preferred for specific aspects; therefore, they are chosen based on design requirements.

**Index Terms**—Kogge Stone Adder(KSA), J Sklansky Adder(JSA), Ladner-Fischer Adder(LFA), Brent-Kung Adder(BKA), Cadence Virtuoso, Parallel Prefix Adders(PPA), Power and Delay.

## I. INTRODUCTION

Effective arithmetic operations are necessary for high-performance computing in digital systems. Adders are essential parts of CPUs, DSPs (digital signal processors), and other computational units because they do binary addition. High-speed adders are becoming increasingly important as the demand for faster and more efficient processing develops. This need is especially addressed by parallel adders, which accelerate addition by lowering carry signal propagation delays. In VLSI implementations, parallel adders are recognized for their superior performance. Recent years have seen a rise in the use of reconfigurable logic, such as FPGAs, for numerous practical designs incorporating mobile DSP and communications applications. This is due to the fact that FPGAs outperform DSP- and microprocessor-based solutions in terms of power and speed [1]. Comparing ASIC designs to this architecture, a significant decrease in development time and cost is necessary. The power advantage is crucial considering the increased popularity of portable and mobile gadgets, which mostly rely on DSP functions.

While BKA offers area optimization at the penalty of delay, KSA is the quickest of the others with the least delay for 16 bits, but at the cost of using more LUTs. LFA offers a result that is in between KSA and BKA [2]. Earlier technologies

relied on redundant RCA to conduct arithmetic operations. However, multipliers play a crucial role in operational units that use massive adders in many sequences and aim to decrease execution rates. In 90 nm CMOS technology, the topology was synthesized, 0.0187 W is the current design dynamic power produced for 16-bit addition, and 26% of the total area cell is covered [3]. In comparison with the most advanced adders currently available, the parallel prefix adders yield a standard improvement in error frequency (EF) of 27%–36% for inputs that are random and an improvement in picture quality metrics for image filtering of 8%–42%. There is a flexible framework available for designing adders with different area-power-delay characteristics when using parallel-prefix addition computation [4]. In the era of digital technology, filters can be used as memory components for data storage. The PPAs are the adders that are most frequently employed to implement digital filters. The sklansky adder is the most efficient among several PPA [5].

Chip complexity and delay are two issues that persist despite the rise in chip use. Kogge-Stone Adders work as 3:1 compressor in static CMOS, which is enough to boost performance. However, as chip demand rises, fast computations become necessary, and fast multipliers are the only ones that can meet this requirement. Kogge-Stone adders, or parallel adder arrays, are used to add partial products in order to speed up multiplication [6]. A parallel-prefix adder is a type of adder used in VLSI design that efficiently adds by utilizing the prefix operation. The gray cell has been utilized in place of the black cell for a 32-bit Ladner-Fischer parallel prefix adder, which improves LFA performance by lowering delay and LUT count.

Traditional Ripple Carry adders cause large delays, particularly for wide bit-width operations, because each bit must wait for the carry from the previous bit. By handling several bits at once, parallel adders reduce this latency [7]. KSA, BKA, JSA and LFA are the Parallel Prefix adders implemented in this project. Using 180 nm technology (GPDK), these four PPAs are implemented in cadence virtuoso. This project's primary goal is to compare the two parameters, delay and power. PPAs main benefit is parallel carry generation, which reduces the number of logic levels (N).

The rest of the article is organized as follows: Section II consists of the three processing stages of parallel-prefix adders. that are 1) pre-processing stage, 2) Carry Generation and

Propagation Stage, and 3) post-processing stage. Section III introduces the implementation of four parallel prefix adders, namely KSA, BKA, LFA, and JSA. Experimental results and discussion of compared PPAs are given in Section IV. Finally, conclusions are drawn in Section V.

## II. WORKING PRINCIPLE OF PPA

The three fundamental phases in binary addition using PPAs are as follows:

### A. Pre-Processing Stage

At this stage, the propagation and create bits for every pair of input operands ( $A_i$  and  $B_i$ ) are calculated. The following equations define the signals  $P_i$  (propagate) and  $G_i$  (generate) according to prefix computation:

$$P_i = A_i \text{ xor } B_i \quad (1)$$

$$G_i = A_i \text{ and } B_i \quad (2)$$

### B. Carry-Generation and Propagation Stage

The carry tree topology of different PPAs differs, hence this stage determines which adder to use. Every carry signal is computed in parallel throughout the prefix computing process. Currently, each bit is processed to generate and propagate carries, which act as intermediary signals for further calculations.

$$G_2 = G_1 \text{ or } (G_0 \text{ and } P_1) \quad (3)$$

$$P_2 = P_1 \text{ and } P_0 \quad (4)$$

### C. Post-Processing Stage

The propagation signal ( $P_i$ ) and the carry signal output ( $C_i$ ) are subject to an exclusive-OR operation, which comes from the carry generating stage, is performed in this final stage.

$$S_o = P_i \text{ xor } C_i - 1 \quad (5)$$

$$C_o = G_i \quad (6)$$

## III. IMPLEMENTATION

### A. Kogge Stone Adder

In 1973, H. S. Stone and P. M. Kogge designed it, which is shown in the figure 1. This is regarded as one of the most straightforward and basic adder designs. By computing the prefix sums in parallel, the parallel prefix approach utilized by the Kogge-Stone adder greatly expedites the addition process [8]. Compared to linear growth, the delay develops logarithmically with the number of bits, which is far faster. At every stage, the fan-out to two increases speed by lowering the load on any one gate. It is utilized in CPU and GPU ALUs where quick arithmetic operations are essential [9].



Fig. 1. 8-Bit Kogge Stone Adder

### B. Ladner Fischer Adder

In 1980, R. Ladner and M. Fischer designed this structure, which is shown in the below figure 2. It is regarded as the quickest adder based on design time. By lowering the number of gates and wiring complexity, the Ladner-Fischer adder increases its area efficiency [10]. The little logical depth offered by this structure comes at the cost of a significant fan-out. Compared to the Kogge-Stone adder, it requires less wire and fewer gates, although the implementation and design can still be complicated. It is simple to choose the trade-off between area and latency with this topology.

### C. Brent-Kung Adder

In 1982, Brent and Kung made the proposal, that is shown in figure 3. It consumes less area because it employs a higher number of logic levels while reducing the overall number of processing elements [11]. This adder is widely used and highly regarded. Consequently, its power consumption is likewise decreased [12]. It efficiently controls fan-out, making sure that no gate drives an excessive number of other gates, which contributes to preserving speed and dependability [13].

### D. Sklansky Adder

The Sklansky adder breaks the problem up into smaller subproblems, works through each one, and then adds the solutions together. Carry computation can be done efficiently with this method [14]. It becomes more area efficient with the structure's assistance in lowering the quantity of gates and connections. The divide-and-conquer algorithm is another name for the Sklansky adder. It is recognized for having small logic depth, which results in faster computing times [15]. The implemented design of Sklansky adder is shown in



Fig. 2. 8-Bit Ladner Fischer Adder



Fig. 3. 8-Bit Brent-Kung Adder

the figure 4.

Following is the colour coding of the blocks:

- Black Cell: Bit Propagates and Generate ( $P_i = A_i \text{ xor } B_i$ )



Fig. 4. 8-Bit Sklansky Adder

&  $G_i = A_i \text{ and } B_i$ )

- Blue Cell: Group Propagate and Generate ( $P_i = P_1$  and  $P_0 \& G_2 = G_1 \text{ or } (G_0 \text{ and } P_1)$ )
- Grey Cell: Group Generate ( $G_2 = G_1 \text{ or } (G_0 \text{ and } P_1)$ )
- White Cell: Post-Processing Stage ( $S_0 = P_i \text{ xor } C_{i-1}$  &  $Cout = G_i$ )

#### IV. RESULTS AND DISCUSSION

The KSA, BKA, LFA, and Sklansky adder are examples of the eight-bit PPAs that are implemented in this study. All the adders are simulated in the Cadence Virtuoso tool using 180nm technology.  $A=10110010$  and  $B=01101101$  are the inputs provided to the 8-bit adders, and  $S=00011111$  with  $\text{carry}=1$  is the output of all the adders. The output waveform of the adders is shown in the below figure 5.

Two characteristics that were taken out of the transient analysis are the power and the delay. The power and delay achieved from Kogge Stone Adder are 20.532uW and 4.580ns, respectively. When compared, the Kogge Stone adder is the quickest, but because its higher wiring and more transistors than the other adders, it requires more power. The Brent Kung adder, on the other hand, has a higher delay because of its logarithmic depth but requires less power because it has a much fewer number of transistors. The Brent Kung Adder yielded a power and delay of 15.196uW and 5.91ns, respectively.

The trade-offs between the number of gates employed and the logic's depth are optimized by the Ladner-Fischer adder. Applications needing a strong balance between high speed



Fig. 5. Output Waveform

and area efficiency are especially well suited for this adder. The power and delay of the Ladner Fischer adder are 11.240 uW and 4.520 ns. The Sklansky adder decreases the number of stages required for carry propagation, which results in a reduced gate count. This structure results in fewer stages for carry propagation, but at the expense of increasing fan-out at higher stages. When compared to other parallel adders, the Sklansky adder's main advantage is its less complicated implementation. The obtained Sklansky adder's power and delay are 17.864uW and 4.148ns, respectively.

TABLE I  
COMPARISON TABLE

| Adder   | Power(uW) | Measured Delay(ns) | Referred Delay(ns) | No. of Transistors |
|---------|-----------|--------------------|--------------------|--------------------|
| KSA [2] | 20.532    | 4.580              | 4.302              | 696                |
| LFA [2] | 11.240    | 4.520              | 5.144              | 672                |
| BKA [2] | 15.196    | 5.910              | 3.581              | 648                |
| JSA [3] | 17.864    | 4.148              | 4.273              | 672                |

Table I: When the delay value from this project, which is carried out in the cadence virtuoso (GPDK 180nm), is compared to the value from the previous project, which is implemented in Xilinx Vivaldo, it can be seen that the results obtained from the adders implemented in cadence are marginally higher. The comparison of power and delay obtained in this project is shown in Figure 6.

## V. CONCLUSION

After comparing the advantages and disadvantages of each parallel adder architecture using Cadence Virtuoso. The Kogge-Stone adder performs exceptionally well in terms of speed, but at the expense of increased power and area consumption. A well-balanced method that works well in applications where speed and area are equally important is the Brent-Kung adder. The Sklansky adder has a simple design but moderate performance, whereas the Ladner-Fischer adder



Fig. 6. Power and Delay Comparison

achieves a balanced trade-off, making it a versatile alternative for a variety of applications. This study helps to determine the best adder architecture for high-performance computing systems depending on specific design criteria. In future work, employing Wallace Multiplier by choosing the best adder out of these four adders and implemented design can be executed using other lower GPDK technologies, like 90nm and 45nm in cadence virtuoso.

## REFERENCES

- [1] B. N. Shashikala, P. R, P. Prahallad, R. S K and M. V. Ramyashree, "Design and implementation of Kogge-stone, Sparse Kogge-stone and Spanning tree adder," 2023 International Conference on Smart Systems for applications in Electrical Sciences (ICSSES), Tumakuru, India, 2023.
- [2] S. Gauhar, A. Sharif and N. Alam, "Comparison of Parallel Prefix Adders Based on FPGA & ASIC Implementations," 2020 IEEE Students Conference on Engineering & Systems (SCES), Prayagraj, India, 2020, pp. 1-6.
- [3] R. Pandey, "Implementation of Approximate Adder circuit of Ladner Fischer Adder (16 bit)," 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2020, pp. 31-34.

- [4] A. Stefanidis, I. Zoumpoulidou, D. Filippas, G. Dimitrakopoulos and G. C. Sirakoulis, "Synthesis of Approximate Parallel-Prefix Adders," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 11, pp. 1686-1699, Nov. 2023.
- [5] A. Prasath A.M., R. V. Arjun, K. Deepaknath and K. Gayathree, "Implementation of optimized digital filter using sklansky adder and kogge stone adder," 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2020, pp. 661-664.
- [6] Y. K. G. R. S. S. T. R. T and V. T, "Design and Simulation of 16x16 Vedic Multiplier using Kogge-Stone Adder," 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2023, pp. 452-457.
- [7] S. K.C., S. M., G. B.C., L. D.M., Navya and P. N.V., "Performance Analysis of Parallel Prefix Adder for Datapath Vlsi Design," 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 2018, pp. 1552-1555, doi: 10.1109/ICICCT.2018.8473087.
- [8] U. Penchalaiah and S. K. VG, "Design of High-Speed and Energy-Efficient Parallel Prefix Kogge Stone Adder," 2018 IEEE International Conference on System, Computation, Automation and Networking (ICSCA), Pondicherry, India, 2018, pp. 1-7.
- [9] B. Penumutchi, S. Vella and H. Satti, "Kogge Stone Adder with GDI technique in 130nm technology for high performance DSP applications," 2017 International Conference On Smart Technologies For Smart Nation (SmartTechCon), Bengaluru, India, 2017, pp. 5- 10, doi: 10.1109/Smart-TechCon.2017.8358334.
- [10] K. A and C. H. Gowda, "An Efficient 32-bit Ladner Fischer Adder derived using Han-Carlson," 2021 IEEE International Conference on Mobile Networks and Wireless Communications (ICMNWC), Tumkur, Karnataka, India, 2021, pp. 1-4.
- [11] M. A. A. Amin, M. Kartwi, M. Yaacob, E. A. Z. Hamidi, T. S. Gunawan and N. Ismail, "Design of Brent Kung Prefix Form Carry Look Ahead Adder," 2022 8th International Conference on Wireless and Telematics (ICWT), Yogyakarta, Indonesia, 2022, pp. 1-6, doi: 10.1109/ICWT55831.2022.9935137.
- [12] N. U. Kumar, K. B. Sindhu, K. D. Teja and D. S. Satish, "Implementation and comparison of VLSI architectures of 16-bit carry select adder using Brent Kung adder," 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), Vellore, India, 2017, pp. 1-7, doi: 10.1109/IPACT.2017.8244982.
- [13] K. G. Hepziba and C. P. Subha, "A novel implementation of high speed modified brent kung carry select adder," 2016 10th International Conference on Intelligent Systems and Control (ISCO), Coimbatore, India, 2016, pp. 1-5, doi: 10.1109/ISCO.2016.7727130.
- [14] K. Sathish and P. Jagadeesh, "An Efficient Design and Performance Analysis of Novel 8 Bit Modified Wallace Multiplier Using Sklansky Adder in Comparison with Kogge-Stone Adder (KSA)," 2023 Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM), Chennai, India, 2023, pp. 1-7.
- [15] A. Garg, D. Agrawal, P. Kularia, N. Gaur, A. Mehra and S. Rajput, "Area efficient modified booth adder based on sklansky adder," 2017 2nd International Conference for Convergence in Technology (I2CT), Mumbai, India, 2017, pp. 308-312, doi: 10.1109/I2CT.2017.8226142.