

# Design and Implementation of Digital Down Converter for WiFi Network

Debarshi Datta<sup>ID</sup> and Himadri Sekhar Dutta

**Abstract**—This letter introduces a field-programmable gate array (FPGA)-based digital down converter (DDC) processing a sampling frequency of about 3.64 GHz to a down-converted frequency of 28.4375 MHz to match the IEEE 802.11ah WiFi HaLow standard. The proposed DDC uses a polyphase mixer (PM) and a resampling filter. The PM adopts parallel coordinate rotation digital computer (CORDIC) processors and lowpass filter arrays to reduce high-speed data rates with minimum resource utilization. Again, the resampling filter employs a cascaded integrator comb (CIC) filter associated with a parallel prefixed adder (PPA) and a multichannel systolic finite impulse response (FIR) filter implemented with canonical expression, attaining optimum hardware cost. Converting floating-point to fixed-point data types provides significant resource savings. Finally, the improved design is coded in the Xilinx Vivado synthesis tool and successfully tested on the FPGA Kintex-7 device. In contrast to other recent architectures, the proposed design substantially reduces area requirements and power utilization. The MATLAB tool verifies the design to achieve an acceptable spurious-free dynamic range (SFDR) of 115 dB.

**Index Terms**—Cascaded integrator comb (CIC), digital down converter (DDC), field-programmable gate array (FPGA), finite impulse response (FIR), spurious-free dynamic range (SFDR).

## I. INTRODUCTION

THE IMPLEMENTATION of an efficient digital down converter (DDC) plays a key role in communication receivers to allow a decrease in the sampling frequency [1]. It is often required to provide parallel processing in DDC architecture using the field-programmable gate array (FPGA) device.

Some recent works related to DDC designs have been discussed in the technical literature. Sikka et al. [2] suggested a DDC architecture to convert the sampling frequency from 200 to 1 MHz using a six-stage cascaded integrator comb (CIC) filter on the FPGA Kintex-7 device. An FPGA implementation of the DDC, including a polyphase CIC filter, was described for wireless applications [3]. Again, a reconfigurable DDC structure was proposed in [4] that reduced the sample rate with an

Manuscript received 12 May 2023; accepted 10 June 2023. Date of publication 16 June 2023; date of current version 30 May 2024. This manuscript was recommended for publication by S. Katkoori. (*Corresponding author:* Debarshi Datta.)

Debarshi Datta is with the Electronics and Communication Engineering Department, Maulana Abul Kalam Azad University of Technology, Kolkata 700064, India (e-mail: debarshidatta7@gmail.com).

Himadri Sekhar Dutta is with the Electronics and Communication Engineering Department, Kalyani Government Engineering College, Kalyani 741235, India (e-mail: himadri.dutta@gmail.com).

Digital Object Identifier 10.1109/LES.2023.3286951



Fig. 1. Structure of DDC filter chain.

input bandwidth of 3.6 GHz to produce a flexible output using a variable sampling rate factor on the Kintex7 device. In [5], Xilinx IP cores DDC design was described for wideband direction using the Kintex-7 device. An FPGA-based parallel DDC structure decomposing with numerically controlled oscillators (NCOs) converted a high-data rate signal to a lower-data rate signal [6].

However, all the existing DDC architectures are not resource-efficient on the FPGA platform. In addition, DDC must be flexible to meet different applications. Therefore, an essential consideration in the design of DDC is how to minimize area and power and improve flexibility. In this letter, an area- and power-efficient DDC design is discussed on an FPGA device. This letter is structured as follows. Section II presents the structure of the proposed DDC. Section III focuses on the details of the complete design. Section IV describes implementations and comparison results. Section V describes the conclusion of this letter.

## II. PROPOSED DDC ARCHITECTURE

The DDC filter chain is depicted in Fig. 1, showing cascaded decimation filters along with corresponding sampling frequencies. The proposed design consists of a polyphase mixer (PM) and a resampling filter, which includes a four stage CIC filter and a two-channel systolic finite impulse response (FIR) filter. The PM can convert high-speed digital bandpass signals to lower-speed modes, allowing the DDC to be successfully implemented on the FPGA platform. The resampling filter reduces the sampling rate and removes undesirable spectral components, yielding a complex baseband spectrum. The DDC can process an input with a high-sampling frequency ( $F_s$ ) of 3.64 GHz, and produce a down-converted sampling frequency ( $F_{out}$ ) of 28.4375 MHz.

The antialiasing complex bandwidth at the output is  $0.8 \times F_{out}$ . The required spurious-free dynamic range (SFDR) is anticipated to be more than 80 dB. The overall decimation



Fig. 2. Structure of PM.

factor ( $R$ ) is

$$R = \frac{F_s}{F_{\text{out}}} = R_1 \times R_2 \times R_3 \quad (1)$$

considering  $R_1$ ,  $R_2$ , and  $R_3$  are the decimation factors of the PM, CIC filter, and FIR filter, respectively. The programmable decimation factors improve the flexibility of the design to meet multistandard receiver designs.

### III. DESIGN COMPONENTS

#### A. PM Structure

The PM contains a de-multiplexer followed by coordinate rotation digital computer (CORDIC) processors [7] and single-rate lowpass filters, as depicted in Fig. 2. This letter uses an 8-stage pipeline CORDIC processor to achieve the desired output. The decimations are presented before filtering to remove zero elements from filter calculations. The polyphase decomposition technique calculates the input sequence  $x(n)$  into  $R_1$  subchannels, where  $n$  is time indexed. Each polyphase branch runs at  $F_s/R_1$ , making DDC feasible on the FPGA platform. The intermediate signal  $g_i(n)$  is given by

$$g_i(n) = x(nR_1 - l) \quad (2)$$

where  $l = 0, 1, \dots, R_1 - 1$ . The CORDIC outputs are given by [8]

$$I_{Cl}(n) = A_t g_i(n) \cos [2\pi(nR_1 - l)f_0/F_s] \quad (3)$$

$$Q_{Cl}(n) = -A_t g_i(n) \sin [2\pi(nR_1 - l)f_0/F_s] \quad (4)$$

where  $A_t$  is the gain of the CORDIC processor after  $t$  iterations and  $f_0$  is the center frequency. The output of the PM is expressed as

$$x'(n) = x_I(n) + jx_Q(n) \quad (5)$$

where

$$x_I(n) = \sum_{l=0}^{R_1-1} A_t x(nR_1 - l) \cos \left[ 2\pi(nR_1 - l) \frac{f_0}{F_s} \right] * h_l(n) \quad (6)$$

and

$$x_Q(n) = \sum_{l=0}^{R_1-1} -A_t x(nR_1 - l) \sin \left[ 2\pi(nR_1 - l) \frac{f_0}{F_s} \right] * h_l(n) \quad (7)$$

considering  $h_l(n)$  is the polyphase filter response.



Fig. 3. Structure of the four stage CIC filter.

#### B. CIC Filter

The computationally efficient lowpass CIC filter is used for antialiasing in high-sample rate change systems [9]. After decimation by  $R_1$ , the incoming sample rate is set by  $F_s/R_1$ . The pass-band frequency ( $\omega_p$ ) is  $0.8 \times \pi/KR_2$ , where  $K$  indicates stages. The proposed CIC uses high-speed Brent-Kung (BK) parallel prefix adders (PPAs) and partitions the decimation factor ( $R_2 = R_{2a}R_{2b}$ ) to optimize the critical path delay, as illustrated in Fig. 3. The main component of the CIC filter is the adder circuit, and in this letter, the BK adder is placed in the integrator as well as in the comb section [10]. The MATLAB tool provides data width in each integrator (I) and comb section (C). The suggested four stage CIC filter has achieved a gain of 60.21 dB (for  $R_2 = 8$ ). To match the design specifications, the aliasing tolerance is set to 80 dB. Suppose the addition of two input numbers  $X$  and  $Y$  (all are  $k$  bits). The PPA works in three processing stages. In the first stage or preprocessing stage, the generated  $G_u$ , propagated  $P_u$ , and half-sums  $H_u$  are precomputed, such as  $G_u = X_u \& Y_u$ ,  $P_u = X_u \vee Y_u$ ,  $H_u = X_u \oplus Y_u$  for  $0 \leq u \leq k-1$ , where  $\&$ ,  $\vee$ , and  $\oplus$  are the conjunction, disjunction, and exclusive disjunction, respectively. The second stage or parallel prefix graph generates the carry  $C_u$  using the value of  $G_u$  and  $P_u$ . Here, the operator o is used to carry generate and carry propagate bits and is expressed as [11]

$$(G, P) o (G', P') = (G \vee (P \& G'), P \& P'). \quad (8)$$

The bit pairs  $(G, P)$  produce the sequential value  $(G_{u:v}, P_{u:v})$  for  $u > v$ , which is defined as

$$(G_{u:v}, P_{u:v}) = (G_u, P_u)o(G_{u-1}, P_{u-1}) \cdots o(G_v, P_v). \quad (9)$$

The carry operator calculates all the carries, and the carry is given by

$$C_u = G_{u:0} \quad (10)$$

[for all  $u \geq 0$ ].

The last stage is the post-processing stage,  $S_0 = H_0 \oplus C_0$ ,  $S_u = H_u \oplus C_{u-1}$ ,  $S_k = C_{k-1}$  for  $0 \leq u \leq k-1$ . Fig. 4 shows the 8-bit BK adder network. The tree requires  $2 \log_2 k - 1$  steps to generate the carry bits. Therefore, this 8-bit BK adder implementation has five stages.



Fig. 4. 8-bit BK prefix adder graph.

Fig. 5. Multichannel ( $C = 2$ ) FIR filter ( $R_3 = 2$ ).

### C. FIR Filter

Usually, the CIC filter exhibits a narrow passband, which attenuates the signals. Therefore, a multichannel FIR filter is designed to compensate for passband droop [12]. A multichannel FIR filter can access a large number of coefficient sets. According to FDATool, the filter order is 64 with 16-bit coefficients. The filter order controls the passband and stopband attenuations. Fig. 5 shows a multichannel ( $c = 2$ ) systolic decimation ( $R_3 = 2$ ) time-multiplexed FIR filter using random access memory (RAM). The minimum input sample rate of the FIR filter is  $F_s/(R_1 \times R_2)$ , and the antialiasing output bandwidth is 80% of  $F_{\text{out}}$ . The passband and stopband cut-off frequencies are related by  $0.2 \times \pi/R_3$  and  $0.8 \times 2\pi/3R_3$ , respectively. The FIR filter works at a frequency of 4 ( $R_3 \times c$ )  $\times F_s/(R_1 \times R_2)$ . The input register is used for synchronization, and the output register ensures correct results for all data input. The coefficients are preserved in RAM of size  $2 \times 2$ . The multiplexed (2:1) two-channel data is multiplied by the

Fig. 6. Power spectrum of the DDC system ( $R_1 = 8$ ,  $R_2 = 8$ ,  $R_3 = 2$ ).

coefficients. The addition of the products is passed through the next stage of the FIR structure. The counters ( $\text{Cnt}$ ) ( $2 \times 2$  clock cycles) are used to initiate the multiplexer select line and address in RAM. An accumulation unit is necessary to achieve the final output.

This letter uses the canonical implementation of the complex valued FIR filter to reduce the number of multiplications [13]. After calculation, the presented 64-tap FIR filter requires 192 multiplications, whereas 256 multiplications are needed for a conventional FIR design [14].

## IV. FPGA IMPLEMENTATION AND COMPARISONS

### A. Design Specifications

The PM provides enough stopband attenuation and passband ripples to minimize signal distortion, indicating good filtering quality. The data specifications to achieve the IEEE 802.11ah WiFi networking standard are listed below [15].

- 1) Input sampling frequency: 3.64 GHz.
- 2) Output sampling frequency: 28.4375 MHz.
- 3) Overall decimation factor: 128 (8x8x2).
- 4) Channel bandwidth: 16 MHz.
- 5) Input data length: 12 bit.
- 6) Output data length: 24 bit.
- 7) Passband edge  $\leq 0.1$  dB.
- 8) Stopband gain  $\geq 80$  dB.

### B. Validation

The input carrier frequency is considered to be 112.2 MHz, and it is sampled by a 3.64-GHz clock quantized to 12-bit to implement in the DDC realization. The ChipScope outputs are sent to MATLAB-R2021a for measuring the power spectrum. As given in Fig. 6, the SFDR is 115 dB.

### C. Comparisons

The presented DDC has been simulated using the Xilinx Vivado 2022.1 tool targeting the Kintex-7 device (XC7K70TFCG676). The design is described by the efficient (like resource sharing, minimizing data transitions, integer format variables, etc. [16]) hardware description language (HDL) in Verilog to optimize the available resources. Again, truncation is applied to each filter node to reduce the data length so

TABLE I  
IMPLEMENTING FLOATING-POINT AND FIXED-POINT DDC

| Attributes      | Single precision floating-point | Fixed-point |
|-----------------|---------------------------------|-------------|
| LUTs            | 11403                           | 1267        |
| DSP48Es         | 204                             | 40          |
| $F_{max}$ (MHz) | 370                             | 455.5       |
| Power (mW)      | 512                             | 268         |

TABLE II  
PERFORMANCE EVALUATION OF PM, CIC, AND FIR FILTERS ( $R_1 = 8$ ,  $R_2 = 8$ ,  $R_3 = 2$ ), ALL TARGETING KINTEX-7,  
NA STANDS FOR NOT AVAILABLE

| Architecture type | Slices | LUTs | Delay(ns) | Power(mW) |
|-------------------|--------|------|-----------|-----------|
| Ref.[6]           | 1378   | 1103 | NA        | 167       |
| Proposed PM       | 1063   | 774  | 1.932     | 135       |
| Ref.[17]          | 397    | 317  | 2.145     | 73        |
| Proposed CIC      | 324    | 245  | 1.632     | 62        |
| Ref.[14]          | 525    | 432  | 2.211     | 88        |
| Proposed FIR      | 402    | 258  | 1.745     | 71        |

TABLE III  
COMPARISON RESULTS OF DIFFERENT DDC SOLUTIONS

| Resource type   | Ref.[5] | Ref.[4] | Ref.[2] | Proposed solution |
|-----------------|---------|---------|---------|-------------------|
| Slices          | 37066   | 13552   | 2253    | 1789              |
| LUTs            | 69499   | 7269    | 1828    | 1267              |
| DSP48Es         | 1034    | 43      | 149     | 40                |
| BRAMs           | NA      | 22      | NA      | 8                 |
| IOBs            | NA      | NA      | 68      | 66                |
| $F_{max}$ (MHz) | NA      | 454.3   | 498     | 455.5             |
| Power (W)       | NA      | 1.446   | 0.325   | 0.268             |
| SFDR (dB)       | NA      | 80.3    | 108     | 115               |

that it prevents overflow errors. Using fixed-point data types, the implementation of the design reduces FPGA resources and power while improving the maximum clock rate  $F_{max}$  compared to floating-point implementations, as described in Table I.

To achieve high resolution, implementing multiplierless CORDIC requires a smaller lookup table (LUT) compared to an NCO. Table II shows the synthesis results of the proposed PM, CIC, and FIR filters with their respective counterparts.

The synthesis results show that the presented CORDIC-based PM offers a reduction of slices (area) by 22.86% and power by 19.16% compared to a traditional mixer designed with parallel NCO arrays [6]. The described CIC filter with BK adder improves the performance of 18.39% in slices, 23.92% in path delay, and 15.07% in power over the conventional CIC filter [17]. The suggested FIR filter reduces 23.43% in slices, 21.07% in delay, and 19.32% in power compared to the existing FIR filter [14]. Table III reported that the complete design reduces area and power by up to 20.59% and 17.54%, respectively, compared to the most recent architecture [2]. The proposed solution runs at a  $F_{max}$  of 455.5 MHz. The CORDIC-based design produces a sufficient SFDR of 115 dB,

which is better than previous designs. Thus, the presented DDC has superior performance compared to all other DDC architectures.

## V. CONCLUSION

In this letter, an FPGA-based DDC design is proposed for WiFi standards. The resampling filter works in a multistage methodology to achieve the highest performance. Results analysis suggested that the design has saved the chip area by 20.59% and power by 17.54% compared to other work stated as [2]. The DDC architecture with parallel processing makes it an appealing solution for real-time signal processing.

## REFERENCES

- [1] B. Sklar, *Digital Communications: Fundamentals and Applications*, 2nd ed. Hoboken, NJ, USA: Prentice-Hall, 2017.
- [2] P. Sikka, A. R. Asati, and C. Shekhar, “Power-and area-optimized high-level synthesis implementation of a digital down converter for software-defined radio applications,” *Circuits Syst. Signal Process.*, vol. 40, pp. 2883–2894, Jun. 2021. [Online]. Available: <https://doi.org/10.1007/s00034-020-01601-9>
- [3] L. L. Motta, B. A. A. Acurio, N. F. T. Aniceto, and L. G. P. Meloni, “Design and implementation of a digital down/up conversion directly from/to RF channels in HDL,” *Integr. VLSI J.*, vol. 68, pp. 30–37, Sep. 2019. [Online]. Available: <https://doi.org/10.1016/j.vlsi.2019.05.006>
- [4] X. Liu, X.-X. Yan, Z.-K. Wang, and Q.-X. Deng, “Design and FPGA implementation of a reconfigurable digital down converter for wideband applications,” *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 25, no. 12, pp. 3548–3552, Dec. 2017. [Online]. Available: <https://doi.org/10.1109/TVLSI.2017.2748603>
- [5] V. Obradović, P. Okiljević, N. Kozić, and D. Ivković, “Practical implementation of digital down conversion for wideband direction finder on FPGA,” *Sci. Tech. Rev.*, vol. 66, no. 4, pp. 40–46, 2016. [Online]. Available: <https://doi.org/10.1007/s00034-020-01601-9>
- [6] L. Guo, F. Tan, P. Zhan, and H. Zeng, “Decomposing numerically controlled oscillator in parallel digital down conversion architecture,” *J. Circuits Syst. Comput.*, vol. 26, no. 9, pp. 1–14, 2017. [Online]. Available: <https://doi.org/10.1142/S0218126617501262>
- [7] J. E. Volder, “The CORDIC trigonometric computing technique,” *IRE Trans. Electron. Comput.*, vol. EC-8, no. 3, pp. 330–334, Sep. 1959.
- [8] A. S. Dhar and B. Lakshmi, “CORDIC architectures: A survey,” *VLSI Des.*, vol. 2010, May 2010, Art. no. 794891, doi: [10.1155/2010/794891](https://doi.org/10.1155/2010/794891).
- [9] E. B. Hogewauer, “An economical class of digital filters for decimation and interpolation,” *IEEE Trans. Acoust., Speech, Signal Process.*, vol. ASSP-29, no. 2, pp. 155–162, Apr. 1981.
- [10] A. Abinaya, M. Maheswari, and A. S. Alqahtani, “Heuristic analysis of CIC filter design for next-generation wireless applications,” *Arabian J. Sci. Eng.*, vol. 46, pp. 1257–1268, Oct. 2020. [Online]. Available: <https://doi.org/10.1007/s13369-020-05016-1>
- [11] P. Lyakhov, M. Valueva, G. Valuev, and N. Nagornov, “A method of increasing digital filter performance based on truncated multiply-accumulate units,” *J. Appl. Sci.*, vol. 10, no. 24, p. 9052, 2020. [Online]. Available: <https://doi.org/10.3390/app10249052>
- [12] “Inferring Stratix V DSP blocks for FIR filtering applications,” Application Note AN639, Altera, San Jose, CA, USA, 2017.
- [13] V. Mauer, *Designing Filters for High Performance*, Altera Corp., San Jose, CA, USA, 2015, pp. 1–14.
- [14] S. Y. Park, “A low-cost FPGA implementation of multi-channel FIR filter with variable bandwidth,” *IEICE Electron. Exp.*, vol. 12, no. 22, pp. 1–7, 2015.
- [15] *IEEE 802.11ah Part, Wireless LAN Media Access Control (MAC) and Physical Layer Specifications*, IEEE Comput. Soc., New York, NY, USA, 2016.
- [16] S. N. Shahrouzi and D. G. Perera, “HDL code optimizations: Impact on hardware implementations and CAD tools,” in *Proc. IEEE PACRIM*, 2019, pp. 1–9.
- [17] R. Teymourzadeh and M. Othman, “VLSI implementation of cascaded integrator comb filters for DSP applications,” 2018, *arXiv:1808.09369*.