

# An 802.11a/b/g/n Digital Fractional- $N$ PLL With Automatic TDC Linearity Calibration for Spur Cancellation

Dongyi Liao, Hechen Wang, Fa Foster Dai, *Fellow, IEEE*, Yang Xu, Roc Berenguer, and Sara Munoz Hermoso

**Abstract**—A fractional-N digital phase-locked loop (PLL) architecture with low fractional spur is presented in this paper. A 2-D Vernier time-to-digital convertor (TDC) is implemented to achieve wide detection range with fine resolution. The TDC is calibrated automatically utilizing the ramp signal generated from the fractional-N accumulator for optimal linearity. A digi-phase spur cancellation technique with automatic TDC gain tracking is also implemented to further suppress the fractional spurs. The chip also includes an improved multimodulus divider (MMD) structure that overcomes the glitch problem during division ratio toggling associated with the prior art MMDs, enabling carrier synthesis across wide frequency range continuously. As part of an 802.11a/b/g/n transceiver, the DPLL can provide coverage for both 2.4/5 G WiFi bands. The proposed fractional-N DPLL is implemented in a 55-nm CMOS technology. The DPLL achieves a largest fractional spur level of  $-55$  dBc without using a sigma-delta modulator and an in-band phase noise of  $-107$  dBc/Hz (0.55 ps integrated jitter) while consuming 9.9 mW.

**Index Terms**—2-D Vernier, digital calibration, DPLL, fractional- $N$ , least-mean-square (LMS) filter, multimodulus divider (MMD), time-to-digital convertor (TDC).

## I. INTRODUCTION

AS SEMICONDUCTOR technology advances to a finer feature size, digital circuits are becoming more efficient in both area and power. Integrating the conventional phase-locked loop (PLL) imposes a greater challenge and burden to maintain the analog components. On the other hand, the DPLL shares a similar device as used in digital circuits. A fully synthesizable DPLL has been proposed to take full advantage of the advanced deep submicron technology while providing easy integration with the digital system [1]. Moreover, the DPLL is highly flexible and programmable, which makes it capable of achieving functionalities that are very difficult to be obtained using the analog PLL. As an example, various DPLL architectures have been proposed to implement direct modulations for high-speed wireless polar transmitters [2], [3], which is a very challenging task for an analog PLL due

Manuscript received August 24, 2016; revised November 9, 2016; accepted November 29, 2016. Date of publication January 16, 2017; date of current version April 20, 2017. This paper was approved by Associate Editor Danilo Manstretta.

D. Liao, H. Wang, and F. F. Dai are with the Department of Electrical and Computer Engineering, Auburn University, Auburn, AL 36849 USA (e-mail: dzl0021@auburn.edu).

Y. Xu is with the Illinois Institute of Technology, Chicago, IL 60616 USA.

R. Berenguer and S. M. Hermoso are with Innophase Inc., Chicago, IL 60605 USA.

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/JSSC.2016.2638882



Fig. 1. Comparison of two DPLL architectures. (a) DPLL with DTC and TDC. (b) DPLL using TDC only.

to its nonlinear analog properties that are sensitive to process–voltage–temperature (PVT) variations.

A key aspect in DPLL operation is the way to measure the phase error between reference clock and feedback signal. Essentially, two approaches exist to address this issue. The first type, as shown in Fig. 1, moves the feedback pulse very close to the reference clock with a digital-to-time converter (DTC) followed by a narrow range time-to-digital convertor (TDC) to provide finer time measurement. The required DTC can be implemented with a phase interpolator or a delay-locked loop to provide multiphase outputs of the feedback signal [4], [5]. On the other hand, the second type of DPLL architecture uses only the TDC to measure the phase error [6], [7], [20]. Since all the work from the DTC is now loaded to the TDC, a wide-range TDC covering at least one oscillator period is required. Fortunately, various TDC architectures are available to meet this requirement. In addition, part of the hardware from the fine TDC can be reused to implement the coarse measurement, which can fasten the loop locking process.

Either using DTC or not, various digital calibration techniques have to be applied to suppress fractional spurs in a DPLL. The conventional fractional spur cancellation using sigma-delta modulator (SDM) [8] requires narrow loop bandwidth in order to suppress the noise-shaping component at high frequency band. In addition, using a high-order SDM

to toggle the loop division ratio varies the feedback edge after the divider over multiple digitally controlled oscillator (DCO) cycles, which requires a TDC or DTC with a wider detectable range, which leads to higher power consumption and more complicated hardware. These drawbacks motivate us to explore other spur cancellation methods including the digi-phase. Regardless of the techniques employed, the spurious level in DPLL is highly dependent on the TDC linearity, necessitating accurate calibrations. In this paper, we present a wideband fractional- $N$  DPLL with digital calibration for fractional spur suppression for a low-power Wi-Fi transceiver in 802.11a/b/g/n bands using a 55-nm CMOS technology. The 2-D Vernier TDC's nonlinearity is automatically calibrated through the fractional frequency synthesis [9]. The implemented RFIC also includes an improved multimodulus divider (MMD) that overcomes the division ratio skipping problem associated with the prior art designs.

This paper is organized as follows. Section II discusses the design principle to achieve low fractional spurs for DPLLs. Section III describes the system architecture and the proposed TDC linearity calibration. The measurement results are presented in Section IV, and conclusions are drawn in Section V.

## II. DESIGN OF A DPLL WITH LOW FRACTIONAL SPUR

When PLL is generating a carrier frequency  $f_o$ , which is equal to integer multiples of the reference frequency  $f_{ref}$ , e.g.,  $f_o = Nf_{ref}$ , the frequency divider generates one pulse after  $N$  DCO cycles. The divider output pulse shall be directly aligned with the reference pulse when the loop is in lock, and thus the phase error measured by the TDC is zero. On the other hand, when DPLL is generating a fractional frequency, the divider will toggle its division ratio between  $N$  and  $N+1$  to achieve an equivalent fractional division ratio. Even though the average division ratio over time equals the desired fractional value, an instantaneous phase error exists between the divider output and the reference clock. This periodic error will further modulate the DCO control words, and thus various fractional spurs will arise along with the desired carrier tone.

Various techniques including digi-phase [10] have been proposed to suppress fractional spur. However, the cancellation effect at TDC output is affected by various nonideal circuit characteristics, which degrade the spur suppression performance. Due to the limited TDC resolution and linearity, a small amount of residue error might still exist after the cancellation. As shown in Fig. 2, assuming that each TDC bit covers  $2^{-tr}$  of one DCO cycle and each digital bit in the digi-phase cancellation signal covers  $2^{-fr}$  of one DCO cycle, respectively, the TDC-resolution-induced residue has a period of  $2^{fr-tr}T_{ref}$ , which corresponds to a fractional spur located at an offset frequency of  $2^{-fr+tr}f_{ref}$ . On the other hand, the divider output sweeps around the reference edge with the division ratio toggling between  $N$  and  $N+1$ , and thus the TDC output waveform repeats every  $2^{fr}T_{ref}$  cycles, which creates a fractional spur at an offset frequency of  $2^{-fr}f_{ref}$ . As a result, both TDC resolution and linearity have impact on fractional spurs. Nevertheless, the fractional spur from the



Fig. 2. Simulated TDC output and the residue signal after the digi-phase canceller, showing that TDC-resolution- and TDC-nonlinearity-induced residue errors possess different periods.

TDC nonlinearity is more critical since it is closer to the carrier tone on the spectrum. As an example, assuming a TDC resolution of 5 ps, a reference frequency of 80 MHz, a carrier frequency of 2.4 GHz, and a fractionality  $2^{-fr}$  of 1/256, the resolution- and linearity-induced fractional spurs will be located at 26 and 0.32 MHz, respectively. Thus, the fractional spur generated by limited TDC resolution will be greatly attenuated by the loop filter, leaving spurs generated by TDC nonlinearity as the dominant source. Thus, to implement a DPLL with low fractional spur, it is critical to have a highly linear TDC.

The fractional spur due to TDC nonlinearity can be further analyzed as follows: assuming that the residue error at the digi-phase canceller output can be expressed as  $\varepsilon = A_1 \sin(2\pi f_m t) + A_2 \sin^2(2\pi f_m t) + A_3 \sin^3(2\pi f_m t) + \dots$ , where  $A_1$  is the magnitude of the error's fundamental tone and  $f_m$  represents the fractional offset frequency, the power level of the closest fractional spur can be derived as

$$P_{frac}(\text{dBc}) = 20 \cdot \log_{10} \left( \frac{H(f_m) K_{DCO}}{2f_m} \cdot A_1 \right) \quad (1)$$

where  $K_{DCO}$  denotes the gain of DCO and  $H(f)$  represents the loop filter transfer function. As an example, assuming a fractional frequency of 1.25 MHz, a loop bandwidth of 1 MHz such that the closest fractional spur experiences a slight suppression from the loop filter, a  $K_{DCO}$  of 10 kHz/b, and a TDC resolution of 5 ps/b, the calculated and simulated spur level results are shown in Fig. 3(a). The simulation result deviates slightly from the calculated value for a small TDC residue, mainly because (1) has not taken into account the quantization effect of a DPLL. Furthermore, using the measured TDC residue error, as shown in Fig. 3(b), its fundamental waveform can be shown to have a peak-to-peak magnitude of 0.6 LSB, which corresponds to a peak magnitude  $A_1$  of 0.3 LSB. Using the above analysis, a spur level of  $-54$  dBc is expected at the closest fractional



Fig. 3. (a) Fractional spur level due to residue error at the digi-phase canceller output. (b) Measured residue error and its fundamental waveform.



Fig. 4. Proposed DPLL block diagram with automatic TDC linearity calibrations for fractional spur cancellation.

frequency, which is very close to our measured closest spur level of  $-56$  dBc.

Several TDC topologies can be used for the proposed DPLL design: the traditional TDC architecture is a single delay line TDC, which can only achieve a resolution of one single gate delay. To achieve finer resolution, Vernier technique is developed [11]. Using two delay chains with a slight delay difference, this kind of TDC can achieve a subgate delay resolution. However, a large number of delay stages will be required in order to cover a large detection range. An improved structure is to configure the Vernier delay chains into a ring [12]. By reusing the delay cells, a Vernier ring TDC can achieve large detection range and fine resolution simultaneously. Alternatively, the gated ring oscillator connects the delay cells together to form a ring oscillator [13]. Using multiple phases in the ring to clock a counter while holding the clock phases between the measurement cycles allow accurate time measurement with intrinsic first-order quantization noise cancellation. Moreover, a time amplifier (TA) TDC [14] and an ADC-based TDC [15] can both achieve fine resolution. However, they have their own sets of drawbacks. A TA TDC is limited by the linearity of its TA, while the conversion rate of ADC-based TDC is limited. In this paper, the 2-D Vernier TDC structure was adopted to achieve the subgate delay resolution. Moreover, the 2-D structure is able to provide

sufficient detectable range while consuming reasonable power and minimal hardware with high conversion rate.

### III. SYSTEM AND BUILDING BLOCKS

#### A. System Architecture

The complete DPLL architecture with digital calibration is shown in Fig. 4. The TDC adopts a three-step architecture to provide both fine and coarse measurements. A digi-phase cancellation signal is injected at the TDC output to cancel the instantaneous divider quantization errors. Ideally, the waveform after the cancellation block shall remain constant with only the dc component. However, various nonideal characteristics in the loop will still cause a small amount of residue phase errors. In other words, the residue error after digi-phase subtraction is directly related to various system imperfections including nonlinearity, mismatch, and variation. Thus, this residue can be used as the error signal for various digital calibrations adopted in this design. The gain applied on the digi-phase path is automatically adjusted with a TDC gain tracking module that correlates the error signal with the digi-phase gain. Optimized gain can be achieved when the error is minimized. Likewise, the TDC calibration uses the same error signal to adjust delay cell for optimal TDC linearity. In summary, our proposed digital calibration scheme can be described as follows.



Fig. 5. Proposed three-step TDC block diagram.

TABLE I  
SPECIFICATIONS OF A THREE-STEP TDC

| Step        | Structure  | Range/Bit  | Resolution |
|-------------|------------|------------|------------|
| Acquisition | Bang-bang  | $\infty/1$ | +/-        |
| Coarse      | Flash      | 2.08ns/5   | 65ps       |
| Fine        | 2D Vernier | 520ps/7    | 5ps        |

- Initially, the PLL is locked to a known fractional frequency with the digi-phase block enabled and the TDC gain tracking fixed at a preset value.
- After lock-in, the TDC calibration block utilizes the ramp signal at the TDC output for TDC linearity calibration.
- The linearity calibration is disabled and the TDC gain tracking is enabled.

Next, the loop is relocked to the desired frequency.

### B. Three-Step TDC

Similar to the Phase Frequency Detector in an analog PLL, the TDC measures the phase difference between the divided feedback signal and the reference clock. The measured result will be further quantized into digital bits and processed by the digital loop filter. The quantization step or TDC resolution directly determines the in-band phase noise at DPLL output and can be shown as [16]

$$\mathcal{L} = \frac{(2\pi)^2}{12} \left( \frac{\Delta t_{\text{res}}}{T_{\text{DCO}}} \right)^2 \frac{1}{f_{\text{ref}}} \quad (2)$$

where  $T_{\text{DCO}}$  is the period of the DCO output and  $t_{\text{res}}$  is the TDC resolution. Assuming a  $T_{\text{DCO}}$  of 416 ps and an  $f_{\text{ref}}$  of 80 MHz, a TDC resolution of 5 ps can be calculated from (2) to achieve an in-band noise floor of  $-110$  dBc/Hz. Since the phase error ranges across  $[-T_{\text{ref}}/2, T_{\text{ref}}/2]$ , our proposed TDC is segmented into three steps to cover all possible phases during phase-locking process, as shown in Fig. 5. The three-step structure includes a bang-bang TDC as the first stage, a single delay chain as the second stage, and a 2-D Vernier delay array as the third stage. The single delay chain is constructed as part of the Vernier delay chains in order to



Fig. 6. Timing diagrams of the (a) bang-bang TDC, (b) single delay line TDC, and (c) Vernier delay line TDC.

save area and power. Table I summarized the specifications of these three sub-TDCs.

More specifically, a bang-bang TDC acting as a signal steering gear is employed for the first stage of the proposed three-step TDC. The bang-bang TDC has the capability to detect an entire reference cycle (12.5 ns in an 80-MHz system). It takes the position of the falling edge from the reference (REF) signal as a trigger signal. If the divided feedback (DIV) signal is between the REF signal's rising and falling edges, as shown in Fig. 6, it will be determined as a lagging event. These two signals will be directly propagated to later TDC stages. Otherwise, if the divided feedback (DIV) signal arrives after the REF signal's falling edge, it will be considered as a leading event with respect to the following REF signal's rising edge. In this case, the two signals will be swapped to ensure a normal operation for the next TDC stages.

In the second stage TDC, a delay chain with 16 delay stages are adopted to provide a coarse measurement with a 4-b binary output. In conjunction with the polarity detection provided by the bang-bang TDC, the coarse TDC provides a 5-b output with a resolution of 65 ps. By reusing the delay stages of the 2-D Vernier TDC, this coarse TDC requires no extra hardware and power consumption, while extending the TDC detectable range to 2.08 ns. With this coarse TDC, the proposed DPLL can achieve faster locking owing to the enlarged detectable range. In the simulation, an initial frequency error of 40 MHz was introduced. With this frequency error, it takes about 5.6  $\mu$ s to lock for a loop with the coarse TDC and the bang-bang TDC, while it takes more than 30  $\mu$ s to lock for a loop using the bang-bang TDC only without the coarse TDC. The fine TDC is used to further lower the in-band phase noise.

The fine TDC is constructed using a Vernier structure with a 2-D arbiter array [17]. The delays from one stage in the fast delay chain and slow delay chain are set to 60 and 65 ps, respectively. This slight difference provides a subgate delay time resolution as fine as 5 ps. The fine 2-D TDC has a detectable range of 520 ps (7 b), which is sufficiently large to cover an entire 2.4-GHz DCO cycle (420 ps). The circuit diagram of TDC unit delay stage is shown in Fig. 7(a).



Fig. 7. Circuit diagrams of (a) TDC unit delay stage and (b) arbiter cell.



Fig. 8. Simulated TDC nonlinearity considering common mode error and differential mode error.

Each delay stage consists of two cascaded inverters to avoid mismatches between the rising and falling edges. In order to tune each delay stage to the desired value (60 and 65 ps) for optimal TDC linearity, both delays are designed to be adjustable with 6-b control and a 0.5-ps step size covering a range of 50 ps. In addition, a first-order SDM was added at the delay control input to further improve the tuning accuracy. The arbiter cell structure is shown in Fig. 7(b). The reference and feedback signals are fed into port “Start” and port “Finish.” This arbiter structure is able to distinguish a minimum time difference of 200 fs based on the simulation results with the propagation delay less than 10 ps.

### C. Automatic TDC Linearity Calibration

Similar to the basic Vernier TDC, a fast and a slow delay chain are employed in a 2-D Vernier structure. However, rather than using a single arbiter line, multiple arbiter lines are implemented in a 2-D Vernier structure to compare each fast delay stage with multiple slow delay stages. By reusing part of the delay stages, a larger detectable range can be achieved. However, a highly linear 2-D Vernier TDC requires the delays of fast and slow chains to satisfy the following conditions:

$$\begin{cases} n(d_s - d_f) = d_s \\ d_s - d_f = t_{\text{res}} \end{cases} \quad (3)$$

where  $d_s$  and  $d_f$  denote the delays of single stage in slow chain and fast chain, respectively, and  $n$  is number of stages in one arbiter line. The first equation comes from the condition for a continuous measurement with a 2-D Vernier TDC and the second equation sets the measurement resolution. Using these two equations, only one set of  $d_s$  and  $d_f$  can be used as



Fig. 9. Proposed TDC automatic linearity calibration loops.

a viable solution. Any deviation of the two delays will cause error compared with the ideal case. We define the common mode delay error as the deviation of the average of two delays and the differential mode delay error as the deviation of the difference of two delays from their ideal values, respectively. As shown in Fig. 8, a common mode delay error introduces gaps at the turning points of each arbiter line and a differential mode delay error leads to an incorrect slope for each line. Moreover, the TDC nonlinearity induced by the common mode delay error is zero for small TDC input located within the first arbiter line. The deviation from the ideal transfer curve accumulates as the TDC input gets larger. On the other hand, the nonlinearity from differential mode delay error shows up even within the first arbiter line but only repeats itself periodically for large TDC inputs. It is from these observations that we conclude that the dominant source for TDC nonlinearity is the differential mode delay error for small TDC inputs and the common mode delay error for large TDC inputs.

With a closer look, the quantization error generated by the fractional- $N$  accumulator presents a staircase ramp waveform that can be used to sweep the TDC input from  $-T_{\text{DCO}}/2$  to  $T_{\text{DCO}}/2$ . As illustrated in Fig. 9, the corresponding TDC output can be further subtracted from an ideal ramp signal, creating an error signal that can be used to automatically adjust the TDC delays. As mentioned above, when TDC input is within the range of the first arbiter line, only the difference



Fig. 10. Measured convergence of TDC common mode and differential mode delays.



Fig. 11. Proposed wide-tuning DCO architecture.

between fast and slow delays causes TDC measurement error. On the other hand, the average of fast and slow delays dominates TDC error when TDC input is sufficiently large such that multiple arbiter lines are used. As a result, the common and differential parts of the fast and slow delays can be calibrated separately according to TDC input range. Two least-mean-square (LMS) loops are designed to collect the differential and common mode error signals used for fast and slow delay chain calibrations. More specifically, TDC generates a flag signal to indicate either one or multiple arbiter lines are used in one measurement. This flag signal will be further used to activate either common or differential LMS loop. In this way, we can guarantee an orthogonal calibration of two types of errors without interfering each other. As shown in Fig. 10, the measured results showed that the LMS loops for common and differential delays converge after about 150  $\mu$ s. The convergence speed depends on the step size of the LMS loop. Faster convergence can be achieved with a larger step size. However, an exceedingly large step size might jeopardize the convergence stability.

#### D. Second-Order Digital Loop Filter

As shown in Fig. 4, the digital loop filter consists of proportional and integral paths to achieve a programmable bandwidth from 200 kHz to 2 MHz. In addition, two additional infinite-impulse-response filters are added on the proportional path to create a second-order filter. Parameters including gain on the proportional and integral paths in the digital loop filter can be programmed to achieve different natural frequencies  $\omega_n$  and damping factors  $\xi$  similar to the analog PLL

$$\omega_n = \sqrt{\frac{K\beta}{T_{ref}}} \quad \xi = \frac{\alpha}{2} \sqrt{K \frac{T_{ref}}{\beta}} \quad (4)$$

where  $K$  represents the total loop gain except loop filter,  $T_{ref}$  is the period of reference clock, and  $\alpha$  and  $\beta$  represent the gains in the proportional and integral paths, respectively. The loop can be programmed to a wider loop bandwidth initially for faster frequency lock and reconfigured to an optimal bandwidth that corresponds to the best phase noise performance afterward.

#### E. Wide-Tuning DCO

As part of a multiband wireless transceiver, a DCO with wide tuning range is required to provide sufficient spectrum coverage. Moreover, a wide-tuning DCO can also be used in a DPLL with high data-rate direct modulation [2]. In our design, four capacitor banks [PVT, acquisition (ACQ), tracking (TRK), and the finest (FIN)] are designed as shown in Fig. 11. The DCO oscillates at 5 GHz and is able to generate a 2.4-GHz carrier with a divide-by-2 prescaler. The implemented DPLL is able to cover 1.9–2.8- and 3.8–5.6-GHz bands for multiband applications. Thus, the DCO is equipped with a wide-tuning range of 38%. In order to lock to the desired channel, the system will first use a successive approximation algorithm to tune the PVT bank, which has the widest tuning range with six binary-weighted capacitors. Next, the other three banks, the ACQ bank (5-b), the TRK bank (6-b), and FIN bank (7-b), will be activated for further locking. The thermometer-weighted structure is adopted for these three banks to ensure the monotonic tuning characteristic. Fig. 12 shows the frequency range relationships among the tuning banks. In order to minimize quantization noise from DCO, two fixed capacitors are connected in series with the parasitic capacitor array to reduce the frequency tuning step. Eventually, the FIN bank has achieved a frequency resolution of 10 kHz/step.

In addition, we used a common-centroid layout scheme as shown in Fig. 13 to further improve the monotonicity of the DCO tuning curve. The thermometer-weighted capacitors in each bank are placed in an array surrounding unit bit 0. Unwired dummy unit capacitors are inserted at the corners of the array to minimize layout mismatches. Moreover, for each unit capacitance, the capacitor has been split into four equal pieces with a common-centroid quadrature layout style as well.

#### F. Error-Free Multimodulus Divider

Conventional MMD uses a chain of 2/3 cells connected in series [18]. In this type of divider, the frequency of



Fig. 12. Proposed DCO frequency tuning banks.



Fig. 13. Schematics and layout of digitally controlled capacitor unit.

DCO waveform is scaled down by two or three times through one cell and propagated to the next to be further divided down. By controlling the division ratio of each 2/3 cell, the entire chain can achieve a continuous division ratio range from  $2^n$  to  $2^{n+1} - 1$ . The extended division ratio range can be achieved with extra extension cells. Such an extended divider chain can achieve a range from  $2^m$  to  $2^n - 1$ , where  $m$  is the chain length when all extension cells are turned off and  $n$  is the chain length when all extension cells are turned on. However, this architecture might generate a glitch in the divider output during the first cycle after the chain length is modified.

Considering the case as shown in Fig. 14, assume that the division ratio  $P$  is set to 01..1 at first, and then all the stages are configured as divide-by-3. Since  $P_N$  is 0, the OR gate will block the feedback signal from the last stage and generate a constant high signal. Equivalently, the second last stage cannot see the last stage since its  $mod_{in}$  signal remains constantly high. In this case, the last 2/3 cell is still running as a divide-by-3 counter, but its feedback signal  $mod_{out}$  is blocked by the OR gate. Now if the  $P$  is switched to 10...0, all the stages



Fig. 14. Incorrect divider state in the first reference period after ratio switching associated with conventional MMD using extension cells.



Fig. 15. Flowchart of the proposed divider with asynchronous counter and the remapped control words.

will be configured to divide-by-2. Since the OR gate no longer blocks the feedback signal from the last stage, the equivalent length of the entire chain is increased by one. Depending on the time of  $P$  switch, the last stage might still need to finish its current divide-by-3 cycle in the first period before successfully switching to divide-by-2 mode. This might cause the feedback signal  $mod_{out}$  to be delayed or advanced by one reference period, thus generating incorrect edge at the divider output. Such glitch can cause failure of locking at fractional frequency in which the division ratio toggles between  $2^n - 1$  and  $2^n$ .

To resolve this issue, some division-ratio-dependent solutions have been proposed for limited extension bits [19], but extending to higher bits still remains nontrivial. In this design, we propose to use a single synthesizable state machine to



Fig. 16. Die photo of the DPLL in a low-power multistandard wireless transceiver RFIC.



Fig. 17. Measured phase noise at a 2.08-GHz output with a loop bandwidth of 1 MHz.

replace all the stages with ratio extension logics as shown in Fig. 15. This new MMD uses an asynchronous counter to count the divided edges from previous stages. As shown in the flowchart, the asynchronous counter is set to zero when the divider extension bits are disabled. Thus, it will always count from zero when enabled to avoid generating glitch in the output. When it is activated, it will function as a counter triggered by the divided clock from the last 2/3 cell stage. The upper limit of the counter is set by the assigned higher bits from division ratio words  $P$ . The remapped division ratio with control word  $P$  is also shown in Fig. 15. The 3 LSB are used to control the high-speed 2/3 cells, while the upper 4 MSB set the upper limit for the asynchronous counter. A division range programmable from 8 to 127 is achieved with no division ratio switching error.

#### IV. MEASUREMENT RESULTS

A prototype of the proposed DPLL is implemented in a standard 55-nm CMOS technology, as shown in Fig. 16. The RFIC is separated into the digital part and the analog part on the layout to minimize their cross talk. The entire DPLL



Fig. 18. Measured spectrum before and after digital calibrations with the fractionalities of (a) 1/64 and (b) 3/64, respectively.

occupies  $0.56 \text{ mm}^2$ , in which two major components, TDC and DCO, take most of the area. When the loop is locked to an integer frequency at 2.08 GHz, the measured in-band phase noise is  $-107 \text{ dBc/Hz}$  and the integrated rms jitter (from 10 kHz to 10 MHz) is 0.55 ps, as shown in Fig. 17. The in-band spur around 250 kHz is due to the power regulator used on board. The loop bandwidth is set to 1 MHz in order to clearly show the in-band noise floor achieved.

To demonstrate the effectiveness of the proposed TDC calibration, the DPLL is configured to lock at various

TABLE II  
MEASURED DPLL PERFORMANCES AND COMPARISONS

|                        | Hsu [20]<br>ISSCC08 | Tasca [4]<br>JSSC11 | Elkholy[21]<br>JSSC15 | Narayanan [5]<br>JSSC16 | Gao [22]<br>ISSCC15 | This work                   |
|------------------------|---------------------|---------------------|-----------------------|-------------------------|---------------------|-----------------------------|
| Architecture           | Frac.<br>DPLL       | Frac.<br>DPLL       | Frac.<br>DPLL         | Frac.<br>SSPLL          | Integer<br>SSDPLL   | <b>Frac.<br/>DPLL</b>       |
| Technology (nm)        | 130                 | 65                  | 65                    | 65                      | 28                  | <b>55</b>                   |
| $f_{ref}$ (MHz)        | 50                  | 40                  | 50                    | 40                      | 80                  | <b>80</b>                   |
| $f_o$ (GHz)            | 3.2-4.2             | 2.9-4.0             | 4.5                   | 4.34-4.94               | 5.8                 | <b>1.9~2.8/<br/>3.8~5.6</b> |
| DCO Tuning Range (%)   | 27.2                | 31.9                | 26.8                  | 12.9                    | /                   | <b>38.3</b>                 |
| In-band PN (dBc/Hz)    | -108                | -104                | -106                  | -120                    | -105                | <b>-107</b>                 |
| Fractional Spur (dBc)  | -53                 | -53                 | -51                   | -59                     | /                   | <b>-55</b>                  |
| Closet Spur (MHz)      | 1                   | 1                   | 0.392                 | 0.03                    | /                   | <b>1.25</b>                 |
| Loop Bandwidth (MHz)   | 1.1                 | 0.312               | 2.5                   | 1                       | /                   | <b>1</b>                    |
| RMS Jitter(fs)         | 204                 | 400                 | 490                   | 133                     | 173                 | <b>549</b>                  |
| Power (mW)             | 46.7                | 4.5                 | 3.7                   | 6.2                     | 9.5                 | <b>9.9</b>                  |
| Area ( $\text{mm}^2$ ) | 0.95                | 0.22                | 0.22                  | 0.2                     | 0.3                 | <b>0.56</b>                 |

fractional frequencies. Two cases with the fractionalities of 1/64 and 3/64 are shown in Fig. 18(a) and (b), respectively. The measured largest fractional spurs in two cases at 1.25 and 3.75 MHz were  $-45$  and  $-36$  dBc before calibrations. In both measurements, the digi-phase spur cancellers have been enabled. However, a small amount of residual error still exists after the canceller due to TDC nonlinearity. When the proposed TDC calibration is completed, the fractional spur level drops to below  $-55$  and  $-60$  dBc, respectively, indicating a spur reduction of 10 and 25 dB, owing to the proposed TDC calibration scheme. Since the TDC gain is proportional to the delay difference between the fast and slow chains, the loop bandwidth varies slightly after delay calibration that could affect the final spur reduction effect as well. Furthermore, the spur level before calibration depends on the TDC initial delay that is PVT sensitive, and different spur levels are observed for various frequency settings. However, with the proposed calibration turned on, the largest fractional spur level is always below  $-55$  dBc. Additional measurements of the largest fractional spurs with different fractional frequencies before and after TDC calibration are shown in Fig. 19.

The measured TDC transfer curve is shown in Fig. 20. Before TDC calibration, gaps between different arbiter lines can be clearly observed due to inaccurate delays from two delay chains that will cause high spurious tone in the DPLL output. After TDC calibration, the measured TDC transfer curve is very close to the ideal transfer curve. With autocalibration, this 2-D Vernier TDC achieves an average differential nonlinearity (DNL) of 1.13 LSB and an integral nonlinearity (INL) of 0.81 LSB, while DNL and INL are 1.32 LSB and 3.49 LSB without calibration, respectively. The DNL is mainly caused by the 2-D arbiter topology, where the turning points of the arbiter chains correspond to the worst DNL. The proposed TDC gain and linearity calibration only need to be carried out once initially and involves negligible extra power consumption.

As part of a low-power 802.11a/b/g/n wireless transceiver RFIC, this proposed DPLL consumes a 9.9-mW total power in which TDC, DCO, and the digital circuits



Fig. 19. Measured fractional spur near 2.4 GHz with a loop bandwidth of 1 MHz for different fractional frequencies with and without TDC calibrations.



Fig. 20. Measured TDC transfer curve, INL, and DNL before and after digital calibrations.

(including MMD) consume 4.7, 4.2, and 1 mW, respectively. The reference signal is generated with an 80-MHz crystal oscillator. Performance comparisons are summarized in Table II, demonstrating a competitive DPLL design compared with the state-of-the-art technologies.

## V. CONCLUSION

A fractional-*N* DPLL using a 2-D Vernier TDC with automatic linearity calibration is presented. Using a ramp signal generated from the existing fractional frequency synthesis blocks, the loop can automatically adjust the TDC's fast and slow delays to achieve the best linearity for fractional spur reduction. A digi-phase canceller with an automatic TDC gain tracking loop is implemented to further suppress the fractional spurs. The largest fractional spur of  $-55$  dBc was measured over various fractional frequencies without using traditional SDM for noise shaping. The proposed three-step TDC is able to provide fine resolution and wide detectable range with minimal hardware. This paper also presents an improved divider structure that resolves the glitch issues during division ratio switching associated with conventional MMDs. This novel divider structure can provide a wide division range from 8 to 127 without transient switching glitches to support the wide DCO tuning range of 38%.

## REFERENCES

- [1] W. Deng *et al.*, "A fully synthesizable all-digital PLL with interpolative phase coupled oscillator, current-output DAC, and fine-resolution digital varactor using gated edge injection technique," *IEEE J. Solid-State Circuits*, vol. 50, no. 1, pp. 68–80, Jan. 2015.
- [2] G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 20 Mb/s phase modulator based on a 3.6 GHz digital PLL with  $-36$  dB EVM at 5 mW power," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 2974–2988, Dec. 2012.
- [3] S. Zheng and H. C. Luong, "A CMOS WCDMA/WLAN digital polar transmitter with AM replica feedback linearization," *IEEE J. Solid-State Circuits*, vol. 48, no. 7, pp. 1701–1709, Jul. 2013.
- [4] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 2.9–4.0-GHz fractional-*N* digital PLL with bang-bang phase detector and 560-fs<sub>rms</sub> integrated jitter at 4.5-mW power," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec. 2011.
- [5] A. T. Narayanan *et al.*, "A fractional-*N* sub-sampling PLL using a pipelined phase-interpolator with an FoM of  $-250$  dB," *IEEE J. Solid-State Circuits*, vol. 51, no. 7, pp. 1630–1640, Jul. 2016.
- [6] Z. Ru, P. Geraedts, E. Klumperink, X. He, and B. Nauta, "A 12GHz 210fs 6mW digital PLL with sub-sampling binary phase detector and voltage-time modulated DCO," in *Symp. VLSI Circuits, Dig. Tech. Papers*, 2013, pp. 194–195.
- [7] J. Borremans, K. Vengattaramane, V. Giannini, B. Debaillie, W. van Thillo, and J. Craninckx, "A 86 MHz–12 GHz digital-intensive PLL for software-defined radios, using a 6 fL/step TDC in 40 nm digital CMOS," *IEEE J. Solid-State Circuits*, vol. 45, no. 10, pp. 2116–2129, Oct. 2010.
- [8] J. W. M. Rogers, F. F. Dai, M. S. Cavin, and D. G. Rahn, "A multiband  $\Delta\Sigma$  fractional-*N* frequency synthesizer for a MIMO WLAN transceiver RFIC," *IEEE J. Solid-State Circuits*, vol. 40, no. 3, pp. 678–689, Mar. 2005.
- [9] D. Liao, H. Wang, F. F. Dai, Y. Xu, and R. Berenguer, "An 802.11 a/b/g/n digital fractional-*N* PLL with automatic TDC linearity calibration for spur cancellation," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, May 2016, pp. 134–137.
- [10] M. A. Wheatley, L. A. Lepper, and N. K. Webb, "Frequency modulated phase locked loop with fractional divider and jitter compensation," U.S. Patent 5 038 120 A, Aug. 6, 1991.
- [11] P. Dudek, S. Szczepanski, and J. V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 240–247, Feb. 2007.
- [12] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit Vernier ring time-to-digital converter in  $0.13\text{ }\mu\text{m}$  CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, Apr. 2010.
- [13] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator TDC with first-order noise shaping," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, Apr. 2009.
- [14] M. Lee, M. E. Heidari, and A. A. Abidi, "A low noise, wideband digital phase-locked loop based on a new time-to-digital converter with subpicosecond resolution," in *IEEE Symp. VLSI Circuits, Dig. Tech. Papers*, Jun. 2008, pp. 112–113.
- [15] Z. Xu, S. Lee, M. Miyahara, and A. Matsuzawa, "A 0.84ps-LSB 2.47mW time-to-digital converter using charge pump and SAR-ADC," in *Proc. IEEE Custom Integr. Circuits Conf.*, May 2013, pp. 1–4.
- [16] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda, and M. Fukaiishi, "A 2.1-to-2.8-GHz low-phase-noise all-digital frequency synthesizer with a time-windowed time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2582–2590, Dec. 2010.
- [17] A. Liscidini, L. Vercesi, and R. Castello, "Time to digital converter based on a 2-dimensions Vernier architecture," in *Proc. IEEE Custom Integr. Circuits Conf.*, May 2009, pp. 45–48.
- [18] C. S. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli, and Z. Wang, "A family of low-power truly modular programmable dividers in standard  $0.35\text{-}\mu\text{m}$  CMOS technology," *IEEE J. Solid-State Circuits*, vol. 35, no. 7, pp. 1039–1045, Jul. 2000.
- [19] P. Nuzzo, K. Vengattaramane, M. Ingels, V. Giannini, M. Steyaert, and J. Craninckx, "A 0.1–5GHz dual-VCO software-defined  $\Sigma\Delta$  frequency synthesizer in 45nm digital CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, Jun. 2009, pp. 321–324.
- [20] C.-M. Hsu, M. Z. Straayer, and M. H. Perrott, "A low-noise wide-BW 3.6-GHz digital  $\Delta\Sigma$  fractional-*N* frequency synthesizer with a noise-shaping time-to-digital converter and quantization noise cancellation," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2776–2786, Dec. 2008.
- [21] A. Elkholly, T. Anand, W. S. Choi, A. Elshazly, and P. K. Hanumolu, "A 3.7 mW low-noise wide-bandwidth 4.5 GHz digital fractional-*N* PLL using time amplifier-based TDC," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 867–881, Apr. 2015.
- [22] X. Gao *et al.*, "A 28nm CMOS digital fractional-*N* PLL with  $-245.5$  dB FOM and a frequency tripler for 802.11abgn/ac radio," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2015, pp. 1–3.

**Dongyi Liao** received the B.S. degree in electrical engineering from the University of Electrical Science and Technology of China, Chengdu, China, in 2012, and the M.S. degree in electrical engineering from Auburn University, Auburn, AL, USA, in 2013, where he is currently pursuing the Ph.D. degree.

His current research interests include RF frontend design and phase-locked loops.



**Hechen Wang** received the B.S. degree in electrical engineering from the University of Electrical Science and Technology of China, Chengdu, China, in 2012, and the M.S. degree in electrical and computer engineering from Auburn University, Auburn, AL, USA, in 2013, where he is currently pursuing the Ph.D. degree in electrical and computer engineering.

His current research interests include the design of mixed-signal circuits, data converters, and RF front-end.





**Fa Foster Dai** (M'92–SM'00–F'09) received the Ph.D. degree in electrical and computer engineering from Auburn University, Auburn, AL, USA, in 1997.

From 1997 to 2000, he was a Technical Staff Member in very large scale integration at Hughes Network Systems, Germantown, MD, USA. From 2000 to 2001, he was a Technical Manager/Principal Engineer in RFIC at YAFO Networks, Hanover, MD, USA. From 2001 to 2002, he was a Senior RFIC Engineer at Cognio Inc., Gaithersburg, MD, USA.

In 2002, he joined Auburn University, where he is currently an Ed and Peggy Reynolds Family Endowed Professor of Electrical and Computer Engineering. He has co-authored six books and book chapters such as *Integrated Circuit Design for High-Speed Frequency Synthesis* (Artech House Publishers, 2006), and holds eight U.S. patents. His current research interests include analog and mixed-signal circuit designs, RFIC and MMIC designs, and high-performance frequency synthesis.

Dr. Dai was a Technical Program Committee (TPC) Chair of the 2016 BiCMOS Circuits and Technology Meeting (BCTM) and is the General Chair of the 2017 BCTM. He was a recipient of the Senior Faculty Research Award for Excellence from the College of Engineering, Auburn University, in 2009. He served as a Guest Editor of the IEEE JOURNAL ON SOLID STATE CIRCUITS in 2012 and 2013 and the IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS in 2001, 2009, and 2010. He served on TPC of the IEEE Symposium on Very Large Scale Integration Circuits from 2005 to 2008. He serves on TPC of the IEEE Bipolar/BCTM and TPC of the IEEE Custom Integrated Circuits Conference.



**Roc Berenguer** (SM'16) received the M.S. and Ph.D. degrees from TECNUN, San Sebastián, Spain, in 1996 and 2000, respectively.

From 1999 to 2015, he was with CEIT, San Sebastián. He joined as an External Consultant at Siemens, Munich, Germany, in 2000, Hitachi Microsystems Europe, Maidenhead, U.K., in 2001, Xignal Technologies, Munich, from 2001 to 2002, Seiko-Epson, Barcelona, Spain, from 2006 to 2007, and Innophase Inc., Chicago, IL, USA, from 2012 to 2014, where he collaborated in the design of several RF front-ends for wireless standards such as GSM-EDGE, DAB, and Wibree. He is currently an Associate Professor with the Electrical, Electronic and Control Engineering Department, TECNUN, Technological Campus of the University of Navarra, San Sebastián. He is also a Senior RFIC Design Engineer with Innophase Inc. and an Assessor of the Spanish Agency of Evaluation and Prospective. He has authored or co-authored more than 70 refereed publications in journals and conferences, and holds ten patents. He has co-authored the books *Design and Test of High Quality Integrated Inductors for RF Applications in Conventional Technologies* (Springer), *GPS and Galileo: Dual RF Front-End Receiver Design, Fabrication and Test* (McGraw-Hill), and *Linear CMOS RF Power Amplifiers* (Springer). His current research interests include CMOS RF/mm-wave IC design, ultralow power analog circuit design for battery-less sensor nodes, and high-speed signal processing.

Dr. Berenguer served in the Technical Program Committee of the IEEE European Solid State Circuit Conference, the IEEE Midwest Symposium Circuits and Systems, and the IEEE Ph.D. Research in Microelectronics and Electronics. He served as a reviewer for several journals such as the IEEE JOURNAL OF SOLID-STATE CIRCUITS, the IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I AND II.



**Yang Xu** received the B.S. and M.S. degrees in electronics engineering from Fudan University, Shanghai, China, in 1997 and 1999, respectively, and the Ph.D. degree from Carnegie Mellon University, Pittsburgh, PA, USA, in 2005.

He was a Senior Researcher with Qualcomm Inc., San Diego, CA, USA, where he was involved in embedded GPS receiver and 3G cellular transceiver designs. He is currently an Associate Professor with the Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL, USA. His current research interests include CMOS analog and RF/mm-wave IC design, CAD methodology and macromodeling for very large scale integration, and high-speed signal processing.



**Sara Munoz Hermoso** received the M.S. degree in electrical engineering from the Illinois Institute of Technology, Chicago, IL, USA, in 2012, and the M.S. degree in telecommunications engineering from the Universidad Politécnica de Madrid, Madrid, Spain, in 2013.

She is currently a Senior RF System Engineer with Innophase Inc., Chicago. Her current research interests include nonlinear RF device modeling and innovative wireless transmitter and receiver architectures.