

# PICOSECOND-ACCURACY DIGITAL-TO-TIME CONVERTER FOR PHASE-INTERPOLATION DDS

F. Baronti, D. Lunardini, R. Roncella, and R. Saletti

Dipartimento di Ingegneria dell'Informazione

Via Caruso 2, 56122 Pisa, Italy

Tel: +39-050-2217629; Fax: +39-050-2217522

E-mail: [d.lunardini@iet.unipi.it](mailto:d.lunardini@iet.unipi.it)

## Abstract

A high-resolution CMOS Digital-to-Time Converter for Direct-Digital-Synthesis (DDS) applications is presented in this paper. The novel architecture permits one to perform 4096 phase-interpolation levels introducing a delay proportional to a 12-bit digital control word with a resolution of about 2 ps. The virtual multiplication of the 120 MHz accumulator clock frequency by the factor 4096 is, thus, realized achieving a great reduction of the DDS output spurious components. The phase interpolation, implemented in two steps, is based on Delay-Locked Delay-Lines that are able to assure the reliability of the introduced delay by fully compensating environmental and process variations. The circuit is very compact in terms of occupied silicon area, since it employs only 35 delay-cells.

## I. INTRODUCTION

Direct Digital Synthesis (DDS) is a popular technique commonly used in a wide range of applications such as measurement equipment, multistandard digital television, arbitrary waveform synthesis, FM or PM modulation, and very finely tunable clock generators. DDS solutions are becoming now suitable also for agile frequency generation in advanced wireless communication systems, instead of analog Phase-Locked Loops (PLL), especially for fully integrated implementations. PLL-based systems are able to provide high frequency output, spectral purity, and low power consumption, but, due to their feedback loop architecture, it is difficult to provide fast frequency switching and high frequency resolution at the same time. Nevertheless, present communication systems often require a high channel switching flexibility, so that the simultaneous achievement of both features is now becoming crucial in frequency synthesis.

DDS-based circuits do not have the limitation in frequency switching speed, because the frequency is generated directly, without feedback loop, but they traditionally suffer the drawbacks of high power consumption and low operating frequency, at least in their basic implementation. A block diagram of a conventional DDS, capable of generating a sine wave, is reported in Figure 1. It consists of a phase accumulator, a ROM storing the sine values (LUT), and a D/A converter followed by a low-pass filter (LPF) [1]-[3]. A square wave can be obtained by adding a high-speed comparator.



Figure 1. Conventional DDS block diagram.

The phase accumulator provides the sine argument that is used as the ROM address. Every clock cycle, the corresponding sine value stored in the ROM is first read and then converted to an analog value. Once the undesired frequency components are filtered out, the consecutive analog values form the sine wave. If  $N$  is the number of bits of the accumulator output and  $\Delta P$  is the accumulator step (frequency control word), the output frequency is given by

$$f_{OUT} = \frac{\Delta P}{2^N} \cdot f_{CLK} \quad (1)$$

where  $f_{CLK} = 1/T_{CLK}$  is the operating clock frequency.

The clock frequency of the synthesizer is limited by the ROM access time and most of the power dissipation is due to the ROM and the D/A converter (DAC). The maximum operating speed of the analog parts of the DDS (DAC and filter) is far below that of the digital parts. In addition, CMOS integration of analog parts is difficult, especially with low power-supply voltages. Therefore, completely digital DDS architectures that use neither ROM nor D/A converter have been developed [4]-[8], in order to overcome the above-mentioned problems and to achieve higher generated frequencies.

Several authors proposed eliminating the analog parts in DDS architectures by performing a phase interpolation of the clock period using a digitally controllable delay generator [4]-[7]. Phase-interpolation DDS circuits are more suitable for integration and more easily transferable from a technology to another, because they do not contain complex analog parts, such as DACs. Moreover, the operating frequency is significantly improved with respect to traditional architectures, even if none of these circuits is able to generate a frequency higher than its input clock. A variant of the phase-interpolation DDS is the Direct Digital Period Synthesizer (DDPS) presented in [8]. In that case, the phase interpolation is combined with DLL-based frequency multiplication, thus allowing the generation of output frequencies higher than the input one. The drawback is that the control word is no longer proportional to the frequency but to the period of the output waveform.

Both phase-interpolation DDS and the DDPS architectures employ a high resolution Digital-To-Time Converter (DTC), in order to finely locate every single edge of the output signal at the right instant in the time domain. Moreover, in both cases, the time resolution of the DTC directly determines the spectral purity of the produced output signal. A high-resolution DTC has been designed for phase-interpolation DDS architectures and is addressed in this paper. According to post-layout simulations, the circuit is able to perform a 4096 level interpolation of a 120 MHz clock, reaching a delay resolution of about 2 ps. The delay generation is made in two steps. The first is a classic clock interpolator that consists in a 32-tap Delay-Locked Delay-Line (DLL) followed by a 32-to-1 multiplexer. The second stage is a new-concept delay generator based on the use of novel controllable delay-cells that can also be reused in DDPS architectures as the final delay regulator.

The new delay-cell design, in which a current-starving digital control method is combined with the shunt-capacitor technique, allows the realization in the second stage of further 128 interpolation levels, also compensating at the same time the environmental condition changes.

The following section gives a brief description of the phase-interpolation DDS architecture, whereas in Section III the new digitally controllable delay generator is presented. Finally, Section IV is devoted to the simulation results.

## II. PHASE-INTERPOLATION DDS

By considering the traditional DDS architecture depicted in Figure 1, we realize that a signal present after the first block is characterized by a frequency equal to the desired output one, at least as a mean

value. Actually, provided that the frequency control word ( $\Delta P$ ) is an odd number, the pulse width of the most significant bit (MSB) of the phase accumulator output changes periodically with a period of  $2^N/f_{CLK}$ , since the output sequence repeats itself every  $2^N$  clock cycles. In this period, the number of pulses of MSB are exactly  $\Delta P$ , so the mean frequency of the considered signal is indeed already equal to  $f_{OUT}$  defined in (1). That signal is a rectangular wave, but the fundamental frequency can be extracted by a low-pass filter in the applications in which a sine wave is needed. Therefore, the phase accumulator alone can be considered as a rough DDS circuit. However, the position in the time domain of the signal edges is quantized by the system clock period, so that high spurious components are generated in the waveform spectrum.

These values of the spurious components are actually unacceptable in most of the applications. Once the output frequency is fixed, a trivial way to reduce the spurious frequency components is increasing the system clock frequency. However, the limit is the maximum working frequency of the phase accumulator that cannot be overtaken. A great improvement of the output spectral purity can also be obtained by delaying each edge of the output signal independently, with a precise controllable delay generator (DTC), in order to make all the output pulses of the same width. Figure 2 presents a simplified block diagram of the phase-interpolation DDS architecture.



Figure 2. Block diagram of the phase-interpolation DDS general architecture. MSB edges are each one delayed by the indicated quantity.

The delay that must be applied to each edge of MSB can be deduced directly from the phase information carried by the  $N-1$  least significant bits of the phase accumulator output ( $C$  in the figure), once that the number  $M$  of interpolation levels implemented by the delay generator is known. An example of the output waveform generation for the first  $2^N+1$  clock cycles is shown in Figure 3. The figure refers to the simple case in which  $N=3$  and  $\Delta P=3$  and shows how the ideal output waveform can be generated by properly delaying each rising and falling edge of the MSB waveform.

The precision of the performed time interpolation and, consequently, the level of the output spurious signals depend now only on the accuracy of the DTC. Indeed, the position of the waveform edges at the output of the delay generator is no longer quantized by the system clock period, but is now quantized by the DTC resolution. It is worth noting that the effect of the phase interpolation on the output signal spectral purity is the same as increasing the clock frequency in the phase accumulator. In particular, performing an  $M$ -level phase interpolation is equivalent to a virtual clock multiplication by a factor  $M$ .

The key factor is now the design of a DTC with the required precision. The use of digitally controlled delay-locked delay-lines (DLL) permits one to realize the phase-interpolation circuit with outstanding performance, so that very low spurious levels can be obtained [9].

| Time<br>( $T_{CLK}$ ) | Acc.<br>Out. | MSB | Delay<br>( $T_{CLK}/N$ ) | MSB<br>Waveform | Output<br>Waveform |
|-----------------------|--------------|-----|--------------------------|-----------------|--------------------|
| 1                     | 0            | 0   | -                        |                 |                    |
| 2                     | 3            | 0   | -                        |                 |                    |
| 3                     | 6            | 1   | 1                        |                 |                    |
| 4                     | 1            | 0   | 2                        |                 |                    |
| 5                     | 4            | 1   | 3                        |                 |                    |
| 6                     | 7            | 1   | -                        |                 |                    |
| 7                     | 2            | 0   | 1                        |                 |                    |
| 8                     | 5            | 1   | 2                        |                 |                    |
| 9                     | 0            | 0   | 3                        |                 |                    |

Figure 3. Example of Digital Direct Synthesis with ideal phase interpolation, using a three-bit phase accumulator, when  $\Delta P = 3$ .

### III. DIGITAL-TO-TIME CONVERTER

A basic block diagram of the novel two-stage Digital-to-Time Converter is shown in Figure 4. The first stage of the delay generation is responsible for the first 32 interpolation levels. It consists of a DLL locked to the clock period, followed by a multiplexer. The 32-tap DLL consists of a cascade of 32 elemental adjustable delay-cells that is inserted in a negative feedback loop by which the line delay is kept equal to the period of the input clock ( $T_{CLK}$ ). The variations of process conditions, power supply, and temperature are, thus, compensated. Each cell of the delay-line is designed using a shunt-capacitor circuit scheme [10], so that the delay control can be carried out by a fully digital semi-custom controller. The DLL produces 32 output signals replicating the input one, each one spaced in phase from the other by  $T_{CLK}/32$ . The following multiplexer permits one to choose one of the DLL outputs that is then used as the synchronization signal for the MSB waveform coming from the phase accumulator. Given the clock frequency of 120 MHz in this design example, the time resolution reached by this first stage is about 260 ps, which is also the delay introduced by each cell of the delay line.



Figure 4. Two-step Digital-to-Time Converter (simplified block diagram).

The second step of the delay generator is a novel architecture that is able to realize a further 128-level interpolation inside each delay tap provided by the first step. Since the DDS application is insensitive to a delay offset identically applied to any signal edge, the first idea is to exploit the difference between two delays instead of the whole delay of a cell. In brief, a single delay-adjustable cell is used to introduce the desired delay by changing between different load configurations. However, this additional delay-cell suffers the environmental and process changes that strongly affect the introduced delay. In order to also make this fine delay generator able to compensate for the changes in environmental conditions, without reducing the precision of the introduced delay, a new kind of controllable delay-cell is employed. Before analyzing in detail the fine delay generator structure, the delay-cell architecture must be presented.

## **CURRENT-STARVED/SHUNT-CAPACITOR DELAY-CELL**

The delay-cells used in the second stage of the delay generator consist of two cascaded inverters, as for any other traditional delay-cell, but their delay needs to be controlled in a particularly flexible way. The next sub-section will explain in detail how this delay control flexibility is used to obtain both a very-high-output delay resolution and accurate compensation of the environmental variations at the same time. Two main techniques are usually employed to make the delay of the cell controllable by an external signal. Essentially, the desired result can be reached by acting either on the inverters load (shunt-capacitors) or on the inverters strength (current-starving). In the newly developed delay-cell presented in this work, both the methods are exploited. Figure 5 shows one of the two identical inverters composing the cell.

The MOS transistors MN0-MN4 and MP0-MP4 have been designed with binary-weighted increasing widths, and they can be switched on or off by the five-bit control word C0...C4. In this way, the current-starving control is completely digital and the strength of the inverters can be regulated with 32 possible control steps.

Moreover, a set of 127 shunt-capacitors is connected at the output of the inverter. A shunt-capacitor is a MOS transistor in which the source and drain terminals are joined together and attached to the inverter output. The digital voltage applied at the gate terminal decides if the MOS channel is formed or not, so that the channel capacitance is inserted or removed according to the gate voltage logic value. The 127 transistor/capacitors are all identical each other, but their control signals (gates) are organized in binary-weighted groups in order to make the load controllable by a seven-bit digital control word. Therefore, the most significant bit of the control word drives 64 shunt-capacitors, whereas only one capacitor is controlled by the least significant bit.

## **SECOND STAGE FINE DELAY-GENERATOR**

The novel architecture of the circuit that implements the second stage of the phase interpolation is showed in Figure 6 as a simplified block diagram. The circuit is essentially composed of three identical delay-cells of the type just described in the previous sub-section. The third cell is responsible for the fine delay generation, whereas the two others are in charge of stabilizing the delay introduced by the third cell against environmental and process variations. As it can be seen in the figure, the current-starving control is identical for all the three cells. It is managed by a digital controller circuit that carries out in parallel the feedback correction according to the result of the phase comparison between the first two cell outputs. On the other hand, the shunt-capacitor configurable load is controlled in a different way for each cell.



Figure 5. Current-starving/shunt-capacitor delay-cell with double digital control.

In particular, all the shunt-capacitors of the first cell are always switched off, so that the cell is configured to introduce the minimum possible delay ( $\tau_{MIN}$ ). Instead, all the capacitors of the second cell are always switched on and the introduced delay is the maximum ( $\tau_{MAX}$ ). The input of these cells is connected to two adjacent taps of the serial delay-line (first stage of the delay generator). The negative feedback loop acting on the current-starving control keeps in phase the outputs of the first and second cell. Therefore, the difference between  $\tau_{MAX}$  and  $\tau_{MIN}$  is maintained constant and equal to the serial delay-line cell delay, compensating the environmental changes. Since the control is in common to all the three cells, the delay is also stabilized in the third cell, provided that their behavior is supposed to be identical. The feedback loop is implemented by a high-sensible phase comparator (PC) and a simple digital controller. The following equation is satisfied when the loop is locked:

$$\tau_{MAX} - \tau_{MIN} = \tau_{SER} \quad (1)$$

where  $\tau_{SER}$  is the delay introduced by one cell of the serial delay line used to implement the first delay stage.

In order to understand better how the equation (1) can be satisfied only controlling the current-starving transistors, let us consider a first order approximation of the delay introduced by a generic inverter:

$$\tau = R \cdot C \quad (2)$$

where  $R$  is inversely proportional to the inverter strength and  $C$  represent inverter load. Considering the loads of the first two cells of our system, the difference between their delays can be written as

$$\tau_{SER} = \tau_{MAX} - \tau_{MIN} = R \cdot (C_0 + \Delta C) - R \cdot C_0 = R \cdot \Delta C \quad (3)$$

when the current-starving feedback loop is locked, where  $C_0$  stands for the load of the first cell (all the shunt capacitors switched off) and  $\Delta C$  is the extra-load introduced by all the 127 shunt capacitors. It is now clear that equation (3) can be satisfied by regulating only the strength of the transistors ( $R$ ), that is to say controlling the current-starving transistors.



Figure 6. Block diagram of the fine delay-generator, 2<sup>nd</sup> stage of the phase interpolation.

The strategy of compensating the environmental changes operating only with the current-starving control permits one to reserve the whole digital shunt-capacitor control of the third cell to generate the desired delay. Indeed, once the current-starving control loop is locked, the only difference between the second and the first cell is that the 127 shunt-capacitors are switched on and off respectively. Therefore, the difference in the introduced delay is now completely due to the capacitors. That means that a 128-level interpolation is realized by simply controlling the seven-bit digital control word that commands the number of capacitors inserted in the third cell. It is also worth noting that the 128 steps are obtained using only three delay-cells, so that only 35 active delay-cells are sufficient to achieve the 4096-level interpolation.

#### IV. SIMULATION RESULTS

The Digital-to-Time Converter has been completely designed and extensively simulated using a 0.35  $\mu\text{m}$  CMOS technology. In particular, the second stage of the delay generation has been functionally verified and carefully characterized after the layout phase, since it is the major novelty of this work. Figure 7 shows the post-layout simulation results relevant to the difference between the delays introduced by the second and the first cell of the fine delay generator as a function of the current-starving control. It is worth noting that a delay difference of 260 ps can always be reached in every environment or process condition. The simulation shown in Figure 8 demonstrates the linearity of the shunt-capacitors' digital control, once that the current-starving feedback loop is locked.

It can be observed that, since the current-starving control steps are very small, the phase comparator (PC) must be able to distinguish very slight phase variations. Particular attention has been paid in the layout design in order to avoid undesired differences in the delay paths, especially for the 32-to-1 multiplexer of the first stage. A few dummy cells were inserted to synchronize the signals and to limit nonlinearity effects.

As an example of the achievable spectral purity enhancement with the described system, the spectrum of a 37 MHz output signal with and without phase interpolation is shown in Figure 9. It must be noted that those simulations refer to the ideal case in which nonlinearity effects are not taken into account.



Figure 7. Post-layout simulation of the delay difference between the second and the first cell in the fine delay generator as a function of the current-starving digital control. The value of  $\tau_{SER} = 260$  ps is reached in every environment and process condition.



Figure 8. Post-layout simulation of the delay difference introduced by the third cell of the fine delay generator as a function of the shunt capacitor control, with respect to the delay introduced when all the configurable loads are switched off.

However, despite all the possible efforts in the layout design, a residual nonlinearity always remains especially due to the unavoidable circuital random mismatch. These effects introduce undesired phase modulation in the DDS output signal and directly affect the output spectral purity. However, a possible solution to overcome the problem consists in adopting calibration techniques to reduce the delay-line nonlinearity [11].



Figure 9. Spectrum of a 37 MHz output signal before and after the 4096 level phase interpolation.

## V. CONCLUSION

A very-high-resolution digital-to-time converter for phase-interpolation DDS architectures has been designed with a  $0.35 \mu\text{m}$  CMOS technology. It consists of two cascaded delay stages, the first of which is based on a DLL, capable of generating a delay proportional to a 12-bit control word. A novel delay-cell design with a double digital delay control is introduced in the second stage of the delay generation in order to reach 128-level interpolation using only three delay-cells. The system achieves 4096 levels of interpolation with a delay resolution of about 2 ps, compensating at the same time the variations of the operating conditions. As a result, the whole circuit is very compact in terms of occupied silicon area, since the total number of delay-cells is only 35.

## ACKNOWLEDGMENT

The authors wish to acknowledge Andrea Calleri for a valuable contribution in the development of this work.

## REFERENCES

- [1] H. T. Nicholas III and H. Samueli, 1991, "A 150 MHz Direct Digital Synthesizer in  $1.25-\mu\text{m}$  CMOS with  $-90 \text{ dBc}$  Spurious Performance," **IEEE Journal of Solid State Circuits**, **26**, 1959-1969.

- [2] A. Yamagishi, M. Ishikawa, T. Tsukahara, and S. Date, 1998, “A 2-V, 2-GHz Low-Power Direct Digital Frequency Synthesizer Chip-Set for Wireless Communication,” **IEEE Journal of Solid State Circuits**, **33**, 210-217.
- [3] J. Vankka, M. Waltari, M. Kosunen, and K. A. I. Halonen, 1998, “A direct digital synthesizer with an on-chip D/A-converter,” **IEEE Journal of Solid State Circuits**, **33**, 218-227.
- [4] T. Nakagawa and H. Nosaka, 1997, “A Direct Digital Synthesizer with Interpolation Circuits,” **IEEE Journal of Solid State Circuits**, **32**, 766-769.
- [5] A. Heiskanen, A. Mantyniemi, and T. Rahkonen, 2001, “A 30 MHz DDS clock generator with sub-ns time domain interpolator and -50 dBc spurious level,” in Proceedings of ISCAS 2001, the 2001 IEEE International Symposium on Circuits and Systems, vol. **4**, pp. 626-629.
- [6] H. Nosaka, Y. Yamaguchi, A. Yamagishi, H. Fukuyama, and M. Muraguchi, 2001, “A Low-Power Direct Digital Synthesizer Using a Self-Adjusting Phase-Interpolation Technique,” **IEEE Journal of Solid State Circuits**, **36**, 1281-1285.
- [7] R. Richter and H. J. Jentschel, 2001, “A Virtual Clock Enhancement Method for DDS Using an Analog Delay Line,” **IEEE Journal of Solid State Circuits**, **36**, 1158-1161.
- [8] D. E. Calbaza and Y. Savaria, 2002, “A Direct Digital Period Synthesis Circuit,” **IEEE Journal of Solid State Circuits**, **37**, 1039-1045.
- [9] F. Baronti, D. Lunardini, L. Fanucci, R. Roncella, and R. Saletti, 2002, “A High-Resolution DLL-based Digital-to-Time converter for DDS Applications,” in Proceedings of 2002 IEEE International Frequency Control Symposium and PDA Exhibition, 29-31 May 2002, New Orleans, Louisiana, USA (IEEE Publication 02CH37234), pp. 649-653.
- [10] P. Andreani, F. Bigongiari, R. Roncella, R. Saletti, and P. Terreni, 1999, “A digitally controlled shunt capacitor CMOS delay line,” **Analog and Integrated Circuit and Signal Processing**, **18**, 89-96.
- [11] F. Baronti, D. Lunardini, R. Roncella, and R. Saletti, “A Self-Calibrating Delay-Locked Delay-Line with Shunt-Capacitor Circuit Scheme,” **IEEE Journal of Solid State Circuits**, **39**, no. 2 (in press).

**QUESTIONS AND ANSWERS**

**JOHN PETSINGER (ITT):** What was the spur reduction? It went from minus 10 to minus what? It looked like minus 80 from back here.

**DIEGO LUNARDINI:** It was only an example of a 37 MHz signal synthesizing before and after the phase interpolation. The first one was the output of the phase accumulator, the MSB. The second spectrum is the spectrum of the final output signal.

**PETSINGER:** Yes, but what exactly was the reduction?

**LUNARDINI:** It was from minus 10 dB to minus 82 dB.

