

RESEARCH ARTICLE | FEBRUARY 05 2025

## FPGA-based process, voltage, and temperature insensitive picosecond resolution timing generators with offset correction for automatic test equipment

Zeyu Guo  ; Liangqi Gui   ; Kai Sheng 

 Check for updates

*Rev. Sci. Instrum.* 96, 024702 (2025)

<https://doi.org/10.1063/5.0244543>



### Articles You May Be Interested In

A field programmable gate array based high speed real-time weak periodic signal detection technique

*Rev. Sci. Instrum.* (February 2021)

A field programmable gate array based synchronization mechanism of analog and digital local oscillators in bandwidth-interleaved data acquisition systems

*Rev. Sci. Instrum.* (March 2021)

High-speed real-time periodic weak pulse signal detection with simplified phase-weighted stacking

*Rev. Sci. Instrum.* (August 2023)



MCL  
MAD CITY LABS INC.

Closed Loop Nanopositioning Systems with Picometer precision, Low noise and High stability

Force Microscopy and Single Molecule Microscopy Instruments for Quantum, Materials, and Bioscience

Custom Design and Innovative Solutions for the Nanoscale World

Think Nano® | Positioning | Microscopy | Solutions



# FPGA-based process, voltage, and temperature insensitive picosecond resolution timing generators with offset correction for automatic test equipment

Cite as: Rev. Sci. Instrum. 96, 024702 (2025); doi: 10.1063/5.0244543

Submitted: 20 October 2024 • Accepted: 18 January 2025 •

Published Online: 5 February 2025



View Online



Export Citation



CrossMark

Zeyu Guo, Liangqi Gui, a) and Kai Sheng

## AFFILIATIONS

Research Center of 6G Mobile Communications, School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

a) Author to whom correspondence should be addressed: [guilq@hust.edu.cn](mailto:guilq@hust.edu.cn)

## ABSTRACT

This paper presents the implementation of a picosecond resolution timing generator (TG) insensitive to process, voltage, and temperature (PVT) variations for automatic test equipment. The TG is implemented in field-programmable gate arrays (FPGAs) using two-stage time interpolation, which utilizes a multi-phase generator, IDELAY3, and carry-chain resources. To enhance the test rate, each channel of the proposed TG consists of four parallel operating edge generators. The TG performance will deteriorate severely without offset correction due to its sensitivity to PVT variations. To improve the adaptability of the TG, we design a robust offset canceler to ensure stable performance of the TG, resilient to PVT variations. With the proposed architecture and offset canceler, the PVT-insensitive TG achieves a time resolution of 5 ps and offers a maximum dynamic range of 10 s. It also shows improved worst case integral non-linearity ranging from -4.7 to +4.6 ps with the operating temperature continuously varying from 15 to 65 °C and voltage ranging from 0.95 to 1.01 V in FPGAs. The proposed TG can be implemented in the Ultrascale or Ultrascale+ FPGA platform.

Published under an exclusive license by AIP Publishing. <https://doi.org/10.1063/5.0244543>

## I. INTRODUCTION

Timing generators (TGs) are widely applied in many cutting-edge industrial and scientific fields, such as quantum computation,<sup>1–4</sup> quantum metrology,<sup>5–9</sup> phase-locked loops<sup>10,11</sup> (PLLs), and especially automated test equipment (ATE).<sup>12–14</sup> With the performance improvements of the digital chips, developing test systems superior to the device under test (DUT) in performance has become increasingly important.<sup>15</sup> A timing formatter, which consists of a formatter and a TG, is regarded as the core block of ATE.<sup>12</sup> The formatter manages all workflows of the formatter and communicates with other ATE blocks. The TG generates accurate edges based on target delay codes from the formatter and combines the edges based on the format of symbols<sup>16</sup> such as Non-Return-to-Zero (NRZ), Return-Zero (RZ), etc. Therefore, the TG must ensure high resolution and precise edge generation to achieve a high-performance ATE.

Implementing high-resolution TGs on application specific integrated circuits (ASICs) is a practical solution for complex timing generation,<sup>13,17–21</sup> but it is costly and lacks flexibility. Recent research on field-programmable gate array (FPGA)-based TGs shows that they can achieve high performance while ensuring a high degree of flexibility.<sup>22–31</sup> FPGA-based TGs offer significant advantages in applications such as mid-to-low-end chip testing and design validation during the development phase.<sup>28</sup>

TGs designed with high-speed clock frequency can generate timing sequence signals with a resolution equal to the clock period. Achieving a resolution of 1 ns requires a 1 GHz clock signal, making it difficult to implement a TG with picosecond-level resolution. The vernier-based digital-to-time converter (DTC)<sup>32</sup> can generate pulse signals with picosecond resolution, but it may require excessive preparation time for fine time delays. Consequently, this method may suffer from a significant symbol rate reduction with increasing timing resolution. The tapped delay line is a practical method

for continuously generating timing signals. However, its resolution is limited by the intrinsic cell delays,<sup>33</sup> typically ranging from tens to hundreds of picoseconds. The time folding method proposed by Qin *et al.*<sup>23</sup> can break the limitation of sequence time resolution contributed by the minimum chain cell delay. However, this TG does not fully utilize FPGA resources and relies on external delay chips. In addition, power consumption and area may increase significantly with the number of delay chips when expanding channels.

Among tapped delay line implementations, carry-chains offer fine linearity and resolution, enabling high-precision edge generation.<sup>25</sup> The delay time of a delay chain depends on process, voltage, and temperature (PVT).<sup>34</sup> Variations in delay time can result in timing inaccuracies and performance deterioration. Voltage-induced delay time variations can be mitigated by highly stable power supplies. However, the impact of process and temperature is unavoidable. Although temperature control equipment can mitigate temperature variations, the system cost and complexity will be increased accordingly. Offset correction methods for FPGA-based TGs are still rarely discussed.

In this paper, we propose a new structure utilizing only FPGA resources for improving TG resolution beyond the limitation of intrinsic cell delays and non-linearity. The proposed TG adopts four parallel edge generators, which consist of a coarse delay generator, a multi-phase generator, a fine delay generator, and an offset canceler. Multiple stages of IDELAY interpolation are cascaded to form a new type of delay chain with significantly smaller bins compared to the physical delay elements in the original carry-chain. To enhance the non-linearity of the new delay chain, we select the appropriate delay codes based on the demanded time resolution and length of the delay chain. To overcome delay time variations caused by PVT changes, we propose a robust offset canceler that ensures stable TG performance. The PVT-insensitive TG exhibits high performance and strong adaptability, making it suitable for various environmental conditions in FPGAs.

## II. ARCHITECTURE

### A. Overall architecture

The overall structure of the proposed TG is shown in Fig. 1. The prototype architecture is composed of  $n$  TG channels. The proposed TG takes two external clocks: a master clock ( $CLK_{mas}$ ) and an offset canceler operating clock ( $CLK_{ope}$ ).  $CLK_{mas}$  determines the test rate, which can be a 266.5 MHz frequency. The output of the TG is synchronized with  $CLK_{mas}$ .  $CLK_{ope}$  is the operating clock of the offset canceler and IDELAYCTRL and has a fixed frequency of 200 MHz.

An edge generator (EG) is composed of a coarse delay generator, a multi-phase generator, a fine delay generator, and an offset canceler. The dead time of each EG is three clock periods, which means TG could generate an edge every four clock periods if only using one edge generator. To enhance the test rate and reduce the resource utilization, each TG channel consists of four edge generators ( $Edge Gen_0$  to  $Edge Gen_3$ ) and an edge combiner. A 5 ps timing adjustment resolution is obtained by implementing the multistage time interpolation (MTI) module in the timing generation compartment. The coarse delay generator operates with  $CLK_{mas}$ . A 6-to-1 multi-phase generator is used as the first interpolation stage. The second-stage time interpolation is performed by a fine delay generator that is implemented with the carry-chain and IDELAY3 resources. The edge combiner is an XOR gate that is used to combine the output of four edge generators. The function could be described by

$$SYMBOL_0 = EDGE_0 \oplus EDGE_1 \oplus EDGE_2 \oplus EDGE_3, \quad (1)$$

where  $SYMBOL_0$  is the output of the TG and  $EDGE_0$  to  $EDGE_3$  are the outputs of the EGs.

Figure 2 shows the timing diagram of outputting timing sequence signals by the proposed TG. First, the target delay code of the coarse delay generator, multi-phase generator, and fine delay



FIG. 1. Overall architecture of the proposed timing generator.



**FIG. 2.** Timing diagram of the proposed timing generator.

generator are loaded. The coarse delay generator counts based on the target delay code and raises the enable signal  $CDG_0$  when completion occurs. As one of the outputs from the multi-phase generator is selected and the selected signal  $MPG_0$  is delayed by the fine delay

generator,  $EDGE_0$  is generated by selecting delay codes from the delay code table according to the target delay. Thus, the offset canceler calculates the edge values of the fine delay generator and selects the appropriate delay code according to a demanded resolution and



(a)



**FIG. 3.** (a) Structure of the fine delay generator. (b) The schematic of the IDELAY interpolation method.

length of the delay chain to update the delay code table. The output signals of the four edge generators operating in parallel,  $EDGE_0$  to  $EDGE_3$ , are combined by the edge combiner to generate the output of the channel,  $SYMBOL_0$  to  $SYMBOL_n$ .

The timing signal transmits through the interpolation stages in sequence before being output, and the output time of the signal edges can be described as

$$t_{output} = N_{coarse} \times t_{clk} + N_1 \times t_1 + N_2 \times t_2, \quad (2)$$

where  $N_{coarse}$  is the delay code of the coarse delay generator, and  $t_{clk}$  is 3.75 ns, which is the period of the  $CLK_{mas}$ .  $N_1$  and  $N_2$  stand for the delay code of the multi-phase generator and the fine delay generator, respectively. The average bin sizes for the two interpolation stages are represented by  $t_1$  and  $t_2$ , with  $t_1$  being 625 ps and  $t_2$  being 5 ps.

The projection of Eq. (2) translates to real-world results using a delay code table that contains the desired delays and the corresponding delay codes.



(a)



(b)



(c)

**FIG. 4.** (a) Structure of the offset canceler. (b) The diagram of the capture process. (c) The operating principle of the select algorithm.

## B. Fine delay generator

The structure of the fine delay generator is illustrated in Fig. 3(a). The output of the multi-phase generator is fed into the fine delay generator, which consists of an  $N$ -stage IDELAY interpolation and a carry chain. The carry-chain selector is a multiplexer that selects different delays of the carry-chain to output using the corresponding delay codes in the delay code table. IDELAY3 operates in “TIME-FIXED” mode,<sup>35</sup> where the delay value is fixed and features automatic temperature and voltage calibration, with a dynamic range of 0–1250 ps and a resolution of 1 ps. The schematic of the IDELAY interpolation method is shown in Fig. 3(b). The carry chain delay bin sizes are configured by different IDELAY values to generate desired carry-chain delays, which are combined to construct finer delay bin sizes by the IDELAY selector, which is a 4-to-1 multiplexer. The IDELAY interpolation can be cascaded to get a higher resolution, and the average resolution of a fine delay generator applying  $N$ -stage IDELAY interpolation can be calculated by

$$t_{resolution} = \frac{T_{dr2}}{4^N \times M}, \quad (3)$$

where  $T_{dr2}$  is the dynamic range of the fine delay generator with  $N$ -stage IDELAY Interpolation,  $N$  is the number of stages of IDELAY Interpolation, and  $M$  is the number of bins in the carry-chain.

The IDELAY interpolation and the carry-chain obtain the target delay code from the delay code table based on the target delay. The fine delay generator generates the  $EDGE_0$  in the next cycle of  $CDG_0$  with improved resolution and non-linearity.

## C. Offset canceler

Figure 4(a) illustrates the structure of the offset canceler, which utilizes the VAR\_LOAD IDELAY3 to capture edges with different delay codes and measure all delay values of EG. The IDELAY3, operating in VAR\_LOAD mode,<sup>35</sup> functions as a tapped delay line with a 4.2 ps resolution and a dynamic range of 2.15 ns, with its delay value dynamically modified by tap value. The capture process is demonstrated in Fig. 4(b). Initially, the delay code of the EG and the tap value of IDELAY3 are set to initial values. The signal generated by EG is captured by the capturer composed of a D-flip-flop (DFF) and an asynchronous first-in-first-out (FIFO) at the rising edge of the IDELAY3 output. The rising edge of the IDELAY3 output can be adjusted by the IDELAY3 controller, with capture results stored in the asynchronous FIFO. A successful capture is determined by the zero/one judgment when there is a transition from “0” to “1” in the capturer’s results and when the count of “1”’s exceeds a threshold. Then, the IDELAY3 controller records the current tap value of IDELAY3 as the initial delay value corresponding to the initial delay code. Subsequent modifications to the delay code trigger a repeated capture process. Once the delay value of the last delay code is measured, the select algorithm updates the delay code table based on the required time resolution and the number of bin sizes. The operating principle of the select algorithm is shown in Fig. 4(c). All delay codes and their corresponding tap values have been stored in FPGA according to the capture process. The select algorithm selects target tap values to achieve a minimum difference compared with the ideal tap values and outputs the corresponding delay codes after being selected. The calibrated delay values with improved non-linearity are finally obtained according to the delay code table.



**FIG. 5.** Effect of the stages of IDELAY interpolation on bin size (a), DNL (b), and INL (c) of the 5 ps resolution fine delay generator. The black lines, blue lines, and red lines represent the results without IDELAY interpolation, with 1-stage IDELAY interpolation, and with 2-stage IDELAY interpolation, respectively.

### III. PERFORMANCE CHARACTERIZATION

#### A. Time resolution and non-linearity

Figure 5 shows the effects of the number of IDELAY interpolation stages on cell delay and non-linearity of the 5 ps resolution fine delay generator. The test results were measured using a 1.8 ps resolution and 5.4 ps root mean square (rms) precision time-to-digital converter (TDC) by measuring the time intervals between the start signal and stop signal from the fine delay generator under test. The TDC is based on the multichain averaging method referring to this publication.<sup>34</sup> The black lines represent the test results of the fine delay generator without IDELAY interpolation, which has a maximum bin size of 66 ps and a negative non-linearity. The differential non-linearity (DNL) is from  $-10.5$  to  $+57.6$  ps, and the integral non-linearity (INL) is within  $-17.7/+116.3$  ps. The non-linearity is mainly contributed to by the carry chain selector and routing with different lengths. Thus, it is rather difficult to realize picosecond resolution TG using only a carry chain for time interpolation. As mentioned in Sec. II, IDELAY interpolation can be used to enhance the delay chain resolution and non-linearity. To achieve an average resolution of 5 ps, we apply 1-stage and 2-stage IDELAY interpolation and select the appropriate delay code to generate the edge. After applying the 1-stage IDELAY interpolation, the DNL is  $-4.8/+8.9$  ps and the INL is  $-9.6/+9.2$  ps. The test results represented by the red lines show that the DNL and INL are greatly improved to  $-1.5/+1.4$  and  $-1.5/+0.9$  ps by using the 2-stage IDELAY interpolation. The resolution and non-linearity of the proposed TG can be further improved by increasing the number of IDELAY interpolation stages. The test results show that the performance improvement from without IDELAY interpolation to 2-stage IDELAY interpolation is significant.

The cell delay and non-linearity of a single TG channel with a 3.75 ns master clock period are demonstrated in Fig. 6. The bin sizes of the proposed TG were tested using a high precision time-to-digital converter mentioned before. Time measurements were performed by measuring the time intervals between the reference clock and the signal from the TG channel under test. In order to achieve better linearity, we selected appropriate delay values based on the demanded time resolution, which is 5 ps, and the final timing generator is obtained. The bin size ranges of the two interpolation stages are  $+555/+684$  and  $+3.0/+6.7$  ps, as shown in Figs. 6(a) and 6(b), respectively. The two interpolation stages are combined so that the 3.75 ns master clock period can be fully interpolated. The bin size and non-linearity of a single TG channel are shown in Figs. 6(c)–6(e). After combining the two interpolation stages, the bin size range is  $+2.6/+7.1$  ps, the DNL is  $-2.4/+2.3$  ps, and the INL is within  $-2.8/+1.6$  ps.

#### B. Oscilloscograms

The characteristics of the signals from the TG channels were measured using a Tektronix oscilloscope DPO5104B, as illustrated in Fig. 7. Figure 7(a) shows that the falling edges of the output signals exhibit a sweeping step size of 1 ns with a dynamic range of 29 ns. In Fig. 7(b), the falling edges demonstrate a fine step resolution of 10 ps, while Fig. 7(c) illustrates the rising edges with an even finer 5 ps step. These oscilloscograms clearly validate the TG's capability to generate pulse signals with picosecond-level resolution. Moreover, the results confirm that the timing of both rising and falling edges can be adjusted dynamically, thereby enabling the proposed TG to achieve precise picosecond-level timing generation.



**FIG. 6.** Cell delay and non-linearity of a single TG channel with a 3.75 ns master clock period: (a) the bin size of the multi-phase generator, (b) the bin size of the fine delay generator, (c)–(e) the bin size and non-linearity of a single TG channel.



**FIG. 7.** Oscilloscograms of output signals from the TG. (a) The signal falling edges with a 1 ns step. (b) The signal falling edges with a 10 ps step. (c) The signal rising edges with a 5 ps step.

### C. Process voltage and temperature variations

To evaluate the stability of the TG performance against process, voltage, and temperature (PVT) variations, the INL of the 5 ps resolution TG using 2-stage IDELAY interpolation was tested with offset correction from 15 to 65 °C. The FPGA chip heated up naturally, and the temperature values were monitored by the system monitor in the FPGA. To better demonstrate the effect of the offset canceler, the TG was corrected at 25 °C as the baseline. The INL of TG was tested from 15 to 65 °C with correction and without correction, respectively. The effectiveness of the offset canceler is shown in Fig. 8(a). Temperature variations significantly affect the INL corresponding to each bin of the TG and cause an offset compared to the baseline at 25 °C. However, the proposed offset canceler exhibits robust performance, ensuring stable operation of the TG. Without correction, the INL range is  $-45.1/+5.6$ ,  $+30.7/+78.6$ , and  $+88.6/+136.5$  ps at operating temperatures of 15, 35, and 65 °C, respectively. With offset correction, the INL range remains below  $-3.9/+4.4$  ps within the wide temperature range. The same process was used to evaluate the effect of the internal supply voltage (VCCINT) on the TG, as shown in Fig. 8(b). The TG was calibrated at 0.97 V as the baseline. The test results show that, without correction, the INL ranges are

$+134.8/+369.7$ ,  $-317.9/-142.0$ , and  $-670.2/-296.4$  ps at VCCINT values of 0.95, 0.99, and 1.01 V, respectively. In contrast, with correction, the worst-case INL range is reduced to below  $-4.7/+4.6$  ps across all VCCINT values. In addition, we tested the bin sizes and INL of the 5 ps resolution TG using 2-stage IDELAY interpolation in two different FPGA chips, as shown in Figs. 8(c) and 8(d). Due to the robust offset canceler, the averaged bin size remains stable at 5 ps, and the INL is below  $-3.2/+3.7$  ps. With the help of the offset canceler, the proposed TG is insensitive to PVT variations.

### D. Dynamic range and standard deviation

The standard deviation (STD) is a key metric for evaluating the performance of a TG, as it characterizes the stability or variability of the output timing signals. The STD of the TG was measured using a high-precision TDC, with both short and long dynamic ranges being evaluated. The TG output pulse signals, which have fixed time intervals, were generated through two TG channels. The arrival times of these pulses were measured and recorded by the TDC for 5000 times. The distributions of the measured time intervals were analyzed, and the STD values, divided by  $\sqrt{2}$ , can be used to quantify the time jitter of the TG. The STD within a clock period of 0–3.75 ns ranged



**FIG. 8.** Effect of the offset canceler against the process, voltage, and temperature variations. (a) The INL of TG in three temperatures: 15, 35, and 65 °C with correction and without correction. (b) The INL of TG in three voltages: 0.95, 0.99, and 1.01 V with correction and without correction. (c) The bin size of TG in two FPGAs with correction. (d) The INL of TG in two FPGAs with correction.



**FIG. 9.** Test results of the standard deviation and dynamic range measurement for the TG. (a) The standard deviation (STD) value is around 9.5 ps when the time interval varies from 0 to 3.75 ns. (b) The STD value is within 9.4–14.1 ps when the time interval varies from 1 ns to 10 s. (c) The time distribution plot of the STD measurement when measuring time intervals of 1020.03 ns.

from 9.0 to 10.1 ps, as shown in Fig. 9(a). For time intervals ranging from 1 ns to 10 s, the STD varied between 9.4 and 14.1 ps, as illustrated in Fig. 9(b). Figure 9(c) presents the time distribution plot for measuring a time interval of 1020.03 ns, with an STD value of 11.9 ps.

#### IV. DISCUSSION

##### A. Time resolution, non-linearity, and latency

The time resolution of the proposed TG is determined by the cell delay of the fine delay generator. Increasing the number of stages of the IDELAY interpolation can enhance timing resolution beyond the limitations of intrinsic carry-chain cell delays and non-linearity, whereas the FPGA resource usage will increase accordingly. The  $N$ -stage IDELAY interpolation Cascade can improve the average resolution of the carry chain as Eq. (3) shows. The average timing resolution of the TG with 2-stage IDELAY interpolation improves from 10 to 0.7 ps, but its linearity is poor. Therefore, it is necessary to select the appropriate delay code to improve non-linearity. The performance improvement from without IDELAY Interpolation to 2-stage IDELAY Interpolation is significant. However, further

increasing the number of stages beyond 2 in IDELAY interpolation makes it tough to yield any substantial performance gains due to the influential role played by other factors such as system clock jitter and noise interference on TG performance. Moreover, increasing the number of IDELAY interpolation stages introduces additional latency, which extends the dead time of the TG from 2 to 3 clock cycles. The increase in dead time would cause a decrease in test rate from one third of the master clock frequency to one fourth of the master clock frequency if the TG only consists of one edge generator. To mitigate the impact on system performance, four parallel edge generators are employed to form the TG.

##### B. Process voltage and temperature variations

The delay time of a delay chain depends on PVT variations. Process deviations in chip manufacturing result in delay time variations among different types or batches of FPGAs. Voltage and temperature variations affect electron mobility within the FPGA, thereby influencing the delay time of the delay chain. Despite PVT variations affecting the delay time of the delay chain, the proposed offset canceler measures all delay values of the TG, and the select algorithm updates the delay code table based on the required time

**TABLE I.** FPGA resource consumption of the proposed TG.

| Resource     | Single offset canceler | Single channel TG | 8-channel TGs (with DDR4) | Available | Occupation (%) |
|--------------|------------------------|-------------------|---------------------------|-----------|----------------|
| 6-input LUTs | 226                    | 1376              | 23 775                    | 331 680   | 7.1            |
| Flip-flops   | 347                    | 1952              | 32 413                    | 663 360   | 4.8            |
| Slices       | 225                    | 1228              | 4828                      | 41 460    | 11.6           |
| IDELAYs      | 2                      | 40                | 385                       | 520       | 74.0           |
| BRAMs        | 3                      | 16                | 62.5                      | 1080      | 5.7            |



FIG. 10. Layout of 8-channel TGs in FPGA.

resolution and the delay chain length. The calibrated TG, with improved non-linearity, generates signals according to the delay code table. The TG channels are tested to be fully functional in two different FPGAs with the operating temperature continuously varying from 15 to 65 °C and voltage ranging from 0.95 to 1.01 V. The test results indicate that an increase in temperature causes the offset to grow, while an increase in voltage leads to a reduction in the

offset. Furthermore, the impact of voltage variations on TG performance is more significant than that of temperature variations. Offset correction is executed every minute, ensuring stable operation under various conditions over extended periods.

### C. Resource occupation

The proposed TG is implemented in Kintex Ultrascale FPGA, and the utilization of resources is a crucial consideration in the implementation of TGs. Table I illustrates the resource utilization in the Kintex Ultrascale FPGA. The Kintex Ultrascale FPGA XCKU060FFVA1156 offers sufficient resources to support the implementation of multi-channel TGs. Due to the utilization of IDELAY3 resources, the proposed TG can be implemented in the Ultrascale or Ultrascale+ FPGA platform. To demonstrate the scalability of the proposed TG, we realized 8-channel TGs with a Double-Data-Rate-IV (DDR4) memory storing the test pattern data, including period, edge, wave format, etc. The layout of 8-channel TGs in FPGA is shown in Fig. 10. The resource utilization shows that IDELAY resources have been the bottleneck for the scalability of the proposed TG. Exploring the possibility of using other resources to replace the IDELAYs in fine delay generators is left as future work.

### D. Comparison of other timing generators under PVT variations

Table II compares this work with other timing generators characterized under PVT variations. The key parameters, including timing adjustment resolution, dynamic range, nonlinearity, and PVT-insensitivity, are summarized in the table. In Ref. 28, a 200 ps-resolution timing formatter is designed and implemented, with the maximum INL reduced from 180 to 44 ps through temperature compensation. Reference 25 presents a TG based on a carry-chain tested in a laboratory environment with a temperature of 25 °C. However, the INL maximum range will expand under PVT variations. A digital-to-time converter with 14.2 ps resolution is implemented using digital signal processing (DSP) blocks in Ref. 27. The maximum range of linearity, is fully tested in 16 different DSP locations, with voltage variations from 1.14 to 1.22 V and temperature variations from 0 to 60 °C. Compared to Refs. 25, 27, and 28, this work has a finer resolution, better linearity and a wider dynamic range. In particular, the proposed TG with the offset canceler is PVT-insensitive, ensuring stable operation in various environments over extended periods.

TABLE II. Comparison between this work and other timing generators under PVT variations. Boldface denotes this work achieves the best performance.

| References, year | Type        | Resolution (ps) | Dynamic range | INL (ps)                   | DNL (ps)         | PVT-insensitive        |
|------------------|-------------|-----------------|---------------|----------------------------|------------------|------------------------|
| 28, 2017         | FPGA        | 200             | 13 ns         | -44.2/+44.2 <sup>a</sup>   | N/A              | N/A                    |
| 25, 2021         | FPGA        | 11.3            | 0.1 s         | -80/+23.2 <sup>b</sup>     | -11.3/21.9       | N/A                    |
| 27, 2021         | FPGA        | 14.2            | 10.9 ns       | -14.2/+312.44 <sup>c</sup> | -288.52/+267.79  | N/A                    |
| <b>This work</b> | <b>FPGA</b> | <b>5</b>        | <b>10 s</b>   | <b>-4.7/+4.6</b>           | <b>-4.3/+4.2</b> | <b>PVT-insensitive</b> |

<sup>a</sup>Maximum range under temperature at 31 and 44 °C.<sup>b</sup>Maximum range under temperature at 36 °C.<sup>c</sup>Maximum range under process variation, voltage from 1.14 to 1.22 V, and temperature from 0 to 60 °C.

## V. CONCLUSION

To achieve picosecond resolution timing generation resilient to PVT variations, this paper presents an FPGA-based TG using the two stage MTI method. The TG incorporates four edge generators per channel, providing a resolution of 5 ps and a maximum dynamic range of 10 s. Characterization of the proposed TG shows that increasing the number of IDELAY interpolation stages is helpful in improving the performance in timing generation. We also discussed the effects of PVT variations on timing generation and calibrated the delay offset for TG channels using an offset canceler. A robust offset canceler is integrated into the FPGA design, ensuring strong stability of TG channels under PVT variations. The proposed TG exhibits flexibility in generating precise timing sequences and stability in diverse environments over extended periods, making it highly suitable for utilization in ATE applications.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Zeyu Guo:** Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Project administration (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). **Liangqi Gui:** Funding acquisition (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal). **Kai Sheng:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Validation (equal).

## DATA AVAILABILITY

The data that support the findings of this study are available within the article.

## REFERENCES

- <sup>1</sup>H. Bernien, B. Hensen, W. Pfaff, G. Koolstra, M. S. Blok, L. Robledo, T. H. Taminiau, M. Markham, D. J. Twitchen, L. Childress, and R. Hanson, "Heralded entanglement between solid-state qubits separated by three metres," *Nature* **497**, 86–90 (2013).
- <sup>2</sup>X. Rong, J. Geng, F. Shi, Y. Liu, K. Xu, W. Ma, F. Kong, Z. Jiang, Y. Wu, and J. Du, "Experimental fault-tolerant universal quantum gates with solid-state spins under ambient conditions," *Nat. Commun.* **6**, 8748 (2015).
- <sup>3</sup>P. Neumann, N. Mizuochi, F. Rempp, P. Hemmer, H. Watanabe, S. Yamasaki, V. Jacques, T. Gaebel, F. Jelezko, and J. Wrachtrup, "Multiparticle entanglement among single spins in diamond," *Science* **320**, 1326–1329 (2008).
- <sup>4</sup>Y. Wu, W. Liu, J. Geng, X. Song, X. Ye, C.-K. Duan, X. Rong, and J. Du, "Observation of parity-time symmetry breaking in a single-spin system," *Science* **364**, 878–880 (2019).
- <sup>5</sup>J. Zopes, K. Sasaki, K. S. Cujia, J. M. Boss, K. Chang, T. F. Segawa, K. M. Itoh, and C. L. Degen, "High-resolution quantum sensing with shaped control pulses," *Phys. Rev. Lett.* **119**, 260501 (2017).
- <sup>6</sup>G. Balasubramanian, I. Y. Chan, R. Kolesov, M. Al-Hmoud, J. Tisler, C. Shin, C. Kim, A. Wojcik, P. R. Hemmer, A. Krueger, T. Hanke, A. Leitenstorfer, R. Bratschitsch, F. Jelezko, and J. Wrachtrup, "Nanoscale imaging magnetometry with diamond spins under ambient conditions," *Nature* **455**, 648–651 (2008).
- <sup>7</sup>J. M. Boss, K. S. Cujia, J. Zopes, and C. L. Degen, "Quantum sensing with arbitrary frequency resolution," *Science* **356**, 837–840 (2017).
- <sup>8</sup>X. Rong, M. Wang, J. Geng, X. Qin, M. Guo, M. Jiao, Y. Xie, P. Wang, P. Huang, F. Shi, Y.-F. Cai, C. Zou, and J. Du, "Searching for an exotic spin-dependent interaction with a single electron-spin quantum sensor," *Nat. Commun.* **9**, 739 (2018).
- <sup>9</sup>K. Liang, M. Zhu, X. Qin, Z. Meng, P. Wang, and J. Du, "Field-programmable-gate-array based hardware platform for nitrogen-vacancy center based fast magnetic imaging," *Rev. Sci. Instrum.* **95**, 024701 (2024).
- <sup>10</sup>Y. Wu, M. Shahmohammadi, Y. Chen, P. Lu, and R. B. Staszewski, "A 3.5–6.8-GHz wide-bandwidth DTC-assisted fractional-N all-digital PLL with a mesh  $\Delta\Sigma$ -TDC for low in-band phase noise," *IEEE J. Solid-State Circuits* **52**, 1885–1903 (2017).
- <sup>11</sup>W. Wu, C.-W. Yao, K. Godbole, R. Ni, P.-Y. Chiang, Y. Han, Y. Zuo, A. Verma, I. S.-C. Lu, S. W. Son, and T. B. Cho, "A 28-nm 75-fs<sub>rms</sub> analog fractional- $N$  sampling PLL with a highly linear DTC incorporating background DTC gain calibration and reference clock duty cycle correction," *IEEE J. Solid-State Circuits* **54**, 1254–1265 (2019).
- <sup>12</sup>T. Okayasu, M. Suda, K. Yamamoto, S. Kantake, S. Sudou, and D. Watanabe, "1.83 ps-resolution CMOS dynamic arbitrary timing generator for >4 GHz ATE applications," in *2006 IEEE International Solid State Circuits Conference—Digest of Technical Papers* (IEEE, 2006), pp. 2122–2131.
- <sup>13</sup>A. Syed, "RIC/DICMOS—multichannel CMOS formatter," in *International Test Conference, 2003. Proceedings. ITC 2003* (IEEE, 2003), Vol. 1, pp. 175–184.
- <sup>14</sup>T.-Y. Wang, S.-M. Lin, and H.-W. Tsao, "Multiple channel programmable timing generators with single cyclic delay line," *IEEE Trans. Instrum. Meas.* **53**, 1295–1303 (2004).
- <sup>15</sup>M. Suda, K. Yamamoto, T. Okayasu, S. Kantake, S. Sudou, and D. Watanabe, "CMOS high-speed, high-precision timing generator for 4.266-Gbps memory test system," in *IEEE International Conference on Test, 2005* (IEEE, 2005), pp. 858–866.
- <sup>16</sup>Y.-Y. Chen, J.-L. Huang, T. Kuo, and X.-L. Huang, "Design and implementation of an FPGA-based data/timing formatter," *J. Electron. Test.* **31**, 549–559 (2015).
- <sup>17</sup>B. Arkin, "Realizing a production ATE custom processor and timing IC containing 400 independent low-power and high-linearity timing verniers," in *2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No. 04CH37519)* (IEEE, 2004), pp. 348–349.
- <sup>18</sup>D.-H. Jung, K. Ryu, J.-H. Park, and S.-O. Jung, "All-digital process-variation-calibrated timing generator for ATE with 1.95-ps resolution and maximum 1.2-GHz test rate," *IEEE Trans. Very Large Scale Integr. Syst.* **26**, 1015–1025 (2018).
- <sup>19</sup>T. Kawamura, Y. Ohtomo, K. Nishimura, and N. Ishihara, "A 1 ps-resolution 2 ns-span 10 Gb/s data-timing generator with spectrum conversion," in *2008 IEEE International Solid-State Circuits Conference—Digest of Technical Papers* (IEEE, 2008), pp. 456–467.
- <sup>20</sup>T. Chen, Q. Wang, Z. Wang, Z. Qi, Z. Han, L. Liu, and S. Hu, "A wide-range and high-linearity timing generator with 1.25 ps-resolution," in *2023 8th International Conference on Integrated Circuits and Microsystems (ICICM)* (IEEE, 2023), pp. 533–537.
- <sup>21</sup>J.-C. Liu, C.-J. Huang, and P.-Y. Lee, "A high-accuracy programmable pulse generator with a 10-ps timing resolution," *IEEE Trans. Very Large Scale Integr. Syst.* **26**, 621–629 (2018).
- <sup>22</sup>W.-Z. Zhang, X. Qin, L. Wang, Y. Tong, Y. Rui, X. Rong, and J.-F. Du, "A fully-adjustable picosecond resolution arbitrary timing generator based on multi-stage time interpolation," *Rev. Sci. Instrum.* **90**, 114702 (2019).
- <sup>23</sup>X. Qin, W.-Z. Zhang, L. Wang, Y. Tong, H. Yang, Y. Rui, X. Rong, and J.-F. Du, "A picosecond resolution arbitrary timing generator based on time folding and time interpolating," *Rev. Sci. Instrum.* **89**, 074701 (2018).
- <sup>24</sup>D. Kong, Z. Fu, H. Liu, and S. Gao, "A multi-functional arbitrary timing generator based on a digital-to-time converter," *Rev. Sci. Instrum.* **94**, 104702 (2023).

- <sup>25</sup>L. Wang, Y. Tong, X. Qin, W.-Z. Zhang, X. Rong, and J. Du, "A field-programmable-gate-array based high time resolution arbitrary timing generator with a time folding method utilizing multiple carry-chains," *Rev. Sci. Instrum.* **92**, 014701 (2021).
- <sup>26</sup>G. Mazin, A. Stejskal, M. Dudka, and M. Ježek, "Non-blocking programmable delay line with minimal dead time and tens of picoseconds jitter," *Rev. Sci. Instrum.* **92**, 114712 (2021).
- <sup>27</sup>P. Kwiatkowski, "Digital-to-time converter for test equipment implemented using FPGA DSP blocks," *Measurement* **177**, 109267 (2021).
- <sup>28</sup>Y.-K. Huang, K.-T. Li, C.-L. Hsiao, C.-A. Lee, J.-L. Huang, and T. Kuo, "Design and implementation of an EG-pool based FPGA formatter with temperature compensation," in *2017 IEEE 26th Asian Test Symposium (ATS)* (IEEE, 2017), pp. 88–93.
- <sup>29</sup>G.-H. Hou, W.-C. Huang, J.-L. Huang, and T. Kuo, "Design and implementation of an FPGA-based 16-channel data/timing formatter," in *2018 IEEE 27th Asian Test Symposium (ATS)* (IEEE, 2018), pp. 209–214.
- <sup>30</sup>Z. Chen, X. Wang, Z. Zhou, R. Moro, and L. Ma, "A simple Field Programmable Gate Array (FPGA) based high precision low-jitter delay generator," *Rev. Sci. Instrum.* **92**, 024701 (2021).
- <sup>31</sup>W. Qiu, J. Xie, Q. Liu, and X. Han, "A low-jitter timing generator based on completely on-chip self-measurement and calibration in a field programmable gate array," *Rev. Sci. Instrum.* **92**, 114703 (2021).
- <sup>32</sup>P. Chen, P.-Y. Chen, J.-S. Lai, and Y.-J. Chen, "FPGA vernier digital-to-time converter with 1.58 ps resolution and 59.3 minutes operation range," *IEEE Trans. Circuits Syst. I* **57**, 1134–1142 (2010).
- <sup>33</sup>Y. Wang, Q. Cao, and C. Liu, "A multi-chain merged tapped delay line for high precision time-to-digital converters in FPGAs," *IEEE Trans. Circuits Syst. II* **65**, 96–100 (2018).
- <sup>34</sup>Q. Shen, S. Liu, B. Qi, Q. An, S. Liao, P. Shang, C. Peng, and W. Liu, "A 1.7 ps equivalent bin size and 4.2 ps RMS FPGA TDC based on multichain measurements averaging method," *IEEE Trans. Nucl. Sci.* **62**, 947–954 (2015).
- <sup>35</sup>Xilinx, UltraScale Architecture SelectIO Resources, 2023, <https://docs.xilinx.com/r/en-US/ug571-ultrascale-selectio>.