

## ISSCC 2009 / SESSION 19 / ANALOG TECHNIQUES / 19.4

19.4 A 65nm CMOS Comparator with Modified Latch to Achieve 7GHz/1.3mW at 1.2V and 700MHz/47 $\mu$ W at 0.6V

Bernhard Goll, Horst Zimmermann

Vienna University of Technology, Vienna, Austria

Clocked regenerative comparators, which use positive feedback of a latch to force a fast decision, are used for many applications. In [1] a 10GHz 3-stage comparator in 1.2V 0.11 $\mu$ m CMOS is presented and is designed to extract every 4<sup>th</sup> bit of a 40Gb/s data stream. A BER<10<sup>-12</sup> for V<sub>pp</sub> at the input is achieved. Depending of the intended application, the constant tail current and the low-voltage swing of the CML blocks may or may not be beneficial. In [2] a latch-type sense amplifier (in 1.5V 0.13 $\mu$ m CMOS) for use in SRAMs is investigated. The delay time is 119ps for an input voltage difference of 100mV. A disadvantage is that for proper operation a sufficiently large supply voltage is needed due to the stack of transistors and therefore the comparison time is longer than 11ns at 0.7V. In [3] a comparator with similar circuit structure in 1.8V 0.18 $\mu$ m CMOS is described, consuming 350 $\mu$ W at 1.4GHz. The standard deviation of the offset without compensation is  $\sigma=31.6$ mV. The sense-amplifier presented in [4] (1.2V 90nm CMOS, 225 $\mu$ W @ 2GHz) also consists of a typical latch with two cross-coupled CMOS inverters. The comparator in [5] (1.5V 0.12 $\mu$ m CMOS, low-threshold transistors) reaches a sensitivity (BER=10<sup>-9</sup>) of 16.5mV @ 4GHz/1.5V and 25.8mV @ 500MHz/0.5V. The design of the latch still needs static current and so 2.65mW is needed at 6GHz/1.5V.

The schematic of the comparator described in this paper is shown in Fig. 19.4.1. The comparator is fabricated in a 1.2V 65nm low-power CMOS process with V<sub>r</sub>=0.4V. In contrast to a conventional latch (see Fig. 19.4.2), which consists of two cross-coupled inverters (N0, P0 and N1, P1), the presented latch is expanded into two paths between the supply rails so that only the threshold voltage of one transistor (instead of two) has to be overcome in each path. A clock period consists of a reset phase (CLK=CLKA=CLKL=V<sub>SS</sub>), which builds up a start condition (OUT=OUT'=V<sub>CO</sub> and FB=FB'=V<sub>SS</sub>) to compare voltage V<sub>i+</sub> at CINP with V<sub>i-</sub> at CINN in the following comparison phase (CLK=V<sub>CO</sub>). During reset, transistors N2 and N9 are switched off and reset transistors P2 and P3 pull both output nodes OUT and OUT' to V<sub>CO</sub>, which in succession turn P4 and P5 off and nodes FB and FB' are discharged to V<sub>SS</sub> by N3 and N4, which are switched on. The gates of transistors N7 and N8 are also at voltage level V<sub>CO</sub>, so that N7 and N8 are initially switched on, when comparison phase starts. For comparison, N2 and N9 are turned on (CLK=CLKA=CLKL=V<sub>CO</sub>) and P2, P3, N3 and N4 are switched off. If V<sub>i+>V\_i-</sub>, then OUT is discharged with transistor N6 more than OUT' with N5 and positive feedback is started. OUT' is pulled towards V<sub>SS</sub> by N6 and N1 more than OUT by N5 and N0 thus P4 is turned on before P5 and FB is pulled towards V<sub>CO</sub> while FB remains near V<sub>SS</sub> for sufficient IV<sub>i+-V\_i-</sub>. Hence, P1 and N0 are switched off, P0 and N1 are on, OUT is pulled to V<sub>CO</sub>, OUT' to V<sub>SS</sub>, and FB is at V<sub>CO</sub>. The decision is done and no static current flows. Transistors N7 and N8 avoid static current flow through the input part via N9 [5], because N7 is turned off due to OUT=V<sub>SS</sub>. For the case when V<sub>i+<V\_i-</sub>, OUT is pulled to V<sub>SS</sub> and OUT' to V<sub>CO</sub>. Due to the separated p-well (triple-well process) of NMOS transistors, the body effect can be used to reduce the threshold voltages of transistors N5 to N8 by setting PWELL to maximum 0.7V. According to Monte-Carlo simulations on 50 samples, the standard deviation of the comparator offset is  $\sigma=22$ mV at V<sub>CO</sub>=1.2V and  $\sigma=47$ mV at V<sub>CO</sub>=0.6V.

The comparison of the presented latch with a conventional latch, which is used in typical comparators, is shown in Fig. 19.4.2. In this simulation, the input stage of Fig. 19.4.1 (transistors N5 to N9) has been connected to each latch. The diagram shows the simulated delay time (50% rising edge of CLK to the time point when IOUT-OUT' has reached 50% of V<sub>CO</sub>) of both latches versus the supply voltage V<sub>CO</sub>. Both latches are designed to have the same delay time at V<sub>CO</sub>=1.2V. If the supply voltage is reduced, then the increase in the delay time of the conventional latch is much more than that of the presented latch (3ns as compared to 0.9ns at V<sub>CO</sub>=0.6V).

The block diagram of the test chip with the comparator is shown in Fig. 19.4.3. The chip is divided into 2 parts, one with a supply voltage of V<sub>DD</sub>=1.2V for optimal functionality of CMOS logic (included for measurement purposes), and a second with the supply voltage of V<sub>CO</sub> where the comparator is placed. The clock is applied to CLKIN and is processed by the clock driver to 2 complementary square-wave clock signals (CLK,  $\bar{CLK}$ , duty cycle  $\sim 50\%$ ) with the logic levels V<sub>SS</sub> and V<sub>CO</sub>. Each adapter consists of two separately supplied inverters to convert the logical levels V<sub>CO</sub> and V<sub>SS</sub> to levels V<sub>DD</sub>=1.2V and V<sub>SS</sub>. During reset phase of the comparator the transmission gate P6/N10 is closed and the decision is held in the transfer stage dynamically at node SHB. With digital pin DIG, it can be chosen whether transmission gate N11/P7 is always open or only open when the transmission gate P6/N10 is closed for achieving a constant delay time at the chip outputs to simplify delay adjustments for BER measurements. With the addition of block T-Gates the possibility of sensitivity tuning [6] for lower clock rates is added, where the currents through transistors N2 and N9 in Fig. 19.4.1 can be controlled with bias voltages at NA and NL, when PM=V<sub>DD</sub>. When the transmission gates are completely open (NA=NL=V<sub>DD</sub>, PM=V<sub>SS</sub>), then CLK=CLKA=CLKL and sensitivity tuning is off.

BER measurements are done by applying a bias voltage at CINP, which is superimposed with a 2<sup>31</sup>-1 PRBS and a reference bias voltage at CINN. For measurements CINN and CINP are biased separately for offset compensation. Here the amplitude is defined by IV<sub>i+-V\_i-+offset</sub>. The left side of Fig. 19.4.4 shows the measured BER versus the bias level at CINN. The optimal operating point (lowest BER) is for V<sub>CO</sub>=1.2V at 0.6V and for V<sub>CO</sub>=0.6V at 0.5V. For these optimal points, the dependence of the BER to the amplitude is measured (right side of Fig. 19.4.4). To achieve BER=10<sup>-9</sup> at V<sub>CO</sub>=1.2V, an amplitude of 15mV @3GHz, 20mV @4GHz (sensitivity tuning @3GHz and 4GHz), 27.2mV @5GHz, 63mV @6GHz and 281mV @7GHz has to be applied. If V<sub>CO</sub> is lowered to 0.6V (T-Gates completely open), 16mV @500MHz and 90.2mV @700MHz are measured. The measured power consumption is 1.3mW @1.2V/7GHz, 1.19mW @1.2V/6GHz, 41 $\mu$ W @500MHz/0.6V and 47 $\mu$ W @700MHz/0.6V. Compared to [5] an improvement of more than 50% in power consumption has been achieved at 6GHz.

The delay time (see Fig. 19.4.5) of the comparator (64ps @CINN=0.6V, 18.6mV amplitude) is measured with an additional on-chip circuit (measurement circuit and comparator delay time variation have  $\sigma=11$ ps) for detecting the time shift between CLK and OUT'. It can be seen, that the measured result is similar to the simulated one. At lower input amplitudes, the measurements are influenced by more random switching of the comparator.

Figure 19.4.6 shows oscilloscope screenshots at 7GHz clock frequency. A micrograph of the test chip is shown in Fig. 19.4.7. The comparator occupies 19.6×16.3 $\mu$ m<sup>2</sup>.

## Acknowledgement:

This work was partially funded by Infineon Technologies Austria and the Austrian BMVIT in the project Soft-RoC in FIT-IT via FFG.

## References:

- [1] Y. Okaniwa, H. Tamura, M. Kibune, et al., "A 40-Gb/s CMOS Clocked Comparator With Bandwidth Modulation Technique," *IEEE J. Solid-State Circuits*, vol. 40, no. 8, pp. 1680-1687, Aug., 2005.
- [2] B. Wicht, T. Nirschl, and D. Schmitt-Landsiedel, "Yield and Speed Optimization of a Latch-Type Voltage Sense Amplifier," *IEEE J. Solid-State Circuits*, vol. 39, pp. 1148-1158, July, 2004.
- [3] K.-L.J. Wong, C.-K.K. Yang, "Offset Compensation in Comparators With Minimum Input-Referred Supply Noise," *IEEE J. Solid-State Circuits*, vol. 39, no. 5, pp. 837-840, May, 2004.
- [4] D. Schinkel, E. Mensink, E. Kiumperink, et al., "A Double-Tail Latch-Type Voltage Sense Amplifier with 18ps Setup+Hold Time," *ISSCC Dig. Tech. Papers*, pp. 314-315, Feb., 2007.
- [5] B. Goll and H. Zimmermann, "A 0.12 $\mu$ m CMOS Comparator Requiring 0.5V at 600MHz and 1.5V at 6GHz," *ISSCC Dig. Tech. Papers*, pp. 316-317, Feb., 2007.
- [6] B. Goll and H. Zimmermann, "A Clocked, Regenerative Comparator in 0.12 $\mu$ m CMOS with Tunable Sensitivity," *European Solid-State Circuits Conf.*, pp. 408-411, Sept., 2007.



Figure 19.4.1: Schematic of the comparator.



Figure 19.4.2: Comparison of the conventional latch with the presented latch.



Figure 19.4.3: Block diagram of the test chip with the comparator.



Figure 19.4.4: BER measurements with PRBS  $2^{31}-1$ .



Figure 19.4.5: Measured delay time of the comparator.



Figure 19.4.6: Oscilloscope screenshots (7GHz clock frequency).



Figure 19.4.7: Micrograph of the test chip.