



A Dissertation for the degree of

Doctor of Philosophy

# **Successive-Approximation based CMOS Process-Scaling Hybrid ADCs**

Graduate School of Science and Technology

Keio University  
**Keio University**



Kentaro Yoshioka

August 2019

# Abstract

Along with CMOS scaling, wireless/wireline communication performances have greatly advanced. To realize a system on chip (SoC) for such products, analog circuits are necessary; for an example, high-performance analog-to-digital converters (ADCs) are required to convert the received analog signal to digital. While such SoCs utilize the most leading CMOS technologies to cut down the costs of the digital circuits, the analog circuit performance inconveniently degrades as the CMOS scaling advance. To name an example, the Opamp gain performance greatly degrade with scaling with worsened transistor gain and lowered supply voltages. On the contrary, as the communication standards further evolve, the performance demands toward analog circuits continue to increase. Thus, the design of ADCs in scaled CMOS process environments become one of the most challenging and critical field of circuit design.

In this thesis, we aim to explore Hybrid ADCs utilizing successive-approximation (SA) circuitry, which can benefit from process scaling. And ultimately, we target to establish an ADC design methodology suitable for scaled CMOS technologies. In chapter 1, the technology trends of the CMOS process scaling are discussed and scaling effects to the analog circuitry are studied. Moreover, we show that SA circuitry is suitable for scaled CMOS and explore its limitations as well. Finally, recent research trends of Hybrid ADCs and its design challenges are discussed. We propose a Hybrid ADC which heavily utilizes the SA circuitry in chapter 2 and 3. In chapter 2, Digital Amplifier (DA) technique is proposed to realize power-efficient and accurate amplification in scaled CMOS, which utilize SA circuitry for amplification. DA cancels out all errors of the low-gain amplifier by feedback based

on SA. Moreover, the amplification accuracy can be arbitrary set by configuring the number of bits of the DA; the amplifier gain is decoupled from the transistor intrinsic gain and brings in a new design paradigm for amplifier design in scaled CMOS. The fabricated ADC with DA achieves SNDR=61.1dB, FoM=12.8fJ/conv., which is over 3x improvement compared with conventional ADCs.

In chapter 3, we explore power-efficient and process scalable ultra-high-speed ADCs, required for high-capacity wireless communications. To achieve low-power and high-speed ADCs, we propose to dynamically configure the ADC *architecture* reflecting the ADC clock frequency, which we name Dynamic Architecture and Frequency Scaling (DAFS). The ADC architecture is reconfigured between successive-approximation and flash every clock cycle, relying on the conversion delay. A prototype subranging ADC is fabricated in 65 nm CMOS, which is 2x more power efficient than reported subranging ADCs.

In chapter 4, we propose a comparator with a variable threshold to explore multi-bit/step comparisons, which can significantly speed up the successive-approximation circuitry implemented in chapter 2 and 3. Finally, we establish a conclusion in chapter 5.

©

All rights reserved by Kentaro Yoshioka. August 2019.

# Contents

|          |                                                               |           |
|----------|---------------------------------------------------------------|-----------|
| <b>1</b> | <b>Introduction</b>                                           | <b>13</b> |
| 1.1      | CMOS Scaling . . . . .                                        | 13        |
| 1.1.1    | Will CMOS scaling continue forever? . . . . .                 | 14        |
| 1.1.2    | Recent trends in CMOS scaling and digital circuits . . . . .  | 14        |
| 1.1.3    | Process scaling (and problems) with analog circuits . . . . . | 18        |
| 1.1.4    | Analog circuits' scaling effect . . . . .                     | 20        |
| 1.2      | Towards process scalable analog circuits . . . . .            | 22        |
| 1.2.1    | Rise of SAR ADCs . . . . .                                    | 22        |
| 1.2.2    | Fundamental Problems of the SAR ADC . . . . .                 | 26        |
| 1.3      | Hybrid ADCs . . . . .                                         | 27        |
| 1.3.1    | Pipelined-SAR ADCs . . . . .                                  | 27        |
| 1.3.2    | Design challenges of the Pipelined-SAR ADC . . . . .          | 28        |
| 1.4      | Thesis motivation and organization . . . . .                  | 29        |
| 1.4.1    | Thesis organization . . . . .                                 | 31        |
| <b>2</b> | <b>Digital Amplifier</b>                                      | <b>35</b> |
| 2.1      | Introduction . . . . .                                        | 35        |
| 2.1.1    | Review of conventional amplifier for scaled CMOS designs. . . | 36        |
| 2.1.2    | Utilizing digital gain calibration . . . . .                  | 37        |
| 2.1.3    | Our approach . . . . .                                        | 38        |
| 2.2      | Digital Amplifier . . . . .                                   | 38        |
| 2.2.1    | Review of Opamp based amplifications . . . . .                | 38        |

|          |                                                                                 |           |
|----------|---------------------------------------------------------------------------------|-----------|
| 2.2.2    | Digital Amplifier Principals . . . . .                                          | 40        |
| 2.2.3    | Digital Amplifier Implementation . . . . .                                      | 41        |
| 2.3      | Further Analysis of Digital Amplifier . . . . .                                 | 44        |
| 2.3.1    | Amplification Error Characteristics . . . . .                                   | 44        |
| 2.3.2    | Power Optimization Strategy . . . . .                                           | 45        |
| 2.3.3    | Spurious-free Characteristics of the DA . . . . .                               | 48        |
| 2.4      | Pipelined-SAR ADC Architecture . . . . .                                        | 48        |
| 2.4.1    | Asynchronous Operation . . . . .                                                | 50        |
| 2.4.2    | Look-Ahead SAR Technique . . . . .                                              | 51        |
| 2.4.3    | Noise Budget . . . . .                                                          | 52        |
| 2.5      | Circuit Implementation . . . . .                                                | 52        |
| 2.5.1    | Operational Amplifier . . . . .                                                 | 52        |
| 2.5.2    | Comparator Designs . . . . .                                                    | 54        |
| 2.5.3    | DA C-DAC Designs . . . . .                                                      | 55        |
| 2.6      | Measurement Results . . . . .                                                   | 57        |
| 2.6.1    | Scaling Effects of the Digital Amplifier . . . . .                              | 62        |
| 2.6.2    | Benchmarks . . . . .                                                            | 65        |
| 2.7      | Conclusions . . . . .                                                           | 65        |
| <b>3</b> | <b>Dynamic Architecture Configuring</b>                                         | <b>67</b> |
| 3.1      | Introduction . . . . .                                                          | 67        |
| 3.2      | Dynamic Architecture and Frequency Scaling . . . . .                            | 70        |
| 3.2.1    | Binary search (Successive approximation) and flash reconfigurable ADC . . . . . | 70        |
| 3.2.2    | DAFS operation . . . . .                                                        | 75        |
| 3.2.3    | Analysis of DAFS . . . . .                                                      | 78        |
| 3.3      | 7-bit Subranging ADC . . . . .                                                  | 82        |
| 3.3.1    | S/H and Folding Circuits . . . . .                                              | 84        |
| 3.3.2    | Live configuring with excess-delay accumulation . . . . .                       | 84        |
| 3.3.3    | Sub-ADC designs . . . . .                                                       | 88        |

|          |                                                                           |            |
|----------|---------------------------------------------------------------------------|------------|
| 3.4      | Results and Discussion . . . . .                                          | 91         |
| 3.4.1    | Measured Results . . . . .                                                | 91         |
| 3.4.2    | Discussions . . . . .                                                     | 94         |
| 3.5      | Conclusions . . . . .                                                     | 96         |
| <b>4</b> | <b>Threshold Configuring Comparator</b>                                   | <b>98</b>  |
| 4.1      | Introduction . . . . .                                                    | 98         |
| 4.2      | 2-bit/Step SAR ADC Architecture . . . . .                                 | 100        |
| 4.2.1    | Conventional Designs . . . . .                                            | 100        |
| 4.2.2    | 2-bit/step with threshold configuring comparators . . . . .               | 101        |
| 4.2.3    | 2-bit/step with Successively Activated Comparators . . . . .              | 102        |
| 4.3      | Wide range threshold configuring comparator . . . . .                     | 106        |
| 4.3.1    | TCC Architecture . . . . .                                                | 106        |
| 4.3.2    | TCC by variable current source . . . . .                                  | 108        |
| 4.3.3    | Variable current source design . . . . .                                  | 111        |
| 4.3.4    | Power Supply Noise Immunity . . . . .                                     | 113        |
| 4.3.5    | Temperature variation effects. . . . .                                    | 115        |
| 4.4      | Measurement Results . . . . .                                             | 116        |
| 4.5      | Conclusions . . . . .                                                     | 123        |
| <b>5</b> | <b>Conclusion</b>                                                         | <b>124</b> |
| 5.1      | Summary . . . . .                                                         | 124        |
| 5.2      | Future research directions . . . . .                                      | 126        |
| 5.2.1    | Further scaling the DA amplifier (down to 16nm, 7nm and beyond) . . . . . | 128        |

# List of Figures

|     |                                                                                                                                                                                                                                                                                                                                                                                             |    |
|-----|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 1.1 | 42 years of processor trend. (In courtesy of [7] [8]) . . . . .                                                                                                                                                                                                                                                                                                                             | 15 |
| 1.2 | Apple A9 chip. (In courtesy of [13]) . . . . .                                                                                                                                                                                                                                                                                                                                              | 17 |
| 1.3 | Modern RF SiP integration . . . . .                                                                                                                                                                                                                                                                                                                                                         | 18 |
| 1.4 | FPGA with analog circuit integration (In courtesy of [20]) . . . . .                                                                                                                                                                                                                                                                                                                        | 19 |
| 1.5 | SAR ADCs published in ISSCC, VLSI (1997-2008) . . . . .                                                                                                                                                                                                                                                                                                                                     | 22 |
| 1.6 | SAR ADCs published in ISSCC, VLSI (1997-2018) . . . . .                                                                                                                                                                                                                                                                                                                                     | 23 |
| 1.7 | SAR ADC circuit block diagram. . . . .                                                                                                                                                                                                                                                                                                                                                      | 24 |
| 1.8 | Thesis organization. . . . .                                                                                                                                                                                                                                                                                                                                                                | 29 |
| 1.9 | Benchmark for high-speed high-resolution Pipelined ADCs. . . . .                                                                                                                                                                                                                                                                                                                            | 30 |
| 2.1 | Zero crossing based amplifiers . . . . .                                                                                                                                                                                                                                                                                                                                                    | 36 |
| 2.2 | Ring amplifiers . . . . .                                                                                                                                                                                                                                                                                                                                                                   | 37 |
| 2.3 | (a) Amplification error due to the finite gain of opamps. A portion of the amplification error is observed at the virtual ground $V_x$ . (b) Concept of the Digital Amplifier is shown. By <i>directly sensing</i> the $V_x$ value and applying feedback to the output, digital amplifier cancels all opamp-induced-errors (finite-gain, incomplete settling, thermal noise, etc.). . . . . | 39 |
| 2.4 | Schematic of a 2.5-bit flip-around MDAC with $n$ bit Digital Amplifier. . . . .                                                                                                                                                                                                                                                                                                             | 40 |
| 2.5 | Operation of the Digital Amplifier broken down in 4 steps. For simplicity, the DA is shown a 3-bit but the actual design is 8-bit. . . . .                                                                                                                                                                                                                                                  | 42 |

|                                                                                                                                                                                                                                                                                                                      |    |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2.6 Number of DA bit versus estimated MDAC power is plotted. 0-bit case is a MDAC designed only with an opamp. MDAC power starts to increase after DA's settling error mitigation effect saturates at a certain point. . . . .                                                                                       | 45 |
| 2.7 We compare the power consumption of opamp-based and DA-based MDAC, respectively. Since DA-based MDACs has a relaxed settling requirements, at DA=7-bit, 46% power savings can be expected at our target SNDR design point. . . . .                                                                               | 46 |
| 2.8 Matlab simulated FFT results of the pipelined-SAR ADC are shown, where (a) uses opamp-based MDAC and (b) utilize DA-based MDAC. Since DA's gain error does not have correlation with the input signal, the SFDR excels by 10dB. Note that the opamp gain and DA bit were tuned to achieve the same SNDR. . . . . | 49 |
| 2.9 The architecture of the two-way interleaved 12bit 160MS/s pipelined SAR ADC. . . . .                                                                                                                                                                                                                             | 50 |
| 2.10 Noise contribution breakdown of the ADC. . . . .                                                                                                                                                                                                                                                                | 52 |
| 2.11 Schematic diagram of the designed opamp. . . . .                                                                                                                                                                                                                                                                | 53 |
| 2.12 Simulated waveform of the DA-based MDAC. While turning off the opamp causes kickback, the noise is small enough so that it can be canceled by DA operation. . . . .                                                                                                                                             | 53 |
| 2.13 DA C-DAC settling error versus ADC SNDR is shown. Since we utilize redundancy in the DA C-DAC, it is robust to settling errors. . . . .                                                                                                                                                                         | 55 |
| 2.14 Simplified figure of the ADC capacitor network. . . . .                                                                                                                                                                                                                                                         | 56 |
| 2.15 Chip photo of the prototype ADC. Evaluation results of the I-channel ADC are shown. . . . .                                                                                                                                                                                                                     | 57 |
| 2.16 ADC measured performance from 3 randomly selected chips. Temperature vs ADC SNDR were measured. . . . .                                                                                                                                                                                                         | 57 |
| 2.17 ADC measured performance from 3 randomly selected chips, where $f_s$ and $f_{in}$ were varied. . . . .                                                                                                                                                                                                          | 58 |

|      |                                                                                                                                                                                             |    |
|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2.18 | ADC FFT measured results at $f_{in}=10.1$ MHz . . . . .                                                                                                                                     | 59 |
| 2.19 | (a) ADC measured DNL. (b) ADC measured INL. . . . .                                                                                                                                         | 60 |
| 2.20 | Simulated power breakdown of the ADC. . . . .                                                                                                                                               | 61 |
| 2.21 | A digital amplifier-based 11-bit pipelined ADC prototyped in 65nm CMOS. . . . .                                                                                                             | 63 |
| 2.22 | Benchmark against Pipelined and Pipelined-SAR ADC published in ISSCC and VLSI. Our work achieves $3\times$ power efficiency improvement compared to ADCs without gain calibrations. . . . . | 64 |
| 3.1  | Aggressive power scaling with DVFS, commonly utilized in CPUs. . .                                                                                                                          | 68 |
| 3.2  | Dynamic power scaling of an ADC without any power scaling techniques, with DVFS, and with DAFS, respectively. . . . .                                                                       | 69 |
| 3.3  | (a) Schematic of 3-bit flash ADC. (b) Schematic of 3-bit binary search ADC. . . . .                                                                                                         | 71 |
| 3.4  | Schematic of the proposed binary search/flash reconfigurable ADC, realized by just adding OR cells to conventional Flash ADCs. . . . .                                                      | 74 |
| 3.5  | (a) Simplified test bench with a 3-bit ADC using DAFS. (b) Timing chart showing the basic operation of the ADC. . . . .                                                                     | 75 |
| 3.6  | (a) DAFS operation at $f_{s_{maxBS}} > f_s$ . (b) DAFS operation at $f_{s_{maxBS}} < f_s < f_{s_{maxFL}}$ . (c) DAFS operation at $f_s \simeq f_{s_{maxFL}}$ . . . . .                      | 76 |
| 3.7  | Dynamic power scaling of an ADC operating only with flash and with DAFS, respectively . . . . .                                                                                             | 80 |
| 3.8  | Block diagram of the 7-bit subranging ADC. DAFS is applied to the 3-bit coarse and fine sub-ADCs. . . . .                                                                                   | 82 |
| 3.9  | Block diagram of the 7-bit subranging ADC. DAFS is applied to the 3-bit coarse and fine sub-ADCs. . . . .                                                                                   | 83 |
| 3.10 | (Schematic of the full implementation of S/H and folding circuits. . .                                                                                                                      | 83 |

|                                                                                                                                                                                                                                                         |     |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 3.11 (a) DAFS operation without $\tau_{TH}$ . Lowest BF ratio will be 0.5 since flash operation will be inserted as soon as any EXD is detected. (b) DAFS operation with $\tau_{TH}$ . ADC does not switch to flash until exceeds $\Sigma$ EXD. . . . . | 85  |
| 3.12 (a) Power scaling with several values of $\tau_{TH}$ . (b) versus BF ratio with several values of $\tau_{TH}$ . . . . .                                                                                                                            | 86  |
| 3.13 Schematic of the live configuring circuit which uses the pulse length of FIN as $\tau_{TH}$ . . . . .                                                                                                                                              | 88  |
| 3.14 Schematic of the comparator with four channel input. The input channel is determined by signal EN[0:3]. The programmable load capacitance used for offset compensation is shown as well. . . . .                                                   | 89  |
| 3.15 Chip micrograph. . . . .                                                                                                                                                                                                                           | 91  |
| 3.16 Measured DNL/INL after foreground comparator offset calibration . .                                                                                                                                                                                | 92  |
| 3.17 Measured power scaling of the subranging ADC, with and without DAFS. The BF ratio was measured and plotted as well. . . . .                                                                                                                        | 92  |
| 3.18 Measured 4096-point FFT spectrum at the written condition. . . . .                                                                                                                                                                                 | 93  |
| 3.19 (a) Measured versus SNDR. (b) Measured versus SNDR . . . . .                                                                                                                                                                                       | 93  |
| 3.20 Power breakdown of the ADC at 820 MS/s with sub-ADC operated only with binary search and flash respectively. . . . .                                                                                                                               | 94  |
| 3.21 PVT variations versus BF ratio is shown. Interestingly, DAFS can operate to cancel out PVT variation effects, relaxing the speed margins of the high-speed ADC. . . . .                                                                            | 95  |
| 4.1 Block diagram of a 2-bit/step ADC provided with TCC. . . . .                                                                                                                                                                                        | 101 |
| 4.2 Proposed 2-bit/step SAR ADC with successively activated comparators. (a) Block diagram. (b) Operation concept. . . . .                                                                                                                              | 103 |
| 4.3 Timing chart of the proposed ADC. . . . .                                                                                                                                                                                                           | 104 |
| 4.4 Power supply versus comparator delay, DAC settling and speed improvement respectively. . . . .                                                                                                                                                      | 105 |
| 4.5 Threshold configuring comparator design. . . . .                                                                                                                                                                                                    | 106 |

|                                                                                                                                         |     |
|-----------------------------------------------------------------------------------------------------------------------------------------|-----|
| 4.6 Schematic of the threshold configuring comparator (CP2 in Fig. 4.2). . . . .                                                        | 109 |
| 4.7 (a) Schematic of 5-bit Vcm biased variable current source. (b) Operation of capacitive dividing. . . . .                            | 110 |
| 4.8 Area efficient 1 fF fringed capacitor used to provide $C_{div}$ . . . . .                                                           | 112 |
| 4.9 Power supply variation effect of (a) $V_{DD}$ biased VCS, (b) $V_{CM}$ biased VCS . . . . .                                         | 113 |
| 4.10 Power supply variation versus ADC resolution with different settings. . . . .                                                      | 115 |
| 4.11 Chip photo. . . . .                                                                                                                | 116 |
| 4.12 (a)DNL and INL before calibration at supply voltage of 0.5 V. (b)DNL and INL after calibration at supply voltage of 0.5 V. . . . . | 117 |
| 4.13 FFT spectrum at condition shown. . . . .                                                                                           | 118 |
| 4.14 Input signal frequency versus SNDR measured at 0.5 V. . . . .                                                                      | 118 |
| 4.15 Power supply voltage versus speed improvement by 2-bit/step SAC operation. . . . .                                                 | 119 |
| 4.16 Power supply variation versus ENOB response in several calibrated supply voltages. . . . .                                         | 120 |
| 4.17 Effect of power supply variation with Vcm or VDD changed separately                                                                | 120 |
| 4.18 Simulated and measured temperature variation effects. . . . .                                                                      | 121 |
| 4.19 Comparison with low power state-of-art works. . . . .                                                                              | 122 |
| 5.1 DA with 2-bit/step. . . . .                                                                                                         | 127 |
| 5.2 DA estimated performance with 16nm and 28nm CMOS . . . . .                                                                          | 128 |

# List of Tables

|     |                                                                                            |     |
|-----|--------------------------------------------------------------------------------------------|-----|
| 2.1 | Normalized settling error requirements for opamp and DA based MDACs, respectively. . . . . | 45  |
| 2.2 | The design of the 8-bit DA C-DAC. . . . .                                                  | 55  |
| 2.3 | Inter-process comparison of the digital amplifier-based MDAC. . . . .                      | 62  |
| 2.4 | Performance Comparison with state-of-the-art Pipelined and Pipelined-SAR ADCs. . . . .     | 63  |
| 3.1 | Comparison with state-of-the-art high-speed ADCs. . . . .                                  | 96  |
| 4.1 | Comparison with conventional 2-bit/step ADC. . . . .                                       | 100 |
| 4.2 | ADC performance summary. . . . .                                                           | 122 |

# Chapter 1

## Introduction

### 1.1 CMOS Scaling

Since 1970 and until now, the number of transistors integrated in a single microprocessor has continuously been increasing. In 2019 today, the CMOS scaling still continues ; the 5nm CMOS risk production is soon beginning and developments of the next CMOS node (3nm CMOS) is highly active [1]. For an example, the TSMC 5nm node brings 15% performance and 45% area improvements compared to the 7nm node [2]. 40 years ago, it has been said that the CMOS scaling limit is around 1um due to physical constrains (wavelength of light), nearly  $1000\times$  of scaling is about to be accomplished with number of technology breakthroughs. While the motivation towards CMOS scaling can be diverse, the largest reason behind is probably economically and financially driven. That is, by utilizing further scaled CMOS process, the unit cost of a single transistor can be cut down and the chip performance can be improved by moving to advanced CMOS processes. Therefore, a chip with more competitiveness and higher profit margins can be obtained, which is the most important factor in silicon business. While CMOS fabrication companies (TSMC, Intel, Samsung) invest enormous amount of money and resources towards advanced CMOS processes and chip design companies invest largely on process porting, their expected ROI (return on investment) upon moving to the advanced nodes are much greater!

### 1.1.1 Will CMOS scaling continue forever?

The end of CMOS scaling will approach when the amount of investment overcomes the expected return, which is expected to be the 3nm node or the next [3]. Then, what will happen to us circuit designers? Will we all lose our jobs? A potential technological direction is: for a specific application, a dedicated process technology can be adapted. Let us return to the point that the CMOS process is dominantly used because of economic reasons (its far cheaper than other processes!), and other dedicated process technologies perform better than CMOS. However, that precondition will be broken with further scaling and a strong motivation will be born to adapt non-CMOS process technologies. For an example, for RF SoCs, a co-integration of CMOS and compound semiconductors (GaN, SiC) can become the mainstream. For mobile SoCs where power consumption is crucial, SoI CMOS may be used. Co-integrating silicon photonics and CMOS is an interesting technology [4], which may produce breakthroughs in wireline communications[5] and LiDARs [6]. These multi-device integration are exciting directions and will bring design paradigms even to analog circuit designs. An another optimistic technology direction is, the CMOS technology will cause a breakthrough (as it has done in the past decades) and process scaling will continue further on.

### 1.1.2 Recent trends in CMOS scaling and digital circuits

Let us return to the topic of the CMOS scaling trends of the last decade. The figure 1.1 plots the processor performance of the last 42 years [7] [8]. While we say "scaling" in one word, the "Dennard's Law" scaling [9], which keeps the power consumption of the chip constant, has already ended and "Moore's Law" scaling [10] is the only one active, which simply increases the crammed number of transistors in a single chip.

When the "Dennard's Law" scaling was active, the device size and the clock frequency improved 30% every process generation. While this alone will explode the chip power, by scaling the power supply and the load capacitance, the entire power



Figure 1.1: 42 years of processor trend. (In courtesy of [7] [8])

consumption of the chip was kept constant. Note that while the load capacitance benefits from the physical scaling effect, the power supply voltage was able to be scaled down by lowering the transistor threshold voltage. However, "Dennard's Law" scaling ended around 2006 since the power supply voltage could no longer be turned down. Around this time, the transistor leak current (or off-currents) became a non-negligible power consumer in SoCs. After the Dennard scaling ended, the CMOS processors' performance became restricted by thermal density power (TDP) and not clock speed. Chips cannot consume more power (or heat) than it can cool, or else the chip itself can be severely damaged if operated in high temperatures ( $> 125$  deg.). One can notice the performance limitation by TDP when running a large program and monitoring the CPU clock rate; when the CPU temperature exceeds a certain amount, the CPU will configure to lower its clock rate, simultaneously degrading the processing performance. Thus, cooling technologies are highly active research areas in high performance computing [11].

Interestingly, the inconvenience that the "Dennard's Law" scaling has ended became a strong motivation towards developing new digital circuit technologies.

Conversely, when the Dennard scaling was active, the chip performance will greatly improve by just porting to a new process node; implementing new technologies were not worth the try. One technology direction where the digital architectures head are "general" towards "domain specific". For an example, by looking at Number of logical cores in Fig. 1.1, we can tell that processors are heading to increase the operation parallelism and functionalities. While it is difficult to improve the performance of general single-instruction operations, multi-core processors boost the performance of highly-parallel operations and multi-task programs. Graphic processing units (GPUs) architectures evolved extreme in this direction. State-of-the-art GPUs has over 8000 cores [12] and has become the *de facto* standard for graphic processing and deep neural network training. While each cores are simple compared to x86 cores, the enormous amount of parallelism become highly effective in "domain specific" tasks like vector / matrix processing.

A number of processors utilize specialized hardware, given the extra number of transistors to be used. For an example, smartphones have a very strict power budget and its processor power efficiency is top of mind. The iPhone A9 processor (Fig.1.2) has a dedicated CPU and GPU but in addition, over 50 "specialized hardware" exist to process images, video, audio and to ensure security. Such "specialized hardware" can only perform a dedicated operation (e.g. encode video) but its power efficiency is extremely high compared to general processors. Moreover, dynamic voltage and frequency scaling (DVFS) techniques has become common in mobile SoCs to extremely scale power when the work load is small. To conclude, while performance improvements for general processors hit the wall, "domain specific" hardware has given rise. Interestingly, it can be interpreted economically that the investment return on architectures and technologies are now higher than investment on process technologies.



Figure 1.2: Apple A9 chip. (In courtesy of [13])



Figure 1.3: Modern RF SiP integration

### 1.1.3 Process scaling (and problems) with analog circuits

What is the analog circuit evolution trend in the last decade, compared to digital circuits? However, the largest problem in analog circuit design with process scaling is: a performance drop is common when moving to advanced nodes. We will study this effect further in the following sections. Generally, when we move to an advanced process node, we see that the analog circuit area scaling is much smaller than that of digital circuits. Therefore, the relative cost of analog circuits (per unit area) is becoming higher and higher. For some time, this impact to the SoC cost was neglected by the large cost scaling of digital circuits. However, the cost scaling of digital circuits has also become slower; it is becoming more challenging to accept the increasing costs of analog circuits. In the latest smartphones (iPhone XS Max and Galaxy X released in 2018) [14] [15], we can tell that it is more cost efficient to split the RF analog circuits and baseband digital circuits to separated chips. While splitting chips causes additional integration costs, we can infer that even with such integration costs, it is more cost efficient to get rid of analog circuits from the baseband digital chip.

Commonly, such modern wireless modules (like smartphones) implement analog RF ICs (RF circuits and baseband ADCs) with relatively legacy CMOS process (e.g. 28nm CMOS). On the other hand, baseband digital ICs (or modems) are integrated with cutting-edge nodes (e.g. 7nm CMOS) [15]. Then, such chips can be integrated with silicon interposers and ship as a single package (Fig.1.3). However, wireless communication trends tries to boost up the communication capacity as the standards evolve (e.g. 5G, 6G..), even trying to reach a communication bandwidth of 100Gbps [16]. Moreover, the industry heads towards to increase the number of



Figure 1.4: FPGA with analog circuit integration (In courtesy of [20])

multi-input and multi-output (MIMO) and carrier aggregations (CA). Thus, the number of analog transceiver circuits must scale with the MIMO and CA; if the analog CMOS process is fixed, scaling the number of analog circuits will directly impact the chip cost, making the realization of future wireless trends extremely challenging.

The analog circuits failing to scale is not only a problem for wireless devices. For an example, high performance processors (e.g. CPU and GPU) utilize high-speed I/O circuits which commonly constructs with ADC based receivers [17] [18]. While increasing the number of memory IOs can significantly boost up the system performance (note that most processor performances are memory bandwidth limited [19]), the poor area scaling of analog circuits is one of the main bottlenecks limiting the number of memory I/Os. Therefore, process scalable analog circuit design techniques are an important research topic effecting various applications.

For a long time, ADC industrial researches were long driven by companies such as Analog Devices and Texas Instruments. However, such companies do not have a strong motivation to tackle into analog designs with advanced technology nodes because their main products are discreet analog devices and legacy nodes play along

well. Recently, Xilinx drives researches of analog circuits for high-speed I/Os and software defined radios. Recent publications include > 4GS/s 13-bit ADCs in 16nm FinFET [21] [22] and integrated high-speed ADC based IOs [23]. By integrating the ADCs with the 16nm node, such circuits can be integrated within the FPGA (ZYNQ UltraScale + RF SoC is shown in Fig.1.4 [20]). For baseband stations with excessive numbers of MIMO, FPGAs integrated with multiple channels of high-performance ADCs can lower the system bill of materials (BOM) cost and power consumption, compared to the legacy implementation which integrates multiple discrete ADC chips.

#### 1.1.4 Analog circuits' scaling effect

Similar to digital circuits, can we take the analog circuit's scaling challenges as a step to revolutionize analog circuit design? This thesis aims to establish a CMOS process scalable analog circuit design technique, especially focused to a Nyquist ADC. Before going to the details, we would like to study why analog cannot compete with CMOS process scaling.

Here, we will focus on an operational amplifier (Opamp), which is the key analog design components for multiple circuits (e.g. switched capacitor circuits, amplifiers, and filters). While there are multiple performance figures for an Opamp we will focus especially on: Gain-Bandwidth (GBW which couples with speed), output amplitude swings (which couples with noise), and lastly gain (which couples with precision). To start off, the effect each measures receive with scaling is studied. First of all, GBW improves with process scaling. Since the transistor GBW is decided by:

$$GBW = g_m/C_p \quad (1.1)$$

the parasitic capacitor  $C_p$  shrinks with scaling and GBW improves. On the other hand, the output swing is affected by the decreased power supply voltage. Therefore, it is essentially impossible to improve the output swing and will be damaged by

scaling. If the power supply decrease 10% by moving to an advanced node, relatively the analog circuit output swing and noise performance will decrease *at least* 10%.

Finally, we discuss about the effects to gain performance. The gain is, for example, the most important parameter to obtain high accuracy in a pipeline ADC. The adverse effects of scaling are most apparent in gain performance and are affected by both supply voltage drop and transistor analog performance degradation. Commonly, there are three approaches upon achieving high gain in Opamps: 1.) Cascode the transistors, 2.) increase the transistor W/L size to increase  $g_m$ , 3.) Increase the number of Opamp stages. A cascode configuration requires a voltage headroom of  $2V_{od} + 2V_{th}$ , the rest of the voltage margin is assigned to the output amplitude. However, let us expect  $V_{th} = 0.4V$ ,  $V_{od} = 0.1V$  and power supply voltage 0.9 V in 28 nm CMOS technology. Critically, with cascoding, the voltage headroom alone reaches 1 V, exceeding the power supply voltage! Therefore, cascode connection cannot be utilized under low power supply voltage. While Opamp gain can be enhanced by increasing  $g_m$  and the number of stages, such approaches consume much more power than cascading.

An another problem in scaled CMOS analog circuit design is the degraded performance of the transistor itself. As well known, with a sufficiently large output resistor, the gain of a common source amplifier circuit can be derived as

$$Gain = g_m \times r_o \quad (1.2)$$

where  $r_o$  is the output resistance of the transistor. While  $r_o$  directly couples to Gain, the value of  $r_o$  is an inverse proportion to the channel length ( $L$ ) and utilizing scaled transistors will damage Opamp gain. While we can gain sustainable  $r_o$  by venturing the use of large  $L$ , but this approach cannot gain any benefits from process scaling; the relative cost of analog circuits will increase.



Figure 1.5: SAR ADCs published in ISSCC, VLSI (1997-2008)

## 1.2 Towards process scalable analog circuits

### 1.2.1 Rise of SAR ADCs

On the other hand, there are also exists an analog circuit whose performance improve by process scaling. A typical example is the SAR ADC. Conventionally, SAR ADCs were utilized for low speed, high resolution ADCs due to their nature which requires multiple cycles to complete the conversion. Typically, the number of required conversion cycles are equivalent of the number of state bits. On the other hand, Pipelined and Flash ADC's conversion time is overwhelmingly short and had been adapted for high-speed applications. For example, in the case of Flash, conversion is performed only once by the comparator and Pipeline ADCs require only sub-ADC conversion and amplification within the conversion cycle, naturally suiting high-speed applications. However, all circuit blocks of the SAR ADC benefit from scaling and its performance can be improved. Due to that fact, the SAR ADC development over the last decade was remarkable and looking at the summary of published SAR ADCs are very informative.

Fig.1.5 plots the SAR ADC performance published in ISSCC and VLSI during 1997-2008, whose data were based on [24]. The x axis shows the sampling speed and y axis shows the walden figure of merit (FoM) [25]. During those days, the



Figure 1.6: SAR ADCs published in ISSCC, VLSI (1997-2018)

most advanced process node were 65nm CMOS and most of the works are based on 130nm or 180nm CMOS. To plot the evolution of unit-SAR ADC performances, we exclude time-interleaved ADCs in the plot, whose fastest unit SAR ADC were 100 MS/s. The Elzakker SAR ADC [26] was presented at ISSCC 2008 (is included in the plot), which improved the SAR ADC power efficiency by  $10\times$  (!) compared to the prior art. This work showed an one shape of a "accomplished" SAR ADC, which gave rise to extensive researches upon further improving the SAR ADC performance and is still active until now.

Fig. 1.6 shows the SAR ADC performance presented during ISSCC and VLSI until now (1997-2018) [24]. Firstly, the process technologies evolved greatly in the past ten years, and the most advanced node presented was 14nm CMOS. Lets study the evolution in both terms of speed and power efficiency. The SAR ADC research direction splits into mainly two paths: those that pursue power efficiency at low speeds ( $< 1\text{MS/s}$ ) and those that pursue high-speed, high-resolution performance which aims to replace Pipelined ADCs. For the former, the power efficiency was further pushed and reached even  $0.4 \text{ fJ/conv.}$ , mainly due to the optimization (supply voltage were reduced from 1V to 0.3V, which improves the energy efficiency  $10\times$ ) and improved process nodes. Therefore, the SAR ADC energy bounds were pushed nearly  $10\times$  in the past 10 years, which are critical components to realize



Figure 1.7: SAR ADC circuit block diagram.

low-powered sensor devices. Interestingly, many SAR ADCs that achieve  $> 10$ -bit and high-speeds ( $> 100$  MS/s) have also been published and is a active research area. Since  $> 100$  MS/s ADCs are mandatory for mobile communications (LTE and WiFi), power efficient SAR ADCs replacing the power-hungry Pipelined ADCs are in high demand. The speed boundaries have also been pushed  $10\times$ , which greatly expanded the application of SAR ADCs.

Before going in to the further details of the SAR ADC, the fundamental SAR ADC operation is explained briefly. The block diagram of the  $n$ -bit SAR ADC is shown in Fig. 1.7. After sampling the input signal  $V_{in}$ , the comparator compares either the input or the reference voltage is larger. The reference voltage required for the comparison is generated by the C-DAC, which is has a resolution of  $n$ -bit as well. Since C-DAC is the only analog (in terms of having multiple voltage levels) component in the ADC, the ADC linearity is determined by this circuit. The comparison result is stored in the logic circuit, and the reference voltage is shifted in the direction which the input range can be narrowed down. The SAR ADC operation is basically a binary search: the initial comparison configures if the input signal is larger than  $1/2 V_{ref}$  or not. If the input signal is smaller, the reference voltage will be shifted to  $1/4 V_{ref}$  and if larger, the reference voltage will be configured to  $3/4 V_{ref}$ . The procedure above is one cycle, and by repeating for the given number of

cycles, a fine analog-to-digital conversion results are obtained.

Then, we will review the function of each circuit of the SAR ADC and consider the impact of process scaling. Fundamentally, the SAR ADC cycle time can be represented by the sum of the delays of comparator, logic and C-DAC.

$$Cycle = t_{Comp} + t_{Logic} + t_{CDAC} \quad (1.3)$$

First of all, the process scaling effect appears most straightforward to logic circuit delays. Since the SAR ADC's logic circuit is mainly composed of flip-flops, the delay of the logic is almost equivalent to the digital gate delay. Therefore similar to a general digital circuit's scaling effect, the delay will be 30% faster for every time the process node advance.

Also, comparators benefit from scaling and the speed will improve proportionally to the GBW and digital gate delay. In general, the comparator circuit can be divided into two circuits, a preamplifier circuit that converts the voltage difference between two inputs into a current difference, and a latch circuit that will amplify the current difference and output as a digital value. The delay of the latch circuit corresponds to the digital gate delay, as in  $t_{Logic}$ . The speed of the preamplifier circuit corresponds to the transistor GBW, which also improves with process scaling.

Also,  $t_{DAC}$  is somewhat proportional to the unit capacitance of the C-DAC. In legacy process technologies, capacitors were created by inserting insulators between vertical metal layers (metal-insulator-metal MIM capacitor). While MIM capacitor has a superior matching property, but the minimum capacitance is quite large (10-50 fF). On the other hand, advanced process technologies enable the use of metal-oxide-metal (MOM) capacitors, which simply utilize the parasitic capacitance born between metals. Since the metal fabrication accuracy has improved significantly, and it has become possible to create highly accurate MOM capacitors. Because MOM capacitors can utilize very small unit capacitance (down to 500 aF), the energy consumption and the delay of the C-DAC has greatly improved together with developments of efficient C-DAC switching techniques [27].

### 1.2.2 Fundamental Problems of the SAR ADC

Although the SAR ADC has made a performance breakthrough in the past decade, the performance enhancement has hit a brick wall which is studied further in this section. Here, it is shown that realizing a high-resolution and high-speed SAR ADC is fundamentally difficult. One of the challenges such SAR ADC faces is the reference voltage settling constraints. Due to the structure of the binary C-DAC, when the large MSB capacitor is switched after the first comparison, a large amount of charging / discharging occurs. Such sudden charge fluctuation causes ringing in the reference voltage, because of the LC resonance of the bonding inductance. To obtain high accuracy by the SAR ADC, such voltage ringing must be attenuated within  $< \text{LSB}/2$  to  $\text{LSB}/4$ , since fluctuated reference voltage corrupts the conversion accuracy. Since a typical solution is to "wait" until the ringing calms down, this prolongs  $t_{C-DAC}$  and limits the conversion speed.

One way to reduce the voltage ringing is to utilize a large "decoupling" capacitor on-chip so the C-DAC charge can be supplied on-chip. However, such decoupling capacitors can easily reach few nFs [28] [29] to achieve high-accuracy. Such capacitors can be even several times larger than the ADC core, and its cost overhead cannot be accepted.

Another way to get around the voltage settling is by providing an on-chip voltage buffer . With buffer bandwidth is sufficient, voltage fluctuations will not occur. On the other hand, this breaks the premise that SAR ADCs does not require an active element, since voltage buffers are basically a high-bandwidth power-hungry opamp. While the power consumption of the voltage buffer is typically excluded in the ADC performance presented at academic conferences, some works report that the utilized voltage buffer itself consumes  $4\times$  more power than the SAR ADC itself [30]. If the SAR ADC included the voltage buffer in its core area, most high-speed high-resolution SAR ADCs may even under-perform the power efficiency of state-of-the-art Pipelined ADCs.

## 1.3 Hybrid ADCs

As mentioned in the previous section, it is fundamentally difficult to realize a high-resolution and high-speed SAR ADC. On the other hand, Pipelined and Flash ADCs alternatives but does not meet the power efficiency requirements of mobile devices. Therefore, to overcome this challenge, there has been extensive researches to make use of the SAR ADC in other ADC architectures, where ADCs which fuse two different architectures (e.g. Pipelined ADCs and SAR ADCs) called "Hybrid" ADCs. By utilizing Hybrid ADC architectures, designers accomplished performances which were difficult with "Monolithic" ADCs.

### 1.3.1 Pipelined-SAR ADCs

Here, we will study deeper "Pipelined-SAR ADC", which is one of the most successful Hybrid ADCs to date.

While it is a cliche that "Pipelined ADCs are power-hungry", why is that? One of the major reason is that Pipelined ADC requires *multiple* power-hungry Opamps (and amplification circuitry), depending on the number of Pipelined stages. Therefore, Pipelined-SAR ADCs aims to lower the ADC power consumption by minimizing the number of Pipeline stages by utilizing a SAR ADC as the high-resolution Quantizer [31] [32]. Conventionally, Flash ADCs were utilized as the Quantizer but its resolution were limited to 4-bit, since the required number of comparators increase exponentially with resolution. By replacing the Flash ADC to a SAR ADC, the quantizer resolution can be greatly improved over the limits ( $> 6$ -bits). While such configuration impacts the conversion speed, since SAR ADCs are much slower, it can be countered in deep scaled CMOS where the SAR ADC conversion speed improves.

The Pipelined-SAR ADC in [31] uses a two-stage configuration of 6-bit 1st stage SAR + 6-bit 2nd stage SAR to construct a 12-bit ADC in total. The residue voltage generated in the 1st stage SAR ADC is amplified  $64 \times$  and sampled via 2nd stage SAR, realizing a two-stage operation. Since only one residue amplifier is required,

the overhead of pipelining is minimized and high power efficiency was obtained.

Moreover, Pipelined-SAR ADC holds several merits over SAR ADC as well. Firstly, the conversion speed excel. Pipelined-SAR ADCs require only 6 SAR cycles and amplification during the conversion cycle, in contrary to the 12-bit SAR ADC which requires 12 SAR cycles. In addition, since the conversion is performed in two-steps, the reference voltage settling requirements are greatly relaxed in Pipelined-SAR ADCs. Specifically, if there is 0.5-bit redundancy between stages, the reference voltage requirement of each stage is only the 1/4 of the 6-bit LSB (which is equivalent to 16 LSB for full 12-bit resolution). Compared to 12-bit SAR ADCs which requires the reference to settle within 1/4 (12-bit) LSB, the design of the reference buffer/decoupling caps. is significantly relaxed and reduces the total system cost. Thus, the hybrid architecture combining pipeline and SAR can enjoy the advantages of both architectures, and achieve both high performance and high power efficiency.

### 1.3.2 Design challenges of the Pipelined-SAR ADC

However, even though Pipelined-SAR ADCs achieve high performance, significant design challenges remain.

1.) **Pipelined-SAR ADCs requires high precision residual amplification.** For such amplification, a high-gain op-amp is indispensable but such designs are difficult to achieve in scaled CMOS processes. While various approaches have been taken to realize high-gain amplifiers in scaled CMOS (detailed benchmarks will be done in Chapter 2), but none have been able to completely overcome the analog process scaling challenges.

2.) **Most designs utilize complex digital gain calibrations.** Hence, number of designs utilize digital calibration to counter gain error and tolerate the use of a low-gain amplifier. Since precise gain is not required, this approach allows the use of efficient open-loop (or dynamic) amplifiers [28] [29]. However, sudden supply voltage variations cannot be tracked and suppressing such fluctuations with bypass capacitors significantly impacts chip cost. While environment variation tracking



Figure 1.8: Thesis organization.

dynamic amplifiers have been proposed, start-up calibration is still necessary. Such calibration typically takes several tens of ms, resulting in lengthy start-up times and reduced SoC power efficiency.

While Hybrid ADCs achieved a breakthrough in performance, it is not a silver bullet towards process scalability and number of critical design challenges remain.

## 1.4 Thesis motivation and organization

In this thesis, design techniques towards CMOS process scalable and a power efficient Nyquist ADCs are explored. Our thesis construction is shown in Fig. 1.8. The key approach we take upon realizing a process scalable ADC is: **1) aggressively utilize the scalable successive approximation (SA) circuitry and 2) propel a Hybrid with the existing ADC architectures.**

We target the ADC application to wireless baseband ADCs for mobile devices in this thesis. Modern wireless standards (e.g. 802.11ax WiFi [33] and 5G [34]) feature mainly two frequency bands: an under 6 GHz band for long distance communications



Figure 1.9: Benchmark for high-speed high-resolution Pipelined ADCs.

and > 20 GHz ultra-wide band for extremely-high-speed communications [35] [36], typically called as ultra-wideband (UWB) communications. Thus, at least two types of baseband ADCs will be required to realize such modern wireless systems. And most importantly, such ADCs are required to be power efficient as possible, since battery life of mobile devices are one of the largest concerns.

**1) We target our first ADC for < 6GHz wireless communication, which is required to be medium-speed and high-resolution.** Since the baseband bandwidth can be as large as 80 MHz, we target the ADC speed to 160 MS/s. To establish fine communications with long distances (several km), the ADC input signal level can be very low. Therefore, to enhance the receiver dynamic-range, we target the ADC resolution to 12-bit (effective SNDR of 60 dB). Since this ADC is most frequently used for wireless communications, high power efficiency is required to prolong battery life; we target the ADC power efficiency to 20fJ/conv., which is a state-of-the-art performance.

Such high-performance is difficult to achieve with a monolithic SAR ADC, given its performance limitations. Therefore, Pipelined-SAR architectures will be the best candidate, but a large design challenge remain when realizing a high-precision amplifier in scaled CMOS process (e.g. 28nm CMOS). We benchmark such ADCs in Fig. 1.9, where we plot the published Pipelined-SAR and Pipelined ADCs. Due

to the design challenges, most of the works achieving high power-efficiencies utilize complex digital calibration to relax amplifier design requirements (plots in blue triangles). Moreover, such designs poses problems in PVT variations and PSNR, which can become troublesome during system integration. On the other hand, works without gain calibration have far worsened power-efficiencies (over  $3 \times$  worse), and does not meet the demand for mobile devices. Thus, our design target is to design a high performance Pipelined-SAR ADCs without the need of digital calibrations in deep-scaled CMOS process.

**2) We target our second ADC for  $> 20$  GHZ ultra-wideband (UWB) communications, which is required to be high-speed and low-precision.** Even with mobile devices, there is a large demand to deliver large-capacity contents like videos and movies. To deliver such contents with high-quality, an UWB communication that is fast as wireline communications are demanded. The baseband frequencies of such UWB communications can reach up to several GHz. In this thesis, we target our ADC speed to 1 GS/s and plan to time-interleave such unit ADCs to reach higher speeds, if demanded. In UWBS, the distance between the base station and the carrier will be very close (several tens of meters) and the received signal level is considered to be relatively high. Thus, we set the ADC resolution to 7-bit (SNDR 35 dB). While such ADCs are common for applications like measurement instruments and wireline communications [37], such ADCs are very power-hungry and does not meet the demands of mobile devices. Therefore, low-power and high-speed ADC design techniques are in high-demand.

#### 1.4.1 Thesis organization

Here, we will briefly discuss the organization of the thesis.

In chapter 2, design techniques for process scalable Pipelined-SAR ADCs are explored. We focus especially on the switched capacitor amplification circuit, which becomes the largest obstacle when implementing Pipelined-SAR ADCs in scaled

CMOS processes. To tackle the problem that Opamp gain cannot be obtained in scaled processes, we propose the Digital Amplifier (DA) technique to realize power-efficient and accurate amplification in scaled CMOS. DA cancels out all errors (i.e. gain error, non-linearity, settling, and thermal noise) of the low-gain Opamp by feedback based on successive approximation (SA). Moreover, the DA accuracy can be arbitrary set by configuring the number of bits in the DA C-DAC; the amplifier gain is decoupled from the transistor analog performance which brings in a new design paradigm and the design methodologies for DA is deeply discussed. Interestingly, since majority of the amplification is "digital" operation, due to the nature of SA circuitry, the DA circuit is highly process scalable.

To confirm the power-efficiency of the DA, we implemented a 0.7V 12-bit 160MS/s Pipelined-SAR ADC in deeply scaled 28nm CMOS, which meets our target for  $< 6$  GHz baseband ADCs. The Pipelined-SAR ADC does not require any digital gain calibration and achieves SNDR=61.1dB, FoM=12.8fJ/conv.. The ADC accomplished a world's best power-efficiency (over  $3\times$  improvement) compared to conventionally published calibration-free high-speed pipelined ADCs. In addition, we evaluate the DA's process scalability by comparing the measured results of the DA-based MDAC prototyped in 65nm and 28nm CMOS. We observe  $2\text{-}3\times$  improvement in speed, power and area mainly resulting from the DA's process scalability.

In chapter 3, we explore power-efficient and process scalable ultra-high-speed ADCs, required for high-capacity wireless communications. While conventional Flash ADCs achieve a very fast conversion rate, its power consumption is notorious. Moreover, while the ADC sampling rate varies dynamically in wireless systems (because the number of available channels varies with environment), Flash ADCs will always consume high-power irrespective of the sampling rate. Digital circuits realize super-linear power scaling by dynamically scaling the power supply voltage reflecting the CPU clock frequency [38], but high-speed ADCs are very sensitive to power supply variations; dynamically scaling the supplies are not realistic.

To achieve super-linear power scaling in high-speed ADCs, we propose to dynamically configure the ADC *architecture* reflecting the ADC clock frequency which we name Dynamic Architecture and Frequency Scaling (DAFS). The ADC architecture is reconfigured between successive-approximation and flash every clock cycle, relying on the conversion delay. To realize architecture configuring with small overheads, successive-approximation/flash reconfigurable ADC is proposed, which just adds few gates to conventional successive-approximation (or binary search) ADCs. The DAFS operation is fully automatic; the flash operation is adaptively performed by detecting excess delays during conversion and no pre-programming is required. We also show that DAFS not only significantly improves the power scaling but also compensates for transistor speed shifts due to process, voltage and temperature (PVT) variations as well.

A prototype subranging ADC is fabricated in 65 nm CMOS, which operates up to 1220 MS/s and achieves SNDR of 36.2 dB. DAFS is active between 820–1220 MS/s and achieves peak power reduction of 30%, when compared with the power scaling when DAFS is disabled. A peak FoM of 85 fJ/conv. was obtained at 820 MS/s, which is 2x more power efficient than reported subranging ADCs, at the time the paper was presented.

The ADC techniques presented in chapter 2 and 3 heavily rely on the comparator performance. For an example, the amplification speed of the DA in chapter 2 is largely dominated by the successive approximation (SA) cycle time. If the number of SA cycles can be reduced by multi-bit conversions, the ADC conversion speed can be greatly improved but such multi-bit conversions require variable threshold comparators. Moreover, if the comparators can hold a variable threshold voltage, the binary-search ADC utilized in chapter 3 can get rid of reference generation circuits which consumes a non-negligible amount of static power.

In chapter 4, we aim to design threshold configurable comparators (TCC) to improve the performance of successive approximation based circuits. Such TCCs

are benefitable, but has a number of design issues: 1) is difficult to implement if the threshold configuring range is very large. 2) TCCs typically have low power-supply-noise-rejection (PSNR), and the threshold can easily drift with even small supply fluctuations.

We propose current source based TCCs to enable wide-range threshold configurability. Moreover, we propose simple  $V_{cm}$  biased current sources, which maintains sufficient comparator PSNR and keeps the ADC free from power supply variations over 10%. To prove the effectiveness of the TCC, we implement a 2-bit/step SAR ADC where the 2-bit/step comparison is carried out by TCCs instead of area and power consuming C-DACs. The prototype ADC fabricated in a 40 nm CMOS achieved a 44.3 dB SNDR with 6.14 MS/s at a single supply voltage of 0.5 V, and achieves a peak FoM of 4.8 fJ/conv-step.

Finally in chapter 5, we summarize the thesis and establish a conclusion.

# Chapter 2

## Digital Amplifier

### 2.1 Introduction

In this chapter, we will focus on the process scalable 12-bit 160 MS/s ADC designs, which mainly targets mobile long-distance communications such as 5G and 802.11ax WiFi SoCs [39]. Since such transceivers are most heavily used in the mobile devices, the ADC's power efficiency is crucial to the device battery-life. While SAR ADCs becomes the priority design candidate when obtaining peak power efficiencies, SAR ADCs have the downside of reference settling, discussed in chapter 1. Though it is possible to design such a high-speed and high-resolution SAR ADC, the overhead of peripheral circuits cannot be ignored; reference buffer may consume more power than the core ADC [30] or extremely large decoupling capacitors will be required.

Therefore, Pipelined-SAR ADCs become the most suitable architecture for such design targets. By pipelining, the reference settling requirements of the SAR ADCs can be greatly relaxed and the overhead of peripheral circuits will be sufficiently small. Moreover, by utilizing SAR ADC as the quantizer, high power efficiency can be expected. However, to achieve high-resolution in Pipelined-SAR ADCs, high-accuracy residue amplification and high gain Opamps are required. As previously discussed, achieving high gain Opamps with scaled CMOS are a major design challenge.



Figure 2.1: Zero crossing based amplifiers

### 2.1.1 Review of conventional amplifier for scaled CMOS designs.

Realizing a suitable amplifier in scaled CMOS has been an active and important research area in the field of ADC designs. For an example, correlated level shifting (CLS) [40] enhances the opamp gain by a square with two-step amplification. However, in deep-scaled CMOS, even square enhancement may be insufficient due to the degraded opamp gain.

Zero-crossing-based amplifiers [41][42][43][44] achieve efficient and accurate amplification by focusing on the virtual-ground node (Fig.2.1). The amplifier output is charged by a current source and when the virtual-ground establishes a zero-voltage, the current source is cut off. The virtual-ground sensing can be realized by a simple zero-crossing-detector (ZCD). Ideally, this will achieve ideal amplification but several critical issues remain in real-life usages. Firstly, while the finite detection delay of ZCDs will become amplification offsets, such offsets may produce non-linearity reflecting the input voltage with low-output-resistance current sources. In scaled processes, improving the linearity of current sources are a big challenge since supply voltages are very low: e.g. cascading transistors are not available. Therefore, realizing high accuracy with ZCDs in scaled CMOS have similar challenges to the Opamp



Figure 2.2: Ring amplifiers

and is very difficult. Moreover, low-power ZCD designs are also challenging in high-speed converter designs because ZCDs are basically Opamps which draw constant currents. Since the starved current scale with the amplification speed, realizing high power efficiency is challenging.

Finally, ring amplifiers [45][46][47] are also efficient amplifiers with emerging techniques. Ring amplifier operation differs from conventional amplification and there is a lot of room for new researches. Fundamentally, the ring amplifier gain is limited by the inverter gain, which degrades with scaled CMOS process. The maximum achievable gain for a three-staged ring amplifier will be a cubic of a single inverter gain, which may be around 40-50dB in scaled CMOS process in the worst corners, and thus inefficient for high-accuracy pipelined converters. Most advanced ring amplifiers utilized in 16nm CMOS utilize digital calibration, though the proposed calibration itself is unique and quite inexpensive [48] [49].

### 2.1.2 Utilizing digital gain calibration

Hence, number of designs utilize digital calibration to counter gain error and tolerate the use of a low-gain amplifier. Since precise gain is not required, this approach allows the use of efficient open-loop (or dynamic) amplifiers [50][29][28][51]. However, sudden supply voltage variations cannot be tracked and suppressing such fluctuations with bypass capacitors significantly impacts chip cost. While environment variation tracking dynamic amplifiers have been proposed, start-up calibration is

still necessary [52][53]. Such calibration typically takes several tens of ms [28], resulting in lengthy start-up times and reduced SoC power efficiency. Furthermore, non-linearity of open-loop amplifiers remains unsolved; with lower supply voltages, the limited amplifier swing tightens SAR noise requirements.

### 2.1.3 Our approach

To establish a process scalable amplifier for Pipelined-SAR ADCs, we propose the digital amplifier (DA) technique. DA cancels out all errors of the low-gain amplifier by feedback based on successive approximation (SA). Errors are detected by judging the virtual ground polarity and canceled out by a C-DAC connected to the MDAC output. Unlike conventional amplification techniques, the amplification accuracy is determined by the C-DAC LSB step and decoupled from transistor intrinsic gain, which brings a new design paradigm for ADC designs in scaled CMOS process.

The DA is used to realize a calibration-free 0.7V 12-bit 160MS/s pipelined-SAR ADC [54] [55]. Without any calibration, the ADC achieves SNDR of 61.1dB and FoM=12.8fJ/conv., which is over  $3\times$  improvement compared with conventional calibration-free high-speed pipelined ADCs. Furthermore, the ADC area including bypass-capacitor is only  $0.097mm^2$ .

This chapter is constructed as follows: Section 2.2 describes the main concept and its amplification characteristics of the DA. Then, the further analysis of the DA is done in Section 2.3. Section 2.4 discuss the designed pipelined-SAR ADC and circuit implementations are disclosed in Section 2.5. Finally, measurement results are discussed in Section 2.6, along with the inter-process comparison.

## 2.2 Digital Amplifier

### 2.2.1 Review of Opamp based amplifications

Before going to the details, we will start off by studying an opamp-based switched capacitor (SC) amplifier and examine its accuracy bottlenecks in scaled CMOS. If



Figure 2.3: (a) Amplification error due to the finite gain of opamps. A portion of the amplification error is observed at the virtual ground  $V_x$ . (b) Concept of the Digital Amplifier is shown. By *directly sensing* the  $V_x$  value and applying feedback to the output, digital amplifier cancels all opamp-induced-errors (finite-gain, incomplete settling, thermal noise, etc.).



Figure 2.4: Schematic of a 2.5-bit flip-around MDAC with  $n$  bit Digital Amplifier.

the opamp has an infinite gain, ideal amplification is done: the output voltage will be ideal ( $V_{outideal}$ ) and virtual ground  $V_x$  will converge to zero. On the other hand, with finite gain (Fig.2.3(a)), an amplification error originating from the finite gain will occur. If the closed loop gain is  $A = A_{openloop} \times \beta$  ( $\beta$  = feedback-factor):

$$V_{out} = V_{outideal} \times \frac{A}{1 + A} \quad (2.1)$$

$$V_{amperror} = V_{outideal} - V_{out} \approx \frac{V_{outideal}}{A} \quad (2.2)$$

Such amplification errors will cause harmonic distortions in pipelined ADCs, degrading the SNDR. To design a pipelined-SAR ADC achieving our design target (SNDR>60dB), system simulations imply that  $A > 60$ dB will be required. Designing such amplifiers in scaled CMOS is very challenging; the achievable  $A$  can be small as 20dB at worst conditions.

### 2.2.2 Digital Amplifier Principals

We propose the digital amplifier (DA) technique to realize an efficient and process-scaling SC amplifier. DA cancels out all errors the opamp generate, which include

gain error, non-linearity, incomplete settling, power supply noise and thermal noise. In this section, we first study how the DA achieves fine effective loop-gain. The main concept of the DA is shown in Fig.2.3(b). DA operates with a 2-step amplification, where the opamp first performs a coarse amplification and then the DA cancels out the errors opamp produced. By *directly sensing* the value of  $V_x$  by a quantizer and cancelling out the errors by feedback via DAC connected to the amplifier output, ideal amplification can be achieved by converging  $V_x$  to zero.

The amplification error of the opamp can be shown as below.

$$V_x = V_{amperror} \times \beta \quad (2.3)$$

From above, we can derive the DAC transition ( $V_{DAC}$ ) as:

$$V_{DAC} = -\frac{V_x}{\beta} + N_Q \quad (2.4)$$

$$V_{out} + V_{DAC} = V_{outideal} + N_Q \quad (2.5)$$

Here, the amplification error  $N_Q$  is the total quantization noise of the quantizer and the DAC. Interestingly, this implies that while conventional amplifiers' accuracy were limited by transistor intrinsic gain, the DA accuracy is only limited by the feedback circuit's quantization noise (or resolution). The feedback circuit resolution is a much easier parameter to configure than transistor gain in scaled processes. We will describe this point further in later sections.

### 2.2.3 Digital Amplifier Implementation

In order to implement our proposed DA concept, a multi-bit quantizer and DAC will be required. Several requirements are: 1) fast, so that it will not limit the amplifier speed (few  $ns$ ) 2) minimum cost (area and power). In order to satisfy the two requirements, we propose a successive approximation (SA) inspired implementation of the DA (Fig.2.4). Since SA requires only a single-bit comparator, SA-logic and



Figure 2.5: Operation of the Digital Amplifier broken down in 4 steps. For simplicity, the DA is shown a 3-bit but the actual design is 8-bit.

C-DAC, the implementation cost is low. Moreover, SA conversions are very fast in scaled processes [56] and the amplifier speed is likely not to be limited by DA.

Here, the operation of the DA-based MDAC is explained step-by-step. As quoted previously, the MDAC operation is split to 2 phases: opamp and DA. During the opamp phase (Fig.2.5(a)),  $\phi_{OP}$  rises and the low-gain opamp is connected to the MDAC output to perform amplification. However, an error occurs owing to the non-ideal effects of the opamp.  $\phi_{OP}$  is driven by a 2ns long pulse and when  $\phi_{OP}$  sets down, the opamp is cut off from the loop and DA is activated (Fig.2.5(b)). During the DA phase, the virtual ground is forced to zero by carrying out feedback based on successive approximation (SA), utilizing a clocked comparator and a C-DAC. The comparator judges the polarity of  $V_x$  (Fig.2.5(c)) and the C-DAC connected at the MDAC output is controlled so that  $V_x$  will converge to zero. The SA operation is repeated for  $n$  cycles;  $V_{out}$  will always converge to the ideal  $V_{out}$  with an error range of the C-DAC LSB voltage ( $V_{CDACLSB}$ ), which also stands for the amplification error in DA (Fig.2.5 (d)). Note that while the DA generates digital codes to configure the C-DAC, this code is only used for amplification and not used for the final ADC output.

We will show that the DA cancels out not only the gain error but the thermal noise of the opamp as well. While there will be opamp thermal noise present during the initial opamp-based amplification, when the opamp amplification ends, the opamp is cut-off from the loop and its noise is sampled (Fig.2.5(b)). Since the opamp noise will simply appear at the  $V_x$  node as non-zero as in eq. (3), the DA operation can sufficiently cancel this out by successive approximation.

By configuring the number of bits in SA, the DA can arbitrary coordinate its amplification accuracy. However, the drawback is as the number of bits increase, the number of SA cycle during the amplification increase as well. Therefore, similar to a SAR ADC, a tradeoff exists between the amplification time and amplification accuracy in DA. To achieve a higher accuracy with constant amplification time, speed enhancing techniques such as 2-bit/step [57][58] can be adopted, but will

impact the power and area.

## 2.3 Further Analysis of Digital Amplifier

### 2.3.1 Amplification Error Characteristics

In this section, the DA amplification error characteristics are analyzed for deeper understanding. A significant feature of the DA is that its gain error is determined by the step size of  $V_{CDACLSB}$  and is irrelevant to intrinsic gain.  $V_{CDACLSB}$  can be easily halved by increasing the DA resolution by 1-bit, which is equivalent to improving the opamp loop-gain by 6dB. In this analysis, we will assume that the DA C-DAC output range is equal to the maximum error the opamp will generate. The DA's effective gain principal can be shown as below, assuming that DA's effective loop gain is  $A_{DA}$ , opamp loop gain is  $A_{OP}$  and DA number of bit as  $n$ :

$$A_{DA} = A_{OP} + 6 \times n \quad (2.6)$$

For further understanding, we will show a specific design example of our MDAC. Our opamp designed in 28nm CMOS can achieve only 20dB loop-gain with the worst conditions, contrary to >60dB loop-gain required for the ADC target performance. From Eq.2.6, by designing a 7-bit DA, the amplifier loop-gain is boosted to:

$$20\text{dB} + 6\text{dB} \times 7\text{bit} = 62\text{dB} \quad (2.7)$$

and the design requirement can be easily met. As a result, over cubic enhancement of  $A_{OP}$  is achieved with DA, while techniques such as CLS are limited to a square [40]. Interestingly, since the gain-error is mostly determined by the step size of  $V_{CDACLSB}$ , it is quite robust to PVT variations. DA can greatly save design time, because little tuning is required while iterating through PVT and post-layout simulations (contrastive to conventional opamp designs which require extensive design efforts, where the transistor characteristics varies widely through PVT and layout).

Table 2.1: Normalized settling error requirements for opamp and DA based MDACs, respectively.

|                      | w/o DA | w/DA                     |
|----------------------|--------|--------------------------|
| Ideal loop-gain [dB] | $A$    | $A + 6 \times n$         |
| Allowed OPAMP error  | -      | $V_{CDACLSB} \times 2^n$ |



Figure 2.6: Number of DA bit versus estimated MDAC power is plotted. 0-bit case is a MDAC designed only with an opamp. MDAC power starts to increase after DA's settling error mitigation effect saturates at a certain point.

### 2.3.2 Power Optimization Strategy

In conventional high-speed pipelined ADC designs, the opamp must be designed with a strict settling error requirements, which easily overgrows the amplifier power consumption [59]. To obtain faster settling, high Gain Bandwidth (GBW) is required, which is typically obtained by burning more power. In this section, we will discuss the DA-based MDAC power consumption assuming that the amplifier power



Figure 2.7: We compare the power consumption of opamp-based and DA-based MDACs, respectively. Since DA-based MDACs has a relaxed settling requirements, at DA=7-bit, 46% power savings can be expected at our target SNDR design point.

is determined by settling requirements. We will show that by utilizing DA, significant power savings can be achieved compared to opamp-based designs because DA allows opamp designs with significantly relaxed settling requirements.

DA not only removes the opamp gain-related errors but can remove settling errors as well. Here, we will consider a 2.5-bit MDAC design with a settling error requirement achieving SNDR=66dB. According to ref.[60], the opamp settling error and GBW relationship can be shown as bellow.

$$\text{Settling req.} \approx \exp(-GBW) \quad (2.8)$$

From above, we can derive the relationship between the amplifier settling requirements and the required GBW (Table 2.1). As shown in the table, utilizing a  $n$ -bit DA can relax the opamp settling requirements by  $2^n \times$ . However, since SA cycles must also be completed within the same amplification window, the effective time for opamp amplification will decrease with the increased DA bit. The *effective* settling

requirement can be derived as bellow.

$$Ratio = 1 - n \times t_{DA} \quad (2.9)$$

$$\text{Eff.Settling} = \text{Settling req.} \times Ratio \quad (2.10)$$

Here,  $t_{DA}$  is the normalized time for a single SA cycle. The effective settling requirements saturate around DA=8bit due to the fixed amplification time window. We will also estimate the MDAC power consumption, derived from the opamp GBW. The GBW can be expressed with  $g_m$ :

$$GBW = \frac{g_m \times \beta}{2\pi C_L} \quad (2.11)$$

To simplify the analysis, we will assume constant current density, where doubling the  $g_m$  will also double the power consumption. The opamp and DA's power consumption were derived from the 28nm CMOS post layout simulation results and the power was scaled in respect to the required  $g_m$  and bits. In Fig.2.6 we plot the MDAC power consumption against DA bits, where the power is normalized to the 0-bit case (MDAC designed only with an opamp). Since the DA's power is mainly dominated by the comparator and the SA-logic, the power increases almost linearly against the DA bit. Increasing the DA bit relax the opamp settling requirements, thereby saving power. However, since the effective settling requirement saturate around DA=8-bit, power savings also saturate around this point. Increasing the DA bit further than 8-bit have no effect and may even increase the power consumption. Reflecting the results of this analysis, the DA bit is set to 7-bit in our design. While we fix the target SNDR to 60dB in our optimization strategy, the design point will change with higher target SNDR. Note that the comparator power increases  $4 \times$  when the target SNDR rise 6dB. Thus for higher target SNDR, the power will be optimized with fewer DA bit.

Also, we conduct a analysis based on the target ADC SNDR versus MDAC power in Fig. 2.7. Since settling requirements become strict with higher resolution,

DA enjoys further power savings at high SNDR as well. At our design point of SNDR=66dB, the DA-based MDAC can save 46% power compared to opamp-based designs.

### 2.3.3 Spurious-free Characteristics of the DA

Another important feature of the DA is that fundamentally, the amplification is spurious-free. Fig.2.8 compares the system simulation results of the pipelined-SAR ADC utilizing opamp-based and DA-based MDAC, respectively. The opamp amplification error can be derived from Eq. (1),(2) by:

$$V_{amperror} \approx \frac{V_{in}}{\beta \times A} \quad (2.12)$$

The error is a function of the input signal  $V_{in}$ . Since such errors will appear at the ADC spectrum as harmonic tones, the SFDR degrades (Fig.2.8(a)). The performance of wireless systems utilizing sub-carriers (e.g. OFDM) may degrade by such spurious tones and higher SFDR is preferred by the system.

On the other hand, since the DA amplification error is quantization noise, the errors can be modeled as random values. Since the amplification errors appear at the noise floor, the SFDR excels compared to opamp-based implementations (Fig.2.8(b)). However, note that when the target SNDR is low, the DA quantization error gets correlation with the signal and may get worsened SFDR performances. If the target SNDR is high enough, as in this design (SNDR>60dB), the spurs will spread nearly to the noise-floor level and the ADC can achieve an enhanced SFDR performance.

## 2.4 Pipelined-SAR ADC Architecture

Fig.2.9 shows the block diagram and timing chart of the two-way interleaved pipelined-SAR ADC. A total of 12-bit results are obtained by: the 1st stage 2.5-bit MDAC and the 2nd stage 10-bit fine SAR ADC (FSAR). We chose 2.5-bit as the first stage



Figure 2.8: Matlab simulated FFT results of the pipelined-SAR ADC are shown, where (a) uses opamp-based MDAC and (b) utilize DA-based MDAC. Since DA's gain error does not have correlation with the input signal, the SFDR excels by 10dB. Note that the opamp gain and DA bit were tuned to achieve the same SNDR.



Figure 2.9: The architecture of the two-way interleaved 12bit 160MS/s pipelined SAR ADC.

MDAC resolution to achieve higher gain mismatch tolerance. While quantizing more bits in the first stage MDAC will further relax the noise requirements of the 2nd stage SAR, such design poses a challenge in MDAC capacitor mismatches since small unit capacitors must be used (considering a MDAC area decided by sampling kT/C noise). Thus, complex gain calibrations are inevitable to achieve high yields.

#### 2.4.1 Asynchronous Operation

Since DA is a charge-based amplification, no active components exist during the hold phase after amplification. Therefore, the DA circuitry is sensitive against leak currents and amplification results can easily be altered in high-leak PVT conditions. To support low sampling rate operation even in high-leak PVTs, we made

the pipeline operation asynchronous to minimize the hold time after amplification. As shown in the timing diagram of Fig.2.9, the ADC is not strictly pipelined: the 2nd stage FSAR conversion is triggered by the finish signal of the DA amplification and for low sampling rate operations, the entire conversion can finish in a single period. This asynchronous operation does not add performance overheads for our two-staged pipeline configuration but can add complexity if more pipeline stages are required.

### 2.4.2 Look-Ahead SAR Technique

To improve the power-efficiency, we adapt subrange SAR technique in the FSAR [61]. On top of that, we propose a look-ahead (LA) SAR technique which foresees and converts the 3-bit MSB from the half-way DA amplification results. Right after the 3rd DA cycle of the DA amplification, the 3-bit LA SAR is activated. The LA SAR samples the half-way DA amplification results and the LA SAR conversion is carried out simultaneously with the DA operation (Fig.2.9). Since the 3-bit MSB results are resolved beforehand by the LA SAR and passed to FSAR, a total of 25% speed improvement is achieved. The amplification error, noise and offset contained in the LA SAR results are compensated by the FSAR redundancy. Therefore, LA SAR requirements are greatly relaxed and its area is only 5% of FSAR. Furthermore, the most power-consuming MSB transitions are done by a small C-DAC, which results in total of 30% DAC switching power savings.

The 12-bit (10-bit + 2-bit redundancy) FSAR design is discussed. The first redundant bit (where its size is  $> 100$  LSB) is placed after the 3rd MSB and compensates for three errors: 1) The sampling error between the FSAR and LA SAR. This is required because the LA SAR samples the half-way amplification results of the DA and such errors must be tolerated. 2) The relative comparator offset mismatch between FSAR and LA SAR. 3) FSAR MSB settling errors. The second redundant bit is placed after the 7th MSB conversion, which is used to tolerate the settling errors caused in the SAR conversion of 4-7th MSB.



Figure 2.10: Noise contribution breakdown of the ADC.

### 2.4.3 Noise Budget

Fig. 2.10 shows the noise breakdown of the designed ADC. The 1st stage MDAC consumes about 75% of the noise, and majority results from the DA comparator. Therefore, the DA comparator itself must be carefully designed to meet the overall noise requirements. The noise resulting from  $kT/C$  and MDAC capacitors ( $C_S$  and  $C_F$ ) are rather small, because the MDAC capacitor size were chosen for sufficient matching requirements.

## 2.5 Circuit Implementation

### 2.5.1 Operational Amplifier

The opamp schematic of the designed MDAC is shown in Fig. 2.11. In order to accomplish low-voltage operation down to 0.7V, we did not use a cascade and adopted a simple two-staged architecture. While the second stage output drives a large output capacitance load (few pF), the first stage drives only a small load (< 100 fF) with a small gain. To optimize the power consumption of a such opamp, we



Figure 2.11: Schematic diagram of the designed opamp.



Figure 2.12: Simulated waveform of the DA-based MDAC. While turning off the opamp causes kickback, the noise is small enough so that it can be canceled by DA operation.

placed the dominant pole at the second stage output as in ref.[62], instead of a miller compensation. Pole-splitting is achieved by proper sizing of the first stage, so that it will achieve enough  $g_m$  and place the 1st stage output pole at high-frequencies. Since settling errors due to instability can also be canceled out by the DA, the phase margin design target is relaxed in our design ( $40\text{-}50^\circ$ ).

In addition, power gating scheme is adopted to minimize the opamp power. The opamp only operates during  $\phi_{OP} = \text{High}$  and does not consume power otherwise; the source current is gated as in ref.[59]. However, since the DA operates in a sample-and-hold fashion as in SAR ADCs, we must design the opamp to minimize the kickback noise during DA operation. Due to the low off-resistance of scaled CMOS devices, voltage nodes  $V_{outP1}$  and  $V_{outN1}$  may cause a large drift due to leak currents. Such voltage variation will kickback to  $V_x$  (opamp input) through the gate-drain coupling of the input transistor, which will interfere with the DA operation and damage the amplification accuracy. In order to prevent such problems, the designed opamp resets  $V_{outP1}$  and  $V_{outN1}$  to  $V_{DD}$  after  $\phi_{OP}$  sets down. While this will cause a initial kickback noise when the DA operation starts, its size is less than 2.5% of the DA C-DAC range and can easily be canceled out (Fig.2.12).

### 2.5.2 Comparator Designs

As we have shown in the last section (Table 2.10), the DA comparator contributes most to the ADC noise performance and must be carefully designed. In our design, to achieve both high-speed and low-powered operation, a two-staged dynamic comparator similar to ref.[63] was adopted. By careful sizing of the input transistors and bandwidth limiting capacitors, the comparator achieves a input-referred-noise of  $160\mu V_{rms}$  in typical conditions. According to system-level simulations, this comparator noise level requirement is similar to 12-bit SAR ADC with the same input signal voltage ( $1V_{pp}$ ) and is not a excessive requirement.

Moreover, we found that even with a such low-noise comparator, the power consumption was only 1/3 of the power-gated opamp. Therefore, the power-dominating

Table 2.2: The design of the 8-bit DA C-DAC.

| <b>Bit</b>    | <b>7</b>  | <b>6</b>  | <b>5</b>  | <b>4</b>  | <b>3</b> | <b>2</b> | <b>1</b> | <b>0</b> |
|---------------|-----------|-----------|-----------|-----------|----------|----------|----------|----------|
| <b>Weight</b> | <b>46</b> | <b>26</b> | <b>16</b> | <b>10</b> | <b>6</b> | <b>4</b> | <b>2</b> | <b>1</b> |



Figure 2.13: DA C-DAC settling error versus ADC SNDR is shown. Since we utilize redundancy in the DA C-DAC, it is robust to settling errors.

circuitry is still the opamp (the power breakdown is shown in the measurement section). However, the comparator power will increase exponentially if we target higher resolutions. In order to mitigate its power, we can adapt low-power techniques such as data driven comparator[64], LSB averaging[65] and VCO comparator[66] but will prolong the DA amplification time in return. Lastly, we would like to note that the DA comparator offset will appear as the MDAC output offset, similar to an opamp output offset. Since our MDAC has 0.5-bit redundancy, such offset do not affect the ADC performance and we do not calibrate the comparator offset in our design.

### 2.5.3 DA C-DAC Designs

The structure of the 8-bit (7-bit + 1-bit redundancy) C-DAC utilized in the DA (we will call this DA C-DAC) is shown in Table 2.2. To add settling error resistance to most of the bits, we design the DA C-DAC with 1-bit redundancy and a sub-binary radix of 1.73. The DA C-DAC settling error tolerance was simulated in Fig.2.13. Even with a settling error of 15% in every bit, the SNDR degradation is only < 1dB. While this 1-bit redundancy can relax the reference voltage designs significantly, the



Figure 2.14: Simplified figure of the ADC capacitor network.

DA amplification time prolongs for 14% due to extra cycles.

As discussed in the previous sections, the absolute value of  $V_{CDACLSB}$  directly couples to the DA accuracy and must be carefully designed. Here, we will discuss the C-DAC design methods to meet the target  $V_{CDACLSB}$ . According to system simulations, the  $V_{CDACLSB}$  required to achieve the target is 1.6mV. Importantly,  $V_{CDACLSB}$  is decided by the ratio between the DA C-DAC LSB capacitor ( $C_{DALSB}$ ) and the total load capacitance seen at the amplifier output. Fig.2.14 shows the simplified capacitor network. The main load capacitors are total capacitance of DA C-DAC  $C_{DA}$ , total capacitance of FSAR C-DAC  $C_{SAR}$ , feedback capacitor seen from the MDAC output  $C_F$  and parasitic capacitance  $C_P$ .  $V_{CDACLSB}$  can be derived via capacitive dividing as bellow.

$$V_{CDACLSB} = V_{ref} \times \frac{C_{DALSB}}{C_{DA} + C_{SAR} + C_P + C_{S+F}} \quad (2.13)$$

Here, the serial capacitance of  $C_S$  and  $C_F$  is shown as  $C_{S+F}$  and  $V_{ref}$  is the reference voltage of the C-DAC. Since the parasitic  $C_P$  relies heavily on the layout, several iteration of layout-parasitic-extraction (LPE) was required to fix the value of  $C_{DALSB}$ . After LPE simulations, we fixed the  $C_{DALSB}$  to 2.4fF to meet the target  $V_{CDACLSB}$ .



Figure 2.15: Chip photo of the prototype ADC. Evaluation results of the I-channel ADC are shown.



Figure 2.16: ADC measured performance from 3 randomly selected chips. Temperature vs ADC SNDR were measured.

## 2.6 Measurement Results

The ADC implemented in 28nm CMOS consumes  $0.097\text{mm}^2$ , which also includes 70pF bypass capacitor for the ADC reference voltage (Fig.2.15). Owing to DA's robustness and efficient use of DA C-DAC's redundancy, a low-cost implementation was accomplished. At typical conditions, the ADC achieves SNDR of 61.1dB with 160MS/s Nyquist input and the power consumption is only 1.9mW. The power



Figure 2.17: ADC measured performance from 3 randomly selected chips, where  $f_s$  and  $f_{in}$  were varied.



Figure 2.18: ADC FFT measured results at  $f_{in}=10.1$  MHz.

includes every necessary ADC components: clock buffer, error correction, reference voltage, and current reference generation. The corresponding walden-FoM is 12.8fJ/conv. To emphasize the calibration-free feature of the DA-based pipelined ADC, we did not apply any calibration for the reported measurement results. However, the effect of inter-channel offset is not included in our measurements, and the reason is described later.

To maximize the power-efficiency, main measurements were carried out with a power supply voltage of 0.7V. The ADC speed can be significantly improved by turning the supply up to 0.9V; 320 MS/s can be achieved with a slightly worsened SNDR of 59.6dB. In our measurements, we fixed the input swing to 1Vpp and the SNR performance is similar for both supply voltages. The SNDR is slightly lower for 0.9V because of higher input frequency (160MHz), which poses higher distortions in the sampling. However, the power-efficiency greatly degrades to 32.1 fJ/conv. because the opamp draws a larger current for high speed operation and the digital circuit power increases with higher supply voltages.

Fig.2.16 shows the temperature variation versus ADC SNDR characteristics of 3 randomly chosen samples. To confirm the calibration-free ADC's robustness, the temperature variation of  $-40$  to  $125^{\circ}C$  were applied, and all samples achieve



Figure 2.19: (a) ADC measured DNL. (b) ADC measured INL.



Figure 2.20: Simulated power breakdown of the ADC.

$\text{SNDR} > 59.5\text{dB}$  with 160MS/s operation. At a high temperature, the comparator noise of DA limits the SNDR. As the temperature goes down, the thermal noise decrease and SNDR is pushed up. Moreover, the SNDR is well flat with varied  $f_s$  and  $f_{in}$  (Fig.2.17).

Fig. 2.18 shows the FFT spectrum of the ADC. As analyzed in Section IV, the DA is fundamentally spurious-free but SFDR was limited to 73dB in measurements. With further analysis, we found that the MDAC layout induced capacitor mismatches limits the SFDR. The spurious tones appeared in all of the measured samples similarly regardless of PVT variations. Furthermore, simulations showed that the SFDR can be further improved either by capacitor rotating or with digital gain calibration. The ADC DNL/INL measured results are reported in Fig. 2.19.

In 2-channel time-interleaved ADCs, the inter-channel offset mismatch effects appear at the DC and Nyquist Frequency. However, in our measurements, we calculate the FFT and SNDR by removing the DC and Nyquist Frequency bin; the inter-channel offset mismatch effect is excluded in our design. Generally, wireless baseband ADCs are utilized with an oversampled situation and useful information rarely exists at the Nyquist Frequency and can be removed without impacting the

Table 2.3: Inter-process comparison of the digital amplifier-based MDAC.

|                              | <b>65nm</b>   | <b>28nm</b>   |
|------------------------------|---------------|---------------|
| <b>Supply Voltage</b>        | <b>0.9V</b>   | <b>0.7V</b>   |
| <b>DA bit</b>                | <b>6</b>      | <b>8</b>      |
| <b>Speed</b>                 | <b>40MS/s</b> | <b>80MS/s</b> |
| <b>Power [uW/MS]</b>         | <b>23</b>     | <b>7.7</b>    |
| <b>Area [mm<sup>2</sup>]</b> | <b>0.075</b>  | <b>0.021</b>  |

wireless system performance. In cases where the Nyquist Frequency is interest, inter-channel offset calibrations should be implemented to suppress the offset mismatch effects. Offset calibrations are less complex compared to gain calibrations and will have little impact to the start-up time. By suppressing the Nyquist tone down to SFDR < 75dB, the ADC SNDR will not be effected. For such cases, the inter-channel relative offset should be  $\leq 2$  LSB and can be easily realized by digital calibrations.

Fig. 2.20 shows the simulated power breakdown of the ADC. The 1st stage MDAC consumes almost 70% of the entire energy and rest is the 2nd stage SAR. Still, the opamp is the dominating power source, since it must complete a coarse but fast amplification. A future research may be pointed to making the coarse amplifier power-efficient; ring amplifiers [45] and dynamic amplifiers [51] will be a great fit for such roles.

### 2.6.1 Scaling Effects of the Digital Amplifier

In order to evaluate the process scaling effects of the digital amplifier, an adequate approach is to implement the same circuit in different CMOS process and compare the performance. Therefore, to conduct a inter-process evaluation of the DA, we prototyped a DA-based 12-bit pipelined ADC in 65nm CMOS (Fig.2.21). The ADC



Figure 2.21: A digital amplifier-based 11-bit pipelined ADC prototyped in 65nm CMOS.

Table 2.4: Performance Comparison with state-of-the-art Pipelined and Pipelined-SAR ADCs.

|                                | This work                          |             | VLSI 2014 Verbruggen        | JSSC 2015 Zhou        | ISSCC 2012 Chai   |
|--------------------------------|------------------------------------|-------------|-----------------------------|-----------------------|-------------------|
| Process                        | <b>28nm</b>                        |             | 28nm                        | 40nm                  | 65nm              |
| Architecture                   | <b>Pipelined-SAR w/Digital Amp</b> |             | Pipelined-SAR w/Dynamic Amp | Pipelined-SAR w/Opamp | Pipelined w/Opamp |
| Interleave                     | <b>2x</b>                          |             | 2x                          | -                     | -                 |
| Supply [V]                     | <b>0.7</b>                         | <b>0.9</b>  | 0.9                         | 1.1                   | 1                 |
| Input range [V <sub>pp</sub> ] | <b>1</b>                           | <b>1</b>    | N.A.                        | 2                     | 1.3               |
| F <sub>s</sub> [MS/s]          | <b>160</b>                         | <b>320</b>  | 200                         | 160                   | 200               |
| SNDR [dB]                      | <b>61.1</b>                        | <b>59.6</b> | 65                          | 65.3                  | 57                |
| Power [mW]                     | <b>1.9</b>                         | <b>8.1</b>  | 2.3                         | 5                     | 5.4               |
| FoMW [fJ/conv.]                | <b>12.8</b>                        | <b>32.1</b> | 7.9                         | 20.7                  | 46.4              |
| Area [mm <sup>2</sup> ]        | <b>0.097 (Inc. Decap)</b>          |             | 0.35 (Inc. Decap)           | 1.87 (Inc. Decap)     | 0.19              |
| Calibration?                   | <b>No</b>                          |             | Yes (Gain, etc.)            | Yes (Gain)            | No                |



Figure 2.22: Benchmark against Pipelined and Pipelined-SAR ADC published in ISSCC and VLSI. Our work achieves 3× power efficiency improvement compared to ADCs without gain calibrations.

is designed with a similar noise budget and accomplishes an identical SNDR of 61.8dB. Importantly, the DA's core circuit is identical, sharing the design of the comparator and the SA logic. While the ADC architecture differs (Pipelined and Pipelined-SAR) and direct comparison cannot be made, the 1st MDAC stage designs are almost the same and will be employed to evaluate the DA's process scaling effects.

Table 2.3 compares the performance of the 1st MDAC stages. Since better opamp gain performance can be achieved with 65nm CMOS, its DA is designed with 6-bit. However, the DA cycle speed greatly excess with 28nm CMOS and achieves 2× improvement in speed. Moreover, the DA area and power efficiency were significantly enhanced with 28nm CMOS due to the digital nature of the DA and 3× improvement were observed. The power-efficiency is also benefited from using low supply voltage (0.7V) in 28nm CMOS. We expect a continuous performance improvement of the DA-based MDACs with further scaled processes, as long as the digital circuit keeps improving its performance.

### 2.6.2 Benchmarks

Table 2.4 compares our ADC performance against state-of-the-art pipelined-SAR and pipelined ADCs achieving similar performance [29], [28], [59]. While accomplishing a competitive energy efficiency to pipelined ADCs utilizing open-loop amplifiers and gain-calibration, our ADC did not require any calibration at all. Moreover, the required overall ADC area is  $3 - 18 \times$  smaller. While prior works with open-loop amplifiers utilize bypass capacitors of several nF due to low power supply rejection, DA is robust to power supply noise and our work design only use 70pF capacitors for decoupling.

Moreover, based on [24], the author categorized either the ADC utilize gain calibration or not to perform a extensive comparison between works published in ISSCC and VLSI (Fig. 2.22). ADCs meeting our design target ( $f_s > 100\text{MS/s}$ ,  $\text{FoM} < 20\text{fJ/conv.}$ ,  $\text{SNDR} > 56\text{dB}$ ) conventionally employed gain-calibration, which had underlying issues on SoC start-up time and stability. For the author's best knowledge, our ADC achieves FoM of 12.8fJ/conv. without calibration, which is a 3x improvement compared to the conventional calibration-free pipelined and pipelined-SAR ADCs with  $f_s > 50\text{MS/s}$  and  $\text{SNDR} > 56\text{dB}$ .

## 2.7 Conclusions

We introduced the concept and implementation of the digital amplifier (DA) to realize a calibration-free, process scaling pipelined-SAR ADC. The amplification features of the DA were extensively studied, such as the gain-error principals and spurious-free characteristics. We showed that the DA accuracy is determined by the C-DAC LSB step and irrelevant to intrinsic gain, showing potential for further process scalability. In addition, due to the relaxed settling requirements, we showed that significant power savings can be achieved compared to opamp-based MDACs.

Measurement results of the calibration-free 0.7V 12b 160MS/s pipelined-SAR ADC were reported. Without any calibration, the ADC achieved  $\text{SNDR} = 61.1\text{dB}$ ,

FoM= 12.8fJ/conv., archiving over 3x power efficiency improvement compared to conventional calibration-free high-speed pipelined ADCs. Finally, an inter-process performance comparison were executed to confirm the process scalability of the DA.

# Chapter 3

## Dynamic Architecture Configuring

### 3.1 Introduction

This chapter focuses on designs for high-speed ADCs in scaled CMOS technologies, which are required for e.g. wireless ultra-wideband (UWB) communications. Moreover, for wireless mobile devices, such ADCs should be power efficient to lessen the impact on battery life.

What are some common approaches upon designing high-speed and low-power ADCs? The most common and popular approach is to time-interleave power-efficient SAR ADCs. By heavily utilizing successive approximation circuitry, the ADC will become process scalable as well. However, the downside of time-interleaving is that the core ADC area increases proportionally to the interleaved channels and will impact area cost. Moreover, inter-channel gain and timing mismatch calibrations increase complexity as well. Flash ADCs are an another option, which can realize high-speed with minimum number of channels. However, Flash is notorious for its power-hungriness because significant amount of redundant circuits operate. To summarize, both ADC architectures have a design tradeoff between area and power and no optimum solution exist.

While ADCs must be designed to operate in the highest sampling frequency, such "highest speed" conditions are rarely used in real-life scenarios. For an example, in mobile communications, it is rare that a single user will use every channel (or



Figure 3.1: Aggressive power scaling with DVFS, commonly utilized in CPUs.

frequency band) and the assigned frequency band will span reflecting the number of users in the particular environment. Therefore, if there are lots of users in the environments, the assigned frequency band per user will be reduced (even with UWBS). The available frequency band for a user maybe small as 20MHz or even up to 1 GHz if the environment is sparse.

Our research question is: can we aggressively improve the high-speed ADC's frequency power scaling, such characteristics are important, taking over the fact that the ADC sampling frequency spans widely during use. Aggressive power scaling is commonly realized in CPUs as the dynamic voltage and frequency scaling (DVFS) technique (Fig. 3.1) [38]. When the CPU is idle, the CPU lowers its operating frequency to save power. Simultaneously, it lowers its supply voltage to further power (modern CPUs normally has a DC-DC converter per logic core). Since digital circuit power consumption is

$$Power = C \times freq. \times V_{DD}^2 \quad (3.1)$$

lowering the supply voltage can aggressively reduce the power consumption. Can we utilize the same technique in the ADC and simply lower its supply voltage when



Figure 3.2: Dynamic power scaling of an ADC without any power scaling techniques, with DVFS, and with DAFS, respectively.

the required sampling frequency is low? The answer might be negative because analog circuits has a much higher power supply sensitivity than digital; even lowering the power supply slightly will greatly reduce the sampling rate. Thus the voltage-scalable frequency band will be very narrow and only a small benefit will be gained. Moreover, the overhead of having a respective DC-DC converter per ADC core may be too large; typical high-efficiency DC-DC converters are much larger than the ADC itself.

How can we achieve better frequency power scaling without tuning the supply voltage? Our main idea is: configure between the successive approximation (SA) and flash ADC architectures dynamically, realizing a *hybrid operation ADC*. Such ADC will have a highest operation frequency of that of Flash and as the frequency slows down, the power consumption will reach that of the SA. We will name such frequency scaling technique which dynamically switches architectures, the Dynamic Architecture and Frequency Scaling (DAFS) [67][68].

Fig. 3.2 compares the ADC power scaling with and without DAFS. The Flash ADCs are reconfigurable, so that it can be switched to operate as SA ADC as well. By reconfiguring the ADC between SA and flash ADC every conversion cycle, the

ADC achieves a maximum speed similar to Flash ADCs and a super-linear power scaling excelling that of the Flash, realizing a low-cost frequency power-scaling ADC. DAFS not only improves the ADC power scaling but tracks the change in conversion delay caused by process, voltage and temperature (PVT) variation as well. As an example, if the ADC operates with slow corners, more flash operations will be inserted to reduce the excess-delay automatically. Since architecture configuring eases the speed variation effects, design margins when designing high-speed ADCs can be improved. To prove the DAFFS effectiveness, a 7-bit subranging ADC was designed in 65nm CMOS and superlinear power scaling was observed in the range of 820 to 1220 MS/s.

This chapter is organized as follows: Section 3.2 describes the basic operation and an analysis of DAFFS with a simplified ADC. Section 3.3 presents a 7-bit subranging ADC that uses DAFFS and describes its operation. The specific sub-ADC design is described as well. The experimental results and discussions are given in Section 3.4.

## 3.2 Dynamic Architecture and Frequency Scaling

### 3.2.1 Binary search (Successive approximation) and flash reconfigurable ADC

The proposed DAFFS technique is based on two architectures, flash ADC and successive approximation (or binary searched) ADC [69]. These two architectures are often used for high-speed ADCs with under 6-bit resolution and have a clear power and speed tradeoff. Firstly in Fig. 3.3 (a), a schematic diagram of a 3-bit flash ADC is shown. Seven comparators with different comparison thresholds ( $> \frac{1}{8}$ ,  $> \frac{2}{8}$ ,  $> \frac{3}{8} \dots$ ) are used, and the flash ADC operates by simply activating all of the comparators at once. The flash ADC's conversion delay ( $t_{FL}$ ) is identical to single comparator delay ( $t_{comp}$ ) plus the reset time of the comparator:



Figure 3.3: (a) Schematic of 3-bit flash ADC. (b) Schematic of 3-bit binary search ADC.

$$t_{FL} \simeq 2t_{comp} \quad (3.2)$$

Although this is the fastest ADC architecture, the flash ADC is notorious for its high power consumption. When the power of a single comparator is  $P_{comp}$  and  $N$  stands for the ADC resolution, the flash ADC power consumption ( $P_{FL}$ ) can be expressed as

$$P_{FL} = (2^N - 1)P_{comp} \quad (3.3)$$

and  $P_{comp}$  increases exponentially with  $N$ .

Secondly, a schematic of a 3-bit successive approximation (binary search) ADC is shown in Fig. 3.3 (b). While the "successive approximation" ADC mentioned here is fundamentally similar to "SAR" ADCs discussed in chapter 1 and 2, but its structure differs. Since "successive approximation" ADCs are somewhat confusing, we will use the term "binary search ADCs" as used in the original paper. SAR ADCs conduct binary search by storing (or registering) the comparison results in the logic circuit and update the C-DAC reference voltage based on such data. On the other hand, binary search ADCs change which comparator to activate based on the previous comparison results.

Like the flash architecture, the 3-bit binary search ADC uses seven comparators. When the CLK rise, only the MSB comparator is activated, which has a threshold of  $\frac{4}{8}$ . If the input is larger than  $\frac{4}{8}$ , the comparator with  $\frac{6}{8}$  threshold is successively activated by the MSB comparator, based on a binary search algorithm. If the input is smaller than  $\frac{4}{8}$ , the comparator with  $\frac{2}{8}$  threshold will be activated instead. Similarly, only one of the 3rd bit comparators is activated, depending on the 2nd bit comparator's result. The conversion delay  $t_{BS}$  including the comparator reset

time will be:

$$t_{BS} \simeq (N + 1)t_{comp} \quad (3.4)$$

While the maximum conversion speed is inversely proportional to  $N$ , the power efficiency is superior to that of the flash ADC. Interestingly, unlike SAR ADCs the time for logic delays and C-DAC settling are not required in binary search ADCs, potentially achieving faster conversion speeds. However, number of comparators increase exponentially with resolution and cannot be used for higher resolution.

$$P_{BS} = N \times P_{comp} \quad (3.5)$$

We can see that flash and binary search ADCs have a distinctive tradeoff between power and speed, and DAFS exploits this characteristic to achieve both low-power and high-speed operation by configuring the architecture operation ratio of these two architectures adaptively during the ADC conversion.

For the DAFS to work sufficiently, architecture reconfiguration between flash and binary search must be realized. Therefore, a binary search/flash reconfigurable ADC which enables fast and simple reconfiguration, is proposed (Fig. 3.4) by simply inserting OR cells between the comparator activation passes. The architecture configure signal (B/F) determines which ADC architecture to be used: when B/F is High, the ADC operates as a flash ADC; when B/F is Low, it operates as binary search ADC. First, we will explain the ADC operation when signal B/F is *High* and CLK rises. In such cases, the AND cell outputs *High* to all of the OR cells which in turn output *High* as well. Therefore, the OR cells activate all of the comparators simultaneously, which is equivalent to a Flash ADC operation.

On the other hand, when B/F is *Low*, the output of AND will be *Low* as well. In order for the OR cells to output *High*, the previous comparator must supply *High*, which is similar to a binary search ADC operation. The overheads of the reconfiguration are single AND and  $(2N-2)$  OR cells, which is remarkably small in terms



Figure 3.4: Schematic of the proposed binary search/flash reconfigurable ADC, realized by just adding OR cells to conventional Flash ADCs.



Figure 3.5: (a) Simplified test bench with a 3-bit ADC using DAFS. (b) Timing chart showing the basic operation of the ADC.

of area and delay. However, the additional clock path delivering the architecture control signal to each comparator increases the ADC power by 5%.

### 3.2.2 DAFS operation

The basic concepts of the DAFS will be explained with a simple 3-bit ADC in Fig. 3.5 (a). As explained in 3.2.1, the ADC is architecture reconfigurable and operates as a binary search ADC when the architecture configure signal (B/F) is Low and operates as a flash ADC when it is High. DAFS requires a 2-ch time-interleaved sample hold circuit (S/H), which makes the sampling network more complex than that of typical ADCs. As shown in the schematic of Fig. 3.5 (a), the ADC sampling network consist of 2-ch time-interleaved S/Hs and a MUX switches the input given to the ADC (VADC).

The basic timing chart is shown in Fig. 3.5 (b), and when \$CLK\_{ADC}\$ rises at the start of cycle 1, the ADC starts the conversion. As soon as the ADC finishes the conversion, the conversion finish signal (\$FIN\$) rises. \$FIN\$ is fed to the control circuit (DAFS CTRL) to set down \$CLK\_{ADC}\$ and toggle DMUX to switch the input channel used in the next cycle. These actions are taken during the ADC conversion phase. In the subsequent ADC reset phase, as soon as \$CLK\_{ADC}\$ falls, the comparator outputs become reset and set down \$FIN\$. At this point, ADC is ready for the next conversion.



Figure 3.6: (a) DAFS operation at  $f_{s_{maxBS}} > f_s$ . (b) DAFS operation at  $f_{s_{maxBS}} < f_s < f_{s_{maxFL}}$ . (c) DAFS operation at  $f_s \doteq f_{s_{maxFL}}$ .

Fig. 3.6 shows the ADC operation operated at several frequencies:  $f_{s_{maxBS}} > fs$ ,  $f_{s_{maxBS}} < fs < f_{s_{maxFL}}$  and  $fs \simeq f_{s_{maxFL}}$ .  $f_{s_{maxBS}}$  and  $f_{s_{maxFL}}$  is the maximum operation frequency for binary search and flash conversions, respectively. To start with, let us consider the DAFFS ADC operation when  $f_{s_{maxBS}} > fs$  (Fig. 3.6 (a)) and for comparison, ADC operation with only flash is plotted as well. Since the flash conversion time ( $t_{FL}$ ) is much shorter than the cycle ( $\frac{1}{fs} = t_{cyc}$ ), the conversion is completed with a large margin. Conversely, the ADC is idle for over half of the given time  $t_{cyc}$ . The ADC reset time is indicated as  $R$  in the figure. When DAFFS is used, the ADC operates as binary search to reduce the power and since  $f_{s_{maxBS}} > fs$ , the binary search conversion time ( $t_{BS}$ ) is still shorter than  $t_{cyc}$  and the conversion can be completed without any architecture configurations.

Next, let us examine the ADC operation when  $f_{s_{maxBS}} < fs < f_{s_{maxFL}}$  (Fig. 3.6 (b)). Since  $fs$  is still below  $f_{s_{maxFL}}$ , flash conversion is completed with a margin. On the other hand,  $fs$  is now higher than  $f_{s_{maxBS}}$ , meaning that  $t_{BS} > t_{cyc}$ . The ADC operation with only binary search is also shown for comparison, and in which the binary search conversion does not finish within cycle 1 and prolonged into cycle 2. We can calculate the excess-delay (EXD) generated in cycle 1 as:

$$EXD[Cyc.1] = t_{BS} - t_{FL} \quad (3.6)$$

EXD will occur every cycle and (3.6) will accumulate, meaning that the conversion will be corrupted once EXD occurs. However, by configuring the architecture to flash, the EXD can be canceled. The operation with DAFFS is plotted, in which the DAFFS CTRL circuit monitors if EXD is positive or not. Since positive amount of EXD is detected at the beginning of cycle 2, B/F is turned to High and the ADC is configured to operate as a flash ADC in cycle 2. Intriguingly,  $t_{FL} < t_{cyc}$  and the EXD of cycle 2 can be expressed as:

$$EXD[Cyc.2] = t_{FL} - t_{CYC} < 0 \quad (3.7)$$

which is a negative value. Therefore, the total accumulated EXD ( $\sigma$ EXD) of these two cycles will be,

$$EXD[Cyc.1] + EXD[Cyc.2] = (t_{BS} - t_{CYC}) + (t_{FL} - t_{CYC}) < 0 \quad (3.8)$$

Equation (3.8) shows that by using the flash operation, the ADC succeeds in cancelling EXD produced in cycle 1. The A/D conversion can be continued while consuming significantly less power than ADCs conducting only flash operations.

Lastly, let us examine the operation when  $f_s \simeq f_{s_{maxFL}}$  (Fig. 3.6 (c)). Here as well, the binary search operation in cycle 1 produces a large amount of EXD and hence, the ADC is configured to flash in cycle 2. However, as  $f_s$  rises the EXD cancelling effect lessen.

$$EXD[Cyc.1] + EXD[Cyc.2] = (t_{BS} - t_{CYC}) + (t_{FL} - t_{CYC}) > 0 \quad (3.9)$$

Therefore, not all of the EXD that arose at cycle 1 can be canceled at once, and the conversion is prolonged into cycle 3. Similarly, the DAES CTRL circuit judges that EXD is still positive and the ADC operates as a flash at cycle 3 as well. The flash operation continues until EXD is completely canceled:

$$EXD[Cyc.1] + EXD[Cyc.2] + EXD[Cyc.3] + EXD[Cyc.4] = (t_{BS} - t_{CYC}) + 3(t_{FL} - t_{CYC}) < 0$$

In Fig. 3.6 (c), three times of flash operation is used to cancel the EXD produced by a single binary search operation.

### 3.2.3 Analysis of DAES

The above study for different  $f_s$  ranges makes mainly four points. (a) When  $f_s$  is higher than  $f_{s_{maxBS}}$ , the flash operation begins to be inserted. (b) By conducting flash operations, excess-delay produced by binary search operation can be canceled. (c) The flash operation continues until the excess-delay is completely canceled. (d)

The occurrence of the flash operation is proportional to  $fs$ .

This section further analyzes the ADC in terms of its response to PVT variations and power consumption. Firstly, let us define binary search versus flash ratio (BF ratio) to signify how much flash operation is used during a conversion at a specific  $fs$ .

$$\text{BF ratio} = \frac{\text{Num. of Flash conv.}}{\text{Num. of BS conv.} + \text{Num. of Flash conv.}} \quad (3.11)$$

For example, the BF ratios of the operations shown in Fig. 3.6 are 0, 0.5 and 0.75. Next, let us estimate the BF ratio for a given  $fs$ . When  $fs_{maxBS} > fs$ , the ADC operation is fully a binary search and BF ratio will always be 0. However, when  $fs > fs_{maxBS}$ , there is a positive amount of EXD and flash operation will be used. The BF ratio in this case is determined from the number of flash conversions required to cancel EXD produced by a single binary search operation. Namely,

$$\text{BF ratio} = \frac{(t_{BS} - t_{CYC}) / (t_{CYC} - t_{FL})}{1 + (t_{BS} - t_{CYC}) / (t_{CYC} - t_{FL})} = \frac{t_{BS} - t_{CYC}}{t_{BS} - t_{FL}} \quad (3.12)$$

If we suppose that  $t_{BS}$  and  $t_{FL}$  are insensitive to the input signal, the BF ratio for specific  $fs$  can be estimated. Moreover, if we substitute  $\frac{1}{fs} = t_{cyc}$ , we can express (3.12) with frequency as below.

$$\text{BF ratio} = \frac{fs_{maxFL}}{fs} \left( \frac{fs - fs_{maxBS}}{fs_{maxFL} - fs_{maxBS}} \right) \quad (3.13)$$

Two interesting characteristics of DAFS can be studied with the help of equations (3.12) and (3.13): PVT drift tracking and power consumption. To start with, let us examine how the BF ratio changes with PVT drift. Here, we will assume that a PVT drift will slow down the transistor (i.e. higher temperature, slow corners) and increase  $t_{BS}$  and  $t_{FL}$ . As a result, the binary search operation produces more EXD, and the amount of EXD cancelled by flash operation decreases as well. Thereupon,



Figure 3.7: Dynamic power scaling of an ADC operating only with flash and with DAFS, respectively

the number of flash operation increases as well as the BF ratio. On the other hand, with faster transistors, the BF ratio decreases because less EXD is produced and more EXD can be cancelled with flash. Normally when designing high-speed ADCs, we must put a lot of design margins into the circuit to meet the target  $fs$  even in the slowest corner condition, and this can lead to a large power overhead. With DAFS, this design margin can be significantly relaxed.

Second, the ADC power consumption ( $P_{ADC}$ ) is estimated from the BF ratio; this is useful when designing and analyzing DAFS ADCs. Our goal is to express  $P_{ADC}$  with  $fs$ , which represents the ADC power scaling. While deriving the exact power scaling is cumbersome, we can simply understand DAFS power scaling as a linear scaling having two regions.

$$P_{ADC} = fs \times P_{BS} [fs \leq fs_{maxBS}]$$

$$P_{ADC} = fs \times (P_{BS} + \alpha) [fs > fs_{maxBS}]$$

As  $fs$  excels  $fs_{maxBS}$  and Flash operation begins to be inserted, the power scaling function changes to that of the latter.

Here,  $\alpha$  is a constant expressing the additionally-inserted Flash operations. We

can see that the DAFS ADC power scaling is a linear power scaling, in which its slope increases when the  $f_s$  exceeds  $f_{s_{maxBS}}$ . The DAFS power scaling for an ADC resolution of 3-bit has been plotted in Fig. 3.7, with a power scaling of flash ADC for comparison. By dynamic architecture configuration, superlinear power scaling can be obtained. The entire power scaling curb of the DAFS can be fit to a quadrature scaling of  $k*f_s^2$ , where  $k$  is a constant which meets  $k = \frac{P_{FL}}{f_{s_{maxFL}}}$ , thus we can call this power scaling "super-linear".

In ADCs using DAFS, the EXD produced by a binary search can be canceled by flash operation as long as its duration is within a cycle. Conversely, DAFS can only be used when  $t_{BS} < 2t_{CYC}$ . If the ADC does not meet this requirement and conducts a binary search in cycle  $N$ , it cannot do any conversion at cycle  $N+1$  and there will be a loss of data. Furthermore, when the resolution of the binary search/flash configurable ADC is increased,  $t_{BS}$  will become larger as in (3.5) and DAFS cannot be used up to  $f_{s_{maxBS}}$ . Hence, for higher resolution, partially active flash (PAF) architecture [5] can be used instead of a binary search to reduce  $t_{BS}$ . Since PAF is an architecture in between binary search and flash, the PAF and flash architecture reconfiguration can be achieved by modifying a binary search/flash configurable ADC.

Lastly, let us discuss the comparator metastability issues in DAFS ADCs. Here, we will define the metastability state as one in which the comparator decision is prolonged for a very long time that it ruins the ADC results. In conventional ADCs, the conversion must satisfy  $t_{ADC} < t_{cyc}$  and if the comparator metastability prolongs the decision such that:  $t_{ADC} > t_{cyc}$ , the results can become corrupted. In DAFS ADCs, the  $t_{ADC}$  is short in flash operation and there is small chance of metastability. However,  $t_{ADC}$  can be twice as long in binary search operations and then the metastability can become an issue. However, with DAFS, the conversion results can still be obtained as long as  $t_{ADC} < 2t_{cyc}$  is satisfied, and a comparator metastability within this range will be simply accounted for EXD. Therefore, the chance of a metastable state occurring is greatly reduced.



Figure 3.8: Block diagram of the 7-bit subranging ADC. DAFS is applied to the 3-bit coarse and fine sub-ADCs.

### 3.3 7-bit Subranging ADC

The 7-bit subranging ADC's block diagram is shown in Fig. 3.8. An MSB (1-bit) is gained in the folding circuit, and 3-bits are acquired from each of the coarse and fine sub-ADCs. All results are added together to generate the 7-bit output. By using four times interleaved S/H and folding circuits, a four phase pipeline operation is realized to enhance the subranging ADC throughput and enable DAFS operation described in Section 3.2 (this will be explained later on). Folding circuits are capable of not only a low power MSB decision; they also halve the fine ADC reference (Fine ref.) transition. Since the fine ref. settling requirement is greatly relaxed, the settling can be completed within the ADC reset time and the subranging ADC does not require additional reset phases. Lastly, we should note that the coarse and fine sub-ADCs are single channel and not time-interleaved and each phase, the sub-ADCs switches the input and configures the channel to convert.

The subranging ADC's conversion consists of four conversion phases: S/H, folding, coarse conversion, and fine conversion. Fig. 3.9 shows the operation of the four channels (Ch.0-3), and note that each channel operates with a conversion phase rotated 90 degrees. The ADC conversion is explained by focusing on the operation of Ch.0 as an example. Here, we will assume that at a certain phase P[N], Ch.0



Figure 3.9: Block diagram of the 7-bit subranging ADC. DAFS is applied to the 3-bit coarse and fine sub-ADCs.



Figure 3.10: (Schematic of the full implementation of S/H and folding circuits.

performs S/H. The sampling switch is closed and the input signal ( $V_{IN}$ ) is sampled to capacitor  $C_S$ , and the switch opens at the end of P[N]. At P[N+1], the MSB comparator of the folding circuit is activated and decides the MSB and simultaneously, the MSB comparator results are used to switch the chopper circuit, which rectifies  $V_{IN}$ . Next, at P[N+2], the 3-bit coarse conversion is performed. In this example, the input is somewhere between 2/8 and 3/8, so the seven fine refs. are switched depending on the results, as 17/64, 18/64, 19/64, etc. Finally, at P[N+3], 3-bit fine conversion *zooms* the coarse converted range.

### 3.3.1 S/H and Folding Circuits

A specific schematic and timing chart of the S/H and folding circuits are shown in Fig. 3.10. The folding circuit design is based on ref. [70], and realized rectifying with chopper switches instead of power-hungry opamps. While this folding circuit is low power, the output voltage ( $V_{ADC}$ ) is the capacitive dividing of  $C_S$  and  $C_{in}$  and has a limited gain of  $A_v < 1$ . Since we designed this circuit with  $C_S = 600$  fF and  $C_{in} = 150$  fF, the gain is:

$$A_v = \frac{C_S}{C_S + C_{IN}} \cong \frac{4}{5} \quad (3.14)$$

$C_{in}$  is the sum of the 130 fF MOM capacitor and the 20 fF comparator input capacitance, which is sized so as to suppress the comparator kickback. In folding circuits that only rectifies the signal at the frontend, there are no critical issues such as gain mismatch with the backend. The non-ideal  $A_v$  just attenuates the signal level the backend ADC receives, and therefore,  $A_v = 0.8$  is acceptable.

### 3.3.2 Live configuring with excess-delay accumulation

Here, we describe the control circuit which configures the flash operation adaptively by detecting EXD. Since EXD is monitored in real time, we will refer the EXD monitoring and architecture controlling circuit as a *live* configuring circuit from



Figure 3.11: (a) DAFS operation without  $\tau_{TH}$ . Lowest BF ratio will be 0.5 since flash operation will be inserted as soon as any EXD is detected. (b) DAFS operation with  $\tau_{TH}$ . ADC does not switch to flash until exceeds  $\Sigma$  EXD.



Figure 3.12: (a) Power scaling with several values of  $\tau_{TH}$ . (b) versus BF ratio with several values of  $\tau_{TH}$ .

now on. The simplest live configuring can be implemented by seeing if the edge of the ADC reset phase continues into the next cycle or not, and if there is any EXD, the ADC architecture is switched to flash in the next cycle. However, this live configuring method has a critical weakness in that, the flash operation starts even if the detected EXD is very small. Fig. 3.11 (a) illustrates this issue, even with  $fs$  slightly exceeding  $fs_{maxBS}$  the flash operation begins in cycle 2. This is undesirable because the lower limit of the BF ratio will be as high as 0.5 and there will be a large power penalty.

In order to lower the power consumption, the number of flash operation should be minimized. To realize this, live configuring with excess-delay accumulation is proposed. The timing chart for such a technique is shown in Fig. 3.11 (b). Here, even though a limited amount of EXD is produced in cycle 1, the flash operation does not start until the accumulated excess-delay ( $\int \text{EXD}$ ) exceeds the threshold  $\tau_{TH}$ . Therefore, if the produced EXD is very small, the ADC operates a number of cycles until  $\Sigma \text{EXD}$  is accumulated to a sufficient amount of  $\tau_{TH}$ . In this example, the ADC operates 5 cycles until live configuring circuit switches the ADC to flash, which result in BF ratio of 0.16. The ideal value of  $\tau_{TH}$  should be chosen to minimize the number of flash operations, which is true when the EXD subtracted by a single

flash operation is smaller than the accumulated EXD ( $\Sigma$ EXD).

$$\tau_{THideal} > t_{cyc} - t_{FL} \quad (3.15)$$

From (3.15), we can see that large value of  $\tau_{TH}$  must be set to achieve good scaling for  $fs$  near  $fs_{maxBS}$ . However, it is challenging to install long timing thresholds because it can cause instability in the system easily. For practical implementation, we simply used the ideal  $\tau_{TH}$  where  $t_{cyc}$  ( $1/fs$ ) is the value when BF ratio meets 0.5. Since the EXD produced by binary search and the EXD subtracted by flash are equal in this frequency ( $t_{BS} - t_{CYC} = t_{CYC} - t_{FL}$ ), the ideal  $\tau_{TH}$  of this frequency will be:

$$\tau_{TH} = t_{BS} - t_{cyc} = \frac{t_{BS} - t_{FL}}{2} \quad (3.16)$$

By substituting values obtained from the simulation and calculating (3.16), the value of  $\tau_{TH}=200$  ps was obtained. In Fig. 3.12, the power scaling and  $fs$  versus BF ratio were plotted for several values of  $\tau_{TH}$ , respectively. From Fig. 3.12 (b), we can tell that larger  $\tau_{TH}$  becomes, the power scaling becomes closer to the ideal scaling during  $fs_{maxBS} < fs < fs_{BFR0.5}$  and  $\tau_{TH}$  of 200 ps is satisfactory.

Next, let us explain a gate level implementation of the live configuring circuit with threshold  $\tau_{TH}$ . As discussed before, long timing thresholds can cause instability the system but on the other hand, the power efficiency will worsen if the threshold is too small. Generating  $\tau_{TH}$  in delay circuits also causes issues such as PVT drift, and calibration must be done to counter them. In our live configuring circuit, the threshold is implemented by using the rising edge of the reset (FIN) signal for EXD detection, instead of the falling edge. The FIN signal is a pulse which rises from the end of ADC conversion until completion of ADC reset (Fig. 3.5); in other words, the live configuring circuit exploits the ADC reset time as a threshold  $\tau_{TH}$ . Since the ADC reset time is around 150-250 ps across PVT variations, sufficient ADC power scaling can be expected, according to Fig. 3.12.

A schematic and timing chart of the live configuring circuit are shown in Fig.



Figure 3.13: Schematic of the live configuring circuit which uses the pulse length of FIN as  $\tau_{TH}$

3.13. The subranging ADC's coarse and fine sub-ADCs share the same B/F signal given from the live configuring circuit. Moreover, FIN signal is generated by taking an AND of the FIN of both coarse and fine sub-ADCs. Therefore, the EXD monitoring is done based on the sub-ADC of a slower conversion, which is often the fine sub-ADC. By unifying the conversion finish signal, the complexity of live configuring circuit can be greatly relaxed. The counter uses the rising edge of FIN as a trigger and switches the MUX output ( $V_{MUXO}$ ) between  $\phi$  1-4, which are 1/4 decimation of the sampling clock respectively.

### 3.3.3 Sub-ADC designs

Now let us describe the binary search/flash reconfigurable ADC used in the sub-ADC. While the comparator mismatch requirements can be relaxed by having redundant bits in the 3-bit sub-ADCs, there are expenses of increased power and area. For example, the calibration procedure can be reduced 60% by implementing the sub-ADC with a redundancy of 3.5-bit, but the power and area increases 60%, respectively. We chose to implement the 3-bit sub-ADC without redundancy to minimize power consumption and area and compensate the comparator mismatch by



Figure 3.14: Schematic of the comparator with four channel input. The input channel is determined by signal  $EN[0:3]$ . The programmable load capacitance used for offset compensation is shown as well.

foreground calibration, which is described later on. Fig. 3.14 shows the schematic of the comparator, based on ref.[71], which is used in the sub-ADC. Note that the reset transistors are omitted for simplicity. The comparator is clocked, and it has four input transistors for the differential  $V_{IN}$  and reference.

To compensate with the four channel S/H, this comparator has multiple input transistor pairs each corresponding to the respective channels. Fig. 3.14 shows the input transistors for Ch.0 and the activation circuit made of 3-input AND. As in Fig.10, the Ch.0 comparator input pairs are activated when EN[0] is *High*. Multi-input comparators can be implemented by configuring  $V_{IN}$  with switches every cycle, but in such cases,  $V_{IN}$  settling becomes a critical issue in GS/s operations and an additional settling phase will be required. While the multiple input transistor pair approach is suited for high speed operation, the mismatch generates different offset voltages between the input pairs and corrupts the ADC linearity.

Lastly, the calibration methods are briefly explained. Since the offset between the multiple input transistor pairs must be nulled, the mapping codes to cancel the offset are acquired via foreground calibration for each channel. The mapping codes are digital values which configures the programmable capacitors. To suppress the comparator offset to under LSB/2, the smallest calibration step of 3 mV was chosen, which ended up with a unit capacitor sizing of 0.75 fF. We chose to design a 4-bit capacitor bank to compensate with the comparator's  $3\sigma$  mismatch of 60 mV. When the ADC is operating, the mapping codes are switched every cycle to cancel the varying offsets. The comparator's foreground calibration can be done simply, since reference voltages are supplied via the on-chip R-DAC. By shorting the comparator input with the reference voltage, a binary search can be conducted by switching the load capacitances as described in ref.[58]. After the binary search, mapping code which cancels the offset is obtained, and these codes will be saved for each input pairs of Ch.0-3 since each of them has different offsets. As the ADC operates, these mapped codes are switched every cycle by the MUX.



Figure 3.15: Chip micrograph.

## 3.4 Results and Discussion

### 3.4.1 Measured Results

The subranging ADC with DAFS was implemented in 65nm CMOS process. Fig. 3.15 shows the chip micrograph, and the ADC occupied an area of 250 x 350  $\mu\text{m}$ . A foreground calibration was done to cancel the comparator offsets. On the other hand, no tuning was applied to the live configuring circuits since they can tolerate PVT variations. Fig. 3.16 plots DNL/INL after calibration, respectively measured at  $f_s=1024 \text{ MS/s}$  and  $f_{in}=10 \text{ MHz}$ . In addition, we did not observe any difference in the linearity when the sub-ADC architecture was switched between flash and binary search.

The measured power scaling characteristic of the subranging ADC is shown in Fig. 3.17. The sub-ADCs are programmable to operate either as DAFS or flash only, and the power scaling for both operation modes are shown. While the power



Figure 3.16: Measured DNL/INL after foreground comparator offset calibration



Figure 3.17: Measured power scaling of the subranging ADC, with and without DAFS. The BF ratio was measured and plotted as well.



Figure 3.18: Measured 4096-point FFT spectrum at the written condition.



Figure 3.19: (a) Measured versus SNDR. (b) Measured versus SNDR

scaling is linear when sub-ADCs operate only with flash, superlinear power scaling was observed with DAFS during high speed operation at 820 MS/s to 1220 MS/s. The BF ratio was measured by acquiring the architecture configuring signal (B/F) and is plotted as well. Beyond 820 MS/s ( $f_{smaxBS}$ ), the live configuring circuit detected EXD and began to insert flash operations. As  $fs$  increased, more flash operations were inserted and made the power scaling superlinear. At 1220 MS/s ( $f_{smaxFL}$ ), the power consumption reached nearly of that of flash only operation and the BF ratio reached 0.98. Similar power scaling characteristics were confirmed in all ten measured samples, which show the robustness of the live configuring. A peak FoM of 85 fJ/conv. was obtained at  $fs_{maxBS}$ : 820 MS/s. DAFS achieves 30% power reduction compared with the power consumed by sub-ADCs operating only with flash. However, this result is smaller than what we expected in Section 3.2. This result will be analyzed later on.



Figure 3.20: Power breakdown of the ADC at 820 MS/s with sub-ADC operated only with binary search and flash respectively.

The 4096 FFT spectrum measured at 1220 MS/s is plotted in Fig. 3.18, and  $f_s$  and  $f_{in}$  versus SNDR is plotted in Fig. 3.19. The nonlinearity of the ADC was mostly due to comparator offsets, while the gain mismatch and timing-skew did not impact the ADC resolution. In Fig. 3.19 (a), there was a brick wall at 1250 MS/s which SNDR suddenly deteriorated. This happens because since  $f_s$  exceeded  $f_{s,maxFL}$ , the coarse sub-ADC started to make conversion errors. In such cases, fine conversions become meaningless and the resolution greatly degrades.

### 3.4.2 Discussions

Fig. 3.20 shows the power breakdown of the ADC acquired from the post-layout simulation at 820 MS/s. Two cases are shown, one in which the sub-ADC operates as only a binary search and one as flash. If we focus on the sub-ADC power consumption, 50% power reduction is achieved by reconfiguring the ADC architectures, which is close to the predictions made in Section 3.2. However, since the power of the digital and reference circuits does not change with DAFS, these become the bottleneck when scaling the entire ADC power. If we extend the sub-ADC resolu-



Figure 3.21: PVT variations versus BF ratio is shown. Interestingly, DAFS can operate to cancel out PVT variation effects, relaxing the speed margins of the high-speed ADC.

tion to beyond 4-bits, the power consumption of sub-ADC will be dominant since  $P_{FL}$  increases exponentially. Since the power of other circuits hardly change, DAFS power scaling will be emphasized. However, aiming further ADC resolution will result with stricter timing-skew and gain mismatch requirements and more effort must be spared for sampling frontend designs.

In Fig. 3.21, each PVT were varied in post-layout simulation, and the resulting change of BF ratio was observed at  $f_s=1$  GS/s. As expected, the BF ratio tracks the transistor speed shift due to the PVT variation. Since the speed of comparator based ADCs is sensitive to PVT, architecture reconfiguration significantly relaxes the design margin. We must keep in mind that the ADC power efficiency worsens as more flash operations are inserted. For example, compared with typical conditions, the sub-ADC power consumption is 20% higher under SS conditions and 40% lower under FF conditions in Fig. 3.21 (c).

Table 3.1: Comparison with state-of-the-art high-speed ADCs.

|                 | Ohhata<br>A-SSCC2012 | Chung<br>Trans.<br>VLSI2014 | Kull<br>ISSCC2013 | Verbruggen<br>ISSCC2010 | This work  |      |      |
|-----------------|----------------------|-----------------------------|-------------------|-------------------------|------------|------|------|
| Technology [nm] | 65                   | 55                          | 32 (SOI)          | 40                      | 65         |      |      |
| Architecture    | Subranging           | Subranging                  | SAR               | Pipeline                | Subranging |      |      |
| Resolution      | 8                    | 8                           | 8                 | 6                       | 7          |      |      |
| fs [MS/s]       | 1000                 | 1000                        | 1200              | 2200                    | 820        | 1000 | 1228 |
| SNDR [dB]       | 42.4                 | 40                          | 39.3              | 29.6                    | 37.4       | 37.2 | 36.2 |
| Power [mW]      | 17.5                 | 16                          | 3.0               | 2.53                    | 4.26       | 5.91 | 8.11 |
| Calibration     | No                   | Yes                         | No                | Yes                     | Yes        |      |      |
| FOM [fJ/conv.]  | 162                  | 195                         | 34                | 40                      | 85         | 99   | 125  |

Lastly, TABLE 3.1 compares our ADC with other state-of-the-art low resolution GS/s ADCs. Compared with conventional subranging ADCs, ours achieved a two times better power efficiency. However, the SAR and pipeline ADCs of ref. [70] and [56] have better power efficiencies. It is worth noting that the power consumption and speed of comparator based ADCs scale significantly with CMOS device scaling. When designed with more advanced CMOS devices, this ADC is expected to operate with a performance comparably to the references. Moreover, this ADC is the first to have superlinear power scaling with GS/s operation.

## 3.5 Conclusions

A subranging ADC with Dynamic Architecture and Frequency Scaling (DAFS) was presented. While operating at over 1 GS/s, the ADC accomplishes superlinear power scaling by adaptively reconfiguring the sub-ADC architecture between binary search and flash. The architecture reconfiguration is done by monitoring the excess-delay of the conversion, and flash operation is used to cancel the excess-delay. DAFS not only improves the power scaling significantly, but compensates for the transistor

speed shift due to PVT variation which can be used to relax the design margin in high-speed ADCs.

A 7-bit subranging ADC was designed in 65nm CMOS in which the DAFS was applied to the sub-ADC. The DAFS operation was confirmed in the range of 820-1220 MS/s, and achieving superlinear power scaling. When compared to the ADC performance with DAFS disabled, maximum of 30% power reduction was achieved. This subranging ADC achieved peak FoM of 85 fJ/conv. at 820 MS/s, which is nearly a twofold improvement over the conventional subranging ADCs.

# Chapter 4

## Threshold Configuring Comparator

### 4.1 Introduction

In this chapter, we will discuss upon improving the comparator circuit utilized in the successive approximation (SA) circuits in chapter 2 and 3.

In chapter 2, we proposed the digital amplifier (DA) technique to realize a high-accuracy amplifier in scaled CMOS technologies. However, the DA's amplification is based on SA and requires  $n$  SA cycles to complete the amplification (given a  $n$ -bit DA), which can limit the total conversion speed. If we can develop techniques that can speed up SA operation speeds, the entire Pipelined-SAR ADC can operate faster as well. Faster conversion speed is beneficial, given the wireless trends expanding the communication bandwidths. For an example, 2-bit/step conversion techniques are a popular approach upon speeding up the SAR ADC operation speeds. If given a 8-bit SAR ADC, while 8 SA cycles were required to complete the conversion, it can be cut down to 4 SA cycles and ideally improving the conversion speeds  $2 \times$ . However, the 2-bit/step circuitry conventionally had to increase the SAR ADC's analog circuitry three-folds; 3 sets of C-DAC and comparators were required to conduct the 2-bit quantization and large overhead had to be introduced.

In this chapter, we propose a power and area efficient 2-bit/step method with a novel wide-range threshold configurable comparator (TCC) design [58] [72]. We propose a 2-bit/step SAR ADC using TCCs which operates with multiple comparators but with a single C=DAC; the overhead is significantly smaller than conventional 2-bit/step SAR ADCs. The comparator threshold is configured dynamically and widely with variable current sources (VCS). The VCS is biased by internally generated  $V_{CM}$  voltage, which makes the ADC free from power supply voltage variation. A simple foreground calibration is described, which requires only a  $1/2 V_{DD}$  input throughout the calibration process, which is typically supplied by the system to generate the input common mode voltage.

For extremely low-power operation, we successively activate the comparators in this design. Even though the power and area overhead is very small, an increase in speed of over 50% can be achieved at a power supply of 0.3-0.6 V. The measured power efficiency of the prototype 2-bit/step SAR ADC in 40nm CMOS is highly comparable with low power state-of-the-art works but with faster operating speeds.

By using the proposed TCC, we can re-implement the DA proposed in chapter 2 to a 2-bit/step based DA to achieve faster conversion speed with minimum area overheads. Moreover, the binary search ADC in chapter 3 requires R-DAC generated reference voltages for comparison. Such R-DAC time constant must be low to achieve fast reference voltage switching, consuming a non-neglectable amount of static current. By utilizing wide-range TCCs, such current consuming R-DACs can be eliminated from the design and improve the ADC power efficiency.

Section 4.2 compares the conventional and proposed 2-bit/step SAR ADC structure. Section 4.3 describes the threshold configuring comparator designs. In section 4.4, the measurement results are shown.

Table 4.1: Comparison with conventional 2-bit/step ADC.

|                       | Cao 2009   | Wei 2011        | TCC 2bit/step Fig.4.1                | SAC 2bit/step Fig.4.2                |
|-----------------------|------------|-----------------|--------------------------------------|--------------------------------------|
| Reference Generation  | C-DAC (x3) | Resistor Ladder | Threshold Configuring Comparator(x3) | Threshold Configuring Comparator(x2) |
| Speed                 | Fast       | Fast            | Fast                                 | Fast (@Low Voltage)                  |
| Area Overhead         | Large      | Medium          | Medium                               | Small                                |
| Power Overhead        | Large      | Medium          | Medium                               | Small                                |
| Low voltage operation | O          | X               |                                      | O                                    |
| Static Power          | No         | Yes             |                                      | No                                   |
| Environment variation | O          | O               | Depends on TCC design                |                                      |

## 4.2 2-bit/Step SAR ADC Architecture

### 4.2.1 Conventional Designs

A 2-bit/step method uses a 2-bit quantizer inside the successive approximation (SA) loop to speed up the conversion. Because only  $n/2$  cycles are required for the  $n$  bit conversion, the SAR ADC speed can be ideally doubled. Since the SAR logic requires little modification to realize the 2-bit/step operation, there is small overhead in the digital circuitry. However, providing a 2-bit quantizer requires many additional analog components and the ADC experiences a large power and area overhead. For example, Flash ADC is a preferred choice for the 2-bit quantizer. It can acquire the comparison results in one clock cycle but the reference of the Flash ADC must be configured every SA cycle.

For an example, at the 1st SA cycle, the references should be  $1/4$ ,  $2/4$ ,  $3/4 V_{ref}$ , respectively. Before proceeding to the next cycle, the C-DAC switches its capacitors reflecting the comparison results. Therefore, references for 2nd SA cycle must be  $7/16$ ,  $8/16$ ,  $9/16 V_{ref}$ , respectively. Conventional 2-bit/step SAR ADC researches with a different generation of references are described in TABLE 4.1. Previous researches require addition reference generation circuitry's (R-DAC and C-DACs)



Figure 4.1: Block diagram of a 2-bit/step ADC provided with TCC.

and consume an additional power overhead. We try to minimize the overheads of the 2-bit/step operation by utilizing threshold configuring comparators.

#### 4.2.2 2-bit/step with threshold configuring comparators

Our key idea is: instead of using multiple references, we realize the 2-bit/step operation by configuring comparator offsets (or threshold  $V_{offset}$ ). A simple block diagram of our proposed 2-bit/step SAR ADC implemented with a threshold configuring comparator (TCC) is shown in Fig. 4.1.

CP2 is an ordinary comparator, which simply compares the input signals  $V_{in}$  and VDAC. Suppose that  $V_{offset}$  of  $1/4 V_{ref}$  and  $-1/4 V_{ref}$  is applied to comparator CP1 and CP3. The comparator threshold ( $V_{THcomp}$ ) would be  $3/4 V_{ref}$  and  $1/4 V_{ref}$ , respectively and 2-bit quantizer is provided. In this method, at a certain SA cycle  $N$ ,  $V_{offset}$  of CP1 and CP3 should be:

$$V_{offset} = \pm \frac{1}{2^2 N} \quad (4.1)$$

When foreground calibration is done and  $V_{offset}$  is set properly, our proposed method will require only one C-DAC and sampling switch respectively. Therefore, power can be significantly reduced when compared with [73] and ADC does not consume DC power.

However, several power and area overheads remain in this TCC based 2-bit/step SAR ADC. First, because the comparators must configure their threshold each cycle, there is dynamic power of  $V_{offset}$  control circuit. Second and most critically, there is an overhead in comparator activation. While the an ordinary SAR will require only 2 comparator activations in a 2-bit conversion, such 2-bit Flash operation requires 3 comparators to be activated. As a result, the comparator power increases 50%. The issue is more critical, because TCC consumes more power than normal comparators.

### 4.2.3 2-bit/step with Successively Activated Comparators

For further power reduction, we propose a 2-bit/step ADC with successively activated comparators (SAC) and the block diagram and operation concept is shown in Fig. 4.2. After the external sampling clock ( $CLK_{ext}$ ) sets down, a SA cycle 1 starts by rising  $\phi_1$  and CP1 decides the first bit ( $OUT_{CP1}$ ). After the first bit decision,  $V_{THcomp}$  of CP2 ( $V_{THCP2}$ ) is set reflecting the result of the first bit. In this case  $OUT_{CP1}$  is 1, thus  $V_{THCP2}$  is set to  $12/16 V_{ref}$  and the second bit ( $OUT_{CP2}$ ) is decided. In the proposed ADC the 2-bit quantizer operates like a binary-search ADC [69], where the second comparator is activated reflecting the preceding comparator's



Figure 4.2: Proposed 2-bit/step SAR ADC with successively activated comparators.  
(a) Block diagram. (b) Operation concept.



Figure 4.3: Timing chart of the proposed ADC.

results. Because the second comparator threshold is configured dynamically every cycle, only two comparators are required instead of three. The results of SA cycle 1 are stored in MSB and 2nd MSB registers respectively.

Fig. 4.3 shows the timing chart of the proposed ADC of SA cycle 1 and 2. Here, CP1 is activated by  $\phi_{CP1}$  when the sampling signal( $CLK_{ext}$ ) sets down, and then  $\phi_{CP2}$  rises successively and 2-bit output is acquired. After the register latches the comparator outputs, ADC cycle signal( $\phi_{cyc.}$ ) change and the ADC prepares for SA cycle 2. However, before the next comparison starts, a VDAC settling delay( $t_{DAC}$ ) is inserted for the reference settling.  $\phi_{cyc.[0:3]}$  is used to control  $V_{THCP2}$ , since it must be configured every cycle. After sufficient C-DAC settling,  $\phi_{CP1}$  rises and SA cycle 2 begins. By repeating these procedures, this ADC achieves 8-bit conversion with 4 SA cycles.

A genetic SAR ADC cycle time is determined by three delays: comparator delay( $t_{comp}$ ), SAR logic delay( $t_{logic}$ ), and DAC settling( $t_{DAC}$ ). Therefore, the total conversion time of an 8-bit 1-bit/step SAR ADC is assumed  $8(t_{comp}+t_{logic}+t_{DAC})$ . On the other hand, the conventional 2-bit/step SAR ADC conversion time is only  $4(t_{comp}+t_{logic}+t_{DAC})$ , since 2-bits are processed simultaneously. Next, our proposed



Figure 4.4: Power supply versus comparator delay, DAC settling and speed improvement respectively.

circuit is considered. The timing chart in Fig.3 implies that  $t_{logic}$  and  $t_{DAC}$  is halved but because the comparators are activated successively, there is no improvement in  $t_{comp}$ . Therefore, the conversion time for 8-bit SAC operation is:

$$t_{conversion} = 8 * t_{comp} + 4(t_{logic} + t_{DAC}) \quad (4.2)$$

We can draw a conclusion that the improvement in SAC speed is larger when  $t_{comp}$  is shorter than  $t_{logic}+t_{DAC}$ . In a typical mid-resolution SAR ADC operated with standard supply voltage, all the delays are about the same length. However, in low-voltage SAR ADCs, it is known that  $t_{logic}$  and  $t_{DAC}$  may be much longer than  $t_{comp}$  [74]: the SAC architecture will benefit in such low-voltage settings. For standard voltage settings, one may choose the ordinary 2-bit/step architecture and simply utilize three TCCs to obtain sufficient speed improvements.

We simulate the power supply versus  $t_{DAC}$  and  $t_{comp}$  was obtained respectively using simulation results, plotted in Fig. 4.4 including speed improvement using SAC. At voltages lower than 0.6 V, the load capacitance determines the delay time and  $t_{comp}$  is considerably shorter. Under such conditions, the proposed SAC significantly speed up the ADC. However, as the power supply rises, drain current exponentially



Figure 4.5: Threshold configuring comparator design.

increases and the DAC buffer instantly charges large load capacitance. When the supply voltage exceeds 0.8 V, the overdrive voltage becomes the dominating constant and the ratio between  $t_{comp}$  and  $t_{DAC}$  flips. For such ADC designs, 2-bit/step ADC designs should use three TCCs to maximize the speed improvements.

## 4.3 Wide range threshold configuring comparator

### 4.3.1 TCC Architecture

To compensate for comparator offsets from process mismatches, TCCs have been widely used. A common TCC is provided by asymmetric capacitive loads [75] [76] and also current sources are frequently used [63] [77]. Fig.4.6 shows the comparator

schematic with threshold configuring used in CP2. CP1 does not have the threshold configuring element but the basic architecture is the same.

Our TCC architecture is based on a Miyahara two-stage dynamic comparator [63]. To start with, the basic comparator operation is described. The comparison begins when comparator activation signal ( $\phi_2$ ) becomes HIGH. Nodes a and b (the drain node of the input transistors MN1 and MN2) drop with its speed proportion to the gate voltage of the input transistors. When either drops  $V_{latch}$ , the second stage latch operates and the output is decided.

Next, we will review several conventional threshold configuring methods and compare with our proposed method.

A certain cycle when  $V_{THcomp}$  is to be  $V_x$  is supposed. Under this condition, the TCC should be balanced when  $V_{inP}, V_{inN} = V_{CM} \pm V_x$ . The drain current of input transistors IdP and IdN in this condition are calculated, and the time until the results are latched ( $t_{latch}$ ) can also be assumed as well. If the input differential pairs simply draw out charge  $Q = C \times V_{latch}$  stored in nodes a, b,

$$t_{latch} = (C * V_{latch}) / I_d \quad (4.3)$$

Since  $t_{latch}$  should be the same for the both input transistor pairs,

$$\frac{I_{dP}}{I_{dN}} = \frac{C_N}{C_P} \quad (4.4)$$

can be lead where  $C_n$  and  $C_p$  are load capacitance of nodes a, b. (4.4) is a very important equation, which imposes that threshold configuring can be achieved by either 1) providing a gap (or an offset) of load capacitance between  $C_N$  and  $C_P$  or 2) by providing an offset current to  $I_{dP}$  or  $I_{dN}$  or 3) providing a  $gm$  offset between the input transistors.

However, very wide threshold shifting of  $3/4 V_{Ref}$  and  $1/4 V_{Ref}$  are required to realize a 2-bit/step operation at cycle 1(Fig. 4.2(b)) and this is challenging with offset load capacitance. To realize such  $V_{THcomp}$ , simulation results at 0.5 V

shows that an impractical capacitance of  $\Delta C = 7.7 \text{ pF}$  is required. Therefore, the comparator power will increase  $5\times$  and in addition, the comparison time will be significantly prolonged. This is because when realizing large threshold shift at low supply voltages, the drain current of the two input transistors can differ as much as  $100 \times$  when one enters sub-threshold region deeply and one does not.

The same problem appears when implementing built-in comparator offset methods [78], which create offset by asymmetrically tuning the tail currents (or can be realized by changing the gm of the input transistors). Tail current configuring will require sizing that is proportional to  $I_{dP}/I_{dN}$  and at low voltages, transistor arrays with W/L sizing exceeding  $100 \times$  will be required. This will increase the area of the comparator greatly.

In our proposed TCC, the  $V_{THcomp}$  is widely configured by a variable current source (VCS). For an example, when the  $V_{THcomp}$  is set to  $12/16 V_{Ref}$ , the VCS connected to the drain of  $V_{inN}$  input transistor(node a) are activated. An offset current (IVCS) is added to  $I_{dN}$  in (4.4) to match  $I_{dp}=I_{dn}+IVCS$ . On the other hand, to set  $V_{THcomp}$  to  $4/16 V_{Ref}$ , VCS connected to the drain of  $V_{inP}$  input transistor are activated(node b). Note that the offset current configures  $V_{THcomp}$  and capacitor loads are unchanged. Therefore,  $t_{comp}$  is not prolonged in this design. However by using the current sources, the overall current is increased and power consumption may increase as well. This can be neglected by operating the comparator dynamically; the transistors MP1 and MP2 are kept off during operation. Therefore, the overall charge drawn out at single comparison does not change and increase in comparator power is small. However by adding VCS, the parasitic capacitance at nodes a and b increases which increase the comparator power consumption by 15%.

#### 4.3.2 TCC by variable current source

Designing a bias circuit for VCS under various voltage conditions, including extremely low voltages, are very challenging. A bias circuit such as band-gap reference has resistance against temperature variation but cannot be used at low-voltages.



Figure 4.6: Schematic of the threshold configuring comparator (CP2 in Fig. 4.2).



Figure 4.7: (a) Schematic of 5-bit  $V_{CM}$  biased variable current source. (b) Operation of capacitive dividing.

Therefore, a simple biasing technique is required. A simple way is to use  $V_{DD}$  as the bias voltage for the current sources. However, such current source has a critical weakness against power supply noise.

To improve the immunity to power supply noise, we propose the  $V_{CM}$  biased VCS. Upon implementation,  $V_{CM}$  biased NMOS transistors with binary tuned W/L ratios are used, as in Fig. 4.6. Two types of transistor threshold, standard  $V_{th}$  and high  $V_{th}$  devices were used for ‘coarse’ and ‘fine’ VCS, configurable for 4 and 5-bit respectively. The use of different  $V_{th}$  relaxes the transistor sizing greatly, since the W/L ratio does not increase exponentially. Although the transistor  $V_{th}$  and sizing differ between the coarse and fine VCS, the same operation and design methodology discussed below is adapted.

Fig. 4.7(a) shows the specific schematic of 5-bit  $V_{CM}$  bias generation circuit and the control circuit of VCS. Here,  $\text{Ctrl.1[0:4]}$  to  $\text{Ctrl.4[0:4]}$  are values earned from calibration which determines VCS output for cycle 1 to 4 respectively. The Ctrl. signals are selected by the  $\phi_{cyc.}$  signal(Fig. 4.3), which rise at a specific cycle. The current source operates when the input of bias generator is High. By capacitive dividing, a gate voltage of  $V_{CM}$  is generated as shown in Fig.4.7(b). The capacitance value of  $C_{div}$  is designed to be the same as the gate capacitance of the biased transistor  $M_{n1}$  and parasitic summed ( $C_p$ ). In this design,  $C_{div}$  was constructed by MOM capacitor and its capacitance was designed to match the estimated  $C_p$ . Therefore, when the

top plate of  $C_{div}$  is connected to  $V_{DD}$ , capacitive dividing provides a gate voltage  $V_{Gen}=V_{CM}$ . To eliminate hysteresis effects, the gate voltage must be reset after each comparison. During the DAC settling phase,  $\phi_{cyc.}$  is turned to Low which activates transistor Mreset in all VCS. By Mreset, the gate voltage of Mn1 is reset to ground.

While  $V_{CM}$  voltage is typically supplied on-chip, one can simply directly use this voltage as reference. However, CMOS switches to bypass  $V_{CM}$  with high-speeds were difficult to design with low-voltages and fast transitions were not available. Since our target is realizing a fast 2-bit/step SAR ADC, such speed overheads were not acceptable. Therefore, we chose an option to internally generate  $V_{CM}$  like voltages at the comparator level. The voltages to charge the capacitors are  $V_{DD}$  so the switching is very fast and do not corrupt the ADC conversion speeds.

### 4.3.3 Variable current source design

The specific design methods of the VCS are explained. The key points when designing VCS is deciding fundamental W and L sizing, implementation of  $C_{div}$ , and comparator noise increase. However, the W and L sizing is heavily dependent on process mismatch characteristics and should be decided based on Monte Carlo results. In our design, the LSB current source transistor has a sizing of  $W = 600 \text{ n}$ ,  $L = 150 \text{ n}$  with Finger = 1. For the larger bit, the Finger is increased by a multiple of 2 respectively. The LSB current source is sized so it will configure  $V_{THcomp}$  by 0.25 LSB(or 1/1024  $V_{Ref}$ ) and the mismatch is small enough, confirmed by Monte Carlo simulation. Considering the process variation, this design margin is enough to generate  $V_{THcomp}$  required for SAC operation with an accuracy of 0.5 LSB.

After fundamental values for W and L are decided,  $C_{div}$  is calculated and implemented. We will suppose that a  $V_{CM}$  bias circuit is designed for transistor with a sizing of  $W = 600 \text{ nm}$ ,  $L = 150 \text{ nm}$ , Finger = 8. A large L size was utilized to realize higher mismatch tolerance. The gate capacitance can be predicted from Cox, which is a portion of tox. For an example, if  $tox = 25 \text{ nm}$ , Cox will be  $13.8 \text{ fF}/\mu\text{m}^2$ .



Figure 4.8: Area efficient 1 fF fringed capacitor used to provide  $C_{div}$ .

Therefore,  $C_p$  can be roughly calculated:  $C_p = WLCox = 10 \text{ fF}$ .  $C_{div}$  is created by a multi-layer fringed capacitor, which has high area efficiency. The capacitor occupies M2-M6, and Fig. 4.8 shows the capacitor of 1 fF, which is used as a unit capacitor. Multi-layer fringed capacitors are challenging to use in circuits which require precise matching, such as C-DACs, but are efficient for loose circuits. When designing  $C_{div}$ , one can run RC extraction to confirm that the calculation was right.

A post layout simulation ran with the conditions above showed that 257 mV bias voltage is generated. However,  $C_p$  relies heavily on W, L variation and operating region of the transistor as well. As a result, the capacitance can vary over 10% than simulation results and makes accurate extractions meaningless. In this design, VCS does not require an accurate voltage of  $V_{CM}$  to be generated and even though it varies, the ADC will still have power supply noise immunity. This issue is discussed specifically later on.

We also simulated the noise performance of the TCC as well. Since VCS injects additional noise to the comparator (and is not signal driven), the noise performance will degrade compared to normal comparators. While the CP1 comparator without VCS had a input referred noise was 0.15 LSB, the TCC noise performance was 0.25 LSB, which increased the noise to 66%. This is the worst condition, with all of the coarse current sources were turned on. Still, the noise performance satisfies the ADC requirements in our design. Generally, for TCCs, the input transistor  $g_m$  has tougher requirements than ordinary comparators in which to cancel the



Figure 4.9: Power supply variation effect of (a)  $V_{DD}$  biased VCS, (b)  $V_{CM}$  biased VCS

noise generated by the VCS. This will not happen in capacitor load based TCCs [75], because bandwidth limitations of the capacitor load will actually improve the comparator noise performance.

#### 4.3.4 Power Supply Noise Immunity

First, the power supply variation effect of simple  $V_{DD}$  biased current source will be studied as shown in Fig. 4.9(a). We will suppose that the ADC input common mode voltage is generated by dividing the ADC power supply voltage ( $V_{DD}$ ) by half. Therefore, when there is a power supply voltage variation of  $\Delta V_{DD}$ , the ADC input ( $V_{in}$ ) varies  $\Delta V_{DD}/2$ . As a result, the gate-source voltage variation of the comparator input transistor is  $\Delta V_{gsin} = \Delta V_{DD}/2$  but the variation of VCS transistor is  $\Delta V_{gsVCS} = \Delta V_{DD}$ . To summarize, in case of  $V_{DD}$  biasing, the effect of power supply variation is different between the input transistor which is a problem: the gate-source voltage difference between the VCS and input transistors will become an exponential difference in the current domain.

For  $V_{DD}$  biased current sources, even with a 10% power supply drift, the TCC

threshold will significantly drift and the ADC effective resolution will be around only 4 bits! Therefore, we must design the VCS current source so that the gate-source voltage difference between the VCS and input transistors will not occur when supply voltage changes.

The power supply variation effect with VCS biased by  $V_{CM}$  is shown in Fig. 4.9(b). When  $V_{CM}$  bias generating circuit of Fig. 4.7(b) is used,  $V_{CM}$ -like bias voltage  $V_{Gen}$  is generated by capacitive dividing.

$$V_{Gen} = \frac{C_{Div}}{(C_{div} + C_p) * V_{DD}} \quad (4.5)$$

If there were no mismatches,  $C_{div}/(C_{div}+C_p)=0.5$  will be realized and bias voltage of  $V_{DD}/2$  will be generated. When the power supply voltage varies to  $V_{DD}+\Delta V_{DD}$ , the generated bias voltage will be affected as:

$$V_{Gen2} = \frac{C_{Div}}{(C_{div} + C_p) * (V_{DD} + \Delta V_{DD})} \quad (4.6)$$

Hence, the gate-source voltage variation of the input transistors and the VCS transistors will be equal in the ideal case; the ADC gains tolerability against power supply variation. ( $V_{Gen2}=V_{DD}/2+\Delta V_{DD}/2$  and  $\Delta V_{gs}$  of  $M_{in}$  and  $M_{VCS}$ , respectively will both be  $\Delta V_{DD}/2$ .) However, we need to consider non-ideal effects affected by process mismatch of  $C_{div}$  and  $C_p$ . When power supply voltage varies to  $V_{DD}+\Delta V_{DD}$  and there are mismatch in the two capacitor values,

$$|\Delta V_{gsVCS} - \Delta V_{gsin}| = \Delta V_{DD} * |(C_{div} + C_p) - 0.5| \quad (4.7)$$

Equation (4.7) implies that the more  $C_{div}/(C_{div}+C_p)$  is closer to ideal (or 0.5), the TCC will cancel supply variation effects and ADC will hold more power supply variation resistance.

Fig. 4.10 shows the simulated results by Matlab which plots power supply variation versus ADC resolution in several  $-C_{div}/(C_{div}+C_p)-0.5-$ . The supposed cali-



Figure 4.10: Power supply variation versus ADC resolution with different settings.

brated power supply is 0.5 V. To maximize simulation efficiency, TCC (or CP2) including the VCS were modeled in Matlab, confirming consistency carefully with the simulation results. Rest of the 2-bit/step SAC ADC was modeled as well to obtain the resolution, where CP1 and DAC were assumed to be ideal. If  $-C_{div}/(C_{div}+C_p)-0.5$ — is under 0.3, which can be sufficiently achieved even in 40 nm process, the ADC will achieve 7 bit resolution with power supply variation of 10%.

#### 4.3.5 Temperature variation effects.

Finally, temperature variation effects are discussed. The temperature effect can affect the transistor drain current in two ways, 1) change in mobility and 2) change in transistor  $V_{th}$ . Both mobility and  $V_{th}$  has a negative temperature coefficient. However, while the decrease of mobility degrades  $I_d$  as well, the  $V_{th}$  decrease will increase  $I_d$  exponentially. Since these effects contradict,  $I_d$  calculation will be complex; when the  $V_{gs}$  is small, the mobility change will be the dominating  $I_d$  change and vice versa. Thus, depending on the comparator's input voltage, the offset drift due to temperature drift will be different. For an example when the set threshold is large (e.g. 1st SA cycle), the set threshold voltage will drift largely from the



Figure 4.11: Chip photo.

calibrated value and for later SA cycles, the effect will be smaller. We conducted a temperature varying simulation based on the settings of 4.10 and plotted the results in Fig. 4.18. At 400 K, there can be 2-bit resolution decrease in the ADC. This is a serious issue if the ADC is operating in an environment where large temperature variation is expected. However, the effect should be countered by running  $V_{THcomp}$  calibration periodically.

## 4.4 Measurement Results

The proposed ADC prototype was designed and fabricated in a 1P7M 40 nm standard CMOS process. Fig. 4.11 shows the microphotograph and layout of the chip. The core area is only  $0.0153 \text{ mm}^2$  and dummy layers are not removed since the effects can be removed by calibration.

Fig. 4.12 (a) and (b) show the DNL and INL at before and after calibration at a power supply of 0.5 V, respectively. Foreground calibration has been done automatically with Matlab, under the same power supply. Before the calibration,



Figure 4.12: (a)DNL and INL before calibration at supply voltage of 0.5 V. (b)DNL and INL after calibration at supply voltage of 0.5 V.

large number of miscodes was confirmed, resulting from C-DAC and VCS process mismatch. After the calibration, both DNL and INL are kept within 1 LSB and the effectiveness of calibration by internal generated reference is proved.

Fig. 4.13 shows the measured FFT spectrum with 6.144 MS/s sampling frequency and Nyquist input frequency of 3.0585 MHz. Fig. 4.14 represents the signal frequency vs. SNDR of the ADC at 0.5 V. A flat frequency response was obtained between 100 kHz and 3 MHz (Nyquist frequency), and 3 dB bandwidth is 6 MHz. The maximum ERBW was 50 MHz measured at a power supply of 0.8 V with sampling frequency of 40.96 MS/s.

Fig. 4.15 shows the power supply voltage vs. speed improvement comparing the 3 dB cutoff frequency of 1-bit/step and 2-bit/step mode. By the proposed method, the ADC achieves maximum speed improvement of 60% at 0.5 V supply but falls beyond 30% when the supply rises to 0.8 V as DAC settling time shortens. However, at supply voltages below 0.4V, the speed improvement was smaller than expected. To maximize the SAR ADC speed, the asynchronous SAR logic delay should be set



Figure 4.13: FFT spectrum at condition shown.



Figure 4.14: Input signal frequency versus SNDR measured at 0.5 V.



Figure 4.15: Power supply voltage versus speed improvement by 2-bit/step SAC operation.

slightly longer than the DAC settling [78]. According to the post-layout simulation results, the minimum generatable delay of the asynchronous SAR logic was nearly twice as longer than the required DAC settling at such supply voltages. Such delay generating circuit which can operate with wide supply voltage range is challenging to design.

Fig. 4.16 shows the SNDR dependence on the power supply voltage variation. The foreground calibration was done at multiple conditions noted and then power supply voltage was varied.  $V_{CM}$  biased VCS has power supply noise immunity throughout the wide operating voltage. With 10% variation, the ENOB drop was only 0.5. In Fig. 4.16, we assumed that same power supply is used at the ADC input buffer and ADC itself so the power supply variation  $\Delta V_{DD}$  is to be affected similarly. However, if the buffer and ADC are run on different supplies, the effect of variation will differ: only  $V_{CM}$  varied or vice versa. The measurement result in this case is plotted in Fig. 4.17 and the ADC is tolerable of 10 mV variance. To prevent resolution deteriorating due to low voltage operation, the calibration was done at 0.7 V supply voltage. The measured ENOB degradation best matches when



Figure 4.16: Power supply variation versus ENOB response in several calibrated supply voltages.



Figure 4.17: Effect of power supply variation with  $V_{\text{cm}}$  or  $V_{\text{DD}}$  changed separately



Figure 4.18: Simulated and measured temperature variation effects.

$C_{div}/(C_{div}+C_p)-0.5$  was estimated as 0.25.

The temperature variation effect of this ADC is plotted in Fig.4.18. Calibration was done at 297 K and the temperature was raised to measure the ENOB degradation. The degradation matches the simulation results. To compensate with temperature variation without periodic foreground calibrations, additional biasing technique will be required as in [79]. However, this technique has a very large power overhead and may consume more power than the ADC itself. Low temperature measurements were not done because of lacked instruments but simulation results implies that 6.5 bit can be achieved with 200 K.

The ADC performance of a single chip is summarized in TABLE 4.2 and performance comparison with low power state-of-art works is shown in Fig. 4.19. Our ADC operates down to 0.3 V while keeping an excellent FoM. The threshold configuring method by  $V_{CM}$  bias current sources can be effective in such extreme low voltage region as well. The achieved FoM throughout the operating supply voltage range of 0.3-0.8 V is comparable with the other works which were designed for dedicated specification. Moreover, the power efficiency is better than that of ADCs which operate in multiple voltages.

Table 4.2: ADC performance summary.

| Technology            | 40nm                  |       |       |        |        |       |
|-----------------------|-----------------------|-------|-------|--------|--------|-------|
| Core Area             | 0.0153mm <sup>2</sup> |       |       |        |        |       |
| Supply Voltage        | 0.3 V                 | 0.4 V | 0.5 V | 0.6 V  | 0.7 V  | 0.8 V |
| F <sub>S</sub> [MS/s] | 0.20                  | 1.024 | 6.144 | 12.288 | 28.672 | 40.96 |
| SNDR Nyquist [dB]     | 43.3                  | 44.8  | 44.7  | 45.5   | 45.6   | 45.8  |
| ENOB                  | 6.91                  | 7.18  | 7.17  | 7.28   | 7.29   | 7.32  |
| Power [ $\mu$ W]      | 0.20                  | 0.71  | 6.43  | 18.5   | 53.9   | 107   |
| FoM [fJ/conv.]        | 8.3                   | 4.8   | 7.1   | 9.8    | 12     | 16    |



Figure 4.19: Comparison with low power state-of-art works.

While our work was one of the pioneers seeking efficiencies with sub-0.5V operated SAR ADCs and when our paper was published, only few SAR ADCs reported the operation yet [80]. Now, several 0.3V SAR ADC with extreme efficiencies (up to 1 fJ/conv.) has been presented [81] [82] [83], showing that lowering the supplies are one of the best ways to obtain top FoM with SAR ADCs.

## 4.5 Conclusions

An extremely low-voltage operating high speed and low power SAR ADC were presented. Using wide-range threshold configuring comparators, 2-bit/step operation was enabled with a small area and low power consumption. A comparator threshold configuring technique by  $V_{CM}$  bias current sources were introduced. Compared with conventional threshold configuring techniques, the proposed method can generate large comparator offset with small power. Moreover, we proposed a novel design of the variable current source, with power supply noise immunity. The effect was confirmed by measurement and ADC had immunity against power supply variation of over 10%.

The prototype ADC achieved 6.1 MS/s and 44.3 dB SNDR with a power supply of 0.5 V. At supply of 0.4 V, the ADC achieves a peak FoM of 4.8 fJ/conv. and operates down to 0.3 V. With the proposed techniques, the ADC achieved over 50% speed improvement and achieved power efficiency competing with the state-of-the-art works.

# Chapter 5

## Conclusion

### 5.1 Summary

In this chapter, would like to summarize the findings established at each of the chapters to summarize the entire thesis.

Along with CMOS scaling, wireless/wireline communication performances have greatly advanced and still continues to evolve. To realize a system on chip (SoC) for such products, high-performance ADCs are required. However, such SoCs utilize scaled CMOS technologies to cut down the costs of the digital circuits, but analog circuit's performance severely degrade when implemented on such processes. Thus, design of ADCs in scaled CMOS process environments becomes one of the most challenging and critical field of circuit design.

Throughout the thesis, explored Hybrid ADCs and novel design techniques which heavily utilize successive-approximation (SA) circuitry, to realize process scaling ADCs. We also aimed to establish an ADC design methodology suitable for scaled CMOS technologies as well.

In chapter 2, we introduced the concept and implementation of the digital amplifier (DA) to realize a CMOS process scalable switched capacitor amplifier. Conventionally, the amplifier (or the Opamp) gain performance greatly degraded with scaling

with worsened transistor gain and lowered supply voltages, and has been the greatest challenge upon scaling the Pipelined-ADCs.

We presented the DA's all error canceling feature, where gain error, non-linearity, incomplete settling, power supply noise and thermal noise of the low-gain amplifier can be canceled out by feedback based on successive approximation. Unlike conventional amplifiers, the DA accuracy can be arbitrary set by configuring the number of bits in the DA C-DAC; the amplifier gain is decoupled from the transistor intrinsic gain, which is suitable for scaled CMOS integration.

We also reported the measurement results of the calibration-free 0.7V 12b 160MS/s pipelined-SAR ADC. Without any calibration, the ADC achieved SNDR=61.1dB, FoM= 12.8fJ/conv., which was a  $3\times$  higher power efficiency than conventional calibration-free ADCs. Also, an inter-process performance comparison was performed, where we fabricated a 28nm and 65nm CMOS version of the DA (and the Pipelined ADC) to confirm the process scalability of the DA. Interestingly, we observed a  $3\times$  improvement in area, power and  $2\times$  improvement in amplification speed, due to the process scalability of successive approximation circuits.

In chapter 3, we introduced the ADC with dynamic architecture and frequency scaling (DAFS). An aggressive frequency power scaling high-speed ADCs are required for ultra-wideband communication systems, but simply configuring the ADC supply voltages are not feasible. To accomplish superlinear power scaling in high-speed ADCs, we proposed dynamic architecture and frequency scaling (DAFS): the ADC architecture was to be dynamically configured by adaptively between binary search and flash, reflecting the ADC clock-rate. The architecture configuration is triggered by monitoring the excess-delay of the conversion, and flash operation is used to cancel the excess-delay. DAFS not only improves the power scaling significantly, but compensates for the transistor speed shift due to PVT variation which can be used to relax the design margin in high-speed ADCs.

We designed a 7-bit subranging ADC in 65nm CMOS, where the DAFS was

applied to the sub-ADC. The DAFS operation was confirmed in the range of 820–1220 MS/s, which our ADC was the first to achieving superlinear power scaling with 1GS/s high-speed operation. When compared to the ADC performance when DAFS was disabled, maximum of 30% power reduction was achieved. The ADC achieved peak FoM of 85 fJ/conv. at 820 MS/s, which is nearly a twofold improvement over the conventional subranging ADCs.

In chapter 4, we introduced wide-range threshold configuring comparators (TCCs), aiming to enhance the successive approximation (SA) circuitry of the ADCs presented in chapter 2 and 3, respectively. For example, by utilizing 2-bit/step searches within the Digital Amplifier (DA) in chapter 2, the amplification speed can be significantly improved. While such TCCs will be useful and enhance the performance of ADCs based on successive approximation, but had a number of design issues: 1) is difficult to implement if the threshold configuring range is very large. 2) TCCs typically have low power-supply-noise-rejection (PSNR), so the threshold were easily drifted with even small supply fluctuations.

We proposed a current source based TCC design which enables both wide-range threshold configurability and power supply variation resistance. The key technology relies on the proposed simple V<sub>cm</sub> biased current sources, which maintains sufficient comparator PSNR and keeps the ADC free from power supply variations over 10%. To prove the effectiveness of the TCC, we implemented a 2-bit/step SAR ADC where the 2-bit/step comparison were carried out by TCCs instead of area and power consuming C-DACs. The prototype ADC fabricated in a 40 nm CMOS achieved a 44.3 dB SNDR with 6.14 MS/s at a single supply voltage of 0.5 V, and achieved a peak FoM of 4.8 fJ/conv-step.

## 5.2 Future research directions

Last but not least, we would like to conclude our thesis by raising few future research directions.



Figure 5.1: DA with 2-bit/step.

The first research direction is utilizing the threshold configuring comparators (TCCs), proposed in chapter 4, to the digital amplifier. By TCCs, we can achieve 2-bit/step SA operations to speed up the DA amplification. Now, the total amplification time is 8ns where 2ns is allocated to the Opamp amplification and the rest 6ns is allocated to the DA. Since 8-bit SA operation is much slower than the Opamp, 75% of the total amplification is consumed in the DA. By applying 2-bit/step operations as in Fig. 5.1, the SA cycle will be cut down to half: the amplification will complete within 5ns and achieve 40% speed ups. However, by 2-bit/step the comparator count will increase three folds and calibration to set the comparator thresholds must be added, which is a non-negligible overhead. Additional techniques to null these overheads should be additionally proposed to compete the total cost of the ADC.

While the above proposal was to improve the DA speeds, what can we do to further improve the DA power efficiency? Remember that 30% of the ADC power is still burned in the Opamp (Fig.2.20). A interesting direction will be to replace the

|                                   | <b>28nm</b> | <b>16nm</b> |
|-----------------------------------|-------------|-------------|
| Opamp gain                        | 20 dB       | 30 dB       |
| Required DA gain<br>(target=60dB) | 40 dB       | 30 dB       |
| DA bits                           | 7-bit       | 5-bit       |
| DA speed                          | 6 ns        | 3 ns        |

Figure 5.2: DA estimated performance with 16nm and 28nm CMOS

Opamp with more efficient amplifiers (e.g. ring amplifier), the power efficiency can further be improved. Such fusion between ring amplifiers and digital amplifiers will be a very interesting research direction.

In the current digital amplifier design, the digital conversion results retrieved from the SA cycles are thrown away. Can we make good use of the conversion results the DA itself produces? For an example, if we fuse the ADC output and the DA output, we can obtain the error the Opamp generates with a certain input. Using such information, one may give feedback and calibrate the Opamp or Ringamp performance to further reduce the amplification error, similar to background calibrations.

### 5.2.1 Further scaling the DA amplifier (down to 16nm, 7nm and beyond)

Do digital amplifier scale performance with further scaled CMOS? And what will the DA performance look like in 16nm CMOS? Answering such questions will be an interesting research direction, since our thesis goal was to establish a process scalable ADC design technique, which will also be effective in further scaled CMOS.

Here, in Fig. 5.2, we will estimate that compared to 28nm CMOS, the 16nm CMOS with FinFETs will have 30% less gate delays and also  $2\times$  higher transistor output resistances. Interestingly, it is known by moving to planer CMOS to Fin-

FETs, the output resistance of transistors improve since fin structures have longer effective channels. Therefore, we estimate that the DC gains of the two-staged opamp will improve 10dB (note that since we do not have access to 16nm CMOS process information, these values are only an estimate from private communication). Thus, we can design the DA with less number of bits (e.g. 5bits), which will benefit conversion speed and power efficiency. Since the SA cycle speed will improve with scaling as well, we expect that the DA amplification speed will improve  $2 \times$  as a whole; even designing a 320 MS/s Pipelined-SAR ADC will be possible with 16nm CMOS!

While scaling down to 7nm CMOS will not improve the Opamp performance (or likely to degrade), the SA cycle speed will continue to scale and we expect higher performance in the 7nm node as well. Since the DA will compensate for the amplifier accuracy, we estimate that one can achieve high-accuracy amplifiers without any gain calibration techniques.

# Bibliography

- [1] A Mocuta, P Weckx, S Demuynck, et al. “Enabling CMOS Scaling Towards 3nm and Beyond”. In: *2018 IEEE Symposium on VLSI Technology*. IEEE. 2018, pp. 147–148.
- [2] TSMC. *TSMC and OIP Ecosystem Partners Deliver Industry’s First Complete Design Infrastructure for 5nm Process Technology*. [https://www.tsmc.com/uploadfile/pr/news/pdf/THPGWQTHTH/NEWS\\_FILE\\_EN.pdf](https://www.tsmc.com/uploadfile/pr/news/pdf/THPGWQTHTH/NEWS_FILE_EN.pdf). Accessed: 2019-6-21.
- [3] ITRS. *International Technology Roadmap for Semiconductors*. <http://www.itrs2.net/>. Accessed: 2019-5-21.
- [4] Meint Smit, Kevin Williams, and Jos van der Tol. “1.3 Integration of Photonics and Electronics”. In: *2019 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2019, pp. 29–34.
- [5] Yanfei Chen, Masaya Kibune, Asako Toda, et al. “22.2 A 25Gb/s hybrid integrated silicon photonic transceiver in 28nm CMOS and SOI”. In: *2015 IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers*. IEEE. 2015, pp. 1–3.
- [6] Taehwan Kim, Pavan Bhargava, Christopher V Poulton, et al. “29.5 A Single-Chip Optical Phased Array in a 3D-Integrated Silicon Photonics/65nm CMOS Technology”. In: *2019 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2019, pp. 464–466.

- [7] Rupp, Karl. *42 Years of Microprocessor Trend Data*. <https://www.karlrupp.net/2018/02/42-years-of-microprocessor-trend-data/>. Accessed: 2019-6-21.
- [8] Andrew Danowitz, Kyle Kelley, James Mao, et al. “CPU DB: recording microprocessor history”. In: *Communications of the ACM* 55.4 (2012), pp. 55–63.
- [9] Robert H Dennard, Fritz H Gaenslen, V Leo Rideout, et al. “Design of ion-implanted MOSFET’s with very small physical dimensions”. In: *IEEE Journal of Solid-State Circuits* 9.5 (1974), pp. 256–268.
- [10] Gordon E Moore et al. *Cramming more components onto integrated circuits*. 1965.
- [11] ABCI. *Commoditizing supercomputer cooling technologies to Cloud*. <https://abci.ai/en/about>. Accessed: 2019-6-22.
- [12] nVidia. *NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD’S MOST ADVANCED DATA CENTER GPU*. <https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf>. Accessed: 2019-6-22.
- [13] AnandTech. *The Apple iPhone 6s and iPhone 6s Plus Review*. <https://www.anandtech.com/shapple-iphone-6s-and-iphone-6s-plus-review/3>. Accessed: 2019-6-22.
- [14] Daniel Yang and Stacy Wegner. *Apple iPhone XS Max Teardown*. <https://www.techinsights.coiphone-xs-max-teardown>. Accessed: 2019-6-22.
- [15] Daniel Yang and Stacy Wegner. *Samsung Galaxy S10 5G Teardown*. <https://www.techinsightsco galaxy-s10-5g-teardown>. Accessed: 2019-6-22.
- [16] Hirofumi Sasaki, Doohwan Lee, Hiroyuki Fukumoto, et al. “Experiment on Over-100-Gbps Wireless Transmission with OAM-MIMO Multiplexing System in 28-GHz Band”. In: *2018 IEEE Global Communications Conference (GLOBECOM)*. IEEE. 2018, pp. 1–6.

- [17] Gain Kim, Lukas Kull, Danny Luu, et al. “30.2 A 161mW 56Gb/s ADC-Based Discrete Multitone Wireline Receiver Data-Path in 14nm FinFET”. In: *2019 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2019, pp. 476–478.
- [18] Shiva Kiran, Shengchang Cai, Ying Luo, et al. “A 52-Gb/s ADC-based PAM-4 receiver with comparator-assisted 2-bit/stage SAR ADC and partially unrolled DFE in 65-nm CMOS”. In: *IEEE Journal of Solid-State Circuits* 54.3 (2018), pp. 659–671.
- [19] Mark Horowitz. “1.1 computing’s energy problem (and what we can do about it)”. In: *2014 IEEE international solid-state circuits conference digest of technical papers (ISSCC)*. IEEE. 2014, pp. 10–14.
- [20] Xilinx. *Zynq RF SoC*. <https://www.xilinx.com/products/silicon-devices/soc/rfsoc.html>. Accessed: 2019-6-22.
- [21] Bruno Vaz, Bob Verbruggen, Christophe Erdmann, et al. “A 13bit 5GS/s ADC with time-interleaved chopping calibration in 16nm FinFET”. In: *2018 IEEE Symposium on VLSI Circuits*. IEEE. 2018, pp. 99–100.
- [22] Bruno Vaz, Adrian Lynam, Bob Verbruggen, et al. “16.1 A 13b 4GS/s digitally assisted dynamic 3-stage asynchronous pipelined-SAR ADC”. In: *2017 IEEE International Solid-State Circuits Conference (ISSCC)*. IEEE. 2017, pp. 276–277.
- [23] Parag Upadhyaya, Chi Fung Poon, Siok Wei Lim, et al. “A fully adaptive 19-to-56Gb/s PAM-4 wireline transceiver with a configurable ADC in 16nm FinFET”. In: *2018 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2018, pp. 108–110.
- [24] B. Murmann. *ADC Performance Survey 1997-2018*. <http://web.stanford.edu/~murmann/adcsurvey.html>. Accessed: 2018-12-30.
- [25] Robert H Walden. “Analog-to-digital converter survey and analysis”. In: *IEEE Journal on selected areas in communications* 17.4 (1999), pp. 539–550.

- [26] Michiel Van Elzakker, Ed Van Tuijl, Paul Geraedts, et al. “A 1.9  $\mu\text{W}$  4.4 fJ/conversion-step 10b 1MS/s charge-redistribution ADC”. In: *2008 IEEE International Solid-State Circuits Conference-Digest of Technical Papers*. IEEE. 2008, pp. 244–610.
- [27] Chun-Cheng Liu, Soon-Jyh Chang, Guan-Ying Huang, et al. “A 10-bit 50-MS/s SAR ADC with a monotonic capacitor switching procedure”. In: *IEEE Journal of Solid-State Circuits* 45.4 (2010), pp. 731–740.
- [28] Yuan Zhou, Benwei Xu, and Yun Chiu. “A 12 bit 160 MS/s two-step SAR ADC with background bit-weight calibration using a time-domain proximity detector”. In: *IEEE Journal of Solid-State Circuits* 50.4 (2015), pp. 920–931.
- [29] Bob Verbruggen, Kazuaki Deguchi, Badr Malki, et al. “A 70 db snr 200 ms/s 2.3 mw dynamic pipelined sar adc in 28nm digital cmos”. In: *IEEE VLSI Circuits Digest of Technical Papers, 2014 Symposium on*. 2014, pp. 1–2.
- [30] Chun-Cheng Liu, Mu-Chen Huang, and Yu-Hsuan Tu. “A 12 bit 100 MS/s SAR-assisted digital-slope ADC”. In: *IEEE Journal of Solid-State Circuits* 51.12 (2016), pp. 2941–2950.
- [31] Chun C Lee and Michael P Flynn. “A SAR-assisted two-stage pipeline ADC”. In: *IEEE Journal of Solid-State Circuits* 46.4 (2011), pp. 859–869.
- [32] Masanori Furuta, Mai Nozawa, and Tetsuro Itakura. “A 10-bit, 40-MS/s, 1.21 mW pipelined SAR ADC using single-ended 1.5-bit/cycle conversion technique”. In: *IEEE journal of solid-state circuits* 46.6 (2011), pp. 1360–1370.
- [33] Boris Bellalta. “IEEE 802.11 ax: High-efficiency WLANs”. In: *IEEE Wireless Communications* 23.1 (2016), pp. 38–46.
- [34] Jeffrey G Andrews, Stefano Buzzi, Wan Choi, et al. “What will 5G be?” In: *IEEE Journal on selected areas in communications* 32.6 (2014), pp. 1065–1082.
- [35] Eldad Perahia, Carlos Cordeiro, Minyoung Park, et al. “IEEE 802.11 ad: Defining the next generation multi-Gbps Wi-Fi”. In: *2010 7th IEEE consumer communications and networking conference*. IEEE. 2010, pp. 1–5.

- [36] Theodore S Rappaport, Shu Sun, Rimma Mayzus, et al. “Millimeter wave mobile communications for 5G cellular: It will work!” In: *IEEE access* 1 (2013), pp. 335–349.
- [37] Lukas Kull, Thomas Toifl, Martin Schmatz, et al. “22.1 A 90GS/s 8b 667mW 64× interleaved SAR ADC in 32nm digital SOI CMOS”. In: *2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. IEEE. 2014, pp. 378–379.
- [38] Greg Semeraro, Grigoris Magklis, Rajeev Balasubramonian, et al. “Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling”. In: *Proceedings Eighth International Symposium on High Performance Computer Architecture*. IEEE. 2002, pp. 29–40.
- [39] Shusuke Kawai, Hiromitsu Aoyama, Rui Ito, et al. “An 802.11 ax 4× 4 spectrum-efficient WLAN AP transceiver SoC supporting 1024QAM with frequency-dependent IQ calibration and integrated interference analyzer”. In: *2018 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2018, pp. 442–444.
- [40] B Robert Gregoire and Un-Ku Moon. “An over-60dB true rail-to-rail performance using correlated level shifting and an opamp with 30dB loop gain”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2008, pp. 540–634.
- [41] John K Fiorenza, Todd Sepke, Peter Holloway, et al. “Comparator-based switched-capacitor circuits for scaled CMOS technologies”. In: *IEEE Journal of Solid-State Circuits* 41.12 (2006), pp. 2658–2668.
- [42] Lane Brooks and Hae-Seung Lee. “A 12b 50MS/s fully differential zero-crossing-based ADC without CMFB”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2009, pp. 166–167.

- [43] Lane Brooks and Hae-Seung Lee. “A 12b, 50 MS/s, fully differential zero-crossing based pipelined ADC”. In: *IEEE Journal of Solid-State Circuits* 44.12 (2009), pp. 3329–3343.
- [44] Dong-Young Chang, Carlos Munoz, Denis Daly, et al. “11.6 A 21mW 15b 48MS/s zero-crossing pipeline ADC in 0.13  $\mu\text{m}$  CMOS with 74dB SNDR”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2014, pp. 204–205.
- [45] Benjamin Hershberg, Skyler Weaver, Kazuki Sobue, et al. “Ring amplifiers for switched capacitor circuits”. In: *IEEE Journal of Solid-State Circuits* 47.12 (2012), pp. 2928–2942.
- [46] Yong Lim and Michael P Flynn. “A 1 mW 71.5 dB SNDR 50 MS/s 13 bit fully differential ring amplifier based SAR-assisted pipeline ADC”. In: *IEEE Journal of Solid-State Circuits* 50.12 (2015), pp. 2901–2911.
- [47] Benjamin Poris Hershberg. “Ring amplification for switched capacitor circuits”. In: (2012).
- [48] Benjamin Hershberg, Davide Dermit, Barend van Liempd, et al. “A 3.2 GS/s 10 ENOB 61mW Ringamp ADC in 16nm with Background Monitoring of Distortion”. In: *2019 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2019, pp. 58–60.
- [49] Benjamin Hershberg, Barend van Liempd, Nereo Markulic, et al. “A 6-to-600MS/s Fully Dynamic Ringamp Pipelined ADC with Asynchronous Event-Driven Clocking in 16nm”. In: *2019 IEEE International Solid-State Circuits Conference-(ISSCC)*. IEEE. 2019, pp. 68–70.
- [50] Boris Murmann and Bernhard E Boser. “A 12 b 75 MS/s Pipelined ADC using Open-Loop Residue Amplification”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2003, pp. 328–497.

- [51] Bob Verbruggen, Masao Iriguchi, Manuel de la Guia Solaz, et al. “A 2.1 mW 11b 410 MS/s dynamic pipelined SAR ADC with background calibration in 28nm digital CMOS”. In: *IEEE VLSI Circuits (VLSIC), 2013 Symposium on*. 2013, pp. C268–C269.
- [52] Hai Huang, Hongda Xu, Brian Elies, et al. “A non-interleaved 12-b 330-MS/s pipelined-SAR ADC with PVT-stabilized dynamic amplifier achieving sub-1-dB SNDR variation”. In: *IEEE Journal of Solid-State Circuits* 52.12 (2017), pp. 3235–3247.
- [53] Minglei Zhang, Kyoohyun Noh, Xiaohua Fan, et al. “A 0.8–1.2 V 10–50 MS/s 13-bit subranging pipelined-SAR ADC using a temperature-insensitive time-based amplifier”. In: *IEEE Journal of Solid-State Circuits* 52.11 (2017), pp. 2991–3005.
- [54] Kentaro Yoshioka, Tomohiko Sugimoto, Naoya Waki, et al. “A 0.7 V 12b 160MS/s 12.8 fJ/conv-step pipelined-SAR ADC in 28nm CMOS with digital amplifier technique”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2017, pp. 478–479.
- [55] Kentaro Yoshioka, Tomohiko Sugimoto, Naoya Waki, et al. “Digital Amplifier: An Power-Efficient and Process-Scaling Amplifier for Switched Capacitor Circuits”. In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* Accepted (2019).
- [56] Lukas Kull, Thomas Toifl, Martin Schmatz, et al. “A 3.1 mW 8b 1.2 GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32 nm digital SOI CMOS”. In: *IEEE Journal of Solid-State Circuits* 48.12 (2013), pp. 3049–3058.
- [57] Zhiheng Cao, Shouli Yan, and Yunchu Li. “A 32mW 1.25 GS/s 6b 2b/step SAR ADC in 0.13  $\mu\text{m}$  CMOS”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2008, pp. 542–634.

- [58] Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, et al. “An 8 bit 0.3–0.8 V 0.2–40 MS/s 2-bit/step SAR ADC with successively activated threshold configuring comparators in 40 nm CMOS”. In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 23.2 (2015), pp. 356–368.
- [59] Yun Chai and Jieh-Tsorng Wu. “A 5.37 mW 10b 200MS/s dual-path pipelined ADC”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2012, pp. 462–464.
- [60] R Demerow. “Settling time of operational amplifiers”. In: *Analog Dialogue* 4.1 (1970).
- [61] Hung-Yen Tai, Yao-Sheng Hu, Hung-Wei Chen, et al. “A 0.85 fJ/conversion-step 10b 200kS/s subranging SAR ADC in 40nm CMOS”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2014, pp. 196–197.
- [62] Stacy Ho, Chi-Lun Lo, Jiayun Ru, et al. “A 23 mW, 73 dB dynamic range, 80 MHz BW continuous-time delta-sigma modulator in 20 nm CMOS”. In: *IEEE Journal of Solid-State Circuits* 50.4 (2015), pp. 908–919.
- [63] Masaya Miyahara, Yusuke Asada, Daehwa Paik, et al. “A low-noise self-calibrating dynamic comparator for high-speed ADCs”. In: *IEEE Asian Solid-State Circuits Conference*. 2008, pp. 269–272.
- [64] Pieter Harpe, Eugenio Cantatore, and Arthur van Roermund. “A 2.2/2.7 fJ/conversion-step 10/12b 40kS/s SAR ADC with Data-Driven Noise Reduction”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2013, pp. 270–271.
- [65] Takashi Morie, Takuji Miki, Kazuo Matsukawa, et al. “A 71dB-SNDR 50MS/s 4.2 mW CMOS SAR ADC by SNR enhancement techniques utilizing noise”. In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*. 2013, pp. 272–273.

- [66] Kentaro Yoshioka and Hiroki Ishikuro. “A 13b SAR ADC with eye-opening VCO based comparator”. In: *ESSCIRC 2014-40th European Solid State Circuits Conference (ESSCIRC)*. IEEE. 2014, pp. 411–414.
- [67] Kentaro Yoshioka, Ryo Saito, Takumi Danjo, et al. “Dynamic architecture and frequency scaling in 0.8–1.2 GS/s 7 b subranging ADC”. In: *IEEE Journal of Solid-State Circuits* 50.4 (2015), pp. 932–945.
- [68] Kentaro Yoshioka, Ryo Saito, Takumi Danjo, et al. “7-bit 0.8–1.2 GS/s dynamic architecture and frequency scaling subrange ADC with binary-search/flash live configuring technique”. In: *2014 Symposium on VLSI Circuits Digest of Technical Papers*. IEEE. 2014, pp. 1–2.
- [69] Geert Van der Plas and Bob Verbruggen. “A 150 MS/s 133uW 7 bit ADC in 90 nm Digital CMOS”. In: *IEEE Journal of Solid-State Circuits* 43.12 (2008), pp. 2631–2640.
- [70] Bob Verbruggen, Jan Craninckx, Maarten Kuijk, et al. “A 2.6 mW 6 bit 2.2 GS/s fully dynamic pipeline ADC in 40 nm digital CMOS”. In: *IEEE Journal of Solid-State Circuits* 45.10 (2010), pp. 2080–2090.
- [71] Jonathan Proesel, Gokce Keskin, Jean-Olivier Plouchart, et al. “An 8-bit 1.5 GS/s flash ADC using post-manufacturing statistical selection”. In: *IEEE Custom Integrated Circuits Conference 2010*. IEEE. 2010, pp. 1–4.
- [72] Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, et al. “An 8bit 0.35–0.8 V 0.5–30MS/s 2bit/step SAR ADC with wide range threshold configuring comparator”. In: *2012 Proceedings of the ESSCIRC (ESSCIRC)*. IEEE. 2012, pp. 381–384.
- [73] Zhiheng Cao, Shouli Yan, and Yunchu Li. “A 32 mW 1.25 GS/s 6b 2b/Step SAR ADC in 0.13um CMOS”. In: *IEEE Journal of Solid-State Circuits* 44.3 (2009), pp. 862–873.

- [74] Ryota Sekimoto, Akira Shikata, Tadahiro Kuroda, et al. “A 40nm 50S/s–8MS/s ultra low voltage SAR ADC with timing optimized asynchronous clock generator”. In: *2011 Proceedings of the ESSCIRC (ESSCIRC)*. IEEE. 2011, pp. 471–474.
- [75] Pierluigi Nuzzo, Claudio Nani, Costantino Armiento, et al. “A 6-bit 50-MS/s threshold configuring SAR ADC in 90-nm digital CMOS”. In: *IEEE Transactions on Circuits and Systems I: Regular Papers* 59.1 (2011), pp. 80–92.
- [76] Bob Verbruggen, Jan Craninckx, Maarten Kuijk, et al. “A 2.6 mW 6 bit 2.2 GS/s fully dynamic pipeline ADC in 40 nm digital CMOS”. In: *IEEE Journal of Solid-State Circuits* 45.10 (2010), pp. 2080–2090.
- [77] Manar El-Chammas and Boris Murmann. “A 12-GS/s 81-mW 5-bit time-interleaved flash ADC with background timing skew calibration”. In: *IEEE Journal of Solid-State Circuits* 46.4 (2011), pp. 838–847.
- [78] Masato Yoshioka, Kiyoshi Ishikawa, Takeshi Takayama, et al. “A 10-b 50-MS/s 820-uW SAR ADC With On-Chip Digital Calibration”. In: *IEEE transactions on biomedical circuits and systems* 4.6 (2010), pp. 410–416.
- [79] Yuji Nakajima, Norihito Kato, Akemi Sakaguchi, et al. “A 7-bit, 1.4 GS/s ADC with offset drift suppression techniques for one-time calibration”. In: *IEEE Transactions on Circuits and Systems I: Regular Papers* 60.8 (2013), pp. 1979–1990.
- [80] Hung-Yen Tai, Hung-Wei Chen, and Hsin-Shu Chen. “A 3.2 fJ/c.-s. 0.35V 10b 100ks/s SAR ADC in 90nm CMOS”. In: *2012 Symposium on VLSI Circuits (VLSIC)*. IEEE. 2012, pp. 92–93.
- [81] Jin-Yi Lin and Chih-Cheng Hsieh. “A 0.3 V 10-bit 1.17 f SAR ADC with merge and split switching in 90 nm CMOS”. In: *IEEE Transactions on Circuits and Systems I: Regular Papers* 62.1 (2014), pp. 70–79.

- [82] Pei-Chen Lee, Jin-Yi Lin, and Chih-Cheng Hsieh. “A 0.4 V 1.94 fJ/conversion-step 10 bit 750 kS/s SAR ADC with input-range-adaptive switching”. In: *IEEE Transactions on Circuits and Systems I: Regular Papers* 63.12 (2016), pp. 2149–2157.
- [83] Jin-Yi Lin and Chih-Cheng Hsieh. “A 0.3 V 10-bit SAR ADC with first 2-bit guess in 90-nm CMOS”. In: *IEEE Transactions on Circuits and Systems I: Regular Papers* 64.3 (2017), pp. 562–572.

## Publication list

### Journals

1. Kentaro Yoshioka, Tomohiko Sugimoto, Naoya Waki, Sinnyoung Kim, Daisuke Kurose, Hirotomo Ishii, Masanori Furuta, Akihide Sai, Hiroki Ishikuro, Tetsuro Itakura, “Digital Amplifier: An Power-Efficient and Process-Scaling Amplifier for Switched Capacitor Circuits” In: *IEEE Trans. VLSI Systems*, Accepted. (Chapter 2)
2. Kentaro Yoshioka, Ryo Saito, Takumi Danjo, Sanroku Tsukamoto, Hiroki Ishikuro, “Dynamic architecture and frequency scaling in 0.8–1.2 GS/s 7 b subranging ADC” In: *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 932–945, Apr. 2015. (Chapter 3)
3. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki Ishikuro, “An 8 bit 0.3–0.8 V 0.2–40 MS/s 2-bit/step SAR ADC with successively activated threshold configuring comparators in 40 nm CMOS” In: *IEEE Trans. VLSI Systems*, vol. 23, no. 2, pp. 356–368, Feb. 2015. (Chapter 4)

### International Conferences

1. Kentaro Yoshioka, Edward Lee, Simon Wong, Mark Horowitz, “Dataset Culling: Towards Efficient Training Of Distillation-Based Domain Specific Models” *To be presented at IEEE ICIP*, 2019.
2. Kentaro Yoshioka, Yosuke Toyama, Koichiro Ban, Daisuke Yashima, Shigeru Maya, Akihide Sai, Kohei Onizuka, “PhaseMAC: A 14 TOPS/W 8bit GRO based Phase Domain MAC Circuit for In-Sensor-Computed Deep Learning Accelerators” In: *IEEE Symp. VLSI Circuits*, pp.263–264, June 2018.
3. Kentaro Yoshioka, Hiroshi Kubota, Tomonori Fukushima, Satoshi Kondo, Tuan Thanh Ta, Hidenori Okuni, Kaori Watanabe, Yoshinari Ojima, Katsuyuki Kimura, Sohichiroh Hosoda, Yutaka Oota, Tomohiro Koizumi, Naoyuki Kawabe, Yasuhiro Ishii, Yoichiro Iwagami, Seitaro Yagi, Isao Fujisawa, Nobuo Kano,

- Tomohiro Sugimoto, Daisuke Kurose, Naoya Waki, Yumi Higashi, Tetsuya Nakamura, Yoshikazu Nagashima, Hirotomo Ishii, Akihide Sai, Nobu Matsumoto, “A 20ch TDC/ADC hybrid SoC for 240x96-pixel 10%-reflection < 0.125%-precision 200m-range imaging LiDAR with smart accumulation technique”, In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, pp.92-94, Feb.2018.
4. Kentaro Yoshioka, Tomohiko Sugimoto, Naoya Waki, Sinnyoung Kim, Daisuke Kurose, Hirotomo Ishii, Masanori Furuta, Akihide Sai, Tetsuro Itakura, “A 0.7 V 12b 160MS/s 12.8 fJ/conv-step pipelined-SAR ADC in 28nm CMOS with digital amplifier technique”, In: *IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, pp.478-479, Feb.2017.
  5. Kentaro Yoshioka, Ryo Saito, Takumi Danjo, Sanroku Tsukamoto, and Hiroki Ishikuro, “7-bit 0.8–1.2 GS/s dynamic architecture and frequency scaling subrange ADC with binary-search/flash live configuring technique,” In: *IEEE Symp. VLSI Circuits*, 2014, pp. 1–2
  6. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki Ishikuro, “A 0.0058mm<sup>2</sup> 7.0 ENOB 24MS/s 17fJ/conv. threshold configuring SAR ADC with source voltage shifting and interpolation technique”, *IEEE Symp. VLSI Circuits*, pp.266-267, June 2013.
  7. Kentaro Yoshioka, Hiroki Ishikuro, “A 13b SAR ADC with eye-opening VCO based comparator”, *ESSCIRC*, pp.411-414, Sept. 2014.
  8. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki Ishikuro, “An 8b extremely area efficient threshold configuring SAR ADC with source voltage shifting technique,” *IEEE ASP-DAC*, pp.31-32, Jan. 2014.
  9. Kentaro Yoshioka, Yosuke Toyama, Teruo Jyo, Hiroki Ishikuro, “A voltage scaling 0.25–1.8 V delta-sigma modulator with inverter-opamp self-configuring amplifier” *IEEE ISCAS*, pp.809-812, May 2013.

10. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki Ishikuro, “A 0.35-0.8 V 8b 0.5-35MS/s 2bit/step extremely-low power SAR ADC”, *IEEE ASP-DAC*, pp.111-112, Jan. 2013.
11. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki Ishikuro, “An 8bit 0.35–0.8 V 0.5–30MS/s 2bit/step SAR ADC with wide range threshold configuring comparator” *ESSCIRC*, pp.381-384, Sept. 2012.

## Awards

1. Special Feature Award, *IEEE ASP-DAC*.
2. Co-recipient: Best Student Award, *IEEE A-SSCC 2012*.

## Other works

### Journals

1. Kentaro Yoshioka, Hiroshi Kubota, Tomonori Fukushima, Satoshi Kondo, Tuan Thanh Ta, et al, “A 20-ch TDC/ADC Hybrid Architecture LiDAR SoC for 240x96 Pixel 200-m Range Imaging With Smart Accumulation Technique and Residue Quantizing SAR ADC” *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3026–3038, Nov. 2018.
2. Yosuke Toyama, Kentaro Yoshioka, Koichiro Ban, Akihide Sai, Kohei Onizuka “An 8-Bit 12.4 TOPS/W Phase-Domain MAC Circuit for Energy-Constrained Deep Learning Accelerators”, *IEEE J. Solid-State Circuits*, Accepted.
3. Shusuke Kawai, Rui Ito, Kengo Nakata, Yutaka Shimizu, Motoki Nagata, Tomohiko Takeuchi, Hiroyuki Kobayashi, Katsuyuki Ikeuchi, Takayuki Kato, Yosuke Hagiwara, Yuki Fujimura, Kentaro Yoshioka, Shigehito Saigusa, Hiroshi Yoshida, Makoto Arai, Toshiyuki Yamagishi, Hirotugu Kajihara, Kazuhisa Horiuchi, Hideki Yamada, Tomoya Suzuki, Yuki Ando, Kensuke Nakanishi, Koichiro Ban, Masahiro Sekiya, Yoshimasa Egashira, Tsuguhide Aoki, Kohei Onizuka, Toshiya Mitomo, “An 802.11ax 44 High-Efficiency WLAN AP Transceiver SoC Supporting 1024-QAM With Frequency-Dependent IQ Calibration and Integrated Interference Analyzer” *IEEE J. Solid-State Circuits*, vol. 53, no. 12, pp. 3688-3699, Dec. 2018.
4. Ryota Sekimoto, Akira Shikata, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki Ishikuro, “A 0.5-V 5.2-fJ/conversion-step full asynchronous SAR ADC with leakage power reduction down to 650 pW by boosted self-power gating in 40-nm CMOS” *IEEE J. Solid-State Circuits*, vol. 48, no. 11, pp. 2628-2636, Nov. 2018.
5. Ryota Sekimoto, Akira Shikata, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki Ishikuro, “An adaptive DAC settling waiting time optimized ultra low voltage

- asynchronous SAR ADC in 40 nm CMOS” *IEICE Trans. Electronics*, Vol.96, pp.820-827, June 2013.
6. Akira Shikata, Ryota Sekimoto, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki Ishikuro, “A 4–10 bit, 0.4–1 V Power Supply, Power Scalable Asynchronous SAR-ADC in 40 nm-CMOS with Wide Supply Voltage Range SAR Controller” *IEICE Trans. Electronics*, Vol.96, pp.443-452, Feb 2013.
- ## International Conferences
1. Yosuke Toyama, Kentaro Yoshioka, Koichiro Ban, Akihide Sai, Kohei Onizuka, “A 12.4 TOPS/W, 20% Less Gate Count Bidirectional Phase Domain MAC Circuit for DNN Inference Applications”, *IEEE ASSCC*, Nov.2018.
  2. Shusuke Kawai, Hiromitsu Aoyama, Rui Ito, Yutaka Shimizu, Mitsuyuki Ashida, Asuka Maki, Tomohiko Takeuchi, Hiroyuki Kobayashi, Go Urakawa, Hiroaki Hoshino, Kentaro Yoshioka, et al, “An 802.11 ax 4 4 spectrum-efficient WLAN AP transceiver SoC supporting 1024QAM with frequency-dependent IQ calibration and integrated interference analyzer” *IEEE ISSCC*, pp.442-444, Feb.2018.
  3. M Nomura, A Muramatsu, H Takeno, S Hattori, D Ogawa, M Nasu, K Hirairi, S Kumashiro, S Moriwaki, Y Yamamoto, S Miyano, Y Hiraku, I Hayashi, K. Yoshioka, A Shikata, Hiroki Ishikuro, M Ahn, Y Okuma, X Zhang, Y Ryu, K Ishida, M Takamiya, Tadahiro Kuroda, H Shinohara, T Sakurai, “0.5 V image processor with 563 GOPS/W SIMD and 32bit CPU using high voltage clock distribution (HVCD) and adaptive frequency scaling (AFS) with 40nm CMOS” *IEEE Symp. VLSI Circuits*, pp.266-267, June 2013.
  4. Ryota Sekimoto, Akira Shikata, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki Ishikuro, “A 40nm CMOS full asynchronous nano-watt SAR ADC with 98% leakage power reduction by boosted self power gating” *IEEE ASSCC*, Nov.2012.

Revision Information.

July 1, 2019. Version 1.0 (Official version for Ph.D defence).